What issues have to faced with mongodb at scale?

  • I have read a lot of negative things about Mongodb on Hacker news. Have you guys faced any significant issues related to mongo while scaling up?

  • That's an interesting question. A couple of things I'd watch out for:

    1. Sharding early can be a premature thing to do - a lot of people manage to scale up with a single replica set.
    2. Replication and backups aren't the same thing, and should be treated differently.
    3. IMO, the key to scaling Mongo successfully is to prepare a good plan for deployment, as well as tracking metrics diligently.

    I'd also suggest looking into databases like RethinkDB, which provides a more powerful querying interface for JSON schemas, as well as easier scaling and clustering.

  • Indexes are pretty much the reason databases work so fast. For me, MongoDB could make developers lazy in terms of just throwing a bunch of data into database. Re-architecting database later makes it very difficult.

    Did you know notCRUD has cool signatures? Use them by going to your User Profile > Edit.

    Personal blog: paraschopra.com/blog/


  • It varies from use case to use case. MongoDb provides you flexibile schema. This itself can either be bane or boon depending on your use case. For example, I was developing an ecommerce store from ground up. In the beginning it felt good because I could modify the schema as per requirements. But as the codebase grew I found my self struggling with the ORM layer. Most of the validations had to be done on the application layer. You also struggle with the depth of document.

    Coming to the read performance. MongoDB provides good performance in terms of search and aggregation. In fact it has a very powerful aggregation framework. But it all works if your data fits in RAM. As soon as MongoDB starts reading data from disk, the performance starts to vanish. It becomes worse and worse.

    Same is the case with write performance. You have to understand the way MongoDB works underneath. Data is written in 'journals' and kept in RAM. This is periodically written to disk when the size of a journal reaches a particular limit. I think the default is 1GB. When mongoDB is writing to disk that time the performance is very bad. MongoDB can handler thousands of writes per second because it doesn't directly write to disk.

    Horizontal scalability is where MongoDB scores very well. Data is divided in shards and replicated over machines. But the major point is that for MongoDB to performa well you have to keep the data in RAM. This is the case with most of the noSQL databases.

    MongoDB is very easy to manage. This is the reason that most of the budding companies use MongoDB. It is flexible, easy to scale, fast to a certain limit and very less management overhead.

  • @harshul thanks.

  • In key-value store, keys are stored at each row. So reducing size of keys has impact on mongo performance. Or at least used to couple of years ago.

    No schema is not meant for lazy programmers who dont want to think about schema. So dont use mongo just so you dont have to think about user stories and data you are mapping.

Log in to reply