Secondary reads:
The safest way to read consistent mongoDB data (and especially the only safe way to read consistent, non-duplicate data on sharded clusters) is to perform primary only reads. Obviously that leaves the secondaries underutilized as they are for high availability. Most of our clients tend to use secondary reads for additional scaling and better hardware utilization, and we recommend taking the following points into account:
- When relying on secondary reads, it’s imperative to size the instances such that the smallest majority of instances possible (for example, 2 members of a three-member replica set, or 3 members of a 5 member replica set) can handle the full production load. The reason for this is that if a replica set member is unavailable (down for maintenance, AWS instance disappeared etc), you still want to be able to use MongoDB’s high availability features to keep everything running smoothly and not end up with a failure cascade.
- Replica mechanism introduces a slight delay before writes are synced to a secondary, the application(s) have to be OK with data that might not include the latest update occasionally. A typical example would be one application or user updating a document and another app/user trying to read that document immediately from a secondary after the first user/application writes the data. we strongly recommend that queries that have to provide a guaranteed consistent view of the data are executed on a primary.
- In general, we advise against using secondary reads on a sharded cluster due to the potential for receiving duplicated data if a chunk of data has been moved from one shard to another, but hasn’t been deleted yet on the source shard. Primary reads have mechanisms in their code path that avoid this behavior. I’ve only added this point for future reference and we can discuss this in more depth when some collections need to be sharded.
Data sizes, index size and numbers, RAM
In general, we recommend keeping the number of indexes on a single collection in the low to mid-twenties range. The problem with a large number of indexes – especially indexes that are very similar to each other, like the ones on the indexing structures – is that they can negatively impact the ability of the query optimizer to select the best index in a performant manner. They also take up RAM, as MongoDB prioritizes caching/holding index data in its internal WT cache over holding (uncompressed) data. The other issue is one of write performance – all these indexes take a while to update when data is written, and while the database engine is usually pretty good at only updating those indexes that contain the fields that have been changed, the presence of the indexing structure means that all indexes that use the indexing structure will have to be updated on every write.
The other issue is one of sizing. We normally recommend sizing the machines/AWS instances so the WT cache (which defaults to 50% of RAM) can hold the working set. We define the working set as the indexes and the data that the application is currently using, with a little bit of headroom. With the current instance sizes, the WT cache comes in around 15-ish GB, but the indexes on the collection on their own are a little bit over 20GB. So right now, the database engine can only hold part of the index data in memory all the time, plus will use relatively little RAM to hold the actual data in the cache and thus has to spend a lot of time paging in data from the OS file cache and decompressing/compressing this data, from experience we suspect that the index size vs available RAM is likely to be a major contributor to the performance issues you are seeing.
The consensus was a general recommendation to less number of indexes and also potentially upgrade to larger instance size, especially in light of the expected data growth.
One suggestion that we would like to discuss with you on-site is to test some of the known slower queries with hinted indexes to speed up the plan evaluation phase. We can also use this technique to test the performance impact of creating an index for the index structure without any sort fields and then implicitly adding a sort stage.
Queries via the aggregation framework vs regular queries
I had a word with our internal aggregation framework specialist and she confirmed my thinking – running the queries as an aggregation pipeline is going to be slightly less performant than running the same query with the same matching/projection/sort as a regular query. This is mainly due to the additional overhead of analyzing the pipeline and then pushing down the information to the query engine. All that said, the overhead is pretty minimal and likely plays a role if we were trying to save milliseconds off your query times. From this perspective (and unless we find evidence to the contrary in the diagnostic data), we think we can safely ignore this.
MongoDB versions
For the sake of example, projects using MongoDB 3.4.6 in production and if engineers are using a fair number of $in queries, In this case, 3.4.9 has a performance improvement for $in queries that you might benefit from. In general, we recommend you to use the latest stable release to get away from know performance issues and vulnerable fixes etc.