MongoDB Sharding
Did some testing of MongoDB sharding on Friday.
Some interesting conclusions:
- During redistribution of shards, data can appear in result sets multiple times.
- We inserted 1,000,000 records and it took ~ 15 minutes to rebalance about 368k records from one shard to the other (the dataset was somewhat weighted such that initially all records went on one shard).
- Query times for an unindexed collection were pretty similar to an unindexed SQL table of similar size.
- Getting data to be easily accessible and manipulable will require lots of denormalization, which in turn requires well-organized code and a separate integrity-checker.
- If queries aren’t efficient to spread out to multiple shards, the main benefit of Mongo would be document-oriented schema rather than more rigid tables.