As we continue to test our Fractal Tree Indexing with MongoDB, I’ve been updating my benchmark infrastructure so I can compare performance, correctness, and resource utilization. Sysbench has long been a standard for testing MySQL performance, so I created a version that is compatible with MongoDB. You can grab my current version of Sysbench for MongoDB here.
So what exactly is Sysbench? According to the Sysbench homepage, “Sysbench is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS [Operating System] parameters that are important for a system running a database under intensive load.”
- Sysbench schema
- 16 copies of the same collection, named sbtest1 … sbtest16, each with 10 million documents
- each has a secondary index on “k”
- documents structured as follows
1 2 3 4 5 6 7 8 | { "_id" : 1 … 10000000 "k" : random integer between 1 and 10000000 "c" : 10 segments of 11 random digits plus hyphen (119 characters total) "pad" : 5 segments of 11 random digits plus hyphen (59 characters total) } |
- Sysbench workload – a single sysbench “transaction” consists of the following operations
- each client thread chooses a random collection to perform all the following operations on
- all operations are keyed by _id, sequentially if a range operation
- 10 random point lookups of “c”
- 1 range lookup of “c” of 100 documents
- 1 sum of “k” of 100 documents using the aggregation framework
- 1 ordered range lookup of “c” of 100 documents
- 1 distinct range lookup of “c” of 100 documents
- update attribute “k” (indexed) in 1 random document
- update attribute “c” (unindexed) in 1 random document
- delete 1 random document by _id, then insert of a new document using the same _id
Benchmark Environment
- Sun x4150, (2) Xeon 5460, 16GB RAM, StorageTek Controller (256MB, write-back), 4x10K SAS/RAID 0
- Ubuntu 10.04 Server (64-bit), ext4 files system
- MongoDB v2.2.3 and MongoDB v2.2.0 + Fractal Tree Indexes
Benchmark Results – Performance
Throughput at all concurrency levels was higher than MongoDB. Our largest win was at 8 concurrent threads where we are 133% faster (17.38 tps vs. 40.50 tps). Not bad for a first pass, as we have many more ideas to come that will push our line higher.
Benchmark Results – Compression
Compared to MongoDB’s file system size (61.36GB), our zlib size on disk takes up only 31.36% the space (19.24GB). Compression has always been a strength of Fractal Tree Indexes, and some data can compress much more than this. I recently wrote another blog about our compression abilities on a different data set.
We will continue to share our results with the community and get people’s thoughts on applications where this might help, suggestions for next steps, and any other feedback. Please drop us a line if you are interested in becoming a beta tester.
I’ll be presenting my benchmarking infrastructure at Percona Live in April. If you are attending the show be sure to stop by our booth and learn more about TokuDB.
hi,Tim, I can’t find how using this tool,How to combine with Sysbench?can you help me?
This tool is a complete implementation of Sysbench, specifically developed to run the workload against MongoDB.
Do you have any examples of how to use this modified version of sysbench ? 🙂 Could always just look in the scripts but a readme or something would be nice .
Thanks for your work on creating this
Chris,
Thanks for the feedback. I’m working on improving my benchmarks with both documentation and functionality. Stay tuned.
Hey Tim,
Do you have a menu / readme / instruction etc. to show how to run your sysbench-mongodb? I downloaded the benchmark from github, but it contains a bunch of .bash scripts and two java files in “src/” dir. It’s highly appreciated if you could provide a clear instruction set.
Sorry for the overdue response, I just cleaned it up yesterday. The README is now available, and you can easily run against MongoDB or TokuMX.
TokuMX has really reduced the ‘disk size requirement’ as well as increased the write insertion speed. Only problem is that ‘our read queries’ is really taking too much time in TokuMx. Can you add a link for ‘read benchmark’ like the one for insertion ?
Pranjal, the majority of the Sysbench workload is reads, including point and range queries on both primary key and secondary indexes. One reason that you read queries are slower may be due to the type of compression you are using on you collections, as higher compression requires more CPU, which can increase the latency of queries. Which compression type are you using? Have you run your tests with the various algorithms available in TokuMX?
I encourage you to post more information about your schema and workload in our tokumx-user Google Group.
Hi,
How do I run mongodb-sysbench for a MongoDb database with Auth enabled and SSL enabled ?
Thanks
Hemant
Good question, there is a simple explanation on http://mongodb.github.io/casbah/guide/connecting.html, and you can easily add a few lines of code to accomplish authentication and/or ssl.
in the config.bash, I see these:
# total number of simultaneous insertion threads (for loader)
# valid values : integer > 0
export NUM_LOADER_THREADS=8
# total number of simultaneous benchmark threads
# valid values : integer > 0
export NUM_WRITER_THREADS=64
I don’t know the difference between there two. Could you tell me ?