I went to MongoSF today – quite an event, and I hope to have a chance to write more about it. This post is about one replication problem and how MongoDB solves it.

If you’re using MySQL Replication when your master goes down it is possible for some writes to be executed on the master, but not on the slave, which gets promoted to the master. When Master comes back up it has some updates done to it which cause it to be inconsistent with data on the new Master. In MySQL world we can chose to either ignore this problem (or may be even replay those changes on slaves and hope it works out), re-clone it from the slave or use mk-table-checksum to find inconsistencies and re-sync them with mk-table-sync. Both of these operations can be very expensive for large databases.

MongoDB approach used in Replication Sets is for failed master to scan its log files to find all object ids which were modified from the point slave synchronized successfully and retrieve those objects back from the new master (or delete them if they no more exist). Such approach allows quick synchronization without any complex support of rolling back changes. In MongoDB there is a catch with this approach – because there is no local durability this also works as long as network goes down but server stays up, however once Single Server Durability is implemented it will be pretty cool.

What is really interesting – it should be possible to apply the same concept to MySQL Replication, possibly with help of some tools like MMM. Row level Replication makes it possible to identify the objects which were changed on the Master after failover to Slave happened and they can be dumped to local file (in case one wants to synchronize them manually) and when fetched again from the master.
This of course will require IDEMPOTENT slave mode but otherwise it should work unless you have DDL operations in between.

In general listening the great presentation on MongoDB Replication by Dwight Merriman as well as previously looking at how replication done in Redis I should say things can be done a lot more simple way when there is no schema and when you do not have to mess with complex features like triggers or multiple storage engines.

5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mark Callaghan

When a MongoDB master goes it can lose data from any page in the database for which changes were in progress. Many deployments won’t be able to use it after recovery without restoring it from a backup or another master. There are many reasons for it to go down including the MongoDB software crashed, was killed (by a person, OOM killer or buggy script) or because of an unplanned hardware reboot.

Hopefully, single-server durability will fix this. The current approach only allows for quick synchronization when you don’t care about losing that data. That is an odd definition of ‘synchronization’.

I get that a MySQL master can fail before propagating some changes to slaves and that requires manual conflict resolution if the master is to be put back into service. But the following only happens when you configure it to happen:
>>>
When Master comes back up it has some updates done to it which cause it to be inconsistent with data on the new Master.
>>>

Mark Callaghan

I am at SFO waiting for a delayed flight. Thanks for keeping me active.

I have no doubt that MongoDB and MyISAM are useful. But I wonder whether enough users understand the crash behavior. And I look on with envy at the advanced replication features in MongoDB and other members of the N(ot)O(nly)SQL family.

MongoDB/MyISAM behavior on a crash is very different from innodb_flush_log_at_trx_commit=2. With that setting for InnoDB you can lose the transactions from the last second. The behavior for MongoDB/MyISAM allows you to lose pieces from any transaction ever committed (last week or last year). I think the window for MongoDB is worse because the database is accessed via mmap and there can be a crash in the middle of writing a page leading to a corrupt page. For MyISAM the crash occurs between page transfers so individual pages won’t be corrupted but the data structures (sets of pages) can be corrupted.

Asad Hasan

I need to implement this with a project. Can I get more information please on where I can begin reading about this.

Thank you.