BalaGowri's home: comparison

Over the past few years NO SQL has become very popular, specially with a capability to boast about an illustrious reference list like twitter, facebook and foursquare.

Being a MySQL user I was curious to explore into this and see what it has and what it offers. Certainly there are plenty of NO SQL implementations, first wanted to check casandra but the first few lines of its documentation sounded like a no go zone, hence decided to try Mongo DB. Going through it I realized that there were alot of similarities. So listed them here for quick reference.

Criteria	MySQL	MongoDB
Type of database	Relational	Document
Installation	On Linux you need to go through a certain amount of steps like user creation, permission setting, creating the data directory	Pretty stright forward create the the data directory at the default location /data/db. It was pretty simpler than MySQL
Starting Up	starting up using safe_mysqld (techincally it invokes mysqld)	pretty straight forward run mongod for a basic start up
Queries	Using SQL	No SQL
	Need to execute create statements ex. When you wanted to create a table with columns x, y you need to do the following. 1. Create database mine; 2. create table a(x int, name varchar(50))engine = MyISAM; 3. Then seperately do an insert insert into a values(10, "hiii");	No create statements for DBs, collections, etc Updated and created as and when needed(lazy loading), so to do the same we need to, db.mine.insert({x: 10, y: "hiii"}) similarly you have functions for update, save, remove, find
	if you want to select data; we can select * from a;	On MongoDB its, db.mine.find()
Storage Engines	A lot of Storage Engines to choose from	Upto now a single storage engine implementation
Operations/ Transactions	When you fire a query to the MySQL server the client needs to wait till the MySQL server returns the result.	In contrast Mongo DB uses 'Fire and forget'. So technically speaking client is not aware whether the query executed successfully or not. It doesn’t return OK. For system where you need safe operations immediately following the query getLastError command is issued and the exception handled. For Requirements like analytics , status message updates, pics comments this fire and forget is perfect. Anyhow for Financial type of system we need to use the work around.
	storage engine like INNODB supprts transactions and has ACID compliance	Not typically made for transactional systems
Indexing	Supports mutiple types of indexes. For optimzation EXPLAIN is used with the select statements	Very similar to MySQL indexing. For optimzation similar to MySQL this has the EXPLAIN statement and another statement called as hint Main difference is that you can define the order of indexing for composite indexes. Which gives more control to the query definition Eg. db.mine.ensureIndex({"x" : 1}) 1 indicates the direction
Geo Spatial Indexing	MyISAM storage engine supports geo spatial indexing.	Is created by passing "2d" instead of passing 1 or -1 to the ensureIndex function. Pretty handy stuff with straight forward support for querying nerarest locations, find entries within a shape and find distances. Compound geo spatial indexing also supported, this will be handy for finding nearest ATM etc. I think this is a must check for people developing location based systems
Back Up	MySQL back up is more complex, specific to the storage engines. Snapshot based back ups on LVM2 volumes is similar to the fsync command in mongo	supports fsync command which is a warm back up with point of time support. Supports mongodump - which does a hot back up but without point of time support
Replication
Mode of Replication	asynchorouns	asynchronous
	Replication is based on Binary log - which has all the writes	Replication is based on OpLog, similar to the binary log but has idompotent operations. Which basically means if the same statement is executed mutiple times it will not make the date inconsistant . Anyhow the statements shoule be executed in order
	set up is straight forward after having a master and slave in place	quiet similar - at start up itself the server can be specified as slave in the start up line, ./mongod --dbpath ~/data/slave --port 10001 --slave --source localhost:10000
	supports complex replication set ups, like two way replication, circular replication, etc	does not support replication from the slave
Replica Set	MySQL cluster supports Replica sets- anyhow MySQL cluster replica sets has synchrous replication	Similarly for HA support MondoDB supports replica sets - which is a master slave cluster with automatic failover. Similar to MySQL cluster in the replica sets. There are arbiter nodes which decide on the primary and secondary node election in case of failures
Sharding	MySQL doesn’t have inbuilt sharding support. Sharding is usually handled at application or ORM level for MySQL	sharding support is given - mongos is a process that will interface multiple mongods which will have the data distributed. Very good feature for scalability

For those who want to try out mongodb check this book, http://oreilly.com/catalog/0636920001096

Next up I want to try Couch DB, which is another No SQL database and implemented on Erlang!

BalaGowri's home

Sunday, November 14, 2010

MongoDB for a MySQL user

Search This Blog

sayanthan's tweets

sayanthan's tweets

Followers

Blog Archive

Known Blogs