Being a MySQL user I was curious to explore into this and see what it has and what it offers. Certainly there are plenty of NO SQL implementations, first wanted to check casandra but the first few lines of its documentation sounded like a no go zone, hence decided to try Mongo DB. Going through it I realized that there were alot of similarities. So listed them here for quick reference.
Criteria | MySQL | MongoDB |
Type of database | Relational | Document |
Installation | On Linux you need to go through a certain amount of steps like user creation, permission setting, creating the data directory | Pretty stright forward create the the data directory at the default location /data/db. It was pretty simpler than MySQL |
Starting Up | starting up using safe_mysqld (techincally it invokes mysqld) | pretty straight forward run mongod for a basic start up |
Queries | Using SQL | No SQL |
Need to execute create statements ex. When you wanted to create a table with columns x, y you need to do the following. 1. Create database mine; 2. create table a(x int, name varchar(50))engine = MyISAM; 3. Then seperately do an insert insert into a values(10, "hiii"); | No create statements for DBs, collections, etc Updated and created as and when needed(lazy loading), so to do the same we need to, db.mine.insert({x: 10, y: "hiii"}) similarly you have functions for update, save, remove, find | |
if you want to select data; we can select * from a; | On MongoDB its, db.mine.find() | |
Storage Engines | A lot of Storage Engines to choose from | Upto now a single storage engine implementation |
Operations/ Transactions | When you fire a query to the MySQL server the client needs to wait till the MySQL server returns the result. | In contrast Mongo DB uses 'Fire and forget'. So technically speaking client is not aware whether the query executed successfully or not. It doesn’t return OK. For system where you need safe operations immediately following the query getLastError command is issued and the exception handled. For Requirements like analytics , status message updates, pics comments this fire and forget is perfect. Anyhow for Financial type of system we need to use the work around. |
storage engine like INNODB supprts transactions and has ACID compliance | Not typically made for transactional systems | |
Indexing | Supports mutiple types of indexes. For optimzation EXPLAIN is used with the select statements | Very similar to MySQL indexing. For optimzation similar to MySQL this has the EXPLAIN statement and another statement called as hint Main difference is that you can define the order of indexing for composite indexes. Which gives more control to the query definition Eg. db.mine.ensureIndex({"x" : 1}) 1 indicates the direction |
Geo Spatial Indexing | MyISAM storage engine supports geo spatial indexing. | Is created by passing "2d" instead of passing 1 or -1 to the ensureIndex function. Pretty handy stuff with straight forward support for querying nerarest locations, find entries within a shape and find distances. Compound geo spatial indexing also supported, this will be handy for finding nearest ATM etc. I think this is a must check for people developing location based systems |
Back Up | MySQL back up is more complex, specific to the storage engines. Snapshot based back ups on LVM2 volumes is similar to the fsync command in mongo | supports fsync command which is a warm back up with point of time support. Supports mongodump - which does a hot back up but without point of time support |
Replication | ||
Mode of Replication | asynchorouns | asynchronous |
Replication is based on Binary log - which has all the writes | Replication is based on OpLog, similar to the binary log but has idompotent operations. Which basically means if the same statement is executed mutiple times it will not make the date inconsistant . Anyhow the statements shoule be executed in order | |
set up is straight forward after having a master and slave in place | quiet similar - at start up itself the server can be specified as slave in the start up line, ./mongod --dbpath ~/data/slave --port 10001 --slave --source localhost:10000 | |
supports complex replication set ups, like two way replication, circular replication, etc | does not support replication from the slave | |
Replica Set | MySQL cluster supports Replica sets- anyhow MySQL cluster replica sets has synchrous replication | Similarly for HA support MondoDB supports replica sets - which is a master slave cluster with automatic failover. Similar to MySQL cluster in the replica sets. There are arbiter nodes which decide on the primary and secondary node election in case of failures |
Sharding | MySQL doesn’t have inbuilt sharding support. Sharding is usually handled at application or ORM level for MySQL | sharding support is given - mongos is a process that will interface multiple mongods which will have the data distributed. Very good feature for scalability |
For those who want to try out mongodb check this book, http://oreilly.com/catalog/0636920001096
Next up I want to try Couch DB, which is another No SQL database and implemented on Erlang!