NoSQL the name is some kind misleading, NoSQL is not anti SQL or replace SQL database completely. The real mean is "not relational" or "not only SQL", both of them indicate that NoSQL is just a different approach for different problem. For some task RDBMS just won't work, but in some fields, SQL is good, SQL still will be a good solution for some tasks for a long time.

The definition of NoSQL database

NoSQL is a generic name of a spectrum of databases that don't follow relational model.

NoSQL databases can be divided into different genres, like key value, document oriented, graph, column based. But all NoSQL share some basic characteristics.

No joins: Most NoSQL databases don't use joins to query data, data are organized as big set, usually aggregated, denormalized. Queries may be against key and using hash function, or be distributed to nodes in parallel way.

Schema free:: NoSQL databases are good at storing and processing unstructured or half structured data. Even for the structured data, NoSQL databases aims to handle the fast changes. Most of them are designed as schema free. Change the data structure has almost no cost.

Scalability: NoSQL databases are driven by big data, pioneered by companies like Google and Amazon , both of them need to deal with huge amount of data daily.

Where NoSQL comes from

To talk about where NoSQL comes from , we should first look at how relational database is limited.

In relational model, data is normalized as tables, data in table consists of rows, row has fields, each fields has a type. Table can relate to other tables by foreign key.

For a small amount of data, this model may be working well, but when the data grows, some problems show up.

The first one is performance, join operations are time consuming, write operations involves lock, another performance bottleneck.

When performance goes down, the solution is scale up or scale out. Scale up will get performance hit again as the business grows. Finally, scale out will be the only solution. If you ever worked with serious relational database scale, you'll know it is quite a challenge, if not impossible.

The third problem is fixed schema, if you want to add some extra fields for some rows, you need to add these columns for each row. For the application which usually designed in OOP way, interact with relational model involves some kind of Object Relation Mapping(ORM).

NoSQL is a response for those problems with relational databases.

Google played a very important role in NoSQl movement by publishing the papers about the design of its infrastructure.

  • The Google File System
  • MapReduce: Simplified Data Processing on Large Clusters
  • Bigtable: A Distributed Storage System for Structured Data

Now we will see some NoSQL genres and related project.

Key value database: memcache, Redis

Key value database store data as key value pair, data is queried by its key. Using some kind of hash function, key value database provide good performance.

Key value databases usually used as cache service, from basic key value cache to support advanced data structure.

Document database: MongoDB

Data is stored as document, a document has a JSON like format, easy for application code manipulation.

Document databases supports SQL like queries, ready for scale out. One of the most popular document now should be MongoDB.

Graph database:Neo4j

Graph database save data as node and the relationship between those nodes. Some kind of data like social graph data , internet hyper link relationship.

Column oriented database: BigTable

Column oriented database save data by its column, data within the same column saved together just like row oriented database save data by its row .

Column oriented database also has keys and values, the key consist of the row id and column id, the value is the cell in the table.