Types of NoSQL databases and extensive details

Please study the first article as background of this article: Why there is a need for NoSql Database?

What about NoSQL:

  • NoSQL is not completely “No Schema” DB
  • There are mainly 3 types of NoSQL DB
    • Document DB
    • Data Structure Oriented DB
    • Column Oriented DB

 What is a Document DB?

  • Documents are key-value pair
  • document can also be stroed in JSON format
  • Because of JSON document considered as object
  • JSON documents are used as Key-Value pairs
  • Document can have any set of keys
  • Any key can associate with any arbitrarily complex value, that is itself a JSON document
  • Documents are added with different sets of keys
    • Missing keys
    • Extra keys
    • Add keys in future when in need
    • Application must know that certain key present
    • Queries are made on Keys
    • Index are set to keys to make search efficient
  • Example: CouchDB, MongoDB, Redis, Riak

Example of Document DB – CouchDB

  • The value is plain string in JSON format
  • Queries are views
  • Views are documents in the DB that specify searches
  • View can be complex
  • Views can use map/reduce to process and summarize results
  • Write Data to Append Only file, an extremely efficient and makes write are significantly faster then write
  • Single headed database
  • Can run in cluster environment (not available in core)
  • From CAP Theorem –
    • Partition Tolerance
    • Availability
    • In Non-Cluster environment availability is main
    • In clustered environment consistency is main
  • BigCouch
    • Integrating clustering with CouchDB
    • Cloudant merging CouchDB & BigCouch

Example of Document DB – MongoDB

  • The value is plain string in JSON format
  • Queries are views
  • Views are JSON documents specifying fields and values to match
  • Queries results can be processed by built in map/reduce
  • Single headed database
  • Can run in cluster environment (not available in core)
  • From CAP Theorem –
    • Partition Tolerance
    • Availability
    • In Non-Cluster environment availability is main
    • In clustered environment consistency is main

Example of Document DB – Riak

  • A document database with more flexible document types
  • Supports JSON, XML, plain text
  • A plugin architecture supports adding other document types
  • Queries must know the structure of JSON or XML for proper results
  • Queries results can be processed by built in map/reduce
  • Built in control about replication and distribution
  • Core is designed to run in cluster environment
  • From CAP Theorem –
    • Partition Tolerance
    • Availability
    • Note: Tradeoff between availability and consistency is tunable
  • Write Data to Append Only file, an extremely efficient and makes write are significantly faster then write

Data Structure Oriented DB – Redis:

  • In Memory DB for fastest read and write speed
  • If dataset can fit in memory, top choice
  • Great  for Raw speed
  • Data isn’t saved on disk and list in case of crash
  • Can be configured to save on disk but hit in performance
  • Limited scalability with some replication
  • Cluster Replication Support is coming
  • In Redis there is a difference
    • The value can be data structure (list or sets)
    • You can do union and intersection on list and sets

Column Oriented DB

  • Also considered as “Sparse row store”
  • Equivalent to “relational table” – “set of rows” identified by key
  • Concept starts with columns
  • Data is organized in the columns
  • Columns are stored contiguously
  • Columns tend to have similar data
  • A row can have as many columns as needed
  • Columns are essentially keys, that can let you lookup values in the rows
  • Columns can be added any time
  • Unused columns in a row does not occupy storage
  • NULL don’t exist
  • Write Data to Append Only file, an extremely efficient and makes write are significantly faster then write
  • Built in control about replication and distribution
  • Example: HBASE & Cassandra
  • HBase
    • From CAP Theorem
      • Partition Tolerance
      • Consistency
  • Cassandra
    • From CAP Theorem
      • Partition Tolerance
      • Availability Note: Tradeoff between availability and consistency is tunable

Additional functionalities supported by NoSql DB: 

  • Scripting Language Support
    • JavaScript
      • CouchDB, MongoDB
  • Pig
    • Hadoop
  • Hive
    • Hadoop
  • Lua
    • Redis
  • RESTFull Interface:
    • CouchDB and Riak
    • CouchDB can be considered as best with Web Application Framework
    • Riak provides traditional protocol buffer interface
Advertisements

One thought on “Types of NoSQL databases and extensive details

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s