Big Data: August 2016

Introduction

NO SQL refers to "not only sql". Its the non relational database technology. However there are some NO SQL databases that supports SQL language as well. No SQL databases are widely used where there are real time web applications such as Google, Amazon, Facebook etc.

Why to use No SQL?

We use NO SQL to get some of the advantages such as
Simpler design
Easy horizontal scaling of machines
Better control over availability
Cost effective as commodity hardware is used
Better performance over relational database management systems

In order to gain these advantages you will have to compromise with the consistency

Understanding CAP Theorem

The CAP theorem, also named Brewer's theorem which states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

Consistency (all nodes see the same data at the same time)
Availability (every request receives a response about whether it succeeded or failed)
Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures)

We can run NO SQL databases on single server or in multiple commodity servers. It employs distributed architecture with salient features like

Commodity Servers are used in many nosql databases
Commodity servers put together to run as single system
Provides redundant storage
Provides geographic distribution
It avoids having single point of failure i.e. outage on single system will bring the whole system down

Categories of NO SQL

Key value store
Columnar
Document Store
Graph Database

Relational Database:- Its a database model where data is organised in the form of rows and columns with unique key identifying each row or tuple. Some of the popular relational databases are Oracle, SQL server, etc.

Key-Value Store:- Fundamental data model used in key-value pairs are associative array(map or dictionary) where data is represented as collection of key-value pairs. This model can be extended to a discretely ordered model that maintains keys in lexicographic order. Extension is computationally powerful and can efficiently retrieve selective key ranges. Some of the popular databases in this category includes Memcache, Radis etc

Column- Oriented database:- These database work by creating collections of one or more key/value pairs that match the record. It doesn't need pre-structured table to work with data. Records that come in the form of single or multiple columns having information. Each column of every record can be different.

Document Store:- These are the database stores where data is stored in the form of documents that are usually in the form of JSON/BSON etc. Document posses the unique key that represents each of the document stored in the database. There are various ways to organize these documents such as collections, tags etc.

Graph Database:- These databases are designed to store the relative data where they can be represented in the form of graph. Lets consider social networking, person x is married to person y where as person x is cousin for person z, also person z is friend of person x. Other example representation of Graph data would be public transport links, road maps or network topologies.

Big Data

Monday, 29 August 2016

What is NO SQL? Categories of NO SQL Database

Introduction

Why to use No SQL?

Understanding CAP Theorem

Categories of NO SQL