Wednesday 8 February 2017

Big Data


What is Big Data?
                               Big data means really a big data itself,it is big collection of data that cannot be processed using normal techniques or say computing techniques.It has several tools or frameworks with it to handle the data.

 
Big data include big amount of data, it may be :

> Structured (Relational data set)
>Semi-Structured (XML data)
>UN-Structured (doc,text,PDF etc)


Tools or Frameworks used :

Hadoop :It is the Apache open source framework that will provide a option to process the data and also store the data from the clusters of computer using some programming constructs (Java).
Hadoop Map Reduce is the software that allows to write the application which processes huge amount of data in parallel.

The two key components of Hadoop :

>Hive and
>Pig.

Pig and Hive are the two tools that made ease the complexity of writing the programming using the java language.Pig allows to use query for operation on the data such as filter join etc.Hive is also similar to  query language but have some restrictions called hive query language (HQL) commands understood by itself.
   

This is just a brief of what is big-data and what it does use and how it is used to manipulate the data with various tools. One can learn big-data and use tools from different sites available .


No comments:

Post a Comment