Hadoop Tutorial

Hadoop Tutorial

Hi in this hadoop tutorial we will describe all about Hadoop, why use Hadoop, Hadoop Architecture, BigData, MapReduce and Some ecosystems.

Now a days required framework like which  handle huge amount of data in an application like Facebook,  Twitter, LinledIn, Google, Yahoo, etc these have lots of data. These companies required some process to that huge data like 1. Data Analysis, 2. Proper Handling of Data and 3. Understandable data to custom format.
Apache Hadoop’s MapReduce and HDFS components originally derived respectively from Google’s MapReduce and Google File System (GFS) papers.

In 2003-2004 Google Introduced some new technique in search engine 1. File System GFS (Google File System) and another framework for data analyzing technique called 2. MapReduce to make fast searching and fast analyzing data. Google just submitted theses white paper to search engine.

In 2005-2006 Yahoo take these technique for Google and Implement in single framework given Name Hadoop. Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at the time, named it after his son’s toy elephant. It was originally developed to support distribution for the Nutch search engine project. No one knows that better than Doug Cutting, chief architect of Cloudera and one of the curious story behind Hadoop.

When he was creating the open source software that supports the processing of large data sets, Cutting knew the project would need a good name. Cutting’s son, then 2, was just beginning to talk and called his beloved stuffed yellow elephant “Hadoop” (with the stress on the first syllable). Fortunately, he had one up his sleeve—thanks to his son. The son (who’s now 12) frustrated with this. He’s always saying ‘Why don’t you say my name, and why don’t I get royalties? I deserve to be famous for this 🙂

Big Data Hadoop

In this Hadoop Tutorial we will learn further following modules.

Module 1-

    1. Understanding Big Data
    2.  Introduction to Hadoop
  1.  Hadoop Architecture

Module 2-

 

    1. Hadoop Distributed File System(HDFS)
    2. HDFS Architecture
    3. JobTracker and TaskTracker Architecture 
    4. Hadoop Configuration 
    1. Hadoop Environment Setup(Hadoop 1.x)
    2. Hadoop Environment Setup(Hadoop 2.x)
    3. Hadoop installation on ubuntu

 

  1. Data Loading Technique

Module 3-

    1. Introduction to MapReduce
    1. MapReduce Flow Chart with Sample Example 
  1. MapReduce Programming Hello World or WordCount Program 

Module 4-

    1. Advanced MapReduce
    1. YARN
  1. YARN Programming

Module 5-

    1. Introduction to Sqoop
  1. Programming with Sqoop

Module 6-

    1. Analytic Using HIVE
  1. Understanding HIVE QL

Module 7-

    1. NoSQL Databases
    1. Understanding HBASE
  1. Zookeeper

Module 8-

    1. Real World Data sets and analysis
  1. Project Discussion
Previous

18 Comments

  1. subrat November 14, 2014
  2. Dinesh Rajput November 14, 2014
  3. Hari November 18, 2014
  4. Neelam November 18, 2014
  5. Dinesh Rajput November 18, 2014
  6. Sagar November 24, 2014
  7. Naveen Sharma November 26, 2014
  8. Dinesh Rajput November 26, 2014
  9. Suyash Soni December 26, 2014
  10. Suyash Soni December 26, 2014
  11. Dinesh Rajput December 27, 2014
  12. Dinesh Rajput December 27, 2014
  13. mbeddedsoft February 4, 2015
  14. sudhir February 12, 2015
  15. Dinesh Rajput February 12, 2015
  16. sudhir February 12, 2015
  17. sumeet April 26, 2018