Hadoop Tutorial


Hi in this hadoop tutorial we will describe all about Hadoop, why use Hadoop, Hadoop Architecture, BigData, MapReduce and Some ecosystems.

Now a days required framework like which  handle huge amount of data in an application like Facebook,  Twitter, LinledIn, Google, Yahoo, etc these have lots of data. These companies required some process to that huge data like 1. Data Analysis, 2. Proper Handling of Data and 3. Understandable data to custom format.

Apache Hadoop's MapReduce and HDFS components originally derived respectively from Google's MapReduce and Google File System (GFS) papers.

In 2003-2004 Google Introduced some new technique in search engine 1. File System GFS (Google File System) and another framework for data analyzing technique called 2. MapReduce to make fast searching and fast analyzing data. Google just submitted theses white paper to search engine.

In 2005-2006 Yahoo take these technique for Google and Implement in single framework given Name Hadoop. Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at the time, named it after his son's toy elephant. It was originally developed to support distribution for the Nutch search engine project. No one knows that better than Doug Cutting, chief architect of Cloudera and one of the curious story behind Hadoop. When he was creating the open source software that supports the processing of large data sets, Cutting knew the project would need a good name. Cutting's son, then 2, was just beginning to talk and called his beloved stuffed yellow elephant "Hadoop" (with the stress on the first syllable). Fortunately, he had one up his sleeve—thanks to his son. The son (who's now 12) frustrated with this. He's always saying 'Why don't you say my name, and why don't I get royalties? I deserve to be famous for this :)

 

In this Hadoop Tutorial we will learn further following modules.

Module 1-
  1. Understanding Big Data
  2.  Introduction to Hadoop
  3.  Hadoop Architecture
Module 2-
  1. Hadoop Distributed File System(HDFS)
  2. HDFS Architecture
  3. JobTracker and TaskTracker Architecture 
  4. Hadoop Configuration 
  5. Hadoop Environment Setup(Hadoop 1.x)
  6. Hadoop Environment Setup(Hadoop 2.x)
  7. Hadoop installation on ubuntu
  8. Data Loading Technique
Module 3-
  1. Introduction to MapReduce
  2. MapReduce Flow Chart with Sample Example 
  3. MapReduce Programming Hello World or WordCount Program 
Module 4-
  1. Advanced MapReduce
  2. YARN
  3. YARN Programming
Module 5-
  1. Introduction to Sqoop
  2. Programming with Sqoop
Module 6-
  1. Analytic Using HIVE
  2. Understanding HIVE QL
Module 7-
  1. NoSQL Databases
  2. Understanding HBASE
  3. Zookeeper
Module 8-
  1. Real World Data sets and analysis
  2. Project Discussion