Categories: Hadoop

Hadoop Tutorial

Hi in this hadoop tutorial we will describe all about Hadoop, why use Hadoop, Hadoop Architecture, BigData, MapReduce and Some ecosystems.

Now a days required framework like which  handle huge amount of data in an application like Facebook,  Twitter, LinledIn, Google, Yahoo, etc these have lots of data. These companies required some process to that huge data like 1. Data Analysis, 2. Proper Handling of Data and 3. Understandable data to custom format.
Apache Hadoop’s MapReduce and HDFS components originally derived respectively from Google’s MapReduce and Google File System (GFS) papers.

In 2003-2004 Google Introduced some new technique in search engine 1. File System GFS (Google File System) and another framework for data analyzing technique called 2. MapReduce to make fast searching and fast analyzing data. Google just submitted theses white paper to search engine.

In 2005-2006 Yahoo take these technique for Google and Implement in single framework given Name Hadoop. Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at the time, named it after his son’s toy elephant. It was originally developed to support distribution for the Nutch search engine project. No one knows that better than Doug Cutting, chief architect of Cloudera and one of the curious story behind Hadoop.

When he was creating the open source software that supports the processing of large data sets, Cutting knew the project would need a good name. Cutting’s son, then 2, was just beginning to talk and called his beloved stuffed yellow elephant “Hadoop” (with the stress on the first syllable). Fortunately, he had one up his sleeve—thanks to his son. The son (who’s now 12) frustrated with this. He’s always saying ‘Why don’t you say my name, and why don’t I get royalties? I deserve to be famous for this 🙂

In this Hadoop Tutorial we will learn further following modules.

Module 1-

    1. Understanding Big Data
    2.  Introduction to Hadoop
  1.  Hadoop Architecture

Module 2-

 

    1. Hadoop Distributed File System(HDFS)
    2. HDFS Architecture
    3. JobTracker and TaskTracker Architecture 
    4. Hadoop Configuration 
    1. Hadoop Environment Setup(Hadoop 1.x)
    2. Hadoop Environment Setup(Hadoop 2.x)
    3. Hadoop installation on ubuntu

 

  1. Data Loading Technique

Module 3-

    1. Introduction to MapReduce
    1. MapReduce Flow Chart with Sample Example 
  1. MapReduce Programming Hello World or WordCount Program 

Module 4-

    1. Advanced MapReduce
    1. YARN
  1. YARN Programming

Module 5-

    1. Introduction to Sqoop
  1. Programming with Sqoop

Module 6-

    1. Analytic Using HIVE
  1. Understanding HIVE QL

Module 7-

    1. NoSQL Databases
    1. Understanding HBASE
  1. Zookeeper

Module 8-

    1. Real World Data sets and analysis
  1. Project Discussion
Previous
Dinesh Rajput

Dinesh Rajput is the chief editor of a website Dineshonjava, a technical blog dedicated to the Spring and Java technologies. It has a series of articles related to Java technologies. Dinesh has been a Spring enthusiast since 2008 and is a Pivotal Certified Spring Professional, an author of a book Spring 5 Design Pattern, and a blogger. He has more than 10 years of experience with different aspects of Spring and Java design and development. His core expertise lies in the latest version of Spring Framework, Spring Boot, Spring Security, creating REST APIs, Microservice Architecture, Reactive Pattern, Spring AOP, Design Patterns, Struts, Hibernate, Web Services, Spring Batch, Cassandra, MongoDB, and Web Application Design and Architecture. He is currently working as a technology manager at a leading product and web development company. He worked as a developer and tech lead at the Bennett, Coleman & Co. Ltd and was the first developer in his previous company, Paytm. Dinesh is passionate about the latest Java technologies and loves to write technical blogs related to it. He is a very active member of the Java and Spring community on different forums. When it comes to the Spring Framework and Java, Dinesh tops the list!

Share
Published by
Dinesh Rajput

Recent Posts

Strategy Design Patterns using Lambda

Strategy Design Patterns We can easily create a strategy design pattern using lambda. To implement…

2 years ago

Decorator Pattern using Lambda

Decorator Pattern A decorator pattern allows a user to add new functionality to an existing…

2 years ago

Delegating pattern using lambda

Delegating pattern In software engineering, the delegation pattern is an object-oriented design pattern that allows…

2 years ago

Spring Vs Django- Know The Difference Between The Two

Technology has emerged a lot in the last decade, and now we have artificial intelligence;…

2 years ago

TOP 20 MongoDB INTERVIEW QUESTIONS 2022

Managing a database is becoming increasingly complex now due to the vast amount of data…

2 years ago

Scheduler @Scheduled Annotation Spring Boot

Overview In this article, we will explore Spring Scheduler how we could use it by…

2 years ago