Now a days required framework like which handle huge amount of data in an application like Facebook, Twitter, LinledIn, Google, Yahoo, etc these have lots of data. These companies required some process to that huge data like 1. Data Analysis, 2. Proper Handling of Data and 3. Understandable data to custom format.
Apache Hadoop’s MapReduce and HDFS components originally derived respectively from Google’s MapReduce and Google File System (GFS) papers.
In 2003-2004 Google Introduced some new technique in search engine 1. File System GFS (Google File System) and another framework for data analyzing technique called 2. MapReduce to make fast searching and fast analyzing data. Google just submitted theses white paper to search engine.
In 2005-2006 Yahoo take these technique for Google and Implement in single framework given Name Hadoop. Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at the time, named it after his son’s toy elephant. It was originally developed to support distribution for the Nutch search engine project. No one knows that better than Doug Cutting, chief architect of Cloudera and one of the curious story behind Hadoop.
When he was creating the open source software that supports the processing of large data sets, Cutting knew the project would need a good name. Cutting’s son, then 2, was just beginning to talk and called his beloved stuffed yellow elephant “Hadoop” (with the stress on the first syllable). Fortunately, he had one up his sleeve—thanks to his son. The son (who’s now 12) frustrated with this. He’s always saying ‘Why don’t you say my name, and why don’t I get royalties? I deserve to be famous for this 🙂
In this Hadoop Tutorial we will learn further following modules.
- Data Loading Technique
- Advanced MapReduce
- YARN Programming
- Introduction to Sqoop
- Programming with Sqoop
- Analytic Using HIVE
- Understanding HIVE QL
- NoSQL Databases
- Understanding HBASE
- Real World Data sets and analysis
- Project Discussion