Hadoop Archive

JobTracker and TaskTracker Design

JobTracker and TaskTracker are coming into picture when we required processing to data set. In hadoop system there are five services always running in background (called hadoop daemon services). Daemon Services of Hadoop- Namenodes Secondary …

HDFS Architecture

Hi in this hadoop tutorial we will describing now HDFS Architecture. There are following are two main components of HDFS. Main Components of HDFS- NameNodes master of the system maintain and manage the blocks which …

What is HDFS?

What is HDFS? HDFS is a file system designed for storing very large files with streaming data access patterns, running on clusters on commodity hardware. Highly fault-tolerant  “Hardware failure is the norm rather than the …

Hadoop Architecture

Here we will describe about Hadoop Architecture. In high level of hadoop architecture there are two main modules HDFS and MapReduce.Means HDFS + MapReduce = Hadoop Framework Following pic have high level architecture of hadoop …

What is Hadoop?

What is Hadoop? first of all we are understanding what is DFS(Distributed File System), Why DFS? DFS(Distributed File Systems)- A distributed file system is a client/server-based application that allows clients to access and process data …

Understanding Big Data

What is Big Data? Lots of Data (Terabytes or Petabytes) Big Data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management …