Tag: Summer Training 2015

Create Your First Hadoop Program

Find out Number of Products Sold in Each Country. Input: Our input data set is a CSV file, SalesJan2009.csv Prerequisites: This tutorial is developed on Linux – Ubuntu operating System. You should have Hadoop (version 2.2.0 used for this tutorial) already installed. You should have Java (version 1.8.0 used for this tutorial) already installed on the system. Before we start with the actual process, change…
Read more

Introduction To Flume and Sqoop

Before we learn more about Flume and Sqoop , lets study Issues with Data Load into Hadoop Analytical processing using Hadoop requires loading of huge amounts of data from diverse sources into Hadoop clusters. This process of bulk data load into Hadoop, from heterogeneous sources and then processing it, comes with certain set of challenges.…
Read more

Hadoop – HDFS Overview

Hadoop File System was developed using distributed file system design. It is run on commodity hardware. Unlike other distributed systems, HDFS is highly faulttolerant and designed using low-cost hardware. HDFS holds very large amount of data and provides easier access. To store such huge data, the files are stored across multiple machines. These files are…
Read more

Hadoop – Big Data Solutions

In this approach, an enterprise will have a computer to store and process big data. Here data will be stored in an RDBMS like Oracle Database, MS SQL Server or DB2 and sophisticated softwares can be written to interact with the database, process the required data and present it to the users for analysis purpose.…
Read more