explain about PIG/Big Data Hadoop Online Training/OnlineITGuru

It is a tool / Platform, generally used with Hadoop to analyze larger sets of data representation. Developed by yahoo in the year 2006.  It undergo various releases and the latest version is 0.17 which was released in   June – 2017. All the data manipulations in Hadoop is done suing Apache Pig.  In data analysis program, PIG contains a high level language known as   PIG Latin.   programmers need to write scripts using PIG Latin for data  analyation using PIG .  The Scripts written in PIG Latin internally converted to MAP and Reduce Tasks. This Apache Pig contains a component known as PIG Engine .  It  accepts PIG Latin as a Input and convert those into Map Reduce Jobs. Pig enables data workers to write complex transformations without knowing the PRIOR knowledge on JAVA. PIG can invoke code in many languages like JAVA, JYthon and JRuby using its User Defined Functions (UDF’s).

 Get more information at  Big data  Hadoop online Training .

PIG works with data from many sources, including structured, unstructured which stores the results into the Hadoop Data File System. It is part of Hadoop ecosystem technologies which includes Hive, HBase, Zookeeper and other utilities to fulfill the functionality gaps  in the framework. The major advantage of Pig it follows a multi Query approach which reduces the number of time the data to be scanned. It reduces the development time by almost 16 times.


To perform a particular task, programmers need to write script using the PIG Latin language and execute them through any of the execution mechanism.  After  the completion of  execution these scripts go through a series of transformations to produce a desired output.

Components :  

The pig  has several components . The architecture of Pig shown below. Let us discuss them in detail.

Parser :  Initially PIG Scripts handled by  the Parser . As a matter of fact It checks the syntax of the script , does type checking and other miscellaneous checks. The output of the Parser DAG( Directed
Acylic  Graphic) , which represents the Pig Latin statements and Logical operators.

 Architecture of PIG/Big Data Hadoop Online Training/OnlineITGuru

Optimizer :   To illustrate the  output   in the Parser passed to logical optimizer, which carries logical optimizations such as Push down and  Projections

Compiler :    In the same fashion the   task of the compiler is to compile the logical  plan  into the series of Map Reduce Jobs

Execution Engine :  To enumerate the task of the execution engine is to  submit the Map Reduce jobs to Hadoop in a Sorted order.  Finally , these Map Reduce jobs executed  on Hadoop to produce the desired Results

Map Reduce  :  Especially It usually splits the input data set into independent chuncks , which are processes by a map task in a  completely parallel manner.  Simultaneously this frame works takes of scheduling and monitoring the task and re- executes if the task fails.

Features of PIG :

UDF’s: It provides the facility to create User Defined Functions as like in other programming  languages like JAVA and invoke them in PIG Scripts .

Extensiblity:   As a matter of fact With the existing operators,  users can develop their own functions to read , for example process and write data .

Rich Set of operators : For example Operations like Join , Sort ,Filter etc.. performed using its rich set of  operators .

Effective Handling :  Generally, Pig handles all kinds of data , in the same fashion both structured and unstructured answer  stores the results in HDFS.

Advantages of PIG :

 In comparison to SQL, PIG has following Advantages

Similarly It declares Execution plans.

It uses lazy evaluation

Especially It can store data at any point during Pipe Line.

It uses Extract , transform and Load.

Map Reduce tasks done easily using PIG Latin  language.

Applications :  

Specifically For processing time sensitive data loads

For processing huge data resources such as web logs.

Get in touch with OnlineITGuru for mastering the Big Data Hadoop Online Course 

Recommended Audience:

Software developers

ETL  developers

Project Managers

Team Lead’s

Business Analyst


There is nothing much  prerequisite for learning Big Data Hadoop .Its good to have a knowledge on  some  OOPs Concepts . But it is not mandatory .Our Trainers  will teach you if you don’t have a knowledge on  those OOPs Concepts

Drop Us A Query

100% Secure Payments. All major credit & debit cards accepted.