DATABASE FUNDAMENTALS
BASICS OF BIG DATA
| Question 
 [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
 | 
|  |  Create Hive and Hbase Queries 
 | 
|  |  Distributing Storage 
 | 
|  |  Processing the dataset and summarizing the result 
 | 
|  |  Resource Management and Job Scheduling 
 | 
Detailed explanation-1: -The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job or a DAG of jobs.
Detailed explanation-2: -One of Apache Hadoop’s core components, YARN is responsible for allocating system resources to the various applications running in a Hadoop cluster and scheduling tasks to be executed on different cluster nodes.
Detailed explanation-3: -YARN supports an extensible resource model. By default YARN tracks CPU and memory for all nodes, applications, and queues, but the resource definition can be extended to include arbitrary “countable” resources. A countable resource is a resource that is consumed while a container is running, but is released afterwards.
Detailed explanation-4: -The ResourceManager (RM) is responsible for tracking the resources in a cluster, and scheduling applications (e.g., MapReduce jobs). Prior to Hadoop 2.4, the ResourceManager is the single point of failure in a YARN cluster.
Detailed explanation-5: -APACHE OOZIE For Apache jobs, Oozie has been just like a scheduler. It schedules Hadoop jobs and binds them together as one logical work. There are two kinds of Oozie jobs: Oozie workflow: These are sequential set of actions to be executed.