COMPUTER FUNDAMENTALS

EMERGING TRENDS IN COMPUTING

CLOUD COMPUTING

Question [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
The number of maps is usually driven by the total size of ____
A
inputs
B
outputs
C
tasks
D
None of the mentioned
Explanation: 

Detailed explanation-1: -The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files. The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 maps for very cpu-light map tasks.

Detailed explanation-2: -The number of map tasks for a given job is driven by the number of input split. For each input split or HDFS blocks a map task is created. So, over the lifetime of a map-reduce job the number of map tasks is equal to the number of input splits.

Detailed explanation-3: -The number of map tasks depends upon the input file and its format. Typically, a file in a Hadoop cluster is broken down into blocks, each with a default size of 128 MB. Depending upon the size, the input file is split into multiple chunks. A map task then runs for each chunk.

There is 1 question to complete.