DATABASE FUNDAMENTALS
BASICS OF BIG DATA
Question
[CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
|
Consider the pseudo-code for MapReduce’s WordCount example (not shown here). Let’s now assume that you want to determine the average amount of words per sentence. Which part of the (pseudo-)code do you need to adapt?
|
Only map()
|
|
Only reduce()
|
|
map() and reduce()
|
|
The code does not have to be changed.
|
Explanation:
Detailed explanation-1: -The text from the input text file is tokenized into words to form a key value pair with all the words present in the input text file. The key is the word from the input file and value is ‘1’. This is how the MapReduce word count program executes and outputs the number of occurrences of a word in any given input file.
Detailed explanation-2: -A map() function can emit up to a maximum number of key/value pairs (depending on the Hadoop environment). A map() function can emit anything between zero and an unlimited number of key/value pairs. A reduce() function can iterate over key/value pairs multiple times.
There is 1 question to complete.