2 Answers

which two resources should you expect to be bottlenecks?

Asked by: 505 views , ,
CCD-410

You need to create a job that does frequency analysis on input data. You will do this by writing a
Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into
individual characters. For each one of these characters, you will emit the character as a key and
an InputWritable as the value. As this will produce proportionally more intermediate data than input
data, which two resources should you expect to be bottlenecks?

A.
Processor and network I/O

B.
Disk I/O and network I/O

C.
Processor and RAM

D.
Processor and disk I/O

2 Answers



  1. Fujmin on Jul 24, 2013 Reply

    Answers: B
    Disk I/O and network I/O

    +6 Votes Thumb up 6 Votes Thumb down 0 Votes



  2. Chatar on Dec 29, 2013 Reply

    yes option ‘B’ is correct. More key/values in intermediate steps means more writing to local file system (disk I/O) by mapper and more data transfer across network (network I/O) for shuffling, sort and merge phase.

    +1 Votes Thumb up 1 Votes Thumb down 0 Votes