2 Answers

how many Mappers will run?

Asked by: 659 views , ,
CCD-410

Your cluster’s HDFS block size in 64MB. You have directory containing 100 plain text files, each of

which is 100MB in size. The InputFormat for your job is TextInputFormat. Determine how many
Mappers will run?

A.
64

B.
100

C.
200

D.
640

2 Answers



  1. oramin on Jul 24, 2013 Reply

    Answers: C
    200

    Explanation:
    Each file would be split into two as the block size (64 MB) is less than the file size
    (100 MB), so 200 mappers would be running.
    Note:
    If you’re not compressing the files then hadoop will process your large files (say 10G), with a
    number of mappers related to the block size of the file.
    Say your block size is 64M, then you will have ~160 mappers processing this 10G file (160*64 ~=
    10G). Depending on how CPU intensive your mapper logic is, this might be an
    acceptable blocks size, but if you find that your mappers are executing in sub minute times, then
    you might want to increase the work done by each mapper (by increasing the block size to 128,
    256, 512m – the actual size depends on how you intend to process the data).
    Reference: http://stackoverflow.com/questions/11014493/hadoop-mapreduce-appropriate-inputfiles-size (first answer, second paragraph)

    +4 Votes Thumb up 4 Votes Thumb down 0 Votes



  2. nihir on Oct 30, 2013 Reply

    different ans on test engine

    -2 Votes Thumb up 1 Votes Thumb down 3 Votes