MapReduce Quiz

MapReduce Beginner Quiz

 



MapReduce Quiz contain set of 61 MCQ questions for MapReduce MCQ which will help you to clear beginner level quiz.



1) Large block size makes transfer time more effective?

  1. TRUE
  2. FALSE

Answer : A

 
 
2) Clients access the blocks directly from ________for read and write

  1. data nodes
  2. name node
  3. secondarynamenode
  4. none of the above

Answer : A

 
 
3) Information about locations of the blocks of a file is stored at __________
 

  1. data nodes
  2. name node
  3. secondarynamenode
  4. none of the above

Answer : B

 
 
4) Hadoop is ‘Rack-aware’  and HDFS replicates data blocks on nodes on different racks
 

  1. TRUE
  2. FALSE

Answer : A

 
 
5) Which node stores the checksum?

  1. datanode
  2. secondarynamenode
  3. namenode
  4. all of the above

Answer : A

 
 
6) MapReduce programming model is  _____________

  1. Platform Dependent but not language-specific
  2. Neither platform- nor language-specific
  3. Platform independent but language-specific
  4. Platform Dependent and language-specific

Answer : B

 
 
7) Which is optional in map reduce program?

  1. Mapper
  2. Reducer
  3. both are optional
  4. both are mandatory

Answer : B

 
 
8) TaskTracker  reside on  _________ and run ________ task.

  1. datanode, map/reduce
  2. datanode,reducer
  3. datanode,mapper
  4. namenode, map/reduce

Answer : A

 
 
9) Sqoop is a tool which can be used to
 

  1. Imports tables from an RDBMS into HDFS
  2. Exports files from HDFS into RDBMS tables
  3. Uses a JDBC interface
  4. all of the above

Answer : D

 
 
10) _________ is a distributed, reliable, available service for efficiently moving large amounts of data as it is produced

  1. FLUME
  2. SQOOP
  3. PIG
  4. HIVE

Answer : A

 
 
11) ________ is a workflow engine , runs on a server typically outside the cluster
 

  1. Oozie
  2. Zookeeper
  3. Chukwa
  4. Mahout

Answer : A

 
 
12) For Better Load Balancing and to avoid potential performance issues

  1. custom Partitioner
  2. custom combiner
  3. custom reducer
  4. user more reducer

Answer : A

 
 
13) Anything written using the OutputCollector.collect method will be written to __________

  1. Local file system
  2. HDFS
  3. Windows file systems only
  4. none of the above

Answer : B

 
 
14) join operation is performed at  ___________

  1. mapper
  2. reducer
  3. shuffle and sort
  4. none of the above

Answer : B

 
 
15) When is the earliest that the reducer() method of any reduce task in a given job is called?

  1. immediately after all map tasks have completed
  2. As soon as a map task emits at least one record
  3. As soon as at least one map task has finished processing its complete input split
  4. none of the above

Answer : A

 
 
16) Which deamon distributes the individual task to data nodes?

  1. tasktracker
  2. jobtracker
  3. namenode
  4. datanode

Answer : B

 
 
17) __________object allows the mapper to interact with the rest of the Hadoop system

  1. Context object
  2. InputSplit
  3. Recordreader
  4. Shuffle and Sort

Answer : A

 
 
18) How many instances of JobTracker can run on a Hadoop Cluster?

  1. only one
  2. maximum two
  3. any number but should not be more than number of datanodes
  4. none of the above

Answer : A

 
 
19) How many instances of Tasktracker run on a Hadoop cluster?

  1. unlimited TaskTracker on each datanode
  2. one TaskTracker for each datanode
  3. maximum 2 Tasktarcker for each datanode
  4. none of the above

Answer : B

 
 
20) Which Deamon processes must run on namenode

  1. tasktracker and jobtracker
  2. namenode and jobtracker
  3. namenode and secondarynamenode
  4. none of the above

Answer : B

 
 
21) Which Deamon processes must run on datanode

  1. tasktracker and datanode
  2. namenode and jobtracker
  3. datanode and secondarynamenode
  4. tasktracker and jobtracker

Answer : A

 
 
22) Which Deamon process must run on secondarynamenode

  1. tasktracker
  2. namenode
  3. secondarynamenode
  4. datanode

Answer : C

 
 
23) What is the command for checking disk usage in hadoop.

  1. Hadoop fs –disk –space
  2. Hadoop fs –diskusage
  3. Hadoop fs –du
  4. None of the above

Answer : c

 
 
24) Which one is not a master daemon?

  1. Namenode
  2. Jobtracker
  3. Tasktracker
  4. None of these.

Answer : b

 
 
25) Does HDFS allow appends to files through command Line

  1. True
  2. False

Answer : b

 
 
26) As HDFS works on the principle of

  1. Write once , Read Many
  2. Write Many, Read Many
  3. Write Many, Read Once
  4. None

Answer : a

 
 
27) Data node decides where to store to data,

  1. Yes
  2. False

Answer : b

 
 
28) Reading is parallel and writing is not parallel in HDFS

  1. True
  2. False

Answer : a

 
 
29) ___ command to check for various inconsistencies in HDFS

  1. FSCK
  2. FETCHDT
  3. SAFEMODE
  4. SAFEANDRECOVERY

Answer : a

 
 
30) Hadoop fsck command is used to
 

  1. Check the integrity of the HDFS
  2. Check the status of data nodes in the cluster
  3. check the status of the NameNode in the cluster
  4. Check the status of the Secondary Name Node

Answer : a

 
 
31) How can you determine available HDFS space in your cluster

  1. using hadoop dfsadmin -report command
  2. using hadoop fsck / command
  3. using secondary namenode web UI
  4. Using Data Node Web UI

Answer : a

 
 
32) How does the HDFS architecture provide data reliability?

  1. Reliance on SAN devices as a DataNode interface.
  2. Storing multiple replicas of data blocks on different DataNodes
  3. DataNodes make copies of their data blocks, and put them on different local disks.
  4. Reliance on RAID on each DataNode.

Answer : b

 
 
33) The path in which the HDFS data will be stored is specified in the following file

  1. hdfs-site.xml
  2. yarn-site.xml
  3. mapred-site.xml
  4. core-site.xml

Answer : a

 
 
34) What is the default partitioning machanisim?

  1. Round Robin
  2. User needs to configure
  3. Hash Partitioning
  4. None

Answer : c

 
 
35) Is it possible to change the HDFS block size

  1. TRUE
  2. FALSE

Answer : a

 
 
36) Name Node contain

  1. Meta data, all data blocks
  2. Metadata and recently used block
  3. Meta data only
  4. None of the above

Answer : c

 
 
37) Name Node can be formatted any time without data loss

  1. TRUE
  2. FALSE

Answer : b

 
 
38) How do you list the files in a HDFS directory

  1. ls
  2. hadoop ls
  3. hadoop fs -ls
  4. hadoop ls -fs

Answer : c

 
 
39) HDFS Achives High Availablility and fault tolerance through

  1. By spliting files into blocks
  2. By keeping a copy of frequently accessing data block in Name Node
  3. By replicating any blocks on multiple data node on the cluster
  4. None of the above

Answer : c

 
 
40) Name node keeps metadata and data files

  1. TRUE
  2. FALSE

Answer : b

 
 
41) Big Data poses challenge to traditional system in terms of

  1. Network bandwidth
  2. Operating system
  3. Storage and proccessing
  4. None of the above

Answer : c

 
 
42) What are supported programming languages for Map Reduce?

  1. The most common programming language is Java, but scripting languages are also supported via Hadoop streaming.
  2. Any programming language that can comply with Map Reduce concept can be supported.
  3. Only Java supported since Hadoop was written in Java.
  4. Currently Map Reduce supports Java, C, C++ and COBOL.

Answer : a

 
 
43) Which node performs housekeeping functions for the NameNode.

  1. Datanode
  2. namenode
  3. Secondary NameNode
  4. Edge Node

Answer : c

 
 
44) What is the implementation language of the Hadoop MapReduce framework?

  1. Java
  2. C
  3. FORTRAN
  4. Python

Answer : a

 
 
45) If the Name node is down and job is submitted

  1. It will connect with Secondary name node to process the job
  2. Wait untill Name Node comes up
  3. gets files from local disk
  4. Job will fail

Answer : d

 
 
46) The Reducer class defines

  1. How to process one key at a time
  2. How to process multiple keys together
  3. Depends on the logic any thing can be done
  4. Depends on the number of keys

Answer : a

 
 
47) Number of Partition is equals to

  1. Number of Reducers
  2. Number of Mappers
  3. Number of Input Split
  4. Number of output directories

Answer : a

 
 
48) A combiner class can be created by extending

  1. Combiner Class
  2. Mapper class
  3. Reducer Class
  4. Partitioner Class

Answer : c

 
 
49) For making Custom Partitioning one needs to implement

  1. Logic to be written in Mapper
  2. Logic to be written in Reducer
  3. Partitioner
  4. Combiner

Answer : c

 
 
50) Output of reducer is written to

  1. temp directory
  2. HDFS
  3. Local disk
  4. None of the above

Answer : b

 
 
51) The default input type in map/reduce is JSON.

  1. TRUE
  2. FALSE

Answer : B

 
 
52) A JobTracker runs in its own JVM process.

  1. TRUE
  2. FALSE

Answer : A

 
 
53) The MapReduce programming model is inspired by functional languages and targets data-intensive computations.

  1. TRUE
  2. FALSE

Answer : A

 
 
54) The output a mapreduce process  is a set of <key,value, type> triples.

  1. TRUE
  2. FALSE

Answer : B

 
 
55) The Map function is applied on the input data and produces a list of intermediate <key,value> pairs.

  1. TRUE
  2. FALSE

Answer : A

 
 
56) The Reduce function is applied to all intermediate pairs with different keys.

  1. TRUE
  2. FALSE

Answer : B

 
 
57) By default, the MapReduce framework stores results in HDFS.

  1. TRUE
  2. FALSE

Answer : A