MapReduce Beginner Quiz
MapReduce Quiz contain set of 61 MCQ questions for MapReduce MCQ which will help you to clear beginner level quiz.
1) Large block size makes transfer time more effective?
- TRUE
- FALSE
2) Clients access the blocks directly from ________for read and write
- data nodes
- name node
- secondarynamenode
- none of the above
3) Information about locations of the blocks of a file is stored at __________
- data nodes
- name node
- secondarynamenode
- none of the above
4) Hadoop is ‘Rack-aware’ and HDFS replicates data blocks on nodes on different racks
- TRUE
- FALSE
5) Which node stores the checksum?
- datanode
- secondarynamenode
- namenode
- all of the above
6) MapReduce programming model is _____________
- Platform Dependent but not language-specific
- Neither platform- nor language-specific
- Platform independent but language-specific
- Platform Dependent and language-specific
7) Which is optional in map reduce program?
- Mapper
- Reducer
- both are optional
- both are mandatory
8) TaskTracker reside on _________ and run ________ task.
- datanode, map/reduce
- datanode,reducer
- datanode,mapper
- namenode, map/reduce
9) Sqoop is a tool which can be used to
- Imports tables from an RDBMS into HDFS
- Exports files from HDFS into RDBMS tables
- Uses a JDBC interface
- all of the above
10) _________ is a distributed, reliable, available service for efficiently moving large amounts of data as it is produced
- FLUME
- SQOOP
- PIG
- HIVE
11) ________ is a workflow engine , runs on a server typically outside the cluster
- Oozie
- Zookeeper
- Chukwa
- Mahout
12) For Better Load Balancing and to avoid potential performance issues
- custom Partitioner
- custom combiner
- custom reducer
- user more reducer
13) Anything written using the OutputCollector.collect method will be written to __________
- Local file system
- HDFS
- Windows file systems only
- none of the above
14) join operation is performed at ___________
- mapper
- reducer
- shuffle and sort
- none of the above
15) When is the earliest that the reducer() method of any reduce task in a given job is called?
- immediately after all map tasks have completed
- As soon as a map task emits at least one record
- As soon as at least one map task has finished processing its complete input split
- none of the above
16) Which deamon distributes the individual task to data nodes?
- tasktracker
- jobtracker
- namenode
- datanode
17) __________object allows the mapper to interact with the rest of the Hadoop system
- Context object
- InputSplit
- Recordreader
- Shuffle and Sort
18) How many instances of JobTracker can run on a Hadoop Cluster?
- only one
- maximum two
- any number but should not be more than number of datanodes
- none of the above
19) How many instances of Tasktracker run on a Hadoop cluster?
- unlimited TaskTracker on each datanode
- one TaskTracker for each datanode
- maximum 2 Tasktarcker for each datanode
- none of the above
20) Which Deamon processes must run on namenode
- tasktracker and jobtracker
- namenode and jobtracker
- namenode and secondarynamenode
- none of the above
21) Which Deamon processes must run on datanode
- tasktracker and datanode
- namenode and jobtracker
- datanode and secondarynamenode
- tasktracker and jobtracker
22) Which Deamon process must run on secondarynamenode
- tasktracker
- namenode
- secondarynamenode
- datanode
23) What is the command for checking disk usage in hadoop.
- Hadoop fs –disk –space
- Hadoop fs –diskusage
- Hadoop fs –du
- None of the above
24) Which one is not a master daemon?
- Namenode
- Jobtracker
- Tasktracker
- None of these.
25) Does HDFS allow appends to files through command Line
- True
- False
26) As HDFS works on the principle of
- Write once , Read Many
- Write Many, Read Many
- Write Many, Read Once
- None
27) Data node decides where to store to data,
- Yes
- False
28) Reading is parallel and writing is not parallel in HDFS
- True
- False
29) ___ command to check for various inconsistencies in HDFS
- FSCK
- FETCHDT
- SAFEMODE
- SAFEANDRECOVERY
30) Hadoop fsck command is used to
- Check the integrity of the HDFS
- Check the status of data nodes in the cluster
- check the status of the NameNode in the cluster
- Check the status of the Secondary Name Node
31) How can you determine available HDFS space in your cluster
- using hadoop dfsadmin -report command
- using hadoop fsck / command
- using secondary namenode web UI
- Using Data Node Web UI
32) How does the HDFS architecture provide data reliability?
- Reliance on SAN devices as a DataNode interface.
- Storing multiple replicas of data blocks on different DataNodes
- DataNodes make copies of their data blocks, and put them on different local disks.
- Reliance on RAID on each DataNode.
33) The path in which the HDFS data will be stored is specified in the following file
- hdfs-site.xml
- yarn-site.xml
- mapred-site.xml
- core-site.xml
34) What is the default partitioning machanisim?
- Round Robin
- User needs to configure
- Hash Partitioning
- None
35) Is it possible to change the HDFS block size
- TRUE
- FALSE
36) Name Node contain
- Meta data, all data blocks
- Metadata and recently used block
- Meta data only
- None of the above
37) Name Node can be formatted any time without data loss
- TRUE
- FALSE
38) How do you list the files in a HDFS directory
- ls
- hadoop ls
- hadoop fs -ls
- hadoop ls -fs
39) HDFS Achives High Availablility and fault tolerance through
- By spliting files into blocks
- By keeping a copy of frequently accessing data block in Name Node
- By replicating any blocks on multiple data node on the cluster
- None of the above
40) Name node keeps metadata and data files
- TRUE
- FALSE
41) Big Data poses challenge to traditional system in terms of
- Network bandwidth
- Operating system
- Storage and proccessing
- None of the above
42) What are supported programming languages for Map Reduce?
- The most common programming language is Java, but scripting languages are also supported via Hadoop streaming.
- Any programming language that can comply with Map Reduce concept can be supported.
- Only Java supported since Hadoop was written in Java.
- Currently Map Reduce supports Java, C, C++ and COBOL.
43) Which node performs housekeeping functions for the NameNode.
- Datanode
- namenode
- Secondary NameNode
- Edge Node
44) What is the implementation language of the Hadoop MapReduce framework?
- Java
- C
- FORTRAN
- Python
45) If the Name node is down and job is submitted
- It will connect with Secondary name node to process the job
- Wait untill Name Node comes up
- gets files from local disk
- Job will fail
46) The Reducer class defines
- How to process one key at a time
- How to process multiple keys together
- Depends on the logic any thing can be done
- Depends on the number of keys
47) Number of Partition is equals to
- Number of Reducers
- Number of Mappers
- Number of Input Split
- Number of output directories
48) A combiner class can be created by extending
- Combiner Class
- Mapper class
- Reducer Class
- Partitioner Class
49) For making Custom Partitioning one needs to implement
- Logic to be written in Mapper
- Logic to be written in Reducer
- Partitioner
- Combiner
50) Output of reducer is written to
- temp directory
- HDFS
- Local disk
- None of the above
51) The default input type in map/reduce is JSON.
- TRUE
- FALSE
52) A JobTracker runs in its own JVM process.
- TRUE
- FALSE
53) The MapReduce programming model is inspired by functional languages and targets data-intensive computations.
- TRUE
- FALSE
54) The output a mapreduce process is a set of <key,value, type> triples.
- TRUE
- FALSE
55) The Map function is applied on the input data and produces a list of intermediate <key,value> pairs.
- TRUE
- FALSE
56) The Reduce function is applied to all intermediate pairs with different keys.
- TRUE
- FALSE
57) By default, the MapReduce framework stores results in HDFS.
- TRUE
- FALSE