Big Data Quiz

Big Data Quiz : This Big Data Expert Hadoop Quiz contains set of 60 Big Data Quiz which will help to clear any exam which is designed for Expert.

1) What is identity mapper in hadoop.

When same mapper runs on the different dataset in same job.
When same mapper runs on different dataset in different jobs.
When there is no mapper for a job then identity mapper class used.
Both b and c are correct.

Answer : c

2) Default scheduler used in map-reducer framework

Capacity scheduler.
Fair scheduler.
Job scheduler.
Lazy scheduler.

Answer : b

3) You are executing PIG in MapReduce mode and you want to execute in local mode. How can you achieve it?

Pig –x local
Pig –x mapreduce leave
Both are correct
None of these

Answer : a

4) The process in hadoop-2.0 which scale the name node horizontally is known as

Federation
Fenching
Namenode HA
None of these.

Answer : a

5) What is true about partitioner in Map-reduce job?

No. of partition is same as number of reducer.
Default partition in hive is Hash partition.
All of these.
None of these.

Answer : c

6) Which is not a role of reporter in map-reduce program.

Report about the progress of mappers and reducers.
Set application level status message.
Update counters.
Help in lunching failed job with the help of counter.

Answer : d

7) What is mapred.tip.id refers to in the time of debugging?

Id of the mapper currently running.
Id of the reducer currently running.
Id of the task currently running.
Id of the job currently running.

Answer : c

8) Which below scheduler support multiple queues?

Fair scheduler
Capacity scheduler.
Lazy scheduler.
Default scheduler.

Answer : b

9) How can you set a debug script in hadoop MR Job?

JobConf.setMapDebugScript(String)
JobConf.setReducerDebugScript(String)
JobConf.setJobScript(String)
None of these.

Answer : a

10) What is the default data type in PIG

Bytearray
Chararray
Textarray
None

Answer : a

11) Which operator in PIG is not associated with loading and storing?

Load
Store
Dump
Split

Answer : d

12) How can you enable the MemStore-Local Allocation Buffer, a feature which works to prevent heap fragmentation under heavy write loads?

hbase.hregion.memstore.mslab.enabled
hbase.hregion.memstore.enabled
hbase.hregion.memstore.mslab.job.enabled
none

Answer : a

13) What is the hashing algorithm used for hash function.

Murmur
Lazy
Default
None.

Answer : b

14) Which one is the policy configuration file used by RPC servers to make authorization decisions on client requests.

hadoop.policy.file= hbase-policy.xml
hadoop.policy.file.apache= hbase-policy.xml
hadoop.policy.file.enable= hbase-policy.xml
none

Answer : a

15) how can you specify destination directory in sqoop

–target-dir <dir>
–destination-dir <dir>
–hdfs-dir <dir>
All of the above

Answer : a

16) How can you enable compression in sqoop.

-z
–compress
Both a and b
None.

Answer : c

17) Let say you are importing from PostgreSQL through sqoop in conjunction with direct mode, you can split the import into separate files after individual files reach a certain size. How you can do it?

–direct-split-size
–input-split-size
–postgresql-split-size
None

Answer : a

18) What is the default port to access name node web UI?

50060
50050
50070
None of the above

Answer : c

19) How many states does Writable interface defines _____.

Two
Three
Six
None of the above

Answer : a

20) What are supported programming languages for Map Reduce?

The most common programming language is Java, but scripting languages are also supported via Hadoop streaming.
Any programming language that can comply with Map Reduce concept can be supported.
Only Java supported since Hadoop was written in Java.
Currently Map Reduce supports Java, C, C++ and COBOL.

Answer : a

21) What is map – side join?

Map-side join is a technique in which data is eliminated at the map step
Map-side join is done in the map phase and done in memory
Map-side join is a form of map-reduce API which joins data from different locations
None of these answers are correct

Answer : B

22) What is reduce – side join?

Reduce-side join is a technique to eliminate data from initial data set at reduce step
Reduce-side join is a technique for merging data from different sources based on a specific key. There are no memory restrictions
Reduce-side join is a set of API to merge data from different sources.
None of these answers are correct

Answer : B

23) What is AVRO?

Avro is a java serialization library
Avro is a java compression library
Avro is a java library that create splittable files
None of these answers are correct

Answer : A

24) Can you run Map – Reduce jobs directly on Avro data?

Yes, Avro was specifically designed for data processing via Map-Reduce
Yes, but additional extensive coding is required
No, Avro was specifically designed for data storage only
Avro specifies metadata that allows easier data access. This data cannot be used as part of map-reduce execution, rather input specification only.

Answer : A

25) Can a custom type for data Map-Reduce processing be implemented?

No, Hadoop does not provide techniques for custom datatypes.
Yes, but only for mappers.
Yes, custom data types can be implemented as long as they implement writable interface.
Yes, but only for reducers.

Answer : C

26) Which Node acts as an access point for the external applications, tools, and users
that need to utilize the Hadoop environment.

Datanode
namenode
job tracker
N/A

Answer : b

27) Which object can be used to get the progress of a particular job

MAP
Reducer
Context
Prgress

Answer : c

28) Which node performs housekeeping functions for the NameNode.

Datanode
namenode
Secondary NameNode
Edge Node

Answer : c

29) Which of the following utilities allows you to create and run MapReduce jobs with any executable or script as the mapper and/or the reducer?

Oozie
Sqoop
Sqoop
Hadoop Streaming

Answer : d

30) Which MapReduce stage serves as a barrier, where all previous stages must be completed before it may proceed?

Combine
Group
Reduce
Write

Answer : a

31) Which TACC resource has support for Hadoop MapReduce?

Ranger
Longhorn
Lonestar
Spur

Answer : a

32) What is the implementation language of the Hadoop MapReduce framework?

Java
C
FORTRAN
Python

Answer : a

33) Which MapReduce phase is theoretically able to utilize features of the underlying file system in order to optimize parallel execution?

Split
Map
Combine
None of the above

Answer : a

34) _______ shell is used to execute Pig Latin statement

Execute
Run
Grunt
N/A

Answer : c

35) _______ operator is used to view logical, physical and mapreduce execution plan to compute a relation

Show
Describe
Display
Explain

Answer : d

36) Pig is developed by

Face Book
Yahoo
Twitter
Linked In

Answer : b

37) Pig is ?

Declarative language
Data Flow Language
Both
N/A

Answer : b

38) Following is not a Daemon of YARN

Resource Manager
Node Manager
Application Master
Job TRacker

Answer : d

39) What happens when a Map task crashes while running a MapReduce job on a cluster configured with MapReduce version 1 (MRv1)?

The framework closes the JVM instance and restarts
The job immediately fails
The JobTracker attempts to re-run the task on the same node
The JobTracker attempts to re-run the task on a different node

Answer : c

40) Which daemon reports available slots for scheduling a Map or Reduce operation in MapReduce version 1 (MRv1)?

TaskTracker
JobTracker
Secondary NameNode
DataNode

Answer : a

41) How is the number of Mappers determined for a job in a MapReduce?

The number of Mappers is calculated by the NameNode based on the number of HDFS blocks in the files.
The developer specifies the number in the job configuration.
The JobTracker chooses the number based on the number of available nodes.
The number of Mappers is equal to the number of InputSplits calculated by the client submitting the job

Answer : d

42) Which daemon instantiates Java Virtual Machines in a cluster running MapReduce v1 (MRv1)

ResourceManager
TaskTracker
JobTracker
DataNode

Answer : b

43) Number of files the reduce task generate?

one file all together
one file per reducer
Depends on input file
None

Answer : b

44) If the Name node is down and job is submitted

It will connect with Secondary name node to process the job
Wait untill Name Node comes up
gets files from local disk
Job will fail

Answer : d

45) What is Partitioning in MapReduce

Making map output into equal partions
when map output exceeds the limit, create a new one
Assigning map output keys to reducers
None of the above

Answer : c

46) The Reducer class defines

How to process one key at a time
How to process multiple keys together
Depends on the logic any thing can be done
Depends on the number of keys

Answer : a

47) By Default number of Map tasks depends upon

number of machines
Number of files
configurable
number of splits

Answer : d

48) How dose MapReduce hides data dispersion

Through Map Reduce components
By defining data as keys and values
By clustering machines
HDFS takes care of it

Answer : b

49) What is the input format for Hadoop Archive files

TextInputFormat
SequenceFileInputFormat
None of these
There is no suitable Input Format Type

Answer : d

50) map() method uses the following object to send output to Map/Reduce framework

JobClient
Config
Context
It can directly write

Answer : c

51) Number of Partition is equals to

Number of Reducers
Number of Mappers
Number of Input Split
Number of output directories

Answer : a

52) A combiner class can be created by extending

Combiner Class
Mapper class
Reducer Class
Partitioner Class

Answer : c

53) Distributed cache can be used to add

a data file
a jar file library
both 1 and 2
None of the above

Answer : c

54) For making Custom Partitioning one needs to implement

Logic to be written in Mapper
Logic to be written in Reducer
Partitioner
Combiner

Answer : c

55) _______________ file controls debugging metrics in hadoop

.core-site.xml
properties
.hadoop-env.sh
hadoop-metrics.properties

Answer : d

56) Default input key type for TextInputFormat?

LongWritable
ShortWrtiable
NullWritable
Text

Answer : a

57) Output of reducer is written to

temp directory
HDFS
Local disk
None of the above

Answer : b

58) How to specify UNIX time in milliseconds in Flume.

%u
%b
%t
None

Answer : c

59) How to specify long month name (January, February) in flume.\

Answer : b

Big Data Hadoop Expert Quiz

Big Data Quiz

About The Author

admin

Big Data Quiz

About The Author

admin

Related Articles

Advanced Manufacturing Process Quiz

Web analytics, Adobe Analytics, Google Analytics Quiz

SailPoint Quiz