Big Data Quiz

Big Data Quiz : This Big Data Beginner Hadoop Quiz contains set of 60 Big Data Quiz which will help to clear any exam which is designed for Beginner.

1) Big Data refers to datasets that grow so large that it is difficult to capture, store, manage, share, analyze and visualize with the typical database software tools.

TRUE
FALSE

Answer : A

2) Default bock size in HDFS is____________

128 KB
64 MB
64 KB
128MB

Answer : B

3) Which of the following statement is/are TRUE regarding Hadoop
i)Performs best with a ‘modest’ number of large files
ii)Performs best with a large number of small files

i)
ii)
Both i) & ii)
none of the above

Answer : A

4) By defalut each block is replicated _______times

Answer : C

5) Large block size makes transfer time more effective?

TRUE
FALSE

Answer : A

6) Which of the following is NOT a demon process?

secondarynamenode
jobtracker
tasktracker
mapreducer

Answer : D

7) SPOF (single point of failure) , can be handled by using _________

secondarynamenode
backupserver
jobtracker
passive nodes

Answer : D

8) Clients access the blocks directly from ________for read and write

data nodes
name node
secondarynamenode
none of the above

Answer : A

9) Information about locations of the blocks of a file is stored at __________

data nodes
name node
secondarynamenode
none of the above

Answer : B

10) What makes data into Big Data?

volume
velocity
variety
all of the above

Answer : D

11) Which of the following statement(s) are TRUE ?
i) Hadoop is comprised of five separate daemons.
ii) Each daemon runs in its own Java Virtual Machine (JVM).

Only ii)
Only i)
Only i) & ii)
All i), ii) & iii)

Answer : C

12) Hadoop is ‘Rack-aware’ and HDFS replicates data blocks on nodes on different racks

TRUE
FALSE

Answer : A

13) Which node stores the checksum?

datanode
secondarynamenode
namenode
all of the above

Answer : A

14) MapReduce programming model is _____________

Platform Dependent but not language-specific
Neither platform- nor language-specific
Platform independent but language-specific
Platform Dependent and language-specific

Answer : B

15) Which is optional in map reduce program?

Mapper
Reducer
both are optional
both are mandatory

Answer : B

16) TaskTracker reside on _________ and run ________ task.

datanode, map/reduce
datanode,reducer
datanode,mapper
namenode, map/reduce

Answer : A

17) The Hadoop API uses basic Java types such as LongWritable, Text, IntWritable. They have almost the same features as default java classes. What are these writable data types optimized for?

file system storage
network transmissions
data retrieval
all of the above

Answer : B

18) What is the default input format?

sequencefileformat
BinaryFileFormat
TextInputFormat
none of the above

Answer : C

19) Which is TRUE about HIVE?

No support for update and delete
No support for singleton inserts
Correlated sub queries are not supported
all of the above

Answer : D

20) Sqoop is a tool which can be used to

Imports tables from an RDBMS into HDFS
Exports files from HDFS into RDBMS tables
Uses a JDBC interface
all of the above

Answer : D

21) Which tool can be used to transfer data from Microsoft SQL Server databases to Hadoop or HIVE.

HBASE
PIG
SQOOP
Flume

Answer : C

22) _________ is a distributed, reliable, available service for efficiently moving large amounts of data as it is produced

FLUME
SQOOP
PIG
HIVE

Answer : A

23) ________ is a workflow engine , runs on a server typically outside the cluster

Oozie
Zookeeper
Chukwa
Mahout

Answer : A

24) To custom OutputFormats must provide a __________ implementation

InputWriter
RecordWriter
OutWriter
WritableComparable

Answer : B

25) Combiner is

Like a ‘mini-Reducer’
Runs locally on a single Mapper’s output
Output from the Combiner is sent to the Reducers
all of the above

Answer : D

26) For Better Load Balancing and to avoid potential performance issues

custom Partitioner
custom combiner
custom reducer
user more reducer

Answer : A

27) Anything written using the OutputCollector.collect method will be written to __________

Local file system
HDFS
Windows file systems only
none of the above

Answer : B

28) Which component of the HIVE architecture submits the individual map-reduce jobs from the DAG to the Execution Engine

compiler
optimizer
driver
none of the above

Answer : C

29) Which HIVE command will load data from an HDFS file/directory to the table?

LOAD DATA INPATH ‘/user/myname/AB.txt’ OVERWRITE INTO TABLE invites PARTITION (ds=’2008-08-15′);
LOAD DATA LOCAL INPATH ‘/user/myname/AB.txt’ OVERWRITE INTO TABLE invites PARTITION (ds=’2008-08-15′);
Both statements are correct
none of the above

Answer : A

30) Which HIVE command will display tables created by user?

show table;
select * from tab;
show tables;
none of the above

Answer : C

31) Which HIVE file format is not splitable after compression?

RCFILE
SEQUENCEFILE
TEXTFILE
all of the above

Answer : C

32) HIVE command : LOAD DATA INPATH ‘/user/myname/log.txt’ INTO TABLE mylog;

Load the data from local file ‘/user/myname/log.txt’ to table mylog
Load the data from HDFS file ‘/user/myname/log.txt’ to table mylog
Overwrite the data from local file ‘/user/myname/log.txt’ to table mylog
none of the above

Answer : B

33) HIVE command: LOAD DATA LOCAL INPATH ‘/examples/files/ab1.txt’ OVERWRITE INTO TABLE sample

Load the data from local file ‘/examples/files/ab1.txt’ to table sample
Load the data from HDFS file ‘/examples/files/ab1.txt’ to table sample
Overwrites the data from local file ‘/examples/files/ab1.txt’ to table sample
all of the above

Answer : C

34) join operation is performed at ___________

mapper
reducer
shuffle and sort
none of the above

Answer : B

35) When is the earliest that the reducer() method of any reduce task in a given job is called?

immediately after all map tasks have completed
As soon as a map task emits at least one record
As soon as at least one map task has finished processing its complete input split
none of the above

Answer : A

36) You have built a MapReduce job that denormalizes a very large table, resulting in an extremely large amount of output data. Which two cluster resources will your job stress the most?

RAM , Network
Network , Disk Input output
CPU, RAM
all of the above

Answer : B

37) You have 10 files in the directory /user/amit/example. Each file is 640MB. You submit a MapReduce job with /user/foo/example as the input path.

all files in the directory
A single file
A single input split
none of the above

Answer : C

38) ___________ Ensure no (key, value) pair is processed more than once

InputSplit
RecordReader
mapper
reducer

Answer : B

39) ___________ reads the record and passes it to the mapper

RecordReader
reducer
InputSplit
none of the above

Answer : A

40) Which of the following is the correct sequence of operations for a MR job?

RecordReader,shuffle and sort,mapper,reducer,InputSplit
InputSplit,RecordReader,mapper,shuffle and sort, reducer
InputSplit,RecordReader,reducer,shuffle and sort, mapper
none of the above

Answer : B

41) Which deamon distributes the individual task to data nodes?

tasktracker
jobtracker
namenode
datanode

Answer : B

42) __________object allows the mapper to interact with the rest of the Hadoop system

Context object
InputSplit
Recordreader
Shuffle and Sort

Answer : A

43) How many instances of JobTracker can run on a Hadoop Cluster?

only one
maximum two
any number but should not be more than number of datanodes
none of the above

Answer : A

44) How many instances of Tasktracker run on a Hadoop cluster?

unlimited TaskTracker on each datanode
one TaskTracker for each datanode
maximum 2 Tasktarcker for each datanode
none of the above

Answer : B

45) Which PIG LATIN statement is used for per record transformation of data(projection)?

JOIN
FOREACH – GENERATE
FATTEN
FILTER

Answer : B

46) Which PIG statement is used to remove nesting?

JOIN
FOREACH – GENERATE
FILTER
none of the above

Answer : C

47) Consider a relation that has a tuple of the form (a,(b,c)) . What is the output, If we apply statement GENERATE $0,FLATTEN($1)

(a,b,c)
(a,b) and (a,c)
invalid operation
none of the above

Answer : A

48) Command to invoke grunt to use local file system

pig
pig -x local
pig local
all of the above

Answer : B

49) ______is currently a better choice for low-latency access.

HBase
HIVE
PIG
all of the above

Answer : A

50) Port number to find namenode and dfshealth information in the browser is________

50070
50060
50030
none of the above

Answer : A

51) To look for jobtracker in the browser use ________ in the browser

http://localhost:50070/
http://localhost:50060/
http://localhost:50030/
none of the above

Answer : C

52) To look for tasktracker in the browser use ________ in the browser

http://localhost:50070/
http://localhost:50060/
http://localhost:50030/
none of the above

Answer : B

53) Which Deamon processes must run on namenode

tasktracker and jobtracker
namenode and jobtracker
namenode and secondarynamenode
none of the above

Answer : B

54) Which Deamon processes must run on datanode

tasktracker and datanode
namenode and jobtracker
datanode and secondarynamenode
tasktracker and jobtracker

Answer : A

55) Which Deamon process must run on secondarynamenode

tasktracker
namenode
secondarynamenode
datanode

Answer : C

56) Hadoop was named after the toy elephant of Doug Cutting’s son.

TRUE
FALSE

Answer : A

57) Which of the following accurately describe Hadoop?

distributed computing approch
open source
java based
all of the above

Answer : D

58) We can update rows and delete rows of a table in HIVE?

TRUE
FALSE

Answer : B

59) HIVE is NOT designed for

OLTP
low latency applications
user facing/interactive applications
all of the above

Answer : D

60) In HDFS HIVE these will create a directory

table
partition
bucket
all of the above

Answer : D

Big Data Hadoop Beginner Quiz

Big Data Quiz

About The Author

admin

Big Data Quiz

About The Author

admin

Related Articles

Advanced Manufacturing Process Quiz

Web analytics, Adobe Analytics, Google Analytics Quiz

SailPoint Quiz