Statistical Analysis Beginner Quiz
1) The least squares regression for predicting Y using X:
- is determined by use of a function of the distance between the observed Y’s and the predicted Y’s
- has the smallest sum of the squared residuals of any line through the data values.
- has the sum of the residuals about the line equal to zero.
- has all of the above properties.
2) Let X and Y be random variable with corr(X,Y)=0.7 then the slope of the regression line when y is regressed with x is
- Positive
- Negative
- 1
- 0
3) Test for autocorrelation?
- Durbin Watson Test
- Hausman Test
- LM Test
- None of the above
4) Test for heteroscedasticity?
- LM Test
- Durbin Watson Test
- Hausman Test
- KS Test
5) What is Panel data?
- Cross sectional data
- Time series data
- Cross sectional data and Time series data2
- None of the above
6) LM test follows which distribution?
- Chi-square
- F distribution
- t distribution
- None of the above
7) Nature of dependent variable in Logistic regression model?
- Continuous
- Categorical
- Both
- None
8) How do you calculate Odds ratio in a logistic regression model for one unit change in the regressor? Where β is the parameter estimate of the model.
- Exp(β)
- Log(β)
- (β)-1
- 1/ β
9) MAPE stands for
- Mean average percent error
- Mean average percentile error
- Mean absolute percent error
- None of the above.
10) A time series data consisting of uncorrelated observations with zero mean and constant variance is termed as
- White noise
- Constant series
- Noise
- None of the above
11) Time series methods
- discovers a pattern in historical data and project it into the future
- include cause-effect relationships
- are useful when historical information is not available
- All of the alternatives are true.
12) Forecast errors
- are the difference in successive values of a time series
- are the differences between actual and forecast values
- should all be nonnegative
- should be summed to judge the goodness of a forecasting model
13) Gradual shifting of a time series over a long period of time is called
- periodicity
- cycle
- regression
- trend
14) Short-term, unanticipated, and nonrecurring factors in a time series provide the random variability known as
- uncertainty
- the forecast error
- the residuals
- the irregular component
15) If data for a time series analysis is collected on an annual basis only, which component may be ignored?
- trend
- seasonal
- cyclical
- irregular
16) One measure of the accuracy of a forecasting model is the
- smoothing constant
- trend component
- mean absolute deviation
- seasonal index
17) The order of a MA process is determined by
- Acf
- Pacf
- None of the above
- Both ‘a’ and ’b’
18) A purely random process is a stationary series with
- Zero variance
- Zero mean
- Positive mean
- Zero mean and zero variances
19) A series that is inherently non stationary is
- Random walk with drift
- Random walk without drift
- Both A and B
- none of the above
20) A series may be trend stationary or difference stationary. Test statistic used to distinguish between the two is
- Dickey-Fuller test
- Engel-Granger test
- Error correction mechanism
- F-test
21) Correlogram is
- The test statistic used to test the chosen ARIMA model for goodness of fit
- Plots of the autocorrelation function and partial autocorrelation function against lag length
- Plots of the autocorrelation function and partial autocorrelation function against time
- Plots of error term against time
22) One of the easiest ways of detecting autocorrelation is the graphical method where we
- Plot error terms against the standardized values
- Plot the error terms against each X variable
- Plot the error term against the Y variable
- Plot the error term against time
23) In a clustering exercise 5 distinct clusters are formed. The distance of each of the mean of the clusters from the other is
- Maximized by the clustering process
- Minimized by the clustering process
- Is Equidistant
- Can’t say anything
24) X and Y are independent then Corr(X,Y) is
- Positive
- Negative
- 0
- Can’t say
25) X and Y are random variable with Corr(X,Y)=0 then X and Y are
- Independent
- can’t say
- Dependent
- None of the above
26) 95% of students at school weigh between 62 kg and 90 kg. assuming this data is normally distributed, what are the mean and standard deviation?
- Mean = 66 kg SD. = 7 kg
- Mean = 76 kg SD. = 7 kg
- Mean = 86 kg SD. = 7 kg
- Mean = 76 kg SD. = 14 kg
27) A machine produces electrical components.
99.7% of the components have lengths between 1.176 cm and 1.224 cm.
Assuming this data is normally distributed, what are the mean and standard deviation?
- Mean = 1.210 cm SD. = 0.008 cm
- Mean = 1.190 cm SD. = 0.008 cm
- Mean = 1.200 cm SD. = 0.004 cm
- Mean = 1.200 cm SD. = 0.008 cm
28) 68% of the marks in a test are between 51 and 64
Assuming this data is normally distributed, what are the mean and standard deviation?
- Mean = 57 SD. = 6.5
- Mean = 57 SD. = 7
- Mean = 57.5 SD. = 6.5
- Mean = 57.5 SD. = 13
29) Which are other names for the normal distribution? Select all that apply.
- Typical curve
- Gaussian curve
- Regular distribution
- Gamma distribution
30) Select all of the statements that are true about normal distributions.
- They are symmetric around their mean.
- They are defined by their mean and skew.
- They are discrete distributions.
- They have high density in their tails.
31) A normal distribution has a mean of 40 and a standard deviation of 5. 68% of the distribution can be found between what two numbers?
- 30 and 50
- 0 and 45
- 0 and 68
- 35 and 45
32) A normal distribution has a mean of 20 and a standard deviation of 3. Approximately 95% of the distribution can be found between what two numbers?
- 17 and 23
- 14 and 26
- 10 and 30
- 0 and 23
33) A standard normal distribution has:
- a mean of 1 and a standard deviation of 1
- a mean of 0 and a standard deviation of 1
- a mean larger than its standard deviation
- all scores within one standard deviation of the mean
34) The total area under the curve of the standard normal distribution is not necessarily 1.0.
- TRUE
- FALSE
35) Normal distribution is defined by,
- Two parameters (mean, Skewness)
- Two parameters (mean, standard deviation )
- One parameters (Skewness )
- One parameter (Mean)
36) Normal distribution is,
- Skewed at left side
- symmetric
- Skewed at Right side
37) Which one Holds True for Normal Distribution
- 50% of values less than the mean
- 40% of values are greater than mean
- all values are greatre than 0
38) Which on Holds for Normal Distribution,
- mean≠median=mode
- mean=median=mode
- mean=median≠mode
39) What is the median of the numbers 4, 2, 11, 6, 2, 9 ?
- 4
- 5
- 6
- 9
40) What extra number must be included with the following list of numbers to increase the median by 1?
16, 7, 24, 2, 11
- 13
- 14
- 18
- 26
41) What extra number must be included with the following list of numbers to decrease the median by 3?
24, 14, 18, 28, 3, 9
- 13
- 14
- 20
- There is no such number
42) What is the median of the squares of the first ten natural numbers?
- 5
- 5
- 5
- 5
43) Choose the correct one. Measures of spread are:
- Range,Mean, Mean deviation, Standard Deviation
- Range, Mean deviation, Standard Deviation
- Mean, Mode, Median, Range
- Mean, Mode, Mean deviation, Standard Deviation
44) For the numbers 13, 16, 12, 11, 8, 14, 12 and 18
which of the following is true?
- median > mean > mode
- mean > median > mode
- mean > mode > median
- median > mode > mean
45) The numbers 7, 6, 10, 13, 7, 2, 5, 6 and x have only one mode
And the mean, median and mode are all equal.
What is the value of x?
- x = 5
- x = 6
- x = 6.5
- x = 7
46) The mean of the numbers 11, 18, 5, 24, 12, 3 and x is 13.
What are the median and mode?
- Median = 24, mode = 18
- Median = 12, mode = 12
- Median = 11, mode = 11
- Median = 12, mode = 18
47) The population standard deviation of the numbers 3, 8, 12, 17, and 25 is 7.563 correct to 3 decimal places.
What happens if each of the five numbers is multiplied by 3?
- The standard deviation remains the same
- The standard deviation is increased by 3
- The standard deviation is multiplied by 3
- The standard deviation is multiplied by 9
48) Time-Series Analysis is used to assess
- Seasonality
- Propencity
- Standard Deviation
- None of the above
49) In a class of 100, the mean on a certain exam was 50, the standard deviation, 0. This means
- half the class had scores less than 50
- there was a high correlation between ability and grade
- everyone had a score of exactly 50
- half the class had 0’s and half had 50’s
50) A list of 5 pulse rates is: 70, 64, 80, 74, 92. What is the median for this list?
- 74
- 80
- 76
- 70
51) The probability that a river will flood in any given year has been estimated from 200 years of historical data to be one in four. This means:
- The River will flood exactly once in every four year.
- In the next 100 years, the River will flood exactly 25 times.
- In the last 100 years, the River flooded exactly 25 times
- In the next 100 years, the River will flood about 25 times
52) Men tend to marry women who are slightly younger than themselves.
Suppose that every man married a woman who was exactly .5 of a year younger than themselves. Which of the following is CORRECT?
- The correlation is .5.
- The correlation is 1
- The correlation is −1
- The correlation is 0
53) Given the data set 4 , 10 , 7 , 7 , 6 , 9 , 3 , 8 , 9 Find the mode
- 7
- 9
- 7,3
- 7,9
54) Find the mean for the following series 15, 16, 18, 19, 25, 25, 36, 25, 45
- 24
- 5
- 89
- 25
55) Find the median for the following series 15, 16, 18, 19, 25, 27, 36, 30, 45
- 25
- 27
- 30
- 19
56) You asked ten of your classmates about their weight. On the basis of this information, you stated that the average weight of all students in your university or college is 158 pounds. This is an example of
- descriptive statistics
- statistical inference
- population
- sample and population
57) What is mean
- mean is a measure of central tendancy
- mean is a measure of variation
- mean is the number of extreme values
- Mean is a measure of data richness
58) Which of the following is a measure of central location
- SD
- Distribution
- Mode
- Variance
59) Which of the following is not a measure of central location
- mean
- Median
- Mode
- Variance
60) The value that has half of the observations above it and half the observations below it is called the
- Range
- mean
- Median
- Mode