Advertise here ✔️

Phone: +255 692 127 931 Email: njoholes@gmail.com

Tuesday, June 9, 2026

500+ STATISTICIAN II INTERVIEW PREPARATION QUESTIONS AND ANSWERS

STATISTICIAN II INTERVIEW PREPARATION 

1.               What is the mean of 2, 4, 6, 8?

A.              4

B.              5

C.              6

D.              7 → B

2.               Which measure is most affected by outliers?

A.              Median

B.              Mode

C.              Mean

D.              Range → C

3.               The median of an odd-numbered dataset is:

A.              Average of all values

B.              Middle value

C.              Most frequent value

D.              Smallest value → B

4.               Standard deviation measures:

A.              Central tendency

B.              Dispersion

C.              Skewness

D.              Probability → B

5.               Which distribution is symmetric?

A.              Normal

B.              Exponential

C.              Poisson

D.              Binomial (skewed) → A

6.               Mode is:

A.              Average

B.              Most frequent value

C.              Middle value

D.              Range → B

7.               Variance is the square of:

A.              Mean

B.              Median

C.              Standard deviation

D.              Mode → C

8.               Range is calculated as:

A.              Max − Min

B.              Mean − Median

C.              Median − Mode

D.              Sum of values → A

9.               A dataset with no variability has standard deviation:

A.              0

B.              1

C.             

D.              -1 → A

10.            Skewness measures:

A.              Spread

B.              Shape of distribution

C.              Center

D.              Size → B

11.            11–20: Probability

12.            Probability values lie between:

A.              -1 and 1

B.              0 and 1

C.              1 and 10

D.              0 and 10 → B

13.            Probability of a sure event is:

A.              0

B.              0.5

C.              1

D.              2 → C

14.            If events A and B are independent:

A.              P(A∩B)=P(A)+P(B)

B.              P(A∩B)=P(A)P(B)

C.              P(A|B)=0

D.              P(A)=P(B) → B

15.            Complement of event A is:

A.              A

B.              1 − P(A)

C.              P(A)²

D.              0 → B

16.            Conditional probability is:

A.              P(A)

B.              P(A|B)

C.              P(B|A)

D.              Both B and C → D

17.            Binomial distribution requires:

A.              Continuous data

B.              Fixed trials

C.              Infinite trials

D.              Negative values → B

18.            Expected value of a fair die:

A.              2.5

B.              3

C.              3.5

D.              4 → C

19.            Law of large numbers states:

A.              Sample mean → population mean

B.              Sample decreases

C.              Variance increases

D.              Data disappears → A

20.            Random variable can be:

A.              Only discrete

B.              Only continuous

C.              Both

D.              None → C

21.            Poisson distribution models:

A.              Continuous data

B.              Rare events

C.              Large values

D.              Negative values → B

22.            21–30: Inferential Statistics

23.            Null hypothesis represents:

A.              Alternative claim

B.              No effect

C.              Strong effect

D.              Prediction → B

24.            p-value measures:

A.              Mean

B.              Evidence against H₀

C.              Sample size

D.              Variance → B

25.            If p-value < 0.05:

A.              Accept H₀

B.              Reject H₀

C.              Ignore

D.              Increase sample → B

26.            Type I error is:

A.              False negative

B.              False positive

C.              True positive

D.              True negative → B

27.            Type II error is:

A.              Reject true H₀

B.              Accept false H₀

C.              Correct decision

D.              None → B

28.            Confidence interval provides:

A.              Exact value

B.              Range estimate

C.              Mean only

D.              Error only → B

29.            Larger sample size leads to:

A.              Larger error

B.              Smaller error

C.              No change

D.              Infinite error → B

30.            t-test is used when:

A.              Large sample

B.              Unknown variance

C.              Known variance

D.              Infinite sample → B

31.            Z-test requires:

A.              Small sample

B.              Known variance

C.              Unknown variance

D.              No data → B

32.            ANOVA compares:

A.              Two means

B.              Multiple means

C.              Variance only

D.              Probabilities → B

33.            31–40: Regression & Data Analysis

34.            Regression analysis studies:

A.              Distribution

B.              Relationship between variables

C.              Mean

D.              Variance → B

35.            Dependent variable is:

A.              Output

B.              Input

C.              Constant

D.              Random → A

36.            Independent variable is:

A.              Output

B.              Predictor

C.              Result

D.              Error → B

37.            R² measures:

A.              Error

B.              Fit of model

C.              Mean

D.              Probability → B

38.            Correlation ranges between:

A.              0 and 1

B.              -1 and 1

C.              1 and 10

D.              -10 and 10 → B

39.            Perfect positive correlation is:

A.              0

B.              -1

C.              1

D.              2 → C

40.            Multicollinearity affects:

A.              Accuracy

B.              Predictor independence

C.              Mean

D.              Variance → B

41.            Residual is:

A.              Observed − Predicted

B.              Predicted − Observed

C.              Mean − Median

D.              Max − Min → A

42.            Linear regression assumes:

A.              Non-linearity

B.              Linearity

C.              Randomness only

D.              Discrete values → B

43.            Overfitting occurs when:

A.              Model too simple

B.              Model too complex

C.              No data

D.              Small variance → B

44.            41–50: Scenario & Practical Questions

45.            Data has extreme outliers. Best measure?

A.              Mean

B.              Median

C.              Variance

D.              Range → B

46.            Small sample size analysis uses:

A.              Z-test

B.              t-test

C.              ANOVA

D.              Regression → B

47.            Comparing 3 groups’ means:

A.              t-test

B.              Z-test

C.              ANOVA

D.              Chi-square → C

48.            Categorical data test:

A.              t-test

B.              Z-test

C.              Chi-square

D.              Regression → C

49.            Checking model accuracy:

A.             

B.              Mean

C.              Mode

D.              Range → A

50.            Missing data handling:

A.              Ignore

B.              Impute

C.              Delete always

D.              Random guess → B

51.            Time series data requires:

A.              Regression

B.              Trend analysis

C.              ANOVA

D.              Chi-square → B

52.            Sampling bias affects:

A.              Accuracy

B.              Mean only

C.              Variance only

D.              Nothing → A

53.            Large variance indicates:

A.              Low spread

B.              High spread

C.              No spread

D.              Constant data → B

54.            Best visualization for distribution:

A.              Pie chart

B.              Histogram

C.              Table

D.              Text → B

55.            What does a histogram display?

A.              Relationship between variables

B.              Frequency distribution

C.              Correlation

D.              Regression → B

56.            A scatter plot is used to show:

A.              Distribution

B.              Relationship between two variables

C.              Frequency

D.              Mean → B

57.            If two variables are perfectly negatively correlated, r =

A.              1

B.              0

C.              -1

D.              0.5 → C

58.            Sampling error occurs due to:

A.              Bias

B.              Random variation

C.              Calculation mistake

D.              Data entry error → B

59.            Non-sampling error includes:

A.              Random error

B.              Sampling fluctuation

C.              Measurement error

D.              Sample size → C

60.            A census studies:

A.              Sample

B.              Population

C.              Subset

D.              Variable → B

61.            Stratified sampling divides population into:

A.              Equal parts

B.              Random parts

C.              Homogeneous groups

D.              Heterogeneous groups → C

62.            Cluster sampling selects:

A.              Individuals

B.              Groups

C.              Variables

D.              Means → B

63.            Systematic sampling selects every:

A.              Random item

B.              nth item

C.              First item

D.              Last item → B

64.            A parameter is usually denoted by:

A.              Greek letters

B.              Numbers

C.              Roman letters

D.              Symbols only → A

65.            A statistic is usually denoted by:

A.              Greek letters

B.              Roman letters

C.              Symbols only

D.              Numbers only → B

66.            Degrees of freedom for variance:

A.              n

B.              n − 1

C.              n + 1

D.              n − 2 → B

67.            A two-tailed test checks:

A.              One direction

B.              Both directions

C.              No direction

D.              Mean only → B

68.            A one-tailed test checks:

A.              Both sides

B.              One direction

C.              Mean only

D.              Variance only → B

69.            Significance level is denoted by:

A.              β

B.              α

C.              μ

D.              σ → B

70.            If α = 0.01, confidence level is:

A.              90%

B.              95%

C.              99%

D.              100% → C

71.            Normal distribution has mean = median =

A.              Mode

B.              Variance

C.              Range

D.              Skewness → A

72.            In a normal distribution, 68% data lies within:

A.              1 SD

B.              2 SD

C.              3 SD

D.              4 SD → A

73.            In a normal distribution, 95% lies within:

A.              1 SD

B.              2 SD

C.              3 SD

D.              4 SD → B

74.            Z-score measures:

A.              Raw value

B.              Standardized value

C.              Mean

D.              Variance → B

75.            Z-score formula includes:

A.              Mean and variance

B.              Mean and standard deviation

C.              Median and mode

D.              Range and IQR → B

76.            If Z = 0, value equals:

A.              Mean

B.              Median

C.              Mode

D.              Range → A

77.            Sampling distribution refers to:

A.              Population data

B.              Distribution of sample statistic

C.              Raw data

D.              Mean only → B

78.            A large sample size reduces:

A.              Bias

B.              Standard error

C.              Mean

D.              Variance → B

79.            A biased estimator:

A.              Equals parameter

B.              Deviates systematically

C.              Is always correct

D.              Has zero variance → B

80.            Efficiency of estimator relates to:

A.              Bias

B.              Variance

C.              Mean

D.              Sample size → B

81.            Consistency means estimator:

A.              Changes

B.              Converges to true value

C.              Is biased

D.              Is random → B

82.            A likelihood ratio test compares:

A.              Means

B.              Models

C.              Variances

D.              Probabilities → B

83.            p-hacking refers to:

A.              Correct testing

B.              Manipulating results

C.              Data cleaning

D.              Sampling → B

84.            A control group is used to:

A.              Compare results

B.              Increase bias

C.              Reduce data

D.              Ignore treatment → A

85.            Experimental design aims to:

A.              Increase bias

B.              Reduce bias

C.              Ignore variables

D.              Increase variance → B

86.            Randomization helps:

A.              Increase error

B.              Reduce bias

C.              Increase bias

D.              Ignore data → B

87.            Blocking is used to:

A.              Ignore variables

B.              Control variability

C.              Increase randomness

D.              Reduce sample → B

88.            Factorial design studies:

A.              One factor

B.              Multiple factors

C.              No factors

D.              Random data → B

89.            Interaction effect means:

A.              No effect

B.              Combined effect

C.              Single effect

D.              Random effect → B

90.            Residual plot helps detect:

A.              Mean

B.              Model assumptions

C.              Variance only

D.              Sample size → B

91.            Leverage points influence:

A.              Mean

B.              Regression line

C.              Variance

D.              Median → B

92.            Influential points affect:

A.              Model strongly

B.              Mean only

C.              Variance only

D.              Nothing → A

93.            Data transformation helps:

A.              Normalize data

B.              Increase bias

C.              Reduce sample

D.              Ignore outliers → A

94.            Log transformation is used for:

A.              Skewed data

B.              Normal data

C.              Small data

D.              Categorical data → A

95.            Scaling data helps:

A.              Model performance

B.              Increase error

C.              Reduce variables

D.              Ignore data → A

96.            Standardization results in:

A.              Mean 1, SD 0

B.              Mean 0, SD 1

C.              Mean 1, SD 1

D.              Mean 0, SD 0 → B

97.            Normalization scales data between:

A.              -1 to 1

B.              0 to 1

C.              1 to 10

D.              -10 to 10 → B

98.            PCA reduces:

A.              Sample size

B.              Dimensions

C.              Mean

D.              Variance → B

99.            Eigenvalues measure:

A.              Variance explained

B.              Mean

C.              Probability

D.              Error → A

100.         Eigenvectors represent:

A.              Direction

B.              Magnitude

C.              Mean

D.              Variance → A

101.         K-means clustering requires:

A.              Labels

B.              No labels

C.              Mean only

D.              Variance only → B

102.         Number of clusters in K-means is:

A.              Fixed

B.              Unknown

C.              Predefined

D.              Infinite → C

103.         Elbow method is used to:

A.              Reduce bias

B.              Choose clusters

C.              Increase data

D.              Normalize data → B

104.         Silhouette score measures:

A.              Accuracy

B.              Cluster quality

C.              Mean

D.              Variance → B

105.         What is the harmonic mean mainly used for?

A.              Averaging ratios

B.              Averaging sums

C.              Categorical data

D.              Large samples

106.         Answer: A

107.         If all values in a dataset are equal, skewness is:

A.              Positive

B.              Negative

C.              Zero

D.              Infinite

108.         Answer: C

109.         A platykurtic distribution is:

A.              Highly peaked

B.              Flat

C.              Skewed

D.              Symmetric

110.         Answer: B

111.         A leptokurtic distribution is:

A.              Flat

B.              Highly peaked

C.              Uniform

D.              Random

112.         Answer: B

113.         In simple linear regression, the slope represents:

A.              Intercept

B.              Change in Y per unit X

C.              Error

D.              Variance

114.         Answer: B

115.         Intercept in regression is value of Y when:

A.              X = 1

B.              X = 0

C.              X = mean

D.              X = variance

116.         Answer: B

117.         Residuals should ideally be:

A.              Patterned

B.              Random

C.              Increasing

D.              Decreasing

118.         Answer: B

119.         Durbin-Watson test detects:

A.              Multicollinearity

B.              Autocorrelation

C.              Normality

D.              Heteroscedasticity

120.         Answer: B

121.         Variance Inflation Factor (VIF) detects:

A.              Autocorrelation

B.              Multicollinearity

C.              Normality

D.              Skewness

122.         Answer: B

123.         If VIF is high, it indicates:

A.              Good model

B.              Multicollinearity

C.              Low variance

D.              Independence

124.         Answer: B

125.         A confounding variable:

A.              Has no effect

B.              Distorts relationship

C.              Is dependent variable

D.              Is constant

126.         Answer: B

127.         Endogeneity refers to:

A.              External variables

B.              Internal bias

C.              Random error

D.              Constant data

128.         Answer: B

129.         Panel data combines:

A.              Cross-sectional only

B.              Time series only

C.              Both cross-sectional & time series

D.              None

130.         Answer: C

131.         Fixed effects model controls for:

A.              Time variation

B.              Individual differences

C.              Mean only

D.              Variance only

132.         Answer: B

133.         Random effects model assumes:

A.              No variation

B.              Random variation

C.              Fixed variation

D.              No correlation

134.         Answer: B

135.         Akaike Information Criterion (AIC) favors:

A.              Simpler models

B.              Complex models

C.              Balanced fit

D.              Random models

136.         Answer: C

137.         Bayesian Information Criterion (BIC) penalizes complexity:

A.              Less

B.              More

C.              Equal

D.              None

138.         Answer: B

139.         A dummy variable takes values:

A.              0 or 1

B.              1 or 2

C.              -1 or 1

D.              Any number

140.         Answer: A

141.         Interaction term in regression captures:

A.              Independent effect

B.              Combined effect

C.              Error

D.              Mean

142.         Answer: B

143.         Elasticity measures:

A.              Absolute change

B.              Relative change

C.              Mean

D.              Variance

144.         Answer: B

145.         In hypothesis testing, β represents:

A.              Type I error

B.              Type II error

C.              Mean

D.              Variance

146.         Answer: B

147.         Power increases when:

A.              Sample size increases

B.              Sample size decreases

C.              Variance increases

D.              Bias increases

148.         Answer: A

149.         A non-parametric test does not assume:

A.              Mean

B.              Distribution

C.              Variance

D.              Data

150.         Answer: B

151.         Mann-Whitney test compares:

A.              Means

B.              Medians

C.              Variances

D.              Probabilities

152.         Answer: B

153.         Wilcoxon test is used for:

A.              Paired data

B.              Independent data

C.              Random data

D.              Large samples

154.         Answer: A

155.         Kruskal-Wallis test compares:

A.              Two groups

B.              Multiple groups

C.              One group

D.              Variance only

156.         Answer: B

157.         Spearman correlation measures:

A.              Linear relation

B.              Rank relation

C.              Variance

D.              Mean

158.         Answer: B

159.         Pearson correlation measures:

A.              Rank relation

B.              Linear relation

C.              Nonlinear relation

D.              Variance

160.         Answer: B

161.         A p-value close to 0 indicates:

A.              Weak evidence

B.              Strong evidence

C.              No evidence

D.              Infinite evidence

162.         Answer: B

163.         Bonferroni correction is used for:

A.              Multiple testing

B.              Sampling

C.              Regression

D.              Clustering

164.         Answer: A

165.         Missing completely at random (MCAR) means:

A.              Depends on data

B.              Independent of data

C.              Depends on outcome

D.              Systematic

166.         Answer: B

167.         Missing at random (MAR) depends on:

A.              Observed data

B.              Unobserved data

C.              None

D.              Random guess

168.         Answer: A

169.         Not missing at random (NMAR) depends on:

A.              Observed only

B.              Unobserved

C.              Mean

D.              Variance

170.         Answer: B

171.         Imputation replaces:

A.              Outliers

B.              Missing values

C.              Means

D.              Variance

172.         Answer: B

173.         Mean imputation may:

A.              Increase variance

B.              Reduce variance

C.              Increase bias

D.              Reduce bias

174.         Answer: B

175.         Weighted mean assigns:

A.              Equal weights

B.              Different weights

C.              No weights

D.              Random weights

176.         Answer: B

177.         Survey weights correct:

A.              Bias

B.              Mean

C.              Variance

D.              Range

178.         Answer: A

179.         Design effect measures:

A.              Efficiency of design

B.              Mean

C.              Variance

D.              Sample size

180.         Answer: A

181.         ROC curve plots:

A.              Precision vs Recall

B.              TPR vs FPR

C.              Mean vs variance

D.              Error vs accuracy

182.         Answer: B

183.         AUC measures:

A.              Accuracy

B.              Model performance

C.              Mean

D.              Variance

184.         Answer: B

185.         Sensitivity is:

A.              True negative rate

B.              True positive rate

C.              False positive rate

D.              Error rate

186.         Answer: B

187.         Specificity is:

A.              True negative rate

B.              True positive rate

C.              Error rate

D.              Mean

188.         Answer: A

189.         False positive rate equals:

A.              1 − specificity

B.              Specificity

C.              Sensitivity

D.              Accuracy

190.         Answer: A

191.         Data leakage occurs when:

A.              Training uses future info

B.              Testing uses past info

C.              Data is missing

D.              Data is clean

192.         Answer: A

193.         Train-test split is used to:

A.              Increase bias

B.              Evaluate model

C.              Reduce data

D.              Normalize data

194.         Answer: B

195.         Underfitting occurs when:

A.              Model too simple

B.              Model too complex

C.              Too much data

D.              No data

196.         Answer: A

197.         Bias-variance tradeoff balances:

A.              Accuracy & error

B.              Bias & variance

C.              Mean & median

D.              Range & IQR

198.         Answer: B

199.         Gradient descent is used to:

A.              Maximize error

B.              Minimize loss

C.              Increase bias

D.              Reduce sample

200.         Answer: B

201.         Loss function measures:

A.              Accuracy

B.              Error

C.              Mean

D.              Variance

202.         Answer: B

203.         Regularization helps:

A.              Prevent overfitting

B.              Increase error

C.              Reduce data

D.              Ignore variables

204.         Answer: A

205.         What is the main purpose of descriptive statistics?

A.              Make predictions

B.              Summarize data

C.              Test hypotheses

D.              Build models → B

206.         Inferential statistics is used to:

A.              Summarize data

B.              Describe sample

C.              Draw conclusions about population

D.              Organize data → C

207.         A frequency polygon is used to:

A.              Show relationship

B.              Show distribution

C.              Show mean

D.              Show variance → B

208.         Bar charts are best for:

A.              Continuous data

B.              Categorical data

C.              Time series

D.              Correlation → B

209.         Pie charts show:

A.              Trends

B.              Proportions

C.              Variance

D.              Mean → B

210.         Stem-and-leaf plot shows:

A.              Mean

B.              Raw data distribution

C.              Correlation

D.              Regression → B

211.         A bimodal distribution has:

A.              One mode

B.              Two modes

C.              No mode

D.              Many modes → B

212.         A uniform distribution has:

A.              Equal frequencies

B.              Skewness

C.              High variance

D.              Low variance → A

213.         A right-skewed distribution has:

A.              Tail on left

B.              Tail on right

C.              No tail

D.              Symmetric → B

214.         A left-skewed distribution has:

A.              Tail on left

B.              Tail on right

C.              Symmetric

D.              No tail → A

215.         Mean deviation is taken from:

A.              Mean or median

B.              Mode only

C.              Range

D.              Variance → A

216.         Quartile deviation equals:

A.              Q3 − Q1

B.              (Q3 − Q1)/2

C.              Q1 − Q3

D.              Mean − median → B

217.         Standard error decreases when:

A.              Sample size increases

B.              Sample size decreases

C.              Variance increases

D.              Mean increases → A

218.         A large p-value suggests:

A.              Reject H₀

B.              Weak evidence

C.              Strong evidence

D.              Significant result → B

219.         Hypothesis testing begins with:

A.              Data collection

B.              Null hypothesis

C.              Conclusion

D.              Graph → B

220.         A parameter is fixed but:

A.              Known

B.              Unknown

C.              Random

D.              Variable → B

221.         A statistic is:

A.              Fixed

B.              Known

C.              Random

D.              Constant → C

222.         Sampling distribution of mean is:

A.              Always normal

B.              Approximately normal

C.              Uniform

D.              Skewed → B

223.         Standard normal distribution has mean:

A.              1

B.              0

C.              -1

D.              100 → B

224.         Standard normal distribution has variance:

A.              0

B.              1

C.              2

D.              10 → B

225.         Z-table gives:

A.              Mean

B.              Probability

C.              Variance

D.              Mode → B

226.         t-distribution is used when:

A.              Large sample

B.              Small sample

C.              Infinite sample

D.              No sample → B

227.         t-distribution approaches normal when:

A.              Sample decreases

B.              Sample increases

C.              Variance decreases

D.              Mean increases → B

228.         Chi-square distribution is used for:

A.              Means

B.              Variances

C.              Categorical data

D.              Correlation → C

229.         F-distribution is used in:

A.              Regression

B.              ANOVA

C.              Probability

D.              Mean → B

230.         ANOVA tests equality of:

A.              Variances

B.              Means

C.              Probabilities

D.              Medians → B

231.         Degrees of freedom in ANOVA depend on:

A.              Mean

B.              Sample size

C.              Groups

D.              Both B and C → D

232.         Residual sum of squares measures:

A.              Explained variation

B.              Unexplained variation

C.              Total variation

D.              Mean → B

233.         Total sum of squares equals:

A.              Explained + residual

B.              Mean + variance

C.              Mode + median

D.              Range + IQR → A

234.         Regression coefficient shows:

A.              Strength

B.              Direction

C.              Change

D.              All of the above → D

235.         Perfect fit means R² equals:

A.              0

B.              0.5

C.              1

D.              -1 → C

236.         If R² = 0, model explains:

A.              All variation

B.              No variation

C.              Half variation

D.              Negative variation → B

237.         Multicollinearity increases:

A.              Accuracy

B.              Variance of coefficients

C.              Mean

D.              Sample size → B

238.         Ridge regression adds:

A.              L1 penalty

B.              L2 penalty

C.              No penalty

D.              Random penalty → B

239.         Lasso regression adds:

A.              L1 penalty

B.              L2 penalty

C.              No penalty

D.              Random penalty → A

240.         Overfitting leads to:

A.              Poor training

B.              Poor generalization

C.              High bias

D.              Low variance → B

241.         Underfitting leads to:

A.              High bias

B.              High variance

C.              Perfect fit

D.              Low bias → A

242.         Bias is:

A.              Error from assumptions

B.              Random error

C.              Sampling error

D.              Measurement error → A

243.         Variance is:

A.              Error variability

B.              Mean

C.              Mode

D.              Range → A

244.         Bootstrap method uses:

A.              Replacement

B.              No replacement

C.              Fixed data

D.              Random guess → A

245.         Jackknife method uses:

A.              All data

B.              Leave-one-out

C.              Sampling

D.              Mean → B

246.         Cross-validation splits data into:

A.              One set

B.              Two or more sets

C.              No sets

D.              Infinite sets → B

247.         K-fold cross-validation uses:

A.              One fold

B.              K partitions

C.              Two partitions

D.              Infinite folds → B

248.         Confusion matrix evaluates:

A.              Regression

B.              Classification

C.              Sampling

D.              Mean → B

249.         Accuracy measures:

A.              Correct predictions

B.              Errors

C.              Variance

D.              Mean → A

250.         Precision focuses on:

A.              True positives

B.              True negatives

C.              Errors

D.              Mean → A

251.         Recall focuses on:

A.              True positives

B.              True negatives

C.              Errors

D.              Variance → A

252.         F1 score balances:

A.              Accuracy

B.              Precision & recall

C.              Mean

D.              Variance → B

253.         KNN is a:

A.              Regression model

B.              Classification method

C.              Clustering method

D.              Sampling method → B

254.         Decision tree is used for:

A.              Classification & regression

B.              Mean only

C.              Variance only

D.              Sampling → A           

255.         Central Limit Theorem states that the sampling distribution of the mean is approximately normal if:

A.              Sample size is small

B.              Sample size is large

C.              Population is skewed

D.              Population is uniform → B

256.         Law of Large Numbers states that:

A.              Sample mean approaches population mean as sample size increases

B.              Sample mean decreases with sample size

C.              Variance increases with sample size

D.              Mean is always equal to median → A

257.         A Type I error occurs when:

A.              True null hypothesis is rejected

B.              False null hypothesis is accepted

C.              True null hypothesis is accepted

D.              False null hypothesis is rejected → A

258.         A Type II error occurs when:

A.              True null hypothesis is rejected

B.              False null hypothesis is accepted

C.              True null hypothesis is accepted

D.              False null hypothesis is rejected → B

259.         Confidence interval gives:

A.              Exact value of parameter

B.              Range of plausible values

C.              Variance only

D.              Mean only → B

260.         95% confidence level means:

A.              95% chance population mean is in interval

B.              5% chance population mean is in interval

C.              Mean = 0.95

D.              Variance = 0.95 → A

261.         Margin of error depends on:

A.              Sample size

B.              Confidence level

C.              Standard deviation

D.              All of the above → D

262.         Standard error is:

A.              Standard deviation of population

B.              Standard deviation of sample mean

C.              Mean of sample

D.              Range → B

263.         Normal approximation to binomial works if:

A.              n is large and p not too close to 0 or 1

B.              n is small

C.              p = 0

D.              p = 1 → A

264.         Poisson distribution approximates binomial if:

A.              n is large, p small

B.              n is small, p large

C.              n and p are large

D.              n and p small → A

265.         Chi-square test is used for:

A.              Mean comparison

B.              Categorical data

C.              Regression

D.              Correlation → B

266.         Degrees of freedom for chi-square =

A.              (r + c −1)

B.              (r −1)(c −1)

C.              r × c

D.              r − c → B

267.         Goodness-of-fit test checks:

A.              Regression fit

B.              Observed vs expected frequencies

C.              Mean comparison

D.              Variance equality → B

268.         Homoscedasticity means:

A.              Equal variances across groups

B.              Unequal variances

C.              Mean = median

D.              Skewness = 0 → A

269.         Heteroscedasticity means:

A.              Equal variances

B.              Unequal variances

C.              Constant error

D.              Normality → B

270.         Bartlett’s test checks:

A.              Normality

B.              Equality of variances

C.              Mean equality

D.              Skewness → B

271.         Levene’s test is used for:

A.              Equality of means

B.              Equality of variances

C.              Regression coefficients

D.              Correlation → B

272.         Kolmogorov-Smirnov test checks:

A.              Variance

B.              Normality

C.              Correlation

D.              Regression → B

273.         Shapiro-Wilk test checks:

A.              Variance

B.              Normality

C.              Mean equality

D.              Skewness → B

274.         Q-Q plot helps assess:

A.              Variance

B.              Normality

C.              Correlation

D.              Regression → B

275.         Boxplot identifies:

A.              Mean

B.              Outliers

C.              Correlation

D.              Regression → B

276.         Interquartile range (IQR) =

A.              Q3 − Q1

B.              Q1 − Q3

C.              Max − Min

D.              Median → A

277.         Standard score (z) formula:

A.              (x − μ)/σ

B.              (μ − x)/σ

C.              x/σ

D.              x − μ → A

278.         Chebyshev’s inequality applies to:

A.              Any distribution

B.              Normal only

C.              Skewed only

D.              Uniform only → A

279.         Empirical rule applies to:

A.              Any distribution

B.              Normal distribution

C.              Uniform

D.              Skewed → B

280.         In hypothesis testing, power =

A.              1 − α

B.              1 − β

C.              α + β

D.              α × Î² → B

281.         ANOVA F-statistic =

A.              Variance between groups / variance within groups

B.              Mean between groups / mean within groups

C.              Sum of squares / mean

D.              Explained / total variance → A

282.         One-way ANOVA compares:

A.              Two means

B.              Multiple means

C.              Variances only

D.              Regression coefficients → B

283.         Two-way ANOVA includes:

A.              One factor

B.              Two factors

C.              Multiple regression

D.              Correlation → B

284.         Post hoc tests are used after:

A.              Significant ANOVA

B.              Non-significant ANOVA

C.              Regression

D.              Chi-square → A

285.         Tukey’s test controls:

A.              Type I error

B.              Type II error

C.              Variance

D.              Mean → A

286.         Bonferroni correction controls:

A.              Type I error in multiple comparisons

B.              Type II error

C.              Regression error

D.              Correlation error → A

287.         Non-parametric tests are used when:

A.              Normality assumption fails

B.              Sample is large

C.              Population is known

D.              Regression needed → A

288.         Wilcoxon signed-rank test is for:

A.              Paired data

B.              Independent data

C.              Categorical data

D.              Multiple groups → A

289.         Mann-Whitney U test is for:

A.              Paired data

B.              Independent two samples

C.              Categorical data

D.              Multiple groups → B

290.         Kruskal-Wallis test is for:

A.              Two groups

B.              Multiple groups

C.              Paired data

D.              Categorical data → B

291.         Friedman test is for:

A.              One-way repeated measures

B.              Two-way repeated measures

C.              Regression

D.              Correlation → A

292.         Spearman rank correlation uses:

A.              Raw values

B.              Ranks

C.              Variances

D.              Means → B

293.         Kendall’s tau measures:

A.              Linear correlation

B.              Rank correlation

C.              Regression slope

D.              Variance → B

294.         Regression diagnostics detect:

A.              Outliers

B.              Leverage points

C.              Influential points

D.              All of the above → D

295.         Cook’s distance measures:

A.              Variance

B.              Influence of observation

C.              Mean

D.              Skewness → B

296.         Leverage points affect:

A.              Regression line slope

B.              Variance

C.              Mean only

D.              Mode only → A

297.         Influential points combine:

A.              Outlier + leverage

B.              Mean + variance

C.              Skewness + kurtosis

D.              Range + median → A

298.         Multivariate analysis deals with:

A.              One variable

B.              Two or more variables

C.              Means only

D.              Variances only → B

299.         Principal Component Analysis (PCA) reduces:

A.              Variance

B.              Dimensions

C.              Mean

D.              Skewness → B

300.         Factor analysis identifies:

A.              Observed variables

B.              Latent factors

C.              Mean

D.              Variance → B

301.         Cluster analysis groups:

A.              Variables

B.              Observations

C.              Mean only

D.              Variance only → B

302.         Hierarchical clustering can be:

A.              Agglomerative

B.              Divisive

C.              Both

D.              None → C

303.         K-means clustering minimizes:

A.              Between-cluster distance

B.              Within-cluster distance

C.              Mean

D.              Variance → B

304.         Silhouette score evaluates:

A.              Regression

B.              Classification

C.              Clustering quality

D.              Correlation → C

305.         Bayesian statistics updates:

A.              Prior probability using data → B

B.              Mean only

C.              Variance only

D.              Mode only

306.         Posterior probability combines:

A.              Likelihood × Prior → A

B.              Mean + Variance

C.              Standard deviation only

D.              Median only

307.         Likelihood function depends on:

A.              Parameter values → A

B.              Sample size only

C.              Variance only

D.              Mean only

308.         Maximum a posteriori (MAP) estimation maximizes:

A.              Likelihood

B.              Posterior → B

C.              Mean

D.              Variance

309.         Prior distribution can be:

A.              Informative → A

B.              Non-informative → B

C.              Both

D.              Neither

310.         Conjugate prior ensures:

A.              Posterior in same family as prior → A

B.              Posterior is uniform

C.              Posterior is normal

D.              Posterior is variance only

311.         Poisson process models:

A.              Continuous time events → A

B.              Categorical data

C.              Mean only

D.              Variance only

312.         Exponential interarrival times are:

A.              Memoryless → A

B.              Dependent

C.              Correlated

D.              Uniform

313.         Markov process satisfies:

A.              Future depends only on present → A

B.              Future depends on past

C.              Mean = median

D.              Variance = 0

314.         Stationary Markov chain has:

A.              Constant transition probabilities → A

B.              Varying probabilities

C.              Mean = 0

D.              Variance = 1

315.         Ergodic Markov chain:

A.              Can reach all states eventually → A

B.              Cannot reach all states

C.              Deterministic

D.              Non-stationary

316.         Transition matrix elements are:

A.              Probabilities → A

B.              Means

C.              Variances

D.              Modes

317.         Poisson regression models:

A.              Count data → A

B.              Continuous data

C.              Binary outcome

D.              Ordinal outcome

318.         Negative binomial regression handles:

A.              Overdispersed count data → A

B.              Binary data

C.              Continuous data

D.              Time series

319.         Ordinal regression models:

A.              Continuous outcome

B.              Ordered categorical outcome → B

C.              Binary outcome

D.              Count data

320.         Multinomial logistic regression handles:

A.              Multiple categories → A

B.              Two categories

C.              Continuous data

D.              Time series

321.         Survival analysis studies:

A.              Time until event → A

B.              Mean only

C.              Variance only

D.              Regression coefficients

322.         Censoring occurs when:

A.              Exact event time unknown → A

B.              Event never occurs

C.              Time series is stationary

D.              Mean = median

323.         Kaplan-Meier estimator estimates:

A.              Survival function → A

B.              Hazard function

C.              Mean

D.              Variance

324.         Cox proportional hazards model assumes:

A.              Constant hazard ratios → A

B.              Increasing hazard

C.              Mean = median

D.              Variance = 1

325.         Hazard function measures:

A.              Instantaneous risk → A

B.              Mean risk

C.              Cumulative variance

D.              Median

326.         Log-rank test compares:

A.              Two survival curves → A

B.              Means

C.              Variances

D.              Regression coefficients

327.         Time-to-event data is:

A.              Continuous → A

B.              Categorical

C.              Binary

D.              Count

328.         Monte Carlo Markov Chain (MCMC) is used for:

A.              Bayesian estimation → A

B.              Mean estimation only

C.              Variance estimation only

D.              Correlation

329.         Gibbs sampling updates:

A.              One variable at a time → A

B.              All variables simultaneously

C.              Mean only

D.              Variance only

330.         Metropolis-Hastings algorithm:

A.              Accepts or rejects proposed sample → A

B.              Always accepts

C.              Rejects all

D.              Updates mean only

331.         Random effects model accounts for:

A.              Within-group variability → A

B.              Between-group only

C.              Mean only

D.              Variance only

332.         Fixed effects model assumes:

A.              Effects are constant → A

B.              Effects are random

C.              Effects vary by sample

D.              Effects unknown

333.         Mixed-effects model includes:

A.              Fixed + random effects → A

B.              Only fixed

C.              Only random

D.              None

334.         Hierarchical linear model handles:

A.              Nested data → A

B.              Time series only

C.              Regression only

D.              Categorical data only

335.         Multilevel modeling is used when:

A.              Data clustered → A

B.              Data independent

C.              Time series

D.              Continuous only

336.         Structural equation modeling (SEM) combines:

A.              Regression + factor analysis → A

B.              Only regression

C.              Only correlation

D.              Only variance

337.         Path analysis is part of:

A.              SEM → A

B.              ANOVA

C.              Regression

D.              Time series

338.         Confirmatory factor analysis (CFA) tests:

A.              Hypothesized factor structure → A

B.              Mean equality

C.              Regression coefficients

D.              Variance equality

339.         Exploratory factor analysis (EFA) discovers:

A.              Factor structure → A

B.              Mean

C.              Regression

D.              Variance

340.         Principal axis factoring is:

A.              Common factor method → A

B.              PCA method

C.              Regression

D.              Correlation

341.         Varimax rotation maximizes:

A.              Loading variance → A

B.              Mean

C.              Regression slope

D.              Covariance

342.         Oblique rotation allows:

A.              Factors correlated → A

B.              Factors uncorrelated

C.              Regression only

D.              Mean only

343.         Bartlett’s test in factor analysis checks:

A.              Sphericity → A

B.              Variance equality

C.              Regression

D.              Mean

344.         Kaiser-Meyer-Olkin (KMO) measure checks:

A.              Sampling adequacy → A

B.              Variance

C.              Regression slope

D.              Mean

345.         Communality indicates:

A.              Variance explained by factors → A

B.              Total variance

C.              Regression slope

D.              Mean

346.         Eigenvalue >1 rule selects:

A.              Number of factors → A

B.              Number of observations

C.              Number of predictors

D.              Regression coefficients

347.         Scree plot visualizes:

A.              Eigenvalues → A

B.              Mean

C.              Variance

D.              Regression

348.         Cluster validity indices include:

A.              Silhouette → A

B.              Rand index → B

C.              Both

D.              None → C

349.         Dendrogram helps in:

A.              Hierarchical clustering → A

B.              Regression

C.              ANOVA

D.              Time series

350.         Agglomerative clustering starts with:

A.              Each observation as a cluster → A

B.              Single cluster

C.              Random cluster

D.              Mean only

351.         Divisive clustering starts with:

A.              All data in one cluster → A

B.              Each observation as cluster

C.              Random clusters

D.              Mean only

352.         Outlier in clustering may:

A.              Form its own cluster → A

B.              Merge with nearest cluster

C.              Affect centroids

D.              All of the above → D

353.         Hierarchical vs K-means:

A.              Deterministic vs iterative → A

B.              Both deterministic

C.              Both iterative

D.              None

354.         DBSCAN clustering identifies:

A.              Density-based clusters → A

B.              Hierarchical clusters

C.              K-means clusters

D.              PCA clusters

355.         The mode is defined as:

A.              Most frequent value → A

B.              Average value

C.              Middle value

D.              Sum of values

356.         The median divides data into:

A.              Two equal halves → A

B.              Four equal parts

C.              Three equal parts

D.              Five equal parts

357.         Skewness measures:

A.              Symmetry of data → A

B.              Spread

C.              Central tendency

D.              Correlation

358.         Positive skew means:

A.              Tail on right → A

B.              Tail on left

C.              Symmetric

D.              Uniform

359.         Negative skew means:

A.              Tail on left → A

B.              Tail on right

C.              Symmetric

D.              Uniform

360.         Kurtosis measures:

A.              Peakedness of distribution → A

B.              Spread

C.              Mean

D.              Median

361.         High kurtosis indicates:

A.              Heavy tails → A

B.              Light tails

C.              Symmetry

D.              Skewness = 0

362.         Low kurtosis indicates:

A.              Light tails → A

B.              Heavy tails

C.              Skewness

D.              Mean

363.         Variance formula (population) is:

A.              Σ(x−μ)² / N → A

B.              Σ(x−μ)² / (N−1)

C.              Σx / N

D.              Σx² / N

364.         Variance formula (sample) is:

A.              Σ(x−x̄)² / (n−1) → A

B.              Σ(x−x̄)² / n

C.              Σx / n

D.              Σx² / n

365.         Standard deviation is:

A.              Square root of variance → A

B.              Variance squared

C.              Mean

D.              Median

366.         Coefficient of variation (CV) =

A.              SD / Mean → A

B.              Mean / SD

C.              Variance / Mean

D.              Median / Mean

367.         Probability of mutually exclusive events:

A.              Sum of individual probabilities → A

B.              Product

C.              Difference

D.              Ratio

368.         Probability of independent events:

A.              Product of probabilities → A

B.              Sum

C.              Difference

D.              Ratio

369.         Conditional probability formula:

A.              P(A|B) = P(A∩B)/P(B) → A

B.              P(A∩B)/P(A)

C.              P(A)+P(B)

D.              P(A)-P(B)

370.         Bayes theorem updates:

A.              Prior probability → A

B.              Mean

C.              Variance

D.              Standard deviation

371.         Random variable X can be:

A.              Discrete → A

B.              Continuous → B

C.              Both → C

D.              Neither

372.         Probability mass function (PMF) applies to:

A.              Discrete → A

B.              Continuous

C.              Both

D.              Neither

373.         Probability density function (PDF) applies to:

A.              Continuous → A

B.              Discrete

C.              Both

D.              Neither

374.         Cumulative distribution function (CDF) gives:

A.              P(X ≤ x) → A

B.              P(X ≥ x)

C.              P(X = x)

D.              Mean

375.         Expected value of X =

A.              Σx·P(x) → A

B.              Mean only

C.              Variance only

D.              Median

376.         Law of total probability:

A.              P(A) = Σ P(A|Bi)P(Bi) → A

B.              P(A) = P(A∩B)

C.              P(A) = P(A)+P(B)

D.              P(A) = P(A)/P(B)

377.         Standard normal distribution:

A.              Mean 0, SD 1 → A

B.              Mean 1, SD 0

C.              Mean 0, SD 0

D.              Mean 1, SD 1

378.         Z-score formula:

A.              (X−μ)/σ → A

B.              (μ−X)/σ

C.              X/σ

D.              X−μ

379.         T-distribution is used when:

A.              Population SD unknown → A

B.              Population mean unknown

C.              Sample size large

D.              Sample size infinite

380.         Degrees of freedom in t-test:

A.              n−1 → A

B.              n

C.              n+1

D.              n−2

381.         One-sample t-test compares:

A.              Sample mean vs population mean → A

B.              Two sample means

C.              Two variances

D.              Proportions

382.         Two-sample t-test compares:

A.              Means of two independent samples → A

B.              Paired samples

C.              Variances

D.              Proportions

383.         Paired t-test compares:

A.              Means of paired observations → A

B.              Independent samples

C.              Variances

D.              Proportions

384.         F-test compares:

A.              Variances → A

B.              Means

C.              Medians

D.              Correlations

385.         ANOVA is an extension of:

A.              t-test → A

B.              Z-test

C.              F-test

D.              Chi-square

386.         One-way ANOVA has:

A.              One factor → A

B.              Two factors

C.              Multiple factors

D.              None

387.         Two-way ANOVA has:

A.              Two factors → A

B.              One factor

C.              Multiple factors

D.              None

388.         Post-hoc tests are used after:

A.              Significant ANOVA → A

B.              Non-significant ANOVA

C.              Regression

D.              Chi-square

389.         Bonferroni correction adjusts for:

A.              Multiple comparisons → A

B.              Single test

C.              Regression

D.              Variance

390.         Chi-square test applies to:

A.              Categorical data → A

B.              Continuous data

C.              Regression

D.              Time series

391.         Chi-square goodness-of-fit compares:

A.              Observed vs expected frequencies → A

B.              Means

C.              Variances

D.              Regression coefficients

392.         Chi-square test for independence examines:

A.              Association between variables → A

B.              Mean difference

C.              Variance equality

D.              Regression

393.         Contingency table shows:

A.              Cross-tabulated counts → A

B.              Means only

C.              Variances only

D.              Regression

394.         Residual =

A.              Observed − Predicted → A

B.              Predicted − Observed

C.              Mean − Observed

D.              Variance − Predicted

395.         Homoscedasticity =

A.              Equal variance → A

B.              Unequal variance

C.              Normal distribution

D.              Independence

396.         Heteroscedasticity violates:

A.              Constant variance assumption → A

B.              Linearity

C.              Normality

D.              Independence

397.         Cook’s distance detects:

A.              Influential points → A

B.              Outliers only

C.              Leverage only

D.              Residuals

398.         Leverage measures:

A.              Distance in predictor space → A

B.              Residual size

C.              Mean

D.              Variance

399.         Multicollinearity affects:

A.              Standard errors → A

B.              Means

C.              Medians

D.              Mode

400.         VIF > 10 indicates:

A.              Severe multicollinearity → A

B.              Low correlation

C.              Independence

D.              Normality

401.         Random effects model accounts for:

A.              Group-level variation → A

B.              Fixed effect only

C.              Mean only

D.              Variance only

402.         Fixed effects model assumes:

A.              Constant effects → A

B.              Random effects

C.              Variable effects

D.              Unknown effects

403.         Mixed effects model combines:

A.              Fixed + random → A

B.              Fixed only

C.              Random only

D.              Neither

404.         Hierarchical modeling handles:

A.              Nested data → A

B.              Independent data

C.              Time series only

D.              Categorical only

405.         Poisson regression is suitable for:

A.              Count data → A

B.              Continuous data

C.              Binary outcome

D.              Ordinal outcome

406.         Overdispersion occurs when:

A.              Variance > Mean → A

B.              Variance < Mean

C.              Variance = Mean

D.              Mean = 0

407.         Negative binomial regression handles:

A.              Overdispersed count data → A

B.              Binary data

C.              Continuous data

D.              Ordinal data

408.         Time series data has:

A.              Temporal order → A

B.              Random order

C.              Categorical only

D.              Constant variance

409.         Stationary time series has:

A.              Constant mean & variance → A

B.              Changing mean

C.              Changing variance

D.              Trend only

410.         Differencing a series removes:

A.              Trend → A

B.              Seasonality

C.              Noise

D.              Skewness

411.         Seasonal differencing removes:

A.              Seasonality → A

B.              Trend

C.              Noise

D.              Mean

412.         Autocorrelation measures:

A.              Correlation of series with lagged values → A

B.              Variance only

C.              Mean only

D.              Skewness

413.         Partial autocorrelation measures:

A.              Direct correlation between X_t and X_{t-k} controlling intermediate lags → A

B.              Total correlation

C.              Variance

D.              Mean

414.         AR(p) model uses:

A.              p lagged values → A

B.              p future values

C.              Moving average

D.              Trend

415.         MA(q) model uses:

A.              q lagged errors → A

B.              q lagged values

C.              Trend only

D.              Mean

416.         ARMA(p,q) combines:

A.              AR + MA → A

B.              AR only

C.              MA only

D.              AR + trend

417.         ARIMA(p,d,q) adds:

A.              Differencing → A

B.              Autocorrelation

C.              Variance

D.              Mean

418.         Exponential smoothing gives:

A.              Higher weight to recent observations → A

B.              Equal weight

C.              Lower weight to recent

D.              Random weight

419.         Holt-Winters method handles:

A.              Trend + seasonality → A

B.              Noise only

C.              Mean only

D.              Variance only

420.         White noise has:

A.              Zero mean, constant variance → A

B.              Non-zero mean

C.              Changing variance

D.              Trend

421.         ARCH/GARCH models address:

A.              Heteroscedasticity → A

B.              Mean

C.              Median

D.              Mode

422.         Monte Carlo simulation estimates:

A.              Probabilities & distributions → A

B.              Mean only

C.              Median only

D.              Mode

423.         MCMC (Markov Chain Monte Carlo) is used for:

A.              Bayesian estimation → A

B.              Frequentist estimation

C.              Mean only

D.              Variance only

424.         Gibbs sampling updates:

A.              One variable at a time → A

B.              All variables simultaneously

C.              Mean only

D.              Variance only

425.         Metropolis-Hastings algorithm:

A.              Accepts or rejects proposed sample → A

B.              Always accepts

C.              Always rejects

D.              Updates mean only

426.         Bayesian inference combines:

A.              Prior × Likelihood → A

B.              Mean × Variance

C.              Median × Mode

D.              Regression coefficients

427.         Posterior distribution =

A.              Updated belief after data → A

B.              Prior only

C.              Likelihood only

D.              Mean only

428.         Conjugate prior ensures:

A.              Posterior in same family → A

B.              Posterior uniform

C.              Posterior normal

D.              Posterior variance

429.         Maximum a posteriori (MAP) estimation maximizes:

A.              Posterior → A

B.              Likelihood

C.              Mean

D.              Variance

430.         Hierarchical Bayesian model accounts for:

A.              Group-level variation → A

B.              Individual only

C.              Mean only

D.              Variance only

431.         Random effects model includes:

A.              Group-level variation → A

B.              Fixed effect only

C.              Mean only

D.              Variance only

432.         Fixed effects model assumes:

A.              Constant effects → A

B.              Random effects

C.              Variable effects

D.              Unknown effects

433.         Mixed effects model combines:

A.              Fixed + random → A

B.              Fixed only

C.              Random only

D.              Neither

434.         Multilevel modeling is used for:

A.              Nested data → A

B.              Independent data

C.              Continuous only

D.              Categorical only

435.         Structural equation modeling (SEM) combines:

A.              Regression + factor analysis → A

B.              Regression only

C.              Correlation only

D.              Variance only

436.         Path analysis is part of:

A.              SEM → A

B.              ANOVA

C.              Regression

D.              Time series

437.         Confirmatory factor analysis (CFA) tests:

A.              Hypothesized factor structure → A

B.              Mean equality

C.              Regression coefficients

D.              Variance equality

438.         Exploratory factor analysis (EFA) discovers:

A.              Factor structure → A

B.              Mean only

C.              Regression

D.              Variance

439.         Kaiser-Meyer-Olkin (KMO) measure checks:

A.              Sampling adequacy → A

B.              Variance

C.              Regression slope

D.              Mean

440.         Bartlett’s test checks:

A.              Sphericity → A

B.              Mean

C.              Variance

D.              Regression

441.         Scree plot visualizes:

A.              Eigenvalues → A

B.              Means

C.              Variances

D.              Regression

442.         Principal Component Analysis (PCA) reduces:

A.              Dimensionality → A

B.              Mean

C.              Variance

D.              Regression

443.         First principal component maximizes:

A.              Variance → A

B.              Mean

C.              Skewness

D.              Kurtosis

444.         Varimax rotation achieves:

A.              Simple structure → A

B.              Maximum variance

C.              Regression

D.              Mean only

445.         K-means clustering minimizes:

A.              Within-cluster sum of squares → A

B.              Between-cluster variance

C.              Mean only

D.              Variance only

446.         Hierarchical clustering produces:

A.              Dendrogram → A

B.              Regression line

C.              Correlation matrix

D.              Factor loadings

447.         Agglomerative clustering starts with:

A.              Each observation as cluster → A

B.              One cluster

C.              Random clusters

D.              Mean only

448.         Divisive clustering starts with:

A.              All data in one cluster → A

B.              Each observation

C.              Random clusters

D.              Mean only

449.         DBSCAN identifies:

A.              Density-based clusters → A

B.              Hierarchical clusters

C.              K-means clusters

D.              Regression clusters

450.         Silhouette score measures:

A.              Cluster separation → A

B.              Mean

C.              Variance

D.              Standard deviation

451.         High silhouette score indicates:

A.              Well-separated clusters → A

B.              Overlapping clusters

C.              Poor clustering

D.              Random clustering

452.         Outlier detection uses:

A.              Z-score, IQR → A

B.              Mean only

C.              Variance only

D.              Median only

453.         Boxplot identifies:

A.              Outliers → A

B.              Mean

C.              Variance

D.              Standard deviation

454.         Leverage points affect:

A.              Regression line → A

B.              Median only

C.              Variance only

D.              Mean only

455.         Hierarchical clustering produces:

A.              Dendrogram → A

B.              Regression line

C.              Correlation matrix

D.              Factor loadings

456.         Agglomerative clustering starts with:

A.              Each observation as a cluster → A

B.              One cluster

C.              Random clusters

D.              Mean only

457.         Divisive clustering starts with:

A.              All data in one cluster → A

B.              Each observation

C.              Random clusters

D.              Mean only

458.         K-means clustering minimizes:

A.              Within-cluster sum of squares → A

B.              Between-cluster variance

C.              Mean only

D.              Variance only

459.         DBSCAN clustering identifies:

A.              Density-based clusters → A

B.              Hierarchical clusters

C.              K-means clusters

D.              Regression clusters

460.         Silhouette score measures:

A.              Cluster separation → A

B.              Mean

C.              Variance

D.              Standard deviation

461.         High silhouette score indicates:

A.              Well-separated clusters → A

B.              Overlapping clusters

C.              Poor clustering

D.              Random clustering

462.         Outlier detection methods include:

A.              Z-score, IQR → A

B.              Mean only

C.              Variance only

D.              Median only

463.         Boxplot shows:

A.              Outliers → A

B.              Mean only

C.              Variance only

D.              Standard deviation

464.         Leverage points affect:

A.              Regression line → A

B.              Median only

C.              Variance only

D.              Mean only

465.         Cook’s distance identifies:

A.              Influential points → A

B.              Outliers only

C.              Median points

D.              Regular points

466.         Multicollinearity inflates:

A.              Standard errors → A

B.              Means

C.              Medians

D.              Modes

467.         Variance Inflation Factor (VIF) >10 indicates:

A.              Severe multicollinearity → A

B.              No correlation

C.              Independence

D.              Normality

468.         Heteroscedasticity violates:

A.              Constant variance assumption → A

B.              Linearity

C.              Normality

D.              Independence

469.         Autocorrelation violates:

A.              Independence assumption → A

B.              Linearity

C.              Normality

D.              Variance

470.         Time series decomposition separates:

A.              Trend, seasonality, residual → A

B.              Mean only

C.              Variance only

D.              Median only

471.         ARIMA model includes:

A.              AR + I + MA → A

B.              AR only

C.              MA only

D.              Differencing only

472.         Stationarity is required for:

A.              ARIMA → A

B.              Regression

C.              ANOVA

D.              Chi-square

473.         Exponential smoothing is used for:

A.              Forecasting → A

B.              Regression

C.              Correlation

D.              Variance estimation

474.         Holt-Winters method models:

A.              Trend + seasonality → A

B.              Noise only

C.              Mean only

D.              Variance only

475.         Bootstrapping resamples:

A.              With replacement → A

B.              Without replacement

C.              Randomly once

D.              Deterministically

476.         Jackknife resampling removes:

A.              One observation at a time → A

B.              Half the sample

C.              Entire sample

D.              Random subset

477.         Principal Component Analysis (PCA) reduces:

A.              Dimensionality → A

B.              Mean

C.              Variance

D.              Regression

478.         First principal component maximizes:

A.              Variance → A

B.              Mean

C.              Skewness

D.              Kurtosis

479.         Eigenvalues in PCA indicate:

A.              Variance explained → A

B.              Mean only

C.              Regression coefficient

D.              Skewness

480.         Factor analysis identifies:

A.              Latent variables → A

B.              Observed variables only

C.              Means

D.              Variances

481.         Kaiser-Meyer-Olkin (KMO) measure checks:

A.              Sampling adequacy → A

B.              Variance

C.              Regression slope

D.              Mean

482.         Bartlett’s test checks:

A.              Sphericity → A

B.              Mean

C.              Variance

D.              Regression

483.         Scree plot visualizes:

A.              Eigenvalues → A

B.              Means

C.              Variances

D.              Regression

484.         Confirmatory Factor Analysis (CFA) tests:

A.              Hypothesized factor structure → A

B.              Mean equality

C.              Regression coefficients

D.              Variance equality

485.         Exploratory Factor Analysis (EFA) discovers:

A.              Factor structure → A

B.              Mean only

C.              Regression

D.              Variance

486.         Structural Equation Modeling (SEM) combines:

A.              Regression + factor analysis → A

B.              Regression only

C.              Correlation only

D.              Variance only

487.         Path analysis is part of:

A.              SEM → A

B.              ANOVA

C.              Regression

D.              Time series

488.         Bayesian statistics updates:

A.              Prior beliefs with data → A

B.              Only mean

C.              Only variance

D.              Only median

489.         Posterior distribution =

A.              Updated probability → A

B.              Prior only

C.              Likelihood only

D.              Mean only

490.         Maximum a posteriori (MAP) estimation maximizes:

A.              Posterior → A

B.              Likelihood

C.              Mean

D.              Variance

491.         Markov Chain Monte Carlo (MCMC) is used for:

A.              Bayesian estimation → A

B.              Frequentist estimation

C.              Mean only

D.              Variance only

492.         Gibbs sampling updates:

A.              One variable at a time → A

B.              All variables simultaneously

C.              Mean only

D.              Variance only

493.         Metropolis-Hastings algorithm:

A.              Accepts/rejects proposed sample → A

B.              Always accepts

C.              Always rejects

D.              Updates mean only

494.         Monte Carlo simulation estimates:

A.              Probabilities & distributions → A

B.              Mean only

C.              Median only

D.              Mode

495.         Random effects model accounts for:

A.              Group-level variation → A

B.              Fixed effect only

C.              Mean only

D.              Variance only

496.         Fixed effects model assumes:

A.              Constant effects → A

B.              Random effects

C.              Variable effects

D.              Unknown effects

497.         Mixed effects model combines:

A.              Fixed + random → A

B.              Fixed only

C.              Random only

D.              Neither

498.         Multilevel modeling is used for:

A.              Nested data → A

B.              Independent data

C.              Continuous only

D.              Categorical only

499.         Overdispersion occurs when:

A.              Variance > mean → A

B.              Variance < mean

C.              Variance = mean

D.              Mean = 0

500.         Negative binomial regression handles:

A.              Overdispersed count data → A

B.              Binary data

C.              Continuous data

D.              Ordinal data


Share:

0 comments:

Post a Comment

Contact Us

SALEHE NJOHOLE P.O.BOX 2428, DAR ES SALAAM, TANZANIA EAST AFRIKA. Call: 0692 127 931