Advertise here ✔️

Phone: +255 692 127 931 Email: njoholes@gmail.com

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhG5GSwn5UBLXuOOUyUDGcmuQRzn3NFkRk47bJzhk44ktBL7H0hlgUZOHqp4Y7HVlkKJd3MToAGxkygkNoG4t4kxCfjG9pCINqkA3KhHIDeudh4Sv1rRQ9uYAQJLrlxGWYzQWUGD9d8Za8/s930/3.png

Saleh Njohole

Always be inspired in your life.

My Life

What goes around is what comes around.

Brave

Be happy this moment because this moment is your life.

Tuesday, June 9, 2026

500+ STATISTICIAN II INTERVIEW PREPARATION QUESTIONS AND ANSWERS

STATISTICIAN II INTERVIEW PREPARATION 

1.               What is the mean of 2, 4, 6, 8?

A.              4

B.              5

C.              6

D.              7 → B

2.               Which measure is most affected by outliers?

A.              Median

B.              Mode

C.              Mean

D.              Range → C

3.               The median of an odd-numbered dataset is:

A.              Average of all values

B.              Middle value

C.              Most frequent value

D.              Smallest value → B

4.               Standard deviation measures:

A.              Central tendency

B.              Dispersion

C.              Skewness

D.              Probability → B

5.               Which distribution is symmetric?

A.              Normal

B.              Exponential

C.              Poisson

D.              Binomial (skewed) → A

6.               Mode is:

A.              Average

B.              Most frequent value

C.              Middle value

D.              Range → B

7.               Variance is the square of:

A.              Mean

B.              Median

C.              Standard deviation

D.              Mode → C

8.               Range is calculated as:

A.              Max − Min

B.              Mean − Median

C.              Median − Mode

D.              Sum of values → A

9.               A dataset with no variability has standard deviation:

A.              0

B.              1

C.             

D.              -1 → A

10.            Skewness measures:

A.              Spread

B.              Shape of distribution

C.              Center

D.              Size → B

11.            11–20: Probability

12.            Probability values lie between:

A.              -1 and 1

B.              0 and 1

C.              1 and 10

D.              0 and 10 → B

13.            Probability of a sure event is:

A.              0

B.              0.5

C.              1

D.              2 → C

14.            If events A and B are independent:

A.              P(A∩B)=P(A)+P(B)

B.              P(A∩B)=P(A)P(B)

C.              P(A|B)=0

D.              P(A)=P(B) → B

15.            Complement of event A is:

A.              A

B.              1 − P(A)

C.              P(A)²

D.              0 → B

16.            Conditional probability is:

A.              P(A)

B.              P(A|B)

C.              P(B|A)

D.              Both B and C → D

17.            Binomial distribution requires:

A.              Continuous data

B.              Fixed trials

C.              Infinite trials

D.              Negative values → B

18.            Expected value of a fair die:

A.              2.5

B.              3

C.              3.5

D.              4 → C

19.            Law of large numbers states:

A.              Sample mean → population mean

B.              Sample decreases

C.              Variance increases

D.              Data disappears → A

20.            Random variable can be:

A.              Only discrete

B.              Only continuous

C.              Both

D.              None → C

21.            Poisson distribution models:

A.              Continuous data

B.              Rare events

C.              Large values

D.              Negative values → B

22.            21–30: Inferential Statistics

23.            Null hypothesis represents:

A.              Alternative claim

B.              No effect

C.              Strong effect

D.              Prediction → B

24.            p-value measures:

A.              Mean

B.              Evidence against H₀

C.              Sample size

D.              Variance → B

25.            If p-value < 0.05:

A.              Accept H₀

B.              Reject H₀

C.              Ignore

D.              Increase sample → B

26.            Type I error is:

A.              False negative

B.              False positive

C.              True positive

D.              True negative → B

27.            Type II error is:

A.              Reject true H₀

B.              Accept false H₀

C.              Correct decision

D.              None → B

28.            Confidence interval provides:

A.              Exact value

B.              Range estimate

C.              Mean only

D.              Error only → B

29.            Larger sample size leads to:

A.              Larger error

B.              Smaller error

C.              No change

D.              Infinite error → B

30.            t-test is used when:

A.              Large sample

B.              Unknown variance

C.              Known variance

D.              Infinite sample → B

31.            Z-test requires:

A.              Small sample

B.              Known variance

C.              Unknown variance

D.              No data → B

32.            ANOVA compares:

A.              Two means

B.              Multiple means

C.              Variance only

D.              Probabilities → B

33.            31–40: Regression & Data Analysis

34.            Regression analysis studies:

A.              Distribution

B.              Relationship between variables

C.              Mean

D.              Variance → B

35.            Dependent variable is:

A.              Output

B.              Input

C.              Constant

D.              Random → A

36.            Independent variable is:

A.              Output

B.              Predictor

C.              Result

D.              Error → B

37.            R² measures:

A.              Error

B.              Fit of model

C.              Mean

D.              Probability → B

38.            Correlation ranges between:

A.              0 and 1

B.              -1 and 1

C.              1 and 10

D.              -10 and 10 → B

39.            Perfect positive correlation is:

A.              0

B.              -1

C.              1

D.              2 → C

40.            Multicollinearity affects:

A.              Accuracy

B.              Predictor independence

C.              Mean

D.              Variance → B

41.            Residual is:

A.              Observed − Predicted

B.              Predicted − Observed

C.              Mean − Median

D.              Max − Min → A

42.            Linear regression assumes:

A.              Non-linearity

B.              Linearity

C.              Randomness only

D.              Discrete values → B

43.            Overfitting occurs when:

A.              Model too simple

B.              Model too complex

C.              No data

D.              Small variance → B

44.            41–50: Scenario & Practical Questions

45.            Data has extreme outliers. Best measure?

A.              Mean

B.              Median

C.              Variance

D.              Range → B

46.            Small sample size analysis uses:

A.              Z-test

B.              t-test

C.              ANOVA

D.              Regression → B

47.            Comparing 3 groups’ means:

A.              t-test

B.              Z-test

C.              ANOVA

D.              Chi-square → C

48.            Categorical data test:

A.              t-test

B.              Z-test

C.              Chi-square

D.              Regression → C

49.            Checking model accuracy:

A.             

B.              Mean

C.              Mode

D.              Range → A

50.            Missing data handling:

A.              Ignore

B.              Impute

C.              Delete always

D.              Random guess → B

51.            Time series data requires:

A.              Regression

B.              Trend analysis

C.              ANOVA

D.              Chi-square → B

52.            Sampling bias affects:

A.              Accuracy

B.              Mean only

C.              Variance only

D.              Nothing → A

53.            Large variance indicates:

A.              Low spread

B.              High spread

C.              No spread

D.              Constant data → B

54.            Best visualization for distribution:

A.              Pie chart

B.              Histogram

C.              Table

D.              Text → B

55.            What does a histogram display?

A.              Relationship between variables

B.              Frequency distribution

C.              Correlation

D.              Regression → B

56.            A scatter plot is used to show:

A.              Distribution

B.              Relationship between two variables

C.              Frequency

D.              Mean → B

57.            If two variables are perfectly negatively correlated, r =

A.              1

B.              0

C.              -1

D.              0.5 → C

58.            Sampling error occurs due to:

A.              Bias

B.              Random variation

C.              Calculation mistake

D.              Data entry error → B

59.            Non-sampling error includes:

A.              Random error

B.              Sampling fluctuation

C.              Measurement error

D.              Sample size → C

60.            A census studies:

A.              Sample

B.              Population

C.              Subset

D.              Variable → B

61.            Stratified sampling divides population into:

A.              Equal parts

B.              Random parts

C.              Homogeneous groups

D.              Heterogeneous groups → C

62.            Cluster sampling selects:

A.              Individuals

B.              Groups

C.              Variables

D.              Means → B

63.            Systematic sampling selects every:

A.              Random item

B.              nth item

C.              First item

D.              Last item → B

64.            A parameter is usually denoted by:

A.              Greek letters

B.              Numbers

C.              Roman letters

D.              Symbols only → A

65.            A statistic is usually denoted by:

A.              Greek letters

B.              Roman letters

C.              Symbols only

D.              Numbers only → B

66.            Degrees of freedom for variance:

A.              n

B.              n − 1

C.              n + 1

D.              n − 2 → B

67.            A two-tailed test checks:

A.              One direction

B.              Both directions

C.              No direction

D.              Mean only → B

68.            A one-tailed test checks:

A.              Both sides

B.              One direction

C.              Mean only

D.              Variance only → B

69.            Significance level is denoted by:

A.              β

B.              α

C.              μ

D.              σ → B

70.            If α = 0.01, confidence level is:

A.              90%

B.              95%

C.              99%

D.              100% → C

71.            Normal distribution has mean = median =

A.              Mode

B.              Variance

C.              Range

D.              Skewness → A

72.            In a normal distribution, 68% data lies within:

A.              1 SD

B.              2 SD

C.              3 SD

D.              4 SD → A

73.            In a normal distribution, 95% lies within:

A.              1 SD

B.              2 SD

C.              3 SD

D.              4 SD → B

74.            Z-score measures:

A.              Raw value

B.              Standardized value

C.              Mean

D.              Variance → B

75.            Z-score formula includes:

A.              Mean and variance

B.              Mean and standard deviation

C.              Median and mode

D.              Range and IQR → B

76.            If Z = 0, value equals:

A.              Mean

B.              Median

C.              Mode

D.              Range → A

77.            Sampling distribution refers to:

A.              Population data

B.              Distribution of sample statistic

C.              Raw data

D.              Mean only → B

78.            A large sample size reduces:

A.              Bias

B.              Standard error

C.              Mean

D.              Variance → B

79.            A biased estimator:

A.              Equals parameter

B.              Deviates systematically

C.              Is always correct

D.              Has zero variance → B

80.            Efficiency of estimator relates to:

A.              Bias

B.              Variance

C.              Mean

D.              Sample size → B

81.            Consistency means estimator:

A.              Changes

B.              Converges to true value

C.              Is biased

D.              Is random → B

82.            A likelihood ratio test compares:

A.              Means

B.              Models

C.              Variances

D.              Probabilities → B

83.            p-hacking refers to:

A.              Correct testing

B.              Manipulating results

C.              Data cleaning

D.              Sampling → B

84.            A control group is used to:

A.              Compare results

B.              Increase bias

C.              Reduce data

D.              Ignore treatment → A

85.            Experimental design aims to:

A.              Increase bias

B.              Reduce bias

C.              Ignore variables

D.              Increase variance → B

86.            Randomization helps:

A.              Increase error

B.              Reduce bias

C.              Increase bias

D.              Ignore data → B

87.            Blocking is used to:

A.              Ignore variables

B.              Control variability

C.              Increase randomness

D.              Reduce sample → B

88.            Factorial design studies:

A.              One factor

B.              Multiple factors

C.              No factors

D.              Random data → B

89.            Interaction effect means:

A.              No effect

B.              Combined effect

C.              Single effect

D.              Random effect → B

90.            Residual plot helps detect:

A.              Mean

B.              Model assumptions

C.              Variance only

D.              Sample size → B

91.            Leverage points influence:

A.              Mean

B.              Regression line

C.              Variance

D.              Median → B

92.            Influential points affect:

A.              Model strongly

B.              Mean only

C.              Variance only

D.              Nothing → A

93.            Data transformation helps:

A.              Normalize data

B.              Increase bias

C.              Reduce sample

D.              Ignore outliers → A

94.            Log transformation is used for:

A.              Skewed data

B.              Normal data

C.              Small data

D.              Categorical data → A

95.            Scaling data helps:

A.              Model performance

B.              Increase error

C.              Reduce variables

D.              Ignore data → A

96.            Standardization results in:

A.              Mean 1, SD 0

B.              Mean 0, SD 1

C.              Mean 1, SD 1

D.              Mean 0, SD 0 → B

97.            Normalization scales data between:

A.              -1 to 1

B.              0 to 1

C.              1 to 10

D.              -10 to 10 → B

98.            PCA reduces:

A.              Sample size

B.              Dimensions

C.              Mean

D.              Variance → B

99.            Eigenvalues measure:

A.              Variance explained

B.              Mean

C.              Probability

D.              Error → A

100.         Eigenvectors represent:

A.              Direction

B.              Magnitude

C.              Mean

D.              Variance → A

101.         K-means clustering requires:

A.              Labels

B.              No labels

C.              Mean only

D.              Variance only → B

102.         Number of clusters in K-means is:

A.              Fixed

B.              Unknown

C.              Predefined

D.              Infinite → C

103.         Elbow method is used to:

A.              Reduce bias

B.              Choose clusters

C.              Increase data

D.              Normalize data → B

104.         Silhouette score measures:

A.              Accuracy

B.              Cluster quality

C.              Mean

D.              Variance → B

105.         What is the harmonic mean mainly used for?

A.              Averaging ratios

B.              Averaging sums

C.              Categorical data

D.              Large samples

106.         Answer: A

107.         If all values in a dataset are equal, skewness is:

A.              Positive

B.              Negative

C.              Zero

D.              Infinite

108.         Answer: C

109.         A platykurtic distribution is:

A.              Highly peaked

B.              Flat

C.              Skewed

D.              Symmetric

110.         Answer: B

111.         A leptokurtic distribution is:

A.              Flat

B.              Highly peaked

C.              Uniform

D.              Random

112.         Answer: B

113.         In simple linear regression, the slope represents:

A.              Intercept

B.              Change in Y per unit X

C.              Error

D.              Variance

114.         Answer: B

115.         Intercept in regression is value of Y when:

A.              X = 1

B.              X = 0

C.              X = mean

D.              X = variance

116.         Answer: B

117.         Residuals should ideally be:

A.              Patterned

B.              Random

C.              Increasing

D.              Decreasing

118.         Answer: B

119.         Durbin-Watson test detects:

A.              Multicollinearity

B.              Autocorrelation

C.              Normality

D.              Heteroscedasticity

120.         Answer: B

121.         Variance Inflation Factor (VIF) detects:

A.              Autocorrelation

B.              Multicollinearity

C.              Normality

D.              Skewness

122.         Answer: B

123.         If VIF is high, it indicates:

A.              Good model

B.              Multicollinearity

C.              Low variance

D.              Independence

124.         Answer: B

125.         A confounding variable:

A.              Has no effect

B.              Distorts relationship

C.              Is dependent variable

D.              Is constant

126.         Answer: B

127.         Endogeneity refers to:

A.              External variables

B.              Internal bias

C.              Random error

D.              Constant data

128.         Answer: B

129.         Panel data combines:

A.              Cross-sectional only

B.              Time series only

C.              Both cross-sectional & time series

D.              None

130.         Answer: C

131.         Fixed effects model controls for:

A.              Time variation

B.              Individual differences

C.              Mean only

D.              Variance only

132.         Answer: B

133.         Random effects model assumes:

A.              No variation

B.              Random variation

C.              Fixed variation

D.              No correlation

134.         Answer: B

135.         Akaike Information Criterion (AIC) favors:

A.              Simpler models

B.              Complex models

C.              Balanced fit

D.              Random models

136.         Answer: C

137.         Bayesian Information Criterion (BIC) penalizes complexity:

A.              Less

B.              More

C.              Equal

D.              None

138.         Answer: B

139.         A dummy variable takes values:

A.              0 or 1

B.              1 or 2

C.              -1 or 1

D.              Any number

140.         Answer: A

141.         Interaction term in regression captures:

A.              Independent effect

B.              Combined effect

C.              Error

D.              Mean

142.         Answer: B

143.         Elasticity measures:

A.              Absolute change

B.              Relative change

C.              Mean

D.              Variance

144.         Answer: B

145.         In hypothesis testing, β represents:

A.              Type I error

B.              Type II error

C.              Mean

D.              Variance

146.         Answer: B

147.         Power increases when:

A.              Sample size increases

B.              Sample size decreases

C.              Variance increases

D.              Bias increases

148.         Answer: A

149.         A non-parametric test does not assume:

A.              Mean

B.              Distribution

C.              Variance

D.              Data

150.         Answer: B

151.         Mann-Whitney test compares:

A.              Means

B.              Medians

C.              Variances

D.              Probabilities

152.         Answer: B

153.         Wilcoxon test is used for:

A.              Paired data

B.              Independent data

C.              Random data

D.              Large samples

154.         Answer: A

155.         Kruskal-Wallis test compares:

A.              Two groups

B.              Multiple groups

C.              One group

D.              Variance only

156.         Answer: B

157.         Spearman correlation measures:

A.              Linear relation

B.              Rank relation

C.              Variance

D.              Mean

158.         Answer: B

159.         Pearson correlation measures:

A.              Rank relation

B.              Linear relation

C.              Nonlinear relation

D.              Variance

160.         Answer: B

161.         A p-value close to 0 indicates:

A.              Weak evidence

B.              Strong evidence

C.              No evidence

D.              Infinite evidence

162.         Answer: B

163.         Bonferroni correction is used for:

A.              Multiple testing

B.              Sampling

C.              Regression

D.              Clustering

164.         Answer: A

165.         Missing completely at random (MCAR) means:

A.              Depends on data

B.              Independent of data

C.              Depends on outcome

D.              Systematic

166.         Answer: B

167.         Missing at random (MAR) depends on:

A.              Observed data

B.              Unobserved data

C.              None

D.              Random guess

168.         Answer: A

169.         Not missing at random (NMAR) depends on:

A.              Observed only

B.              Unobserved

C.              Mean

D.              Variance

170.         Answer: B

171.         Imputation replaces:

A.              Outliers

B.              Missing values

C.              Means

D.              Variance

172.         Answer: B

173.         Mean imputation may:

A.              Increase variance

B.              Reduce variance

C.              Increase bias

D.              Reduce bias

174.         Answer: B

175.         Weighted mean assigns:

A.              Equal weights

B.              Different weights

C.              No weights

D.              Random weights

176.         Answer: B

177.         Survey weights correct:

A.              Bias

B.              Mean

C.              Variance

D.              Range

178.         Answer: A

179.         Design effect measures:

A.              Efficiency of design

B.              Mean

C.              Variance

D.              Sample size

180.         Answer: A

181.         ROC curve plots:

A.              Precision vs Recall

B.              TPR vs FPR

C.              Mean vs variance

D.              Error vs accuracy

182.         Answer: B

183.         AUC measures:

A.              Accuracy

B.              Model performance

C.              Mean

D.              Variance

184.         Answer: B

185.         Sensitivity is:

A.              True negative rate

B.              True positive rate

C.              False positive rate

D.              Error rate

186.         Answer: B

187.         Specificity is:

A.              True negative rate

B.              True positive rate

C.              Error rate

D.              Mean

188.         Answer: A

189.         False positive rate equals:

A.              1 − specificity

B.              Specificity

C.              Sensitivity

D.              Accuracy

190.         Answer: A

191.         Data leakage occurs when:

A.              Training uses future info

B.              Testing uses past info

C.              Data is missing

D.              Data is clean

192.         Answer: A

193.         Train-test split is used to:

A.              Increase bias

B.              Evaluate model

C.              Reduce data

D.              Normalize data

194.         Answer: B

195.         Underfitting occurs when:

A.              Model too simple

B.              Model too complex

C.              Too much data

D.              No data

196.         Answer: A

197.         Bias-variance tradeoff balances:

A.              Accuracy & error

B.              Bias & variance

C.              Mean & median

D.              Range & IQR

198.         Answer: B

199.         Gradient descent is used to:

A.              Maximize error

B.              Minimize loss

C.              Increase bias

D.              Reduce sample

200.         Answer: B

201.         Loss function measures:

A.              Accuracy

B.              Error

C.              Mean

D.              Variance

202.         Answer: B

203.         Regularization helps:

A.              Prevent overfitting

B.              Increase error

C.              Reduce data

D.              Ignore variables

204.         Answer: A

205.         What is the main purpose of descriptive statistics?

A.              Make predictions

B.              Summarize data

C.              Test hypotheses

D.              Build models → B

206.         Inferential statistics is used to:

A.              Summarize data

B.              Describe sample

C.              Draw conclusions about population

D.              Organize data → C

207.         A frequency polygon is used to:

A.              Show relationship

B.              Show distribution

C.              Show mean

D.              Show variance → B

208.         Bar charts are best for:

A.              Continuous data

B.              Categorical data

C.              Time series

D.              Correlation → B

209.         Pie charts show:

A.              Trends

B.              Proportions

C.              Variance

D.              Mean → B

210.         Stem-and-leaf plot shows:

A.              Mean

B.              Raw data distribution

C.              Correlation

D.              Regression → B

211.         A bimodal distribution has:

A.              One mode

B.              Two modes

C.              No mode

D.              Many modes → B

212.         A uniform distribution has:

A.              Equal frequencies

B.              Skewness

C.              High variance

D.              Low variance → A

213.         A right-skewed distribution has:

A.              Tail on left

B.              Tail on right

C.              No tail

D.              Symmetric → B

214.         A left-skewed distribution has:

A.              Tail on left

B.              Tail on right

C.              Symmetric

D.              No tail → A

215.         Mean deviation is taken from:

A.              Mean or median

B.              Mode only

C.              Range

D.              Variance → A

216.         Quartile deviation equals:

A.              Q3 − Q1

B.              (Q3 − Q1)/2

C.              Q1 − Q3

D.              Mean − median → B

217.         Standard error decreases when:

A.              Sample size increases

B.              Sample size decreases

C.              Variance increases

D.              Mean increases → A

218.         A large p-value suggests:

A.              Reject H₀

B.              Weak evidence

C.              Strong evidence

D.              Significant result → B

219.         Hypothesis testing begins with:

A.              Data collection

B.              Null hypothesis

C.              Conclusion

D.              Graph → B

220.         A parameter is fixed but:

A.              Known

B.              Unknown

C.              Random

D.              Variable → B

221.         A statistic is:

A.              Fixed

B.              Known

C.              Random

D.              Constant → C

222.         Sampling distribution of mean is:

A.              Always normal

B.              Approximately normal

C.              Uniform

D.              Skewed → B

223.         Standard normal distribution has mean:

A.              1

B.              0

C.              -1

D.              100 → B

224.         Standard normal distribution has variance:

A.              0

B.              1

C.              2

D.              10 → B

225.         Z-table gives:

A.              Mean

B.              Probability

C.              Variance

D.              Mode → B

226.         t-distribution is used when:

A.              Large sample

B.              Small sample

C.              Infinite sample

D.              No sample → B

227.         t-distribution approaches normal when:

A.              Sample decreases

B.              Sample increases

C.              Variance decreases

D.              Mean increases → B

228.         Chi-square distribution is used for:

A.              Means

B.              Variances

C.              Categorical data

D.              Correlation → C

229.         F-distribution is used in:

A.              Regression

B.              ANOVA

C.              Probability

D.              Mean → B

230.         ANOVA tests equality of:

A.              Variances

B.              Means

C.              Probabilities

D.              Medians → B

231.         Degrees of freedom in ANOVA depend on:

A.              Mean

B.              Sample size

C.              Groups

D.              Both B and C → D

232.         Residual sum of squares measures:

A.              Explained variation

B.              Unexplained variation

C.              Total variation

D.              Mean → B

233.         Total sum of squares equals:

A.              Explained + residual

B.              Mean + variance

C.              Mode + median

D.              Range + IQR → A

234.         Regression coefficient shows:

A.              Strength

B.              Direction

C.              Change

D.              All of the above → D

235.         Perfect fit means R² equals:

A.              0

B.              0.5

C.              1

D.              -1 → C

236.         If R² = 0, model explains:

A.              All variation

B.              No variation

C.              Half variation

D.              Negative variation → B

237.         Multicollinearity increases:

A.              Accuracy

B.              Variance of coefficients

C.              Mean

D.              Sample size → B

238.         Ridge regression adds:

A.              L1 penalty

B.              L2 penalty

C.              No penalty

D.              Random penalty → B

239.         Lasso regression adds:

A.              L1 penalty

B.              L2 penalty

C.              No penalty

D.              Random penalty → A

240.         Overfitting leads to:

A.              Poor training

B.              Poor generalization

C.              High bias

D.              Low variance → B

241.         Underfitting leads to:

A.              High bias

B.              High variance

C.              Perfect fit

D.              Low bias → A

242.         Bias is:

A.              Error from assumptions

B.              Random error

C.              Sampling error

D.              Measurement error → A

243.         Variance is:

A.              Error variability

B.              Mean

C.              Mode

D.              Range → A

244.         Bootstrap method uses:

A.              Replacement

B.              No replacement

C.              Fixed data

D.              Random guess → A

245.         Jackknife method uses:

A.              All data

B.              Leave-one-out

C.              Sampling

D.              Mean → B

246.         Cross-validation splits data into:

A.              One set

B.              Two or more sets

C.              No sets

D.              Infinite sets → B

247.         K-fold cross-validation uses:

A.              One fold

B.              K partitions

C.              Two partitions

D.              Infinite folds → B

248.         Confusion matrix evaluates:

A.              Regression

B.              Classification

C.              Sampling

D.              Mean → B

249.         Accuracy measures:

A.              Correct predictions

B.              Errors

C.              Variance

D.              Mean → A

250.         Precision focuses on:

A.              True positives

B.              True negatives

C.              Errors

D.              Mean → A

251.         Recall focuses on:

A.              True positives

B.              True negatives

C.              Errors

D.              Variance → A

252.         F1 score balances:

A.              Accuracy

B.              Precision & recall

C.              Mean

D.              Variance → B

253.         KNN is a:

A.              Regression model

B.              Classification method

C.              Clustering method

D.              Sampling method → B

254.         Decision tree is used for:

A.              Classification & regression

B.              Mean only

C.              Variance only

D.              Sampling → A           

255.         Central Limit Theorem states that the sampling distribution of the mean is approximately normal if:

A.              Sample size is small

B.              Sample size is large

C.              Population is skewed

D.              Population is uniform → B

256.         Law of Large Numbers states that:

A.              Sample mean approaches population mean as sample size increases

B.              Sample mean decreases with sample size

C.              Variance increases with sample size

D.              Mean is always equal to median → A

257.         A Type I error occurs when:

A.              True null hypothesis is rejected

B.              False null hypothesis is accepted

C.              True null hypothesis is accepted

D.              False null hypothesis is rejected → A

258.         A Type II error occurs when:

A.              True null hypothesis is rejected

B.              False null hypothesis is accepted

C.              True null hypothesis is accepted

D.              False null hypothesis is rejected → B

259.         Confidence interval gives:

A.              Exact value of parameter

B.              Range of plausible values

C.              Variance only

D.              Mean only → B

260.         95% confidence level means:

A.              95% chance population mean is in interval

B.              5% chance population mean is in interval

C.              Mean = 0.95

D.              Variance = 0.95 → A

261.         Margin of error depends on:

A.              Sample size

B.              Confidence level

C.              Standard deviation

D.              All of the above → D

262.         Standard error is:

A.              Standard deviation of population

B.              Standard deviation of sample mean

C.              Mean of sample

D.              Range → B

263.         Normal approximation to binomial works if:

A.              n is large and p not too close to 0 or 1

B.              n is small

C.              p = 0

D.              p = 1 → A

264.         Poisson distribution approximates binomial if:

A.              n is large, p small

B.              n is small, p large

C.              n and p are large

D.              n and p small → A

265.         Chi-square test is used for:

A.              Mean comparison

B.              Categorical data

C.              Regression

D.              Correlation → B

266.         Degrees of freedom for chi-square =

A.              (r + c −1)

B.              (r −1)(c −1)

C.              r × c

D.              r − c → B

267.         Goodness-of-fit test checks:

A.              Regression fit

B.              Observed vs expected frequencies

C.              Mean comparison

D.              Variance equality → B

268.         Homoscedasticity means:

A.              Equal variances across groups

B.              Unequal variances

C.              Mean = median

D.              Skewness = 0 → A

269.         Heteroscedasticity means:

A.              Equal variances

B.              Unequal variances

C.              Constant error

D.              Normality → B

270.         Bartlett’s test checks:

A.              Normality

B.              Equality of variances

C.              Mean equality

D.              Skewness → B

271.         Levene’s test is used for:

A.              Equality of means

B.              Equality of variances

C.              Regression coefficients

D.              Correlation → B

272.         Kolmogorov-Smirnov test checks:

A.              Variance

B.              Normality

C.              Correlation

D.              Regression → B

273.         Shapiro-Wilk test checks:

A.              Variance

B.              Normality

C.              Mean equality

D.              Skewness → B

274.         Q-Q plot helps assess:

A.              Variance

B.              Normality

C.              Correlation

D.              Regression → B

275.         Boxplot identifies:

A.              Mean

B.              Outliers

C.              Correlation

D.              Regression → B

276.         Interquartile range (IQR) =

A.              Q3 − Q1

B.              Q1 − Q3

C.              Max − Min

D.              Median → A

277.         Standard score (z) formula:

A.              (x − μ)/σ

B.              (μ − x)/σ

C.              x/σ

D.              x − μ → A

278.         Chebyshev’s inequality applies to:

A.              Any distribution

B.              Normal only

C.              Skewed only

D.              Uniform only → A

279.         Empirical rule applies to:

A.              Any distribution

B.              Normal distribution

C.              Uniform

D.              Skewed → B

280.         In hypothesis testing, power =

A.              1 − α

B.              1 − β

C.              α + β

D.              α × Î² → B

281.         ANOVA F-statistic =

A.              Variance between groups / variance within groups

B.              Mean between groups / mean within groups

C.              Sum of squares / mean

D.              Explained / total variance → A

282.         One-way ANOVA compares:

A.              Two means

B.              Multiple means

C.              Variances only

D.              Regression coefficients → B

283.         Two-way ANOVA includes:

A.              One factor

B.              Two factors

C.              Multiple regression

D.              Correlation → B

284.         Post hoc tests are used after:

A.              Significant ANOVA

B.              Non-significant ANOVA

C.              Regression

D.              Chi-square → A

285.         Tukey’s test controls:

A.              Type I error

B.              Type II error

C.              Variance

D.              Mean → A

286.         Bonferroni correction controls:

A.              Type I error in multiple comparisons

B.              Type II error

C.              Regression error

D.              Correlation error → A

287.         Non-parametric tests are used when:

A.              Normality assumption fails

B.              Sample is large

C.              Population is known

D.              Regression needed → A

288.         Wilcoxon signed-rank test is for:

A.              Paired data

B.              Independent data

C.              Categorical data

D.              Multiple groups → A

289.         Mann-Whitney U test is for:

A.              Paired data

B.              Independent two samples

C.              Categorical data

D.              Multiple groups → B

290.         Kruskal-Wallis test is for:

A.              Two groups

B.              Multiple groups

C.              Paired data

D.              Categorical data → B

291.         Friedman test is for:

A.              One-way repeated measures

B.              Two-way repeated measures

C.              Regression

D.              Correlation → A

292.         Spearman rank correlation uses:

A.              Raw values

B.              Ranks

C.              Variances

D.              Means → B

293.         Kendall’s tau measures:

A.              Linear correlation

B.              Rank correlation

C.              Regression slope

D.              Variance → B

294.         Regression diagnostics detect:

A.              Outliers

B.              Leverage points

C.              Influential points

D.              All of the above → D

295.         Cook’s distance measures:

A.              Variance

B.              Influence of observation

C.              Mean

D.              Skewness → B

296.         Leverage points affect:

A.              Regression line slope

B.              Variance

C.              Mean only

D.              Mode only → A

297.         Influential points combine:

A.              Outlier + leverage

B.              Mean + variance

C.              Skewness + kurtosis

D.              Range + median → A

298.         Multivariate analysis deals with:

A.              One variable

B.              Two or more variables

C.              Means only

D.              Variances only → B

299.         Principal Component Analysis (PCA) reduces:

A.              Variance

B.              Dimensions

C.              Mean

D.              Skewness → B

300.         Factor analysis identifies:

A.              Observed variables

B.              Latent factors

C.              Mean

D.              Variance → B

301.         Cluster analysis groups:

A.              Variables

B.              Observations

C.              Mean only

D.              Variance only → B

302.         Hierarchical clustering can be:

A.              Agglomerative

B.              Divisive

C.              Both

D.              None → C

303.         K-means clustering minimizes:

A.              Between-cluster distance

B.              Within-cluster distance

C.              Mean

D.              Variance → B

304.         Silhouette score evaluates:

A.              Regression

B.              Classification

C.              Clustering quality

D.              Correlation → C

305.         Bayesian statistics updates:

A.              Prior probability using data → B

B.              Mean only

C.              Variance only

D.              Mode only

306.         Posterior probability combines:

A.              Likelihood × Prior → A

B.              Mean + Variance

C.              Standard deviation only

D.              Median only

307.         Likelihood function depends on:

A.              Parameter values → A

B.              Sample size only

C.              Variance only

D.              Mean only

308.         Maximum a posteriori (MAP) estimation maximizes:

A.              Likelihood

B.              Posterior → B

C.              Mean

D.              Variance

309.         Prior distribution can be:

A.              Informative → A

B.              Non-informative → B

C.              Both

D.              Neither

310.         Conjugate prior ensures:

A.              Posterior in same family as prior → A

B.              Posterior is uniform

C.              Posterior is normal

D.              Posterior is variance only

311.         Poisson process models:

A.              Continuous time events → A

B.              Categorical data

C.              Mean only

D.              Variance only

312.         Exponential interarrival times are:

A.              Memoryless → A

B.              Dependent

C.              Correlated

D.              Uniform

313.         Markov process satisfies:

A.              Future depends only on present → A

B.              Future depends on past

C.              Mean = median

D.              Variance = 0

314.         Stationary Markov chain has:

A.              Constant transition probabilities → A

B.              Varying probabilities

C.              Mean = 0

D.              Variance = 1

315.         Ergodic Markov chain:

A.              Can reach all states eventually → A

B.              Cannot reach all states

C.              Deterministic

D.              Non-stationary

316.         Transition matrix elements are:

A.              Probabilities → A

B.              Means

C.              Variances

D.              Modes

317.         Poisson regression models:

A.              Count data → A

B.              Continuous data

C.              Binary outcome

D.              Ordinal outcome

318.         Negative binomial regression handles:

A.              Overdispersed count data → A

B.              Binary data

C.              Continuous data

D.              Time series

319.         Ordinal regression models:

A.              Continuous outcome

B.              Ordered categorical outcome → B

C.              Binary outcome

D.              Count data

320.         Multinomial logistic regression handles:

A.              Multiple categories → A

B.              Two categories

C.              Continuous data

D.              Time series

321.         Survival analysis studies:

A.              Time until event → A

B.              Mean only

C.              Variance only

D.              Regression coefficients

322.         Censoring occurs when:

A.              Exact event time unknown → A

B.              Event never occurs

C.              Time series is stationary

D.              Mean = median

323.         Kaplan-Meier estimator estimates:

A.              Survival function → A

B.              Hazard function

C.              Mean

D.              Variance

324.         Cox proportional hazards model assumes:

A.              Constant hazard ratios → A

B.              Increasing hazard

C.              Mean = median

D.              Variance = 1

325.         Hazard function measures:

A.              Instantaneous risk → A

B.              Mean risk

C.              Cumulative variance

D.              Median

326.         Log-rank test compares:

A.              Two survival curves → A

B.              Means

C.              Variances

D.              Regression coefficients

327.         Time-to-event data is:

A.              Continuous → A

B.              Categorical

C.              Binary

D.              Count

328.         Monte Carlo Markov Chain (MCMC) is used for:

A.              Bayesian estimation → A

B.              Mean estimation only

C.              Variance estimation only

D.              Correlation

329.         Gibbs sampling updates:

A.              One variable at a time → A

B.              All variables simultaneously

C.              Mean only

D.              Variance only

330.         Metropolis-Hastings algorithm:

A.              Accepts or rejects proposed sample → A

B.              Always accepts

C.              Rejects all

D.              Updates mean only

331.         Random effects model accounts for:

A.              Within-group variability → A

B.              Between-group only

C.              Mean only

D.              Variance only

332.         Fixed effects model assumes:

A.              Effects are constant → A

B.              Effects are random

C.              Effects vary by sample

D.              Effects unknown

333.         Mixed-effects model includes:

A.              Fixed + random effects → A

B.              Only fixed

C.              Only random

D.              None

334.         Hierarchical linear model handles:

A.              Nested data → A

B.              Time series only

C.              Regression only

D.              Categorical data only

335.         Multilevel modeling is used when:

A.              Data clustered → A

B.              Data independent

C.              Time series

D.              Continuous only

336.         Structural equation modeling (SEM) combines:

A.              Regression + factor analysis → A

B.              Only regression

C.              Only correlation

D.              Only variance

337.         Path analysis is part of:

A.              SEM → A

B.              ANOVA

C.              Regression

D.              Time series

338.         Confirmatory factor analysis (CFA) tests:

A.              Hypothesized factor structure → A

B.              Mean equality

C.              Regression coefficients

D.              Variance equality

339.         Exploratory factor analysis (EFA) discovers:

A.              Factor structure → A

B.              Mean

C.              Regression

D.              Variance

340.         Principal axis factoring is:

A.              Common factor method → A

B.              PCA method

C.              Regression

D.              Correlation

341.         Varimax rotation maximizes:

A.              Loading variance → A

B.              Mean

C.              Regression slope

D.              Covariance

342.         Oblique rotation allows:

A.              Factors correlated → A

B.              Factors uncorrelated

C.              Regression only

D.              Mean only

343.         Bartlett’s test in factor analysis checks:

A.              Sphericity → A

B.              Variance equality

C.              Regression

D.              Mean

344.         Kaiser-Meyer-Olkin (KMO) measure checks:

A.              Sampling adequacy → A

B.              Variance

C.              Regression slope

D.              Mean

345.         Communality indicates:

A.              Variance explained by factors → A

B.              Total variance

C.              Regression slope

D.              Mean

346.         Eigenvalue >1 rule selects:

A.              Number of factors → A

B.              Number of observations

C.              Number of predictors

D.              Regression coefficients

347.         Scree plot visualizes:

A.              Eigenvalues → A

B.              Mean

C.              Variance

D.              Regression

348.         Cluster validity indices include:

A.              Silhouette → A

B.              Rand index → B

C.              Both

D.              None → C

349.         Dendrogram helps in:

A.              Hierarchical clustering → A

B.              Regression

C.              ANOVA

D.              Time series

350.         Agglomerative clustering starts with:

A.              Each observation as a cluster → A

B.              Single cluster

C.              Random cluster

D.              Mean only

351.         Divisive clustering starts with:

A.              All data in one cluster → A

B.              Each observation as cluster

C.              Random clusters

D.              Mean only

352.         Outlier in clustering may:

A.              Form its own cluster → A

B.              Merge with nearest cluster

C.              Affect centroids

D.              All of the above → D

353.         Hierarchical vs K-means:

A.              Deterministic vs iterative → A

B.              Both deterministic

C.              Both iterative

D.              None

354.         DBSCAN clustering identifies:

A.              Density-based clusters → A

B.              Hierarchical clusters

C.              K-means clusters

D.              PCA clusters

355.         The mode is defined as:

A.              Most frequent value → A

B.              Average value

C.              Middle value

D.              Sum of values

356.         The median divides data into:

A.              Two equal halves → A

B.              Four equal parts

C.              Three equal parts

D.              Five equal parts

357.         Skewness measures:

A.              Symmetry of data → A

B.              Spread

C.              Central tendency

D.              Correlation

358.         Positive skew means:

A.              Tail on right → A

B.              Tail on left

C.              Symmetric

D.              Uniform

359.         Negative skew means:

A.              Tail on left → A

B.              Tail on right

C.              Symmetric

D.              Uniform

360.         Kurtosis measures:

A.              Peakedness of distribution → A

B.              Spread

C.              Mean

D.              Median

361.         High kurtosis indicates:

A.              Heavy tails → A

B.              Light tails

C.              Symmetry

D.              Skewness = 0

362.         Low kurtosis indicates:

A.              Light tails → A

B.              Heavy tails

C.              Skewness

D.              Mean

363.         Variance formula (population) is:

A.              Σ(x−μ)² / N → A

B.              Σ(x−μ)² / (N−1)

C.              Σx / N

D.              Σx² / N

364.         Variance formula (sample) is:

A.              Σ(x−x̄)² / (n−1) → A

B.              Σ(x−x̄)² / n

C.              Σx / n

D.              Σx² / n

365.         Standard deviation is:

A.              Square root of variance → A

B.              Variance squared

C.              Mean

D.              Median

366.         Coefficient of variation (CV) =

A.              SD / Mean → A

B.              Mean / SD

C.              Variance / Mean

D.              Median / Mean

367.         Probability of mutually exclusive events:

A.              Sum of individual probabilities → A

B.              Product

C.              Difference

D.              Ratio

368.         Probability of independent events:

A.              Product of probabilities → A

B.              Sum

C.              Difference

D.              Ratio

369.         Conditional probability formula:

A.              P(A|B) = P(A∩B)/P(B) → A

B.              P(A∩B)/P(A)

C.              P(A)+P(B)

D.              P(A)-P(B)

370.         Bayes theorem updates:

A.              Prior probability → A

B.              Mean

C.              Variance

D.              Standard deviation

371.         Random variable X can be:

A.              Discrete → A

B.              Continuous → B

C.              Both → C

D.              Neither

372.         Probability mass function (PMF) applies to:

A.              Discrete → A

B.              Continuous

C.              Both

D.              Neither

373.         Probability density function (PDF) applies to:

A.              Continuous → A

B.              Discrete

C.              Both

D.              Neither

374.         Cumulative distribution function (CDF) gives:

A.              P(X ≤ x) → A

B.              P(X ≥ x)

C.              P(X = x)

D.              Mean

375.         Expected value of X =

A.              Σx·P(x) → A

B.              Mean only

C.              Variance only

D.              Median

376.         Law of total probability:

A.              P(A) = Σ P(A|Bi)P(Bi) → A

B.              P(A) = P(A∩B)

C.              P(A) = P(A)+P(B)

D.              P(A) = P(A)/P(B)

377.         Standard normal distribution:

A.              Mean 0, SD 1 → A

B.              Mean 1, SD 0

C.              Mean 0, SD 0

D.              Mean 1, SD 1

378.         Z-score formula:

A.              (X−μ)/σ → A

B.              (μ−X)/σ

C.              X/σ

D.              X−μ

379.         T-distribution is used when:

A.              Population SD unknown → A

B.              Population mean unknown

C.              Sample size large

D.              Sample size infinite

380.         Degrees of freedom in t-test:

A.              n−1 → A

B.              n

C.              n+1

D.              n−2

381.         One-sample t-test compares:

A.              Sample mean vs population mean → A

B.              Two sample means

C.              Two variances

D.              Proportions

382.         Two-sample t-test compares:

A.              Means of two independent samples → A

B.              Paired samples

C.              Variances

D.              Proportions

383.         Paired t-test compares:

A.              Means of paired observations → A

B.              Independent samples

C.              Variances

D.              Proportions

384.         F-test compares:

A.              Variances → A

B.              Means

C.              Medians

D.              Correlations

385.         ANOVA is an extension of:

A.              t-test → A

B.              Z-test

C.              F-test

D.              Chi-square

386.         One-way ANOVA has:

A.              One factor → A

B.              Two factors

C.              Multiple factors

D.              None

387.         Two-way ANOVA has:

A.              Two factors → A

B.              One factor

C.              Multiple factors

D.              None

388.         Post-hoc tests are used after:

A.              Significant ANOVA → A

B.              Non-significant ANOVA

C.              Regression

D.              Chi-square

389.         Bonferroni correction adjusts for:

A.              Multiple comparisons → A

B.              Single test

C.              Regression

D.              Variance

390.         Chi-square test applies to:

A.              Categorical data → A

B.              Continuous data

C.              Regression

D.              Time series

391.         Chi-square goodness-of-fit compares:

A.              Observed vs expected frequencies → A

B.              Means

C.              Variances

D.              Regression coefficients

392.         Chi-square test for independence examines:

A.              Association between variables → A

B.              Mean difference

C.              Variance equality

D.              Regression

393.         Contingency table shows:

A.              Cross-tabulated counts → A

B.              Means only

C.              Variances only

D.              Regression

394.         Residual =

A.              Observed − Predicted → A

B.              Predicted − Observed

C.              Mean − Observed

D.              Variance − Predicted

395.         Homoscedasticity =

A.              Equal variance → A

B.              Unequal variance

C.              Normal distribution

D.              Independence

396.         Heteroscedasticity violates:

A.              Constant variance assumption → A

B.              Linearity

C.              Normality

D.              Independence

397.         Cook’s distance detects:

A.              Influential points → A

B.              Outliers only

C.              Leverage only

D.              Residuals

398.         Leverage measures:

A.              Distance in predictor space → A

B.              Residual size

C.              Mean

D.              Variance

399.         Multicollinearity affects:

A.              Standard errors → A

B.              Means

C.              Medians

D.              Mode

400.         VIF > 10 indicates:

A.              Severe multicollinearity → A

B.              Low correlation

C.              Independence

D.              Normality

401.         Random effects model accounts for:

A.              Group-level variation → A

B.              Fixed effect only

C.              Mean only

D.              Variance only

402.         Fixed effects model assumes:

A.              Constant effects → A

B.              Random effects

C.              Variable effects

D.              Unknown effects

403.         Mixed effects model combines:

A.              Fixed + random → A

B.              Fixed only

C.              Random only

D.              Neither

404.         Hierarchical modeling handles:

A.              Nested data → A

B.              Independent data

C.              Time series only

D.              Categorical only

405.         Poisson regression is suitable for:

A.              Count data → A

B.              Continuous data

C.              Binary outcome

D.              Ordinal outcome

406.         Overdispersion occurs when:

A.              Variance > Mean → A

B.              Variance < Mean

C.              Variance = Mean

D.              Mean = 0

407.         Negative binomial regression handles:

A.              Overdispersed count data → A

B.              Binary data

C.              Continuous data

D.              Ordinal data

408.         Time series data has:

A.              Temporal order → A

B.              Random order

C.              Categorical only

D.              Constant variance

409.         Stationary time series has:

A.              Constant mean & variance → A

B.              Changing mean

C.              Changing variance

D.              Trend only

410.         Differencing a series removes:

A.              Trend → A

B.              Seasonality

C.              Noise

D.              Skewness

411.         Seasonal differencing removes:

A.              Seasonality → A

B.              Trend

C.              Noise

D.              Mean

412.         Autocorrelation measures:

A.              Correlation of series with lagged values → A

B.              Variance only

C.              Mean only

D.              Skewness

413.         Partial autocorrelation measures:

A.              Direct correlation between X_t and X_{t-k} controlling intermediate lags → A

B.              Total correlation

C.              Variance

D.              Mean

414.         AR(p) model uses:

A.              p lagged values → A

B.              p future values

C.              Moving average

D.              Trend

415.         MA(q) model uses:

A.              q lagged errors → A

B.              q lagged values

C.              Trend only

D.              Mean

416.         ARMA(p,q) combines:

A.              AR + MA → A

B.              AR only

C.              MA only

D.              AR + trend

417.         ARIMA(p,d,q) adds:

A.              Differencing → A

B.              Autocorrelation

C.              Variance

D.              Mean

418.         Exponential smoothing gives:

A.              Higher weight to recent observations → A

B.              Equal weight

C.              Lower weight to recent

D.              Random weight

419.         Holt-Winters method handles:

A.              Trend + seasonality → A

B.              Noise only

C.              Mean only

D.              Variance only

420.         White noise has:

A.              Zero mean, constant variance → A

B.              Non-zero mean

C.              Changing variance

D.              Trend

421.         ARCH/GARCH models address:

A.              Heteroscedasticity → A

B.              Mean

C.              Median

D.              Mode

422.         Monte Carlo simulation estimates:

A.              Probabilities & distributions → A

B.              Mean only

C.              Median only

D.              Mode

423.         MCMC (Markov Chain Monte Carlo) is used for:

A.              Bayesian estimation → A

B.              Frequentist estimation

C.              Mean only

D.              Variance only

424.         Gibbs sampling updates:

A.              One variable at a time → A

B.              All variables simultaneously

C.              Mean only

D.              Variance only

425.         Metropolis-Hastings algorithm:

A.              Accepts or rejects proposed sample → A

B.              Always accepts

C.              Always rejects

D.              Updates mean only

426.         Bayesian inference combines:

A.              Prior × Likelihood → A

B.              Mean × Variance

C.              Median × Mode

D.              Regression coefficients

427.         Posterior distribution =

A.              Updated belief after data → A

B.              Prior only

C.              Likelihood only

D.              Mean only

428.         Conjugate prior ensures:

A.              Posterior in same family → A

B.              Posterior uniform

C.              Posterior normal

D.              Posterior variance

429.         Maximum a posteriori (MAP) estimation maximizes:

A.              Posterior → A

B.              Likelihood

C.              Mean

D.              Variance

430.         Hierarchical Bayesian model accounts for:

A.              Group-level variation → A

B.              Individual only

C.              Mean only

D.              Variance only

431.         Random effects model includes:

A.              Group-level variation → A

B.              Fixed effect only

C.              Mean only

D.              Variance only

432.         Fixed effects model assumes:

A.              Constant effects → A

B.              Random effects

C.              Variable effects

D.              Unknown effects

433.         Mixed effects model combines:

A.              Fixed + random → A

B.              Fixed only

C.              Random only

D.              Neither

434.         Multilevel modeling is used for:

A.              Nested data → A

B.              Independent data

C.              Continuous only

D.              Categorical only

435.         Structural equation modeling (SEM) combines:

A.              Regression + factor analysis → A

B.              Regression only

C.              Correlation only

D.              Variance only

436.         Path analysis is part of:

A.              SEM → A

B.              ANOVA

C.              Regression

D.              Time series

437.         Confirmatory factor analysis (CFA) tests:

A.              Hypothesized factor structure → A

B.              Mean equality

C.              Regression coefficients

D.              Variance equality

438.         Exploratory factor analysis (EFA) discovers:

A.              Factor structure → A

B.              Mean only

C.              Regression

D.              Variance

439.         Kaiser-Meyer-Olkin (KMO) measure checks:

A.              Sampling adequacy → A

B.              Variance

C.              Regression slope

D.              Mean

440.         Bartlett’s test checks:

A.              Sphericity → A

B.              Mean

C.              Variance

D.              Regression

441.         Scree plot visualizes:

A.              Eigenvalues → A

B.              Means

C.              Variances

D.              Regression

442.         Principal Component Analysis (PCA) reduces:

A.              Dimensionality → A

B.              Mean

C.              Variance

D.              Regression

443.         First principal component maximizes:

A.              Variance → A

B.              Mean

C.              Skewness

D.              Kurtosis

444.         Varimax rotation achieves:

A.              Simple structure → A

B.              Maximum variance

C.              Regression

D.              Mean only

445.         K-means clustering minimizes:

A.              Within-cluster sum of squares → A

B.              Between-cluster variance

C.              Mean only

D.              Variance only

446.         Hierarchical clustering produces:

A.              Dendrogram → A

B.              Regression line

C.              Correlation matrix

D.              Factor loadings

447.         Agglomerative clustering starts with:

A.              Each observation as cluster → A

B.              One cluster

C.              Random clusters

D.              Mean only

448.         Divisive clustering starts with:

A.              All data in one cluster → A

B.              Each observation

C.              Random clusters

D.              Mean only

449.         DBSCAN identifies:

A.              Density-based clusters → A

B.              Hierarchical clusters

C.              K-means clusters

D.              Regression clusters

450.         Silhouette score measures:

A.              Cluster separation → A

B.              Mean

C.              Variance

D.              Standard deviation

451.         High silhouette score indicates:

A.              Well-separated clusters → A

B.              Overlapping clusters

C.              Poor clustering

D.              Random clustering

452.         Outlier detection uses:

A.              Z-score, IQR → A

B.              Mean only

C.              Variance only

D.              Median only

453.         Boxplot identifies:

A.              Outliers → A

B.              Mean

C.              Variance

D.              Standard deviation

454.         Leverage points affect:

A.              Regression line → A

B.              Median only

C.              Variance only

D.              Mean only

455.         Hierarchical clustering produces:

A.              Dendrogram → A

B.              Regression line

C.              Correlation matrix

D.              Factor loadings

456.         Agglomerative clustering starts with:

A.              Each observation as a cluster → A

B.              One cluster

C.              Random clusters

D.              Mean only

457.         Divisive clustering starts with:

A.              All data in one cluster → A

B.              Each observation

C.              Random clusters

D.              Mean only

458.         K-means clustering minimizes:

A.              Within-cluster sum of squares → A

B.              Between-cluster variance

C.              Mean only

D.              Variance only

459.         DBSCAN clustering identifies:

A.              Density-based clusters → A

B.              Hierarchical clusters

C.              K-means clusters

D.              Regression clusters

460.         Silhouette score measures:

A.              Cluster separation → A

B.              Mean

C.              Variance

D.              Standard deviation

461.         High silhouette score indicates:

A.              Well-separated clusters → A

B.              Overlapping clusters

C.              Poor clustering

D.              Random clustering

462.         Outlier detection methods include:

A.              Z-score, IQR → A

B.              Mean only

C.              Variance only

D.              Median only

463.         Boxplot shows:

A.              Outliers → A

B.              Mean only

C.              Variance only

D.              Standard deviation

464.         Leverage points affect:

A.              Regression line → A

B.              Median only

C.              Variance only

D.              Mean only

465.         Cook’s distance identifies:

A.              Influential points → A

B.              Outliers only

C.              Median points

D.              Regular points

466.         Multicollinearity inflates:

A.              Standard errors → A

B.              Means

C.              Medians

D.              Modes

467.         Variance Inflation Factor (VIF) >10 indicates:

A.              Severe multicollinearity → A

B.              No correlation

C.              Independence

D.              Normality

468.         Heteroscedasticity violates:

A.              Constant variance assumption → A

B.              Linearity

C.              Normality

D.              Independence

469.         Autocorrelation violates:

A.              Independence assumption → A

B.              Linearity

C.              Normality

D.              Variance

470.         Time series decomposition separates:

A.              Trend, seasonality, residual → A

B.              Mean only

C.              Variance only

D.              Median only

471.         ARIMA model includes:

A.              AR + I + MA → A

B.              AR only

C.              MA only

D.              Differencing only

472.         Stationarity is required for:

A.              ARIMA → A

B.              Regression

C.              ANOVA

D.              Chi-square

473.         Exponential smoothing is used for:

A.              Forecasting → A

B.              Regression

C.              Correlation

D.              Variance estimation

474.         Holt-Winters method models:

A.              Trend + seasonality → A

B.              Noise only

C.              Mean only

D.              Variance only

475.         Bootstrapping resamples:

A.              With replacement → A

B.              Without replacement

C.              Randomly once

D.              Deterministically

476.         Jackknife resampling removes:

A.              One observation at a time → A

B.              Half the sample

C.              Entire sample

D.              Random subset

477.         Principal Component Analysis (PCA) reduces:

A.              Dimensionality → A

B.              Mean

C.              Variance

D.              Regression

478.         First principal component maximizes:

A.              Variance → A

B.              Mean

C.              Skewness

D.              Kurtosis

479.         Eigenvalues in PCA indicate:

A.              Variance explained → A

B.              Mean only

C.              Regression coefficient

D.              Skewness

480.         Factor analysis identifies:

A.              Latent variables → A

B.              Observed variables only

C.              Means

D.              Variances

481.         Kaiser-Meyer-Olkin (KMO) measure checks:

A.              Sampling adequacy → A

B.              Variance

C.              Regression slope

D.              Mean

482.         Bartlett’s test checks:

A.              Sphericity → A

B.              Mean

C.              Variance

D.              Regression

483.         Scree plot visualizes:

A.              Eigenvalues → A

B.              Means

C.              Variances

D.              Regression

484.         Confirmatory Factor Analysis (CFA) tests:

A.              Hypothesized factor structure → A

B.              Mean equality

C.              Regression coefficients

D.              Variance equality

485.         Exploratory Factor Analysis (EFA) discovers:

A.              Factor structure → A

B.              Mean only

C.              Regression

D.              Variance

486.         Structural Equation Modeling (SEM) combines:

A.              Regression + factor analysis → A

B.              Regression only

C.              Correlation only

D.              Variance only

487.         Path analysis is part of:

A.              SEM → A

B.              ANOVA

C.              Regression

D.              Time series

488.         Bayesian statistics updates:

A.              Prior beliefs with data → A

B.              Only mean

C.              Only variance

D.              Only median

489.         Posterior distribution =

A.              Updated probability → A

B.              Prior only

C.              Likelihood only

D.              Mean only

490.         Maximum a posteriori (MAP) estimation maximizes:

A.              Posterior → A

B.              Likelihood

C.              Mean

D.              Variance

491.         Markov Chain Monte Carlo (MCMC) is used for:

A.              Bayesian estimation → A

B.              Frequentist estimation

C.              Mean only

D.              Variance only

492.         Gibbs sampling updates:

A.              One variable at a time → A

B.              All variables simultaneously

C.              Mean only

D.              Variance only

493.         Metropolis-Hastings algorithm:

A.              Accepts/rejects proposed sample → A

B.              Always accepts

C.              Always rejects

D.              Updates mean only

494.         Monte Carlo simulation estimates:

A.              Probabilities & distributions → A

B.              Mean only

C.              Median only

D.              Mode

495.         Random effects model accounts for:

A.              Group-level variation → A

B.              Fixed effect only

C.              Mean only

D.              Variance only

496.         Fixed effects model assumes:

A.              Constant effects → A

B.              Random effects

C.              Variable effects

D.              Unknown effects

497.         Mixed effects model combines:

A.              Fixed + random → A

B.              Fixed only

C.              Random only

D.              Neither

498.         Multilevel modeling is used for:

A.              Nested data → A

B.              Independent data

C.              Continuous only

D.              Categorical only

499.         Overdispersion occurs when:

A.              Variance > mean → A

B.              Variance < mean

C.              Variance = mean

D.              Mean = 0

500.         Negative binomial regression handles:

A.              Overdispersed count data → A

B.              Binary data

C.              Continuous data

D.              Ordinal data


Share:

Contact Us

SALEHE NJOHOLE P.O.BOX 2428, DAR ES SALAAM, TANZANIA EAST AFRIKA. Call: 0692 127 931