Aquileo | Type II Error in Hypothesis Testing with R Programming

Type II Error occurs in hypothesis testing when we fail to reject the null hypothesis even though it is false. In simple words, we miss identifying a real effect or difference that actually exists. It can happen due to small sample size, low effect, or using a very strict significance level.

Mathematical Definition of Type II Error:

P(Failing to reject H_0 given H_0 is false) = P(Failing to reject H_0 \mid H_0 is false)

Note that:

P(X) is the probability of the event X happening.
H_0 = NULL Hypothesis
H_a= Alternative Hypothesis

Implementation of Type II Error in Hypothesis Testing using R

We simulate Type II error in hypothesis testing by repeatedly drawing samples from a population where the null hypothesis is false and measuring how often we fail to reject the null hypothesis. We will be using R programming language for implementation.

1. Defining the Function to Simulate Type II Error

We define a function that performs repeated sampling, calculates p-values, and estimates the proportion of times the null hypothesis is not rejected when it should be.

typeII.test: Custom function to estimate Type II error through simulations.
mu0: Assumed mean under the null hypothesis.
TRUEmu: Actual true mean of the population.
sigma: Standard deviation of the population.
n: Number of observations in each sample.
alpha: Significance level used in the t-test.
iterations: Number of samples simulated.
rnorm: Generates random values from a normal distribution.
sd: Calculates sample standard deviation.
pt: Returns cumulative probability for a given t-value.
mean(pvals >= alpha): Calculates the proportion of p-values greater than alpha (means, failing to reject null).

typeII.test <- function(mu0, TRUEmu, sigma, n, alpha, iterations = 10000){
  pvals <- rep(NA, iterations)
  for(i in 1 : iterations){
    temporary.sample <- rnorm(n = n, mean = TRUEmu, sd = sigma)
    temporary.mean <- mean(temporary.sample)
    temporary.sd <- sd(temporary.sample)
    pvals[i] <- 1 - pt((temporary.mean - mu0)/(temporary.sd / sqrt(n)), df = n - 1)
  }
  return(mean(pvals >= alpha))
}

2. Estimating Type II Error for sigma = 3

We run the simulation with a lower spread (\sigma = 3) to estimate the Type II error.

n <- 10
sigma <- 3
alpha <- 0.03
mu0 <- 4
TRUEmu <- 10
typeII.test(mu0, TRUEmu, sigma, n, alpha, iterations = 10000)

Output:

0

This means we almost always reject the null hypothesis when it is false (very low Type II error).

3. Estimating Type II Error for sigma = 5

We increase the spread (\sigma = 5) and re-run the function to see how the error rate changes.

n <- 10
sigma <- 5
alpha <- 0.03
mu0 <- 4
TRUEmu <- 10
typeII.test(mu0, TRUEmu, sigma, n, alpha, iterations = 10000)

Output:

0.0622

This shows a slight increase in Type II error as variability in the data increases.

Type II Error in Hypothesis Testing with R Programming

Mathematical Definition of Type II Error:

Implementation of Type II Error in Hypothesis Testing using R

1. Defining the Function to Simulate Type II Error

2. Estimating Type II Error for sigma = 3

3. Estimating Type II Error for sigma = 5

Explore