2025-10-06 13:32 Tags:Bordeaux

Two-sample t-test Example

What is a t-test?

A t-test is a statistical test that compares means (averages).
It asks:

“Are the observed differences in means big enough that they’re unlikely to be due to random chance?”

Question: Do mothers of low-birth-weight babies have a different average weight compared to mothers of normal-birth-weight babies?


Step 1: Hypotheses

  • Null hypothesis:
  • Alternative hypothesis:

Step 2: Sample data (illustration)

  • Group 1 (low BW mothers):

  • Group 2 (normal BW mothers):


Step 3: Formula for two-sample t-test (equal variances)

  1. Pooled variance:
  1. Standard error (SE):
  1. t-statistic:

Step 4: Plug in numbers

  1. Pooled variance:

So pooled SD:

  1. Standard error:
  1. t-statistic:

Step 5: Decision

  • Degrees of freedom:
  • Critical value (two-sided, ):

  • Our test statistic:

Since , we reject .


✅ Conclusion

There is a significant difference in maternal weights between mothers of low vs. normal birth weight babies (p < 0.01).

Chi-square Test of Independence Example

1. What is a chi-square test?

A chi-square test is used for categorical variables (yes/no, male/female, race groups, etc.).

It answers questions like:

“Are these two categorical variables related, or are they independent?”
“Do proportions differ across groups?”


Step 1: Hypotheses

  • Null hypothesis:
  • Alternative hypothesis:

Step 2: Sample data (illustration)

RaceLow BW (1)Normal BW (0)Total
White20180200
Black1585100
Other1090100
Total45355400

Step 3: Expected counts

Formula:

Example for White–Low BW:

Do this for each cell.


Step 4: Compute chi-square statistic

Formula:

Let’s calculate a few cells:

  • White–Low BW:
  • Black–Low BW:
  • Other–Low BW:

… and similarly for the “Normal BW” cells.

Add them all up:


Step 5: Decision

  • Degrees of freedom:
  • Critical value (α = 0.05, df = 2): 5.99

  • Our test statistic:

Since , we fail to reject .


✅ Conclusion

There is no significant evidence that the proportion of low birth weight infants differs across racial groups (p > 0.05).


Conclusion

🧩 Step 1: What are we comparing?

The table is all about bivariate comparisons — comparing an outcome (dependent variable) across one or more groups (independent variable).

Two key questions:

  1. Is the outcome variable continuous (a number like weight, height, blood pressure) or categorical (like yes/no, race, low vs normal)?

  2. How many groups are we comparing? (1, 2, or more?)


🧩 Step 2: Continuous outcomes

If your dependent variable is continuous:

a) One group vs. a fixed value

  • Example: Is the average maternal weight = 120 pounds?

  • Test: One-sample t-test (parametric) or Wilcoxon signed-rank test (nonparametric).

  • Why? Because you’re comparing the sample mean/median to a known constant.


b) Two independent groups

  • Example: Is mean maternal weight different for low vs. normal birth weight infants?

  • Test: Two-sample t-test (parametric) or Mann–Whitney U (nonparametric).

  • Why? Because you want to compare averages of two groups.


c) More than two groups

  • Example: Is mean infant birth weight different across 3 race categories?

  • Test: ANOVA (parametric) or Kruskal–Wallis test (nonparametric).

  • Why? ANOVA extends the t-test idea to more than 2 groups.


d) Two matched/paired groups

  • Example: Compare a mother’s blood pressure before pregnancy vs. during pregnancy (same person, measured twice).

  • Test: Paired t-test (parametric) or Wilcoxon signed-rank test (nonparametric).

  • Why? Because the data points are linked (paired).


🧩 Step 3: Categorical outcomes

If your dependent variable is categorical (yes/no, or categories like race):

a) Two independent groups

  • Example: Compare proportion of low birth weight in smokers vs. non-smokers.

  • Test: z-test for proportions (parametric) or Chi-square test (nonparametric). If sample is small, Fisher’s exact test.

  • Why? Because you’re comparing proportions between 2 groups.


b) More than two groups

  • Example: Does the proportion of low birth weight differ across 3 races?

  • Test: Chi-square test of independence.

  • Why? Because chi-square is the standard tool for comparing categorical distributions across groups.


c) Paired/matched binary data

  • Example: Did the same baby get classified as “low birth weight” by two doctors? (paired yes/no outcomes).

  • Test: McNemar’s test.

  • Why? Because it’s for paired categorical data (think 2x2 table with matched pairs).


🧩 Step 4: Continuous vs Continuous

  • Example: Is maternal weight correlated with infant birth weight?

  • Test: Pearson’s correlation (parametric) or Spearman’s correlation (nonparametric).

  • Why? Because both are continuous, and you’re looking for association.


🧩 Step 5: Parametric vs. Nonparametric

  • Parametric tests (t-test, ANOVA, z-test, Pearson) assume the data follow certain distributions (usually normal).

  • Nonparametric tests (Wilcoxon, Mann–Whitney, Kruskal–Wallis, Spearman, Fisher’s exact) don’t rely on those assumptions.

  • Rule of thumb: if sample size is small or data are skewed/outliers → go nonparametric.


✅ In plain words:

  • If your outcome is continuous → you’re comparing means → use t-test (2 groups), ANOVA (≥3 groups), or paired t-test.

  • If your outcome is categorical → you’re comparing proportions → use chi-square, z-test, or Fisher’s exact.

  • If both are continuous → use correlation (Pearson or Spearman).


Two-sample z-test (with known variances)

Question: Do two independent populations have the same mean, assuming their variances are known?


Step 1: Hypotheses

  • Null hypothesis:
  • Alternative hypothesis (depends on question):

or


Step 2: Sampling distribution of the difference

The sample means are:

The difference has expectation:

And variance:

So the standard error (SE) is:


Step 3: Test statistic

Under :


Step 4: Decision rule

  • Choose significance level (e.g., 0.05).
  • Two-sided test: reject if .
  • One-sided test: reject if or .
  • Or compute the p-value and compare with .

Step 5: Numerical Example

Suppose:

  • Population 1: , ,
  • Population 2: , ,

A. Compute SE

B. Test statistic

C. Decision

  • Degrees of freedom: not needed (Z test uses normal distribution).
  • Critical value for two-sided : .
  • Our .

✅ Reject .


✅ Conclusion

There is a significant difference between the two population means (p < 0.01).