Facts about the F Distribution

Lumen Learning; OpenStax

Facts about the F Distribution

Learning Outcomes

Discuss two uses for the F distribution: one-way ANOVA and the test of two variances

Here are some facts about the F distribution.

The curve is not symmetrical but skewed to the right.
There is a different curve for each set of dfs.
The F statistic is greater than or equal to zero.
As the degrees of freedom for the numerator and for the denominator get larger, the curve approximates the normal.
Other uses for the F distribution include comparing two variances and two-way Analysis of Variance. Two-Way Analysis is beyond the scope of this chapter.

try it

MRSA, or Staphylococcus aureus, can cause a serious bacterial infections in hospital patients. This table shows various colony counts from different patients who may or may not have MRSA.

Conc = 0.6	Conc = 0.8	Conc = 1.0	Conc = 1.2	Conc = 1.4
9	16	22	30	27
66	93	147	199	168
98	82	120	148	132

Plot of the data for the different concentrations:

This graph is a scatterplot for the data provided. The horizontal axis is labeled 'Colony counts' and extends from 0 - 200. The vertical axis is labeled 'Tryptone concentrations' and extends from 0.6 - 1.4.

Test whether the mean number of colonies are the same or are different. Construct the ANOVA table (by hand or by using a TI-83, 83+, or 84+ calculator), find the p-value, and state your conclusion. Use a 5% significance level.

While there are differences in the spreads between the groups, the differences do not appear to be big enough to cause concern.

We test for the equality of mean number of colonies:

H₀ : μ₁ = μ₂ = μ₃ = μ₄ = μ₅H_a: μⁱ ≠ μ^j some i ≠ j

The one-way ANOVA table results are shown in below.

Source of Variation	Sum of Squares (SS)	Degrees of Freedom (df)	Mean Square (MS)	F
Factor (Between)	10,233	5 – 1 = 4	[latex]\displaystyle\frac{{{10},{233}}}{{4}}={2},{558.25}[/latex]	[latex]\displaystyle\frac{{{2},{558.25}}}{{{4},{194.9}}}={0.6099}[/latex]
Error (Within)	41,949	15 – 5 = 10
Total	52,182	15 – 1 = 14	[latex]\displaystyle\frac{{{41},{949}}}{{10}}={4},{194.9}[/latex]

This graph shows a nonsymmetrical F distribution curve. The curve is skewed to the right. A vertical upward line extends from 0.6649 to the curve. This line is just to the right of the graph's peak and the region to the right of the line is shaded to represent the p-value. Distribution for the test: F_4,10Probability Statement: p-value = P(F > 0.6099) = 0.6649.

Compare α and the p-value: α = 0.05, p-value = 0.669, α < p-value

Make a decision: Since α < p-value, we do not reject H0.

Conclusion: At the 5% significance level, there is insufficient evidence from these data that different levels of tryptone will cause a significant difference in the mean number of bacterial colonies formed.

Example

Four sororities took a random sample of sisters regarding their grade means for the past term. The results are shown in the table.

Mean Grades for Four Sororities

Sorority 1	Sorority 2	Sorority 3	Sorority 4
2.17	2.63	2.63	3.79
1.85	1.77	3.78	3.45
2.83	3.25	4.00	3.08
1.69	1.86	2.55	2.26
3.33	2.21	2.45	3.18

Using a significance level of 1%, is there a difference in mean grades among the sororities?

Solution:

Let μ₁, μ₂, μ₃, μ₄ be the population means of the sororities. Remember that the null hypothesis claims that the sorority groups are from the same normal distribution. The alternate hypothesis says that at least two of the sorority groups come from populations with different normal distributions. Notice that the four sample sizes are each five.

Note

This is an example of a balanced design, because each factor (i.e., sorority) has the same number of observations.

H₀: μ₁ = μ₂ = μ₃ = μ₄

H_a: Not all of the means μ₁, μ₂, μ₃, μ₄ are equal.

Distribution for the test: F_3,16

where k = 4 groups and n = 20 samples in total

df(num)= k – 1 = 4 – 1 = 3

df(denom) = n – k = 20 – 4 = 16

Calculate the test statistic: F = 2.23

Graph:

This graph shows a nonsymmetrical F distribution curve with values of 0 and 2.23 on the x-axis representing the test statistic of sorority grade averages. The curve is slightly skewed to the right, but is approximately normal. A vertical upward line extends from 2.23 to the curve and the area to the right of this is shaded to represent the p-value. Probability statement: p-value = P(F > 2.23) = 0.1241

Compare α and the p-value: α = 0.01

p-value = 0.1241

α < p-value

Make a decision: Since α < p-value, you cannot reject H0.

Conclusion: There is not sufficient evidence to conclude that there is a difference among the mean grades for the sororities.

Using a Calculator

Put the data into lists L1, L2, L3, and L4. Press STAT and arrow over to TESTS. Arrow down to F:ANOVA. Press ENTERand Enter (L1,L2,L3,L4).

The calculator displays the F statistic, the p-value and the values for the one-way ANOVA table:

F = 2.2303

p = 0.1241 (p-value)

Factor df = 3

SS = 2.88732

MS = 0.96244

Error df = 16

SS = 6.9044

MS = 0.431525