Data Science Day 21: F -test and t-test
From last time we know t-test is used for comparing the mean of 2-level categorical variable and ANOVA is used for comparing the mean value of a 3-level categorical variable or more.
Question:
However, there is a question bugs me, why both T-test and ANOVA are comparing the mean value, but** one P-value comes from the t-test and the other P-value is derived from the F-test**?
[caption id=“attachment_1249” align=“alignnone” width=“300”]
Pexels / Pixabay[/caption]
I did a bit research into this and discussed with little Rain, then we found out the key relation to answer is the equivalence of F and t-test.
Answer:
KaTeX parse error: Expected 'EOF', got ' ' at position 9: F= t^{2} ̲
The hidden reason is when pair of the sample are normally distributed then the ratios of variance of sample in each pair will always follow the same distribution. Therefore, the t-test and F-test generate the same p-values.
Example : F-test vs t-test in Blood pressure decrease dataset
We want to know if the blood pressure medication has changed the blood pressure for 15 patients after 6 months.
test=pd.DataFrame({"score_decrease": [ -5, -8, 0, 0, 0 ,2,4,6,8, 10,10, 10,18,26,32] })
center=pd.DataFrame({"score_remained": [ 0, 0, 0, 0, 0 ,0,0,0,0, 0,0, 0,0,0,0] })
F-test results:
scipy.stats.f_oneway(score_decrease,score_remained)
F_onewayResult(statistic=array([ 7.08657734]), pvalue=array([ 0.01272079]))
t-test results:
scipy.stats.ttest_ind(score_decrease, score_remained)
Ttest_indResult(statistic=array([ 2.66206261]), pvalue=array([ 0.01272079]))
As we can see the F-test and t-test have the same P-value= 0.0127.
I used SAS to generate a graph:
ods graphics on;
proc ttest h0=0 plots(showh0) sides=u alpha=0.05;
var decrease;
run;
ods graphics off;
Summary:
Except for F=t2F=t^2F=t2, I summarized a table for F-test and t-test.
##t- test & F-test Assumption ##
- Observations are Independent and Random
- The population are Normally distributed
- No outliers
####t-test Null-hypothesis:
The mean value of the two groups are the same.
The mean value = n0.
F-test Null hypothesis:
The mean value of three or more groups are the same.
N1=N2=N3…
t-test Features
The Standard deviation is not known and Sample size is small.
F-test Features:
The variance of the normal populations is not known.
t-test Application:
1.Compare mean value of two groups.
2.Compare mean value of a group with a particular number.
F-test Application:
- comparing the variances of two or more populations.
- ANOVA comparing the mean value of 3 or more groups.