Data Science Day 4:
Chi-Square test application 1:
Test Goodness of a fit.
We use the goodness of a fit to test if the observed categorical data follows the hypothesized or expected distribution.
Example 1: P-value Interpretation
Suppose f_exp are the expected number of boys in grade 1 different classes. f_obs are the observed number of boys in grade 1. We want to see if f_obs is the same as the f_exp distribution.
H0(Null Hypotheses):the observation boy students distribution is consistent with the expected distribution.
We use the followingpythoncode to acquire the p-value:
Chisquare(f_obs=[18,15,5,8,4,3], f_exp=[10,5,7,18,10,11])
For this particular example, thep-value=6.02e-08, which is significantlysmaller than 0.05. So wereject H0,and conclude the observed boy students distribution is Differentfrom the Expected boy distributions.Note, in order to avoid biased result, we should have the observation numbers >5.
Example 2: Data visualization Interpretation
We will graph a Chi-square distribution with degree 5 and size 1000, and use Kernel Density Estimation to fit the graph. We can see this is a pretty good fit.
To be continue.....
作者:乌然娅措
链接:https://www.jianshu.com/p/a352952a7899