饮歌长啸
您可以将它们放入 data.frame 并调出列(这样输出看起来也更好):import statsmodels.api as smimport pandas as pdimport numpy as npHeight = np.random.uniform(0,1,100)Weight = np.random.uniform(0,1,100)Age = np.random.uniform(0,30,100)df = pd.DataFrame({'Height':Height,'Weight':Weight,'Age':Age})res = sm.OLS(df['Height'],df[['Weight','Age']]).fit()In [10]: res.summary()Out[10]: <class 'statsmodels.iolib.summary.Summary'>""" OLS Regression Results =======================================================================================Dep. Variable: Height R-squared (uncentered): 0.700Model: OLS Adj. R-squared (uncentered): 0.694Method: Least Squares F-statistic: 114.3Date: Mon, 24 Aug 2020 Prob (F-statistic): 2.43e-26Time: 15:54:30 Log-Likelihood: -28.374No. Observations: 100 AIC: 60.75Df Residuals: 98 BIC: 65.96Df Model: 2 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975]------------------------------------------------------------------------------Weight 0.1787 0.090 1.988 0.050 0.000 0.357Age 0.0229 0.003 8.235 0.000 0.017 0.028==============================================================================Omnibus: 2.938 Durbin-Watson: 1.813Prob(Omnibus): 0.230 Jarque-Bera (JB): 2.223Skew: -0.211 Prob(JB): 0.329Kurtosis: 2.404 Cond. No. 49.7==============================================================================
POPMUISE
我使用二阶多项式来预测身高和年龄如何影响士兵的体重。您可以在我的 GitHub 上获取 ansur_2_m.csv。 df=pd.read_csv('ANSUR_2_M.csv', encoding = "ISO-8859-1", usecols=['Weightlbs','Heightin','Age'], dtype={'Weightlbs':np.integer,'Heightin':np.integer,'Age':np.integer}) df=df.dropna() df.reset_index() df['Heightin2']=df['Heightin']**2 df['Age2']=df['Age']**2 formula="Weightlbs ~ Heightin+Heightin2+Age+Age2" model_ols = smf.ols(formula,data=df).fit() minHeight=df['Heightin'].min() maxHeight=df['Heightin'].max() avgAge = df['Age'].median() print(minHeight,maxHeight,avgAge) df2=pd.DataFrame() df2['Heightin']=np.linspace(60,100,50) df2['Heightin2']=df2['Heightin']**2 df2['Age']=28 df2['Age2']=df['Age']**2 df3=pd.DataFrame() df3['Heightin']=np.linspace(60,100,50) df3['Heightin2']=df2['Heightin']**2 df3['Age']=45 df3['Age2']=df['Age']**2 prediction28=model_ols.predict(df2) prediction45=model_ols.predict(df3) plt.clf() plt.plot(df2['Heightin'],prediction28,label="Age 28") plt.plot(df3['Heightin'],prediction45,label="Age 45") plt.ylabel="Weight lbs" plt.xlabel="Height in" plt.legend() plt.show() print('A 45 year old soldier is more probable to weight more than an 28 year old soldier')