
结论部分记在这里

逻辑回归,不需要指定 函数(线性、几次方、指数等)
code: https://github.com/hanbingjiao/ML-Python3-LogisticRegression-Demo/
CSV 下载地址:https://github.com/weiwanling/TensorFlow_Experiment/tree/1241a064d780cdcf278e4f8edec0f7bbd9bc0aea
通过多个角度看模型评估的好坏
绘制图表,进行数据可视化
pandas:基础数据分析套件
scikit-learn:强大的数据分析建模库
keras:人工神经网络
下载数据集遇到的问题:Kaggle网站注册用户刷不出验证界面。这里需要安装谷歌访问助手,百度一下有很多解答。
import pandas as pd
path = 'data/pima-indians-diabetes.csv'
pima=pd.read_csv(path)
pima.head()
#X,y赋值
feature_name=['pregnant','insulin','bmi';age']
X = pima[feature_names]
y =pima.label
#确认维度
print(X.shape)
print(y.shape)
#数据分离
from sklearn.model_selection import train_test_split
X_trian,X_test,y_train,y_test = train_test_split(X,y,random_state=0)
#模型训练
from sklearn.linear_model import logisticregression
logReg = logisticRegression()
logReg.fit(X_train,y_train)
y_pred = logReg.predict(X_test)
from sklearn import metric
print("metrics.accuracy_score(y_test,y_pred)
##确认正负样本的数据量以及空准确率
y_test.value_counts()
y_test.mean()
1-y_test.mean()
max(y_test.mean(),1-y_test.mean())
#展示部分书记结果与预测结果
print(y_test.value[0:25]
pritn(y_pred[0:25]
#计算并展示混淆矩阵
confusion = metrics.confusion_metrix(y_test,y_pred)
#四个因子赋值
TN = confusion[0][0]
FP = confusion[0][1]
FN = confusion[1][0]
TP = confusion[1][1]
#指标计算
accuracy = (TP+TN)/(TP+TN+FP+FN)
mis_rate =(FP+FN)/(TP+TN+FP+FN)
recall =TP/(TP+FN)
specificity =TN(TN+FP)
precision = TP/(TP+FP)
f1_score = 2*precison*recall/(precision+recall)
本地下载:https://nico.cc/softs/pima-indians-diabetes-database.zip