循环实例无限附加字符串

我试图在 OOP 中编写一个表,该表将返回一些算法的统计指标,并在 Python 中的 pandas DataFrame 中显示。


我遇到的问题是,对于每个实例,列的名称都会在正在创建的 DataFrame 中附加一个“预测”的附加字符串(最后的示例)。


我的代码:


from sklearn.metrics import roc_auc_score, accuracy_score, cohen_kappa_score, recall_score, accuracy_score, precision_score, f1_score

from sklearn import metrics


#----------------------------------------------------------------#        

####################### ROC metrics table ########################

#----------------------------------------------------------------# 



class roc_table:

    def __init__(self, data):

        self.data = data



    def viewer():



        #count columns in dataframe

        count_algo = len(data.columns)


        for i in data.iloc[:,1:]:

            data['predicted_{}'.format(i)] = (data[i] >= threshold).astype('int')


        rock_table = {

            "AUC":[round(roc_auc_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],

            "Accuracy":[round(accuracy_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],

            "Kappa":[round(cohen_kappa_score(data.actual_label, data[i]),2)for i in data.iloc[:,count_algo:]],

            "Sensitivity (Recall)": [round(recall_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],

            "Specificity": [round(accuracy_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],

            "Precision": [round(precision_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],

            "F1": [round(f1_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]]

        }   


        rock_table = pd.DataFrame.from_dict(rock_table, orient = 'index').reset_index()

        col = ['metrics']

        col.extend([x for x in data.iloc[:,count_algo:]])

        rock_table.columns = col    


        return rock_table

这条线给我带来了麻烦:


for i in data.iloc[:,1:]:

            data['predicted_{}'.format(i)] = (data[i] >= threshold).astype('int')

运行它时得到的输出示例:

http://img1.mukewang.com/62cd45ab0001524c06280785.jpg

阿晨1998
浏览 139回答 1
1回答

有只小跳蛙

问题在于您的 OOP 实现。您正在改变传递给“ roc_table ”类的原始数据。请尝试以下方法:class roc_table:def __init__(self, data):    self.org_data = datadef viewer(self, threshold):    #make a copy of initial data    data = self.org_data.copy()    #count columns in dataframe    count_algo = len(data.columns)    for i in data.iloc[:,1:]:        data['predicted_{}'.format(i)] = (data[i] >= threshold).astype('int')    rock_table = {        "AUC":[round(roc_auc_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],        "Accuracy":[round(accuracy_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],        "Kappa":[round(cohen_kappa_score(data.actual_label, data[i]),2)for i in data.iloc[:,count_algo:]],        "Sensitivity (Recall)": [round(recall_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],        "Specificity": [round(accuracy_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],        "Precision": [round(precision_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]],        "F1": [round(f1_score(data.actual_label, data[i]),2) for i in data.iloc[:,count_algo:]]    }       rock_table = pd.DataFrame.from_dict(rock_table, orient = 'index').reset_index()    col = ['metrics']    col.extend([x for x in data.iloc[:,count_algo:]])    rock_table.columns = col        return rock_table然后像这样实例化类并使用:rt = roc_table(data)threshold=0.5rt.viewer(threshold)threshold=0.75rt.viewer(threshold)这样原始数据就不会发生变异。希望这可以帮助。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python