我们可以再添加一列“filestatus”,并使用 pandas 在状态列上添加逻辑吗

这是我的Python代码:


在下面的输出中samplefile1已传入所有 3 行,因此 newcolumn{"filestatus" : "passed"}为"InputFile":"samplefile1"


因为example它有一个pass又一个fail如此新的{"filestatus":"failed"}栏目{"inputfile":"example"}


import json


df = pd.DataFrame([

            ['samlefile1','user1@gmail.com', 'xyz' ,'pass'],

            ['samlefile1','user5@gmail.com', 'xyz' ,'pass'],

            ['samlefile1','user6@gmail.com', 'xyz' ,'pass'],

            ['testfile','user2@gmail.com', 'abc' ,'pass'],

            ['example','user3@gmail.com', 'xyz' ,'pass'],

            ['example','user3@gmail.com', 'xyz' ,'fail']],columns = ['InputFile','UserId', 'UserGroup' ,'status']

    )

#Count no of status per filename

Input_status_count = df.groupby(['InputFile']).agg(success_count=('status', lambda x: x[x == 'pass'].count()),

                                                           fail_count=('status', lambda x: x[x == 'fail'].count()))

        

#Merge the 2 files on 'InputFile' column

FinalDF = pd.merge(df, Input_status_count,on = "InputFile" )

JSON_String = FinalDF.to_json(orient='records')

JSON_String



output:

[    

{"InputFile":"samlefile1","UserId":"user1@gmail.com","UserGroup":"xyz","status":"pass","success_count":3,"fail_count":0, "filestatus":"passed"},

    {"InputFile":"samlefile1","UserId":"user5@gmail.com","UserGroup":"xyz","status":"pass","success_count":3,"fail_count":0, "filestaus":"passed"},

    {"InputFile":"samlefile1","UserId":"user6@gmail.com","UserGroup":"xyz","status":"pass","success_count":3,"fail_count":0, "filestatus":"passed"},

    {"InputFile":"testfile","UserId":"user2@gmail.com","UserGroup":"abc","status":"","success_count":1,"fail_count":0, "filestatus":"not ran"},

    {"InputFile":"example","UserId":"user3@gmail.com","UserGroup":"xyz","status":"pass","success_count":1,"fail_count":1,"filestatus":"failed"},

    {"InputFile":"example","UserId":"user3@gmail.com","UserGroup":"xyz","status":"fail","success_count":1,"fail_count":1, "filestatus": "failed"}

]


哔哔one
浏览 209回答 2
2回答

红糖糍粑

是的,你可以这样做:import numpy as np#Count no of status per filenameInput_status_count = df.groupby(['InputFile']).agg([('filestatus', lambda x: 'passed' if x[x['status'] == 'fail'].count() == 0 else 'failed')])Lambda 函数基本上检查我们是否获得状态为失败的零行计数,然后文件状态将被传递,否则失败。

白衣非少年

您可以使用 pd.crosstab (而不是复杂的 agg)将通过和失败制成表格,并且当没有失败时状态列将通过:Input_status_count = pd.crosstab(df['InputFile'],df['status']).reset_index()Input_status_count['filestatus'] = ["passed" if i==0 else "failed" for i in Input_status_count['fail']]Input_status_countstatus   InputFile  fail  pass filestatus0          example     1     1     failed1       samlefile1     0     3     passed2         testfile     0     1     passed
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python