我有这个数据框
import pandas as pd
from datetime import datetime
df = pd.DataFrame([
{"_id": "1", "date": datetime.strptime("2020-09-29 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"},
{"_id": "2", "date": datetime.strptime("2020-09-29 14:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
{"_id": "3", "date": datetime.strptime("2020-09-25 17:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"},
{"_id": "4", "date": datetime.strptime("2020-09-17 09:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
{"_id": "5", "date": datetime.strptime("2020-09-19 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
{"_id": "6", "date": datetime.strptime("2020-09-19 08:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
]).set_index('date')
看起来像这样:
_id status
date
2020-09-29 07:00:00 1 started
2020-09-29 14:00:00 2 end
2020-09-25 17:00:00 3 started
2020-09-17 09:00:00 4 end
2020-09-19 07:00:00 5 end
我正在尝试按天分组并计算每个状态。但我想在列名称中包含名称的名称。
这是所需的输出:
status_started status_end
date
2020-09-29 07:00:00 1 1
2020-09-25 17:00:00 1 0
2020-09-17 09:00:00 0 1
2020-09-19 07:00:00 0 2
我试过这个:
df = df.groupby([pd.Grouper(freq='d'), 'status']).agg({'status': "count"})
df = df.reset_index(level="status")
out:
status
date status
2020-09-17 end 1
2020-09-19 end 2
2020-09-25 started 1
2020-09-29 end 1
2020-09-29 started 1
但并没有成功改造df。
qq_笑_17
一只名叫tom的猫
相关分类