将列添加回DataFrame

首页课程实战体系课手记专栏慕课教程

将列添加回DataFrame

我正在尝试获取的概率testers_time并将其加回到df。我有以下几点：

dict = {'id': ['a','b','c','d'], 'testers_time': [10, 30, 15, None], 'stage_1_to_2_time': [30, None, 30, None], 'activated_time' : [40, None, 45, None],'stage_2_to_3_time' : [30, None, None, None],'engaged_time' : [70, None, None, None]}

df = pd.DataFrame(dict, columns=['id', 'testers_time', 'stage_1_to_2_time', 'activated_time', 'stage_2_to_3_time', 'engaged_time'])

unique, counts = np.unique(df['testers_time'].dropna().sort_values() , return_counts=True)

print(pd.DataFrame(counts/float(len(counts))))

预期输出（最后一列）：

id testers_time stage_1_to_2_time activated_time stage_2_to_3_time \

0 a 10.0 30.0 40.0 30.0

1 b 30.0 NaN NaN NaN

2 c 15.0 30.0 45.0 NaN

3 d NaN NaN NaN NaN

engaged_time prob

0 70.0 0.333333

1 NaN 0.333333

2 NaN 0.333333

3 NaN NaN

但是我一直在坚持如何将其添加回df。你能帮忙吗？

临摹微笑

浏览 150回答 1

1回答

梵蒂冈之花

您可能想要map一些标准化的value_counts输出，像这样。df['prob'] = df['testers_time'].map(    df.testers_time.value_counts(normalize=True))df  id  testers_time  stage_1_to_2_time  activated_time  stage_2_to_3_time  engaged_time      prob0  a          10.0               30.0            40.0               30.0          70.0  0.3333331  b          30.0                NaN             NaN                NaN           NaN  0.3333332  c          15.0               30.0            45.0                NaN           NaN  0.3333333  d           NaN                NaN             NaN                NaN           NaN       NaN

0 0

随时随地看视频慕课网APP

相关分类

Python