我有一个数据框df,其中包含城市人口的工作和年龄信息
df
User City Job Age
0 A x Unemployed 33
1 B x Student 18
2 C x Unemployed 27
3 D y Data Scientist 28
4 E y Unemployed 45
5 F y Student 18
对于每个城市,我想计算失业率和年龄中位数。
对于失业率,我做了以下工作
## Count the people in each city
cust = insDataRed.groupby(['City'])['User'].count() ## Number of people for each city
cust = pd.DataFrame(cust)
cust.columns=['nCust']
cust['City']=cust.index
cust=cust.reset_index(drop=True)
## Count the people unemployed in each city
unempl = df[df['Job'] == 'Unemployed']
unempl = unempl.groupby(['City'])['Job'].count()
unempl = pd.DataFrame(unempl)
unempl.columns=['unempl']
unempl['City']=unempl.index
unempl=unempl.reset_index(drop=True)
# 1. Fraction of Unemployment
unRate = pd.merge(unempl, cust, on = 'City')
unRate['rate'] =(unRate['unempl']/unRate['nCust'])*100
有没有更优雅的解决方案?如何计算每个城市的年龄中值?
撒科打诨
相关分类