我正在使用 9000 行和 6 列的 pandas 数据框。此时,我正在尝试将 4 种工作(商业经理 - 业务开发人员 -网络营销人员 - 流量经理)。
鉴于每个工作的年经验范围并不相同,我使用“qcut”将数据分为四组,如下所示:
(您可以运行下面的代码来获取数据框示例)
import pandas as pd
df = pd.DataFrame({'Job': ['Commercial Manager', 'Traffic Manager', 'Web Marketer', 'Commercial Manager', 'Commercial Manager', 'Web Marketer', 'Commercial Manager', 'Commercial Manager', 'Traffic Manager', 'Business Developer', 'Business Developer', 'Web Marketer', 'Traffic Manager', 'Traffic Manager', 'Commercial Manager', 'Business Developer', 'Traffic Manager', 'Commercial Manager', 'Business Developer', 'Business Developer', 'Web Marketer'],
'Experience': [1.00000, 3.00000, 3.00000, 1.50000, 2.00000, 6.00000, 0.00000, 4.00000, 8.00000, 5.00000, 0.50000, 3.00000, 3.00000, 0.00000, 2.00000, 3.00000, 0.50000, 3.00000, 3.00000, 8.00000, 3.50000]})
levels = ["beginner", "intermediate", "advanced", "expert"]
jobs = ["Commercial Manager", "Business Developer", "Web Marketer", "Traffic Manager"]
def convert(levels, jobs):
for j in jobs:
df["Level"] = pd.qcut(df.loc[df["Job"] == j, "Experience"].rank(method="first"), q = 4, labels = levels, duplicates = "drop")
return df
convert(levels, jobs)
这是使用“qcut”后的输出:
Job Experience Level
0 Commercial Manager 1.00000 NaN
1 Traffic Manager 3.00000 intermediate
2 Web Marketer 3.00000 NaN
3 Commercial Manager 1.50000 NaN
4 Commercial Manager 2.00000 NaN
5 Web Marketer 6.00000 NaN
6 Commercial Manager 0.00000 NaN
7 Commercial Manager 4.00000 NaN
8 Traffic Manager 8.00000 expert
9 Business Developer 5.00000 NaN
10 Business Developer 0.50000 NaN
11 Web Marketer 3.00000 NaN
看来它只适用于“流量管理器”,并且level用 NaN 替换了其他体验。我真的迷失了。有什么帮助吗?
哔哔one
繁星点点滴滴
相关分类