python pandas 按值计数重新标记值

给出以下示例：

example = pd.DataFrame({'y':[1,1,1,1,1,1,1,1,1,1,2,2,2,2,0,0,-1,-1,-1]})

我想按频率计数按降序重新标记这些值。因此，我希望将案例数最多的值（例如 1）替换为 0，然后将下一个最大的 bin 替换为 1，依此类推所有值。需要注意的是我想忽略值为 -1 的情况。如果我运行value_counts()，我可以看到这个：

1 10

2 4

-1 3

0 2

dtype: int64

但我想要一个 pythonic 和非 hacky/clean 解决方案来创建以下内容：

0 0

1 0

2 0

3 0

4 0

5 0

6 0

7 0

8 0

9 0

10 1

11 1

12 1

13 1

14 2

15 2

16 -1

17 -1

18 -1

0 10

1 4

-1 3

2 2

dtype: int64

（理想情况下，我也保留旧的专栏，以便保存良好的记录）。我可以循环每个值，检查它是否不是 -1，然后value_counts()用迭代次数替换它，但这感觉维护成本很高。有没有一种干净的方法来实现它？

守着一只汪

浏览 99回答 1

1回答

米脂

由after withoutSeries.map的索引创建的字典使用：SeriesSeries.value_counts-1s = example['y'].value_counts().drop(-1)d = {v:k for k, v in dict(enumerate(s.index)).items()}或者：s = example['y'].value_counts().drop(-1)d = dict(zip(s.index, range(len(s))))m = example['y'].ne(-1)example.loc[m, 'y'] = example.loc[m, 'y'].map(d)print (example) y0 01 02 03 04 05 06 07 08 09 010 111 112 113 114 215 216 -117 -118 -1另一个想法是增加-1价值：-1dictionarys = example['y'].value_counts().drop(-1)d = {**{-1:-1}, **dict(zip(s.index, range(len(s))))}example['y'] = example['y'].map(d)

0 0

随时随地看视频慕课网APP