Python:如何将系列/列中的相同数字更改为其他值?

我正在尝试更改数据框中很长的列(大约 1mio 条目)的值。我有类似的东西


####ID_Orig

3452  

3452  

3452  

6543  

6543

...

我想要类似的东西


####ID_new

0  

0  

0  

1  

1  

...

目前我正在这样做:


j=0

for i in range(0,1199531): 

    if data.ID_orig[i]==data.ID_orig[i+1]:

        data.ID_orig[i] = j

    else:

        data.ID_orig[i] = j

        j=j+1

这需要很长时间......有没有更快的方法来做到这一点?我不知道值ID_orig有什么以及单个值出现的频率。


料青山看我应如是
浏览 370回答 3
3回答

明月笑刀无情

使用factorize,但如果有重复的组,则输出值设置为相同的数字。将ed 值与ne( !=)进行比较的另一种解决方案更通用 - 始终创建新值,如果重复组值也是如此:shiftcumsumdf['ID_new1'] = pd.factorize(df['ID_Orig'])[0]df['ID_new2'] = df['ID_Orig'].ne(df['ID_Orig'].shift()).cumsum() - 1print (df)&nbsp; &nbsp;ID_Orig&nbsp; ID_new1&nbsp; ID_new20&nbsp; &nbsp; &nbsp;3452&nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; 01&nbsp; &nbsp; &nbsp;3452&nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; 02&nbsp; &nbsp; &nbsp;3452&nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; 03&nbsp; &nbsp; &nbsp;6543&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; 14&nbsp; &nbsp; &nbsp;6543&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; 15&nbsp; &nbsp; &nbsp; 100&nbsp; &nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp; 26&nbsp; &nbsp; &nbsp; 100&nbsp; &nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp; 27&nbsp; &nbsp; &nbsp;6543&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; 3 <-repeating group8&nbsp; &nbsp; &nbsp;6543&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; 3 <-repeating group

桃花长相依

你可以这样做 …import collectionsl1 = [3452, 3452, 3452, 6543, 6543]c = collections.Counter(l1)l2 = list(c.items())l3 = []for i, t in enumerate(l2):&nbsp; &nbsp; for x in range(t[1]):&nbsp; &nbsp; &nbsp; &nbsp; l3.append(i)for x in l3:&nbsp; &nbsp; print(x)这是输出:00011
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python