如何在 Pandas 中用多个唯一字符串替换重复值?

import pandas as pd

import numpy as np

data = {'Name':['Tom', 'Tom', 'Jack', 'Terry'], 'Age':[20, 21, 19, 18]} 

df = pd.DataFrame(data)

假设我有一个看起来像这样的数据框。我试图弄清楚如何检查名称列中的值“汤姆”,如果我第一次找到它,我将其替换为值“FirstTom”,而第二次出现时,我将其替换为值“SecondTom” . 你如何做到这一点?我之前使用过 replace 方法,但仅用于用单个值替换所有 Toms。我不想在值的末尾添加 1,而是将字符串完全更改为其他内容。


编辑:


如果 df 看起来更像下面这样,我们将如何检查第一列和第二列中的 Tom,然后用 FirstTom 替换第一个实例,用 SecondTom 替换第二个实例


data = {'Name':['Tom', 'Jerry', 'Jack', 'Terry'], 'OtherName':[Tom, John, Bob,Steve]}


开满天机
浏览 113回答 4
4回答

白猪掌柜的

只需添加到现有的解决方案中,您就可以使用inflect创建动态字典import inflectp = inflect.engine()df['Name'] += df.groupby('Name').cumcount().add(1).map(p.ordinal).radd('_')print(df)        Name  Age0    Tom_1st   201    Tom_2nd   212   Jack_1st   193  Terry_1st   18

茅侃侃

我们可以做的cumcountdf.Name=df.Name+df.groupby('Name').cumcount().astype(str)df&nbsp; &nbsp; &nbsp;Name&nbsp; Age0&nbsp; &nbsp; Tom0&nbsp; &nbsp;201&nbsp; &nbsp; Tom1&nbsp; &nbsp;212&nbsp; &nbsp;Jack0&nbsp; &nbsp;193&nbsp; Terry0&nbsp; &nbsp;18更新suf = lambda n: "%d%s"%(n,{1:"st",2:"nd",3:"rd"}.get(n if n<20 else n%10,"th"))g=df.groupby('Name')df.Name=df.Name.radd(g.cumcount().add(1).map(suf).mask(g.Name.transform('count')==1,''))df&nbsp; &nbsp; &nbsp;Name&nbsp; Age0&nbsp; 1stTom&nbsp; &nbsp;201&nbsp; 2ndTom&nbsp; &nbsp;212&nbsp; &nbsp; Jack&nbsp; &nbsp;193&nbsp; &nbsp;Terry&nbsp; &nbsp;18更新 2 列suf = lambda n: "%d%s"%(n,{1:"st",2:"nd",3:"rd"}.get(n if n<20 else n%10,"th"))g=s.groupby([s.index.get_level_values(0),s])s=s.radd(g.cumcount().add(1).map(suf).mask(g.transform('count')==1,''))s=s.unstack()&nbsp; &nbsp; &nbsp;Name OtherName0&nbsp; 1stTom&nbsp; &nbsp; 2ndTom1&nbsp; &nbsp;Jerry&nbsp; &nbsp; &nbsp; John2&nbsp; &nbsp; Jack&nbsp; &nbsp; &nbsp; &nbsp;Bob3&nbsp; &nbsp;Terry&nbsp; &nbsp; &nbsp;Steve

红颜莎娜

编辑:对于每行重复的计数,请使用:df = pd.DataFrame(data = {'Name':['Tom', 'Jerry', 'Jack', 'Terry'],&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 'OtherName':['Tom', 'John', 'Bob','Steve'],&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 'Age':[20, 21, 19, 18]})print (df)&nbsp; &nbsp; Name OtherName&nbsp; Age0&nbsp; &nbsp; Tom&nbsp; &nbsp; &nbsp; &nbsp;Tom&nbsp; &nbsp;201&nbsp; Jerry&nbsp; &nbsp; &nbsp; John&nbsp; &nbsp;212&nbsp; &nbsp;Jack&nbsp; &nbsp; &nbsp; &nbsp;Bob&nbsp; &nbsp;193&nbsp; Terry&nbsp; &nbsp; &nbsp;Steve&nbsp; &nbsp;18import inflectp = inflect.engine()#map by function for dynamic counterf = lambda i: p.number_to_words(p.ordinal(i))#columns filled by namescols = ['Name','OtherName']#reshaped to MultiIndex Seriess = df[cols].stack()#counter per groupscount = s.groupby([s.index.get_level_values(0),s]).cumcount().add(1)#mask for filter duplicatesmask = s.reset_index().duplicated(['level_0',0], keep=False).values#filter only duplicates and map, reshape back and add to original datadf[cols] = count[mask].map(f).unstack().add(df[cols], fill_value='')print (df)&nbsp; &nbsp; &nbsp; &nbsp;Name&nbsp; OtherName&nbsp; Age0&nbsp; firstTom&nbsp; secondTom&nbsp; &nbsp;201&nbsp; &nbsp; &nbsp;Jerry&nbsp; &nbsp; &nbsp; &nbsp;John&nbsp; &nbsp;212&nbsp; &nbsp; &nbsp; Jack&nbsp; &nbsp; &nbsp; &nbsp; Bob&nbsp; &nbsp;193&nbsp; &nbsp; &nbsp;Terry&nbsp; &nbsp; &nbsp; Steve&nbsp; &nbsp;18使用GroupBy.cumcountwith Series.map,但仅适用于重复值 by Series.duplicated:data = {'Name':['Tom', 'Tom', 'Jack', 'Terry'], 'Age':[20, 21, 19, 18]}&nbsp;df = pd.DataFrame(data)nth = {0: "First",1: "Second",2: "Third",3: "Fourth"}mask = df.Name.duplicated(keep=False)df.loc[mask, 'Name'] = df[mask].groupby('Name').cumcount().map(nth) + df.loc[mask, 'Name']print (df)&nbsp; &nbsp; &nbsp; &nbsp; Name&nbsp; Age0&nbsp; &nbsp;FirstTom&nbsp; &nbsp;201&nbsp; SecondTom&nbsp; &nbsp;212&nbsp; &nbsp; &nbsp; &nbsp;Jack&nbsp; &nbsp;193&nbsp; &nbsp; &nbsp; Terry&nbsp; &nbsp;18动态字典应该是这样的:import inflectp = inflect.engine()mask = df.Name.duplicated(keep=False)f = lambda i: p.number_to_words(p.ordinal(i))df.loc[mask, 'Name'] = df[mask].groupby('Name').cumcount().add(1).map(f) + df.loc[mask, 'Name']print (df)&nbsp; &nbsp; &nbsp; &nbsp; Name&nbsp; Age0&nbsp; &nbsp;firstTom&nbsp; &nbsp;201&nbsp; secondTom&nbsp; &nbsp;212&nbsp; &nbsp; &nbsp; &nbsp;Jack&nbsp; &nbsp;193&nbsp; &nbsp; &nbsp; Terry&nbsp; &nbsp;18

牧羊人nacy

transformnth = ['First', 'Second', 'Third', 'Fourth']def prefix(d):&nbsp; &nbsp; n = len(d)&nbsp; &nbsp; if n > 1:&nbsp; &nbsp; &nbsp; &nbsp; return d.radd([nth[i] for i in range(n)])&nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; return ddf.assign(Name=df.groupby('Name').Name.transform(prefix))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Name&nbsp; Age0&nbsp; &nbsp; &nbsp;FirstTom&nbsp; &nbsp;201&nbsp; &nbsp; SecondTom&nbsp; &nbsp;212&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Jack&nbsp; &nbsp;193&nbsp; &nbsp; &nbsp; &nbsp; Terry&nbsp; &nbsp;184&nbsp; &nbsp;FirstSteve&nbsp; &nbsp;175&nbsp; SecondSteve&nbsp; &nbsp;166&nbsp; &nbsp;ThirdSteve&nbsp; &nbsp;15
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python