守着一只汪
使用set_index与cumcount对MultiIndex,然后通过重塑unstack:df1 = (df.set_index(['ID',df.groupby('ID').cumcount()])['Value'] .unstack() .rename(columns=lambda x: 'Value{}'.format(x + 1)) .reset_index())对于 python3.6+可以使用f-strings 来重命名列名称:df1 = (df.set_index(['ID',df.groupby('ID').cumcount()])['Value'] .unstack() .rename(columns=lambda x: f'Value{x+1}') .reset_index())另一个想法是由构造函数 create lists 和 new DataFrame:s = df.groupby('ID')['Value'].apply(list)df1 = (pd.DataFrame(s.values.tolist(), index=s.index) .rename(columns=lambda x: 'Value{}'.format(x + 1)) .reset_index())print (df1) ID Value1 Value2 Value30 1 ABC BCD AKB1 2 CAB AIK NaN2 3 KIB NaN NaN性能:取决于行数和列的唯一值数ID:np.random.seed(45)a = np.sort(np.random.randint(1000, size=10000))b = np.random.choice(list('abcde'), size=10000)df = pd.DataFrame({'ID':a, 'Value':b})#print (df)In [26]: %%timeit ...: (df.set_index(['ID',df.groupby('ID').cumcount()])['Value'] ...: .unstack() ...: .rename(columns=lambda x: f'Value{x+1}') ...: .reset_index()) ...: 8.96 ms ± 628 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)In [27]: %%timeit ...: s = df.groupby('ID')['Value'].apply(list) ...: (pd.DataFrame(s.values.tolist(), index=s.index) ...: .rename(columns=lambda x: 'Value{}'.format(x + 1)) ...: .reset_index()) ...: ...: 105 ms ± 7.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)#jpp solutionIn [28]: %%timeit ...: def group_gen(df): ...: for key, x in df.groupby('ID'): ...: x = x.set_index('ID').T ...: x.index = pd.Index([key], name='ID') ...: x.columns = [f'Value{i}' for i in range(1, x.shape[1]+1)] ...: yield x ...: ...: pd.concat(group_gen(df)).reset_index() ...: 3.23 s ± 20.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)