-
湖上湖
你可以用点来实现它: df = pd.DataFrame( { 'A': [0,0,1], 'B': [1,0,0], 'C': [0,0,0,], 'D': [1,0,1], 'F': [1,0,1] })df['new_column'] = df.dot(df.columns).str.join(",") A B C D F new_column0 0 1 0 1 1 B,D,F1 0 0 0 0 0 2 1 0 0 1 1 A,D,F更新:对于包含多个字母的列,@BEN_YO 提出了一个非常好的解决方案:df.dot(df.columns+',').str[:-1]
-
米脂
如果列名更像一个字符,请使用DataFrame.dot向列名添加分隔符并最后从右侧删除Series.str.rstrip:df['new_column'] = df.dot(df.columns + ',').str.rstrip(",")#alternative#df['new_column'] = (df @ (df.columns + ',')).str.rstrip(",")print (df) A B C D F new_column0 0 1 0 1 1 B,D,F1 0 0 0 0 0 2 1 0 0 1 1 A,D,Fdf = pd.DataFrame({ 'col1': [0,0,1], 'col2': [1,0,0], 'col3': [0,0,0,], 'col4': [1,0,1], 'col5': [1,0,1]})df['new_column'] = df.dot(df.columns + ',').str.rstrip(",")#alternative#df['new_column'] = (df @ (df.columns + ',')).str.rstrip(",")print (df) col1 col2 col3 col4 col5 new_column0 0 1 0 1 1 col2,col4,col51 0 0 0 0 0 2 1 0 0 1 1 col1,col4,col5替代解决方案:cols = df.columns.to_numpy()df["new_column"] = [', '.join(cols[x]) for x in df.to_numpy().astype(bool)]性能:sammywemmy无法使用第一个解决方案,因为有 50 列,所以有些列有 2 个或更多字母。也是footfalcon创建列表的解决方案,所以也不要测试。df = pd.DataFrame({ 'A': [0,0,1], 'B': [1,0,0], 'C': [0,0,0,], 'D': [1,0,1], 'E': [1,0,1]})[30000 rows x 50 columns]df = pd.concat([df] * 10, ignore_index=True, axis=1)df = pd.concat([df] * 10000, ignore_index=True).add_prefix('col')最快的是列表理解解决方案,但样本数据只有 10 毫秒,然后是真正快速的dot解决方案,最后是apply解决方案:In [70]: %%timeit ...: cols = df.columns.to_numpy() ...: df["new_column"] = [', '.join(cols[x]) for x in df.to_numpy().astype(bool)] ...: 128 ms ± 443 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)#for testing are values converted to boolean (else test fail)In [72]: %timeit df['new_column'] = df.astype(bool).dot(df.columns + ',').str.rstrip(",")138 ms ± 1.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)#Dishin H GoyaniIn [73]: %timeit df["New_column"] = df.apply(lambda x: ','.join(df.columns[x==1]), axis=1)3.98 s ± 129 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)#Akshay SehgalIn [75]: %timeit df['new_column'] = df.apply(lambda x: ', '.join(list(x[x!=0].index)), axis=1)11 s ± 349 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)#Rajith ThennakoonIn [78]: %%timeit ...: df["new_column"] = df.apply(lambda x: (pd.DataFrame(x[x==1]).index.values),axis=1) ...: df["new_column"] = df["new_column"].apply(lambda x: ','.join(map(str, x))) ...: ...: 25.9 s ± 709 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
-
噜噜哒
不确定这是否是最佳解决方案,但它可以完成工作:import pandas as pddf = pd.DataFrame( { 'A': [0,0,1], 'B': [1,0,0], 'C': [0,0,0,], 'D': [1,0,1], 'F': [1,0,1] })df1 = df.Tnew_cells = []for c in df1.columns: new_cells.append(df1[df1[c] == 1].index.tolist())df['New_column'] = new_cells输出:A B C D F New_column0 0 1 0 1 1 [B, D, F]1 0 0 0 0 0 []2 1 0 0 1 1 [A, D, F]
-
莫回无
如果你有 python >= 3.5,你可以使用 matmul 运算符来做一个点积——df['new_column'] = (df @ df.columns).str.join(', ') A B C D E new_column0 0 1 0 1 1 B, D, E1 0 0 0 0 0 2 1 0 0 1 1 A, D, E或者您可以使用applyaxis=1 解决此问题,如下所示 -df['new_column'] = df.apply(lambda x: ', '.join(list(x[x!=0].index)), axis=1) A B C D E new_column0 0 1 0 1 1 B, D, E1 0 0 0 0 0 2 1 0 0 1 1 A, D, E
-
慕哥6287543
您可以使用applywith lambdafunction onaxis=1df["New_column"] = df.apply(lambda x: ','.join(df.columns[x==1]), axis=1)df A B C D F New_column0 0 1 0 1 1 B,D,F1 0 0 0 0 02 1 0 0 1 1 A,D,F
-
拉莫斯之舞
试试这个方法。df = pd.DataFrame({"A":[0,0,1],"B":[1,0,0],"C":[0,0,0],"D":[1,0,1],"F":[1,0,1]})df["new_column"] = df.apply(lambda x: (pd.DataFrame(x[x==1]).index.values),axis=1)df["new_column"] = df["new_column"].apply(lambda x: ','.join(map(str, x)))输出 A B C D F new_column0 0 1 0 1 1 B,D,F1 0 0 0 0 0 2 1 0 0 1 1 A,D,F