肥皂起泡泡
一个有点复杂的解决方案,仅使用numpy,但对于大数据来说工作速度非常快:尝试在线运行它!import pandas as pd, numpy as np, mathdf = pd.DataFrame([ ['Train', 'Superfast', 10, 20], [np.nan, 'Convernient', np.nan, np.nan], [np.nan, 'Newest model', np.nan, np.nan], [np.nan, 'Year 2002/099', np.nan, np.nan], ['Car', 'Fastest', 20, 30], [np.nan, 'Can be more fast', np.nan, np.nan], [np.nan, 'Year/2020/AYD', np.nan, np.nan],], columns = ['A', 'B', 'C', 'D'])a = df.valuesi = np.append(np.flatnonzero(~(a[:, 0] != a[:, 0])), a.shape[0])b = a[i[:-1], :]diffs = np.diff(i)maxs = np.amax(diffs)c = np.zeros([i.shape[0], maxs], dtype = np.str_)begs, ends = i[:-1], i[1:]for j in range(1, maxs): chosen = begs + j < ends b[chosen, 1] += ' ' + a[begs[chosen] + j, 1]df = pd.DataFrame(b, columns = df.columns.values.tolist())print(df)代码输出: A B C D0 Train Superfast Convernient Newest model Year 2002/099 10 201 Car Fastest Can be more fast Year/2020/AYD 20 30