我的问题与此类似,但答案似乎并不完全有效!
给定以下 pandas 数据框:
+---------+-----------------+-----------------+
| SECTION | TEXT | NUMBER_OF_WORDS |
+---------+-----------------+-----------------+
| ONE | lots of text… | 55 |
+---------+-----------------+-----------------+
| ONE | word1 | 1 |
+---------+-----------------+-----------------+
| ONE | lots of text… | 151 |
+---------+-----------------+-----------------+
| ONE | word2 | 1 |
+---------+-----------------+-----------------+
| ONE | word3 | 1 |
+---------+-----------------+-----------------+
| ONE | word4 | 1 |
+---------+-----------------+-----------------+
| TWO | lots of text… | 523 |
+---------+-----------------+-----------------+
| TWO | lots of text… | 123 |
+---------+-----------------+-----------------+
| TWO | word4 | 1 |
+---------+-----------------+-----------------+
如果 NUMBER_OF_WORDS 列中的值为 1;它必须与上面的行结合起来;前提是它们具有相同的 SECTION 值。
这是代码;这似乎有效,但不是我想要的。
df.groupby(['SECTION', (df.NUMBER_OF_WORDS.shift(1) == 1)], as_index=False, sort=False).agg({'TEXT': lambda x: ', '.join(x), 'NUMBER_OF_WORDS': lambda x: sum(x)})
桃花长相依
相关分类