我有一个 Pandas DataFrame:
text is_from_me
0 Happy birthday bud!!! 1
1 Thanks man! 0
2 Definitely would've come back had I thought ab... 1
3 Your good 0
4 Okay haha 1
5 Have a good one 1
6 Yea you too. What are you up to? 0
7 No hw like I'm doing all day 1
8 Just got up 1
9 Same here. I went to the football game last... 0
10 I think I saw that in your story 1
11 Win? 1
12 Lost in last second 0
13 Aw, that sucks 1
14 Means it was a good game tho? 1
15 Really good game. They were on the 1/2 yard li... 0
16 Dang 1
我正在尝试制作以下内容:
input output
0 Happy birthday bud!!! Thanks man!
2 Thanks man! Definitely would've come back had I thought ab...
3 Definitely would've come back had I thought ab... Your good
4 Your good Okay haha\nHave a good one
我可以用这段代码完成一些接近的事情:
pd.concat([df['text'].reset_index(drop=True), df['text'].shift(-1).reset_index(drop=True)], axis=1)
但是,这不会根据is_from_me组的文本与分隔原始字符串的换行符组合的位置来组合文本。这是一个简单的例子,可能会有多于 2 行的行组合成一行。
我已经尝试想出一种简单的方法来定义这个分组,但我所能管理的只是一个令人费解的 for 循环,它有点以一种hacky 的方式完成这项工作。是否有我可以编写的聚合函数可以为我完成此任务?
相关分类