我正在尝试捕获列表格式的数据框/熊猫中的元素。如果字符串存在,下面会捕获整个列表,我如何仅按行捕获特定字符串的元素而忽略其余部分?
这是我尝试过的...
l1 = [1,2,3,4,5,6]
l2 = ['hello world \n my world','world is a great place \n we live in it','planet earth',np.NaN,'\n save the water','']
df = pd.DataFrame(list(zip(l1,l2)),
columns=['id','sentence'])
df['sentence_split'] = df['sentence'].str.split('\n')
print(df)
这段代码的结果:
df[df.sentence_split.str.join(' ').str.contains('world', na=False)] # does the trick but still not exactly what I am looking for.
id sentence sentence_split
1 hello world \n my world [hello world , my world]
2 world is a great place \n we live in it [world is a great place , we live in it]
但寻找:
id sentence sentence_split
1 hello world \n my world hello world; my world
2 world is a great place \n we live in it world is a great place
收到一只叮咚
UYOU
随时随地看视频慕课网APP
相关分类