如何检查Python中的列表中是否存在DataFrame字符串列的第一个单词？

首页课程实战体系课手记专栏慕课教程

如何检查Python中的列表中是否存在DataFrame字符串列的第一个单词？

我有一个DataFramedf_sentences和一个列表question_words，如下所示：

df_sentences:

sentence label

you will not forget this movie 0

will the novel ever die 1

why we drink alcohol 1

did trump win the election 1

ambiance is perfect 0

question_words = ['what', 'why', 'when', 'where', 'whose', 'which', 'whom', 'who', 'how',

'do', 'are', 'will', 'did', 'will', 'am', 'are', 'was', 'were', 'can', 'has', 'have']

我想检查列表中是否sentence存在该列的第一个单词，question_words并在新列中返回结果ques_word。

预期产量：

sentence label ques_word

you will not forget this movie 0 0

will the novel ever die 1 1

why we drink alcohol 1 1

did trump win the election 1 1

the ambiance is perfect 0 0

到目前为止，我正在尝试使用什么，.str.contains('|'.join(question_words)).astype(int)但是正如预期的那样，它将返回与question_wordslist匹配的所有子字符串的所有数量。

手掌心

浏览 436回答 2

2回答

慕村9548890

如果您想要快速的解决方案，请使用列表理解。q_set = set(question_words)df['ques_word'] = [    1 if w.split(None, 1)[0]  in q_set else 0 for w in df.sentence]df                         sentence  label  ques_word0  you will not forget this movie      0          01         will the novel ever die      1          12            why we drink alcohol      1          13      did trump win the election      1          14             ambiance is perfect      0          0

0 0

扬帆大鱼

.str.split(" ")[0].contains('|'.join(question_words)).astype(int)应该做的工作

0 0

随时随地看视频慕课网APP