Pandas 使用另一个系列作为查找值在一行中查找关键字的位置

5回答

杨魅力

您还可以使用series.str.splitwithexpand=True转换为数据框，然后使用df.eq检查数据框是否与其他系列匹配：example_data['position'] = (example_data['Title'].str.split(expand=True)                             .eq(example_data['query']).idxmax(1)+1)print(example_data)      query                       Title  position0  keyword1  keyword1 keyword2 keyword3         11  keyword1  keyword2 keyword1 keyword3         2如果可能缺少匹配项，您可以使用：m = example_data['Title'].str.split(expand=True)c = m.eq(example_data['query'])example_data['position'] = np.where(c.any(1),c.idxmax(1)+1,np.nan)

0 0

慕的地10843

使用.index但也检查匹配，如果没有返回匹配-1：out = [b.split().index(a) + 1        if a in b        else -1        for a, b in zip(example_data['query'], example_data['Title'])]print (out)[1, 2]example_data['query_position'] = out

0 0

慕田峪4524236

我找到的解决方案更 Pythonic 但有效。str.find无法帮助，因为它将索引返回为字符数，而不是单词。example_data['query_position'] = [len(t.split(q)[0].split(' ')) if len(t.split(q)) > 1 else 0 for t, q in zip(example_data['Title'].str.lower(), example_data['query'].str.lower())]

0 0

浮云间

如果我理解正确，您正在尝试创建一个新列，query_position它检查字符串是否query出现在中Title，然后给出位置。str.find()如果查询的字符串不存在于另一个字符串中，则该方法返回 -1。您已经说过，如果字符串不存在，您希望它返回 0，但如果您正在搜索的字符串存在并且位于 0 索引处，则可能会导致混淆。如果您真的想将其设为零，那么我将使用以下方法解决问题str.find()：# Quick custom functiondef match_string(Title, query):    s = Title.find(query)    if s == -1:        return 0    else:        return s# Use the .apply() function to create a new column using the custom functionexample_data['query_position'] = example_data.apply(lambda x: match_string(x['Title'], x['query']), axis=1)如果您想保留 -1 原样，那么这是将该str.find()函数应用于您的数据框的方法：example_data['query_position'] = example_data.apply(lambda x:str.find(x['Title'],                                 x['query']), axis=1)

0 0

30秒到达战场

我认为您希望有一个仅枚举如下行的列：example_data['enum'] = range(example_data.count())然后，如果您在标题字符串中找到查询字符串，只需像这样更新 row_id：example_data['query_position'] = example_data.apply(lambda x: x['enum'] if x['Title'].contains(x['query']) else 0)让我知道这是否有帮助！

0 0