IndexError: positional indexers are out-of-bounds在已删除行但不在全新DataFrame 上的 DataFrame 上运行以下代码时出现错误:
我正在使用以下方法来清理数据:
import pandas as pd
def get_list_of_corresponding_projects(row: pd.Series, df: pd.DataFrame) -> list:
"""Returns a list of indexes indicating the 'other' (not the current one) records that are for the same year, topic and being a project.
"""
current_index = row.name
current_year = row['year']
current_topic = row['topic']
if row['Teaching Type'] == "Class":
mask = (df.index != current_index) & (df['year'] == current_year) & (df['topic'] == current_topic) & (df['Teaching Type'] == "Project")
return df[mask].index.values.tolist()
else:
return list()
def fix_classes_with_corresponding_projects(df: pd.DataFrame) -> pd.DataFrame:
"""Change the Teaching Type of projects having a corresponding class from 'Project' to 'Practical Work'
"""
# find the projects corresponding to that class
df['matching_lines'] = df.apply(lambda row: get_list_of_corresponding_projects(row, df), axis=1)
# Turn the series of lists into a single list without duplicates
indexes_to_fix = list(set(sum(df['matching_lines'].values.tolist(), [])))
# Update the records
df.iloc[indexes_to_fix, df.columns.get_loc('Teaching Type')] = "Practical Work"
# Remove the column that was used for tagging
df.drop(['matching_lines'], axis=1, inplace=True)
# return the data
return df
在全新的DataFrame上运行时,这些方法可以正常工作:
df = pd.DataFrame({'year': ['2015','2015','2015','2016','2016','2017','2017','2017','2017'],
'Teaching Type':['Class', 'Project', 'Class', 'Class', 'Project', 'Class', 'Class', 'Class', 'Project' ],
'topic': ['a', 'a', 'b', 'a', 'c','a','b','a','a']})
display(df)
df = fix_classes_with_corresponding_projects(df)
display(df)
上面的示例在以下行中受到影响:
df.iloc[indexes_to_fix, df.columns.get_loc('Teaching Type')] = "Practical Work"
我在这里想念什么?我认为,当我使用索引值时,我可以避免这种类型的错误。
元芳怎么了
相关分类