猿问

在循环结束时将行添加到 pandas 数据框

我正在尝试在数据框中添加行作为循环的一部分。


该程序循环访问 URL 并以数据帧格式提取数据


for id in game_ids:

    df_team_final = []

    df_player_final = []

    url = 'https://www.fibalivestats.com/data/' + id + '/data.json'

    content = requests.get(url)

    data = json.loads(content.content)

在循环结束时,我使用 concat 合并客队/主队(和球员)的两个 df


    team_full = pd.concat([df_home_team, df_away_team])

    player_full = pd.concat([df_home_player_merge, df_away_player_merge])

在循环外我已经编程保存为 Excel


# #if cant find it, create new spread sheet

writer = pd.ExcelWriter('Box Data.xlsx', engine='openpyxl')

team_full.to_excel(writer, sheet_name='Team Stats', index=False)

player_full.to_excel(writer, sheet_name='Player Stats', index=False)

writer.save()

writer.close()

当我循环浏览多个网页时,我需要随时更新 df,显然在当前格式中,我只是用第二个循环覆盖第一个 url


在循环结束时附加或添加到数据帧的最佳方法是什么?


qq_遁去的一_1
浏览 194回答 1
1回答

喵喔喔

由于我们看不到完整的代码,因此我只能在这里给出一个粗略的轮廓。我假设您没有将抓取的数据附加到某种容器,因此它会在下一次迭代后丢失。# empty lists outside of loop to store datadf_team_final = []df_player_final = []for id in game_ids:    url = 'https://www.fibalivestats.com/data/' + id + '/data.json'    content = requests.get(url)    data = json.loads(content.content)    # create dataframes that you need    # df_home_team, df_away_team etc    # and append data to containers    team_full = pd.concat([df_home_team, df_away_team])    player_full = pd.concat([df_home_player_merge, df_away_player_merge])    df_team_final.append(team_full)    df_player_final.append(player_full )现在您将数据框存储为列表,您可以将它们与pandas.concat# outside of the loopteam_full = pd.concat(df_team_final)player_full = pd.concat(df_player_final)并立即保存:writer = pd.ExcelWriter('Box Data.xlsx', engine='openpyxl')team_full.to_excel(writer, sheet_name='Team Stats', index=False)player_full.to_excel(writer, sheet_name='Player Stats', index=False)writer.save()writer.close()编辑从您共享的文件中,我看到您在循环中添加了容器:但是你应该把它们放在循环开始之前:# initialize them heredf_team_final = []df_player_final = []for id in game_ids:
随时随地看视频慕课网APP

相关分类

Python
我要回答