将数据帧上传到 s3 python

我正在尝试像下面这样划分数据框:


from io import StringIO

import pandas as pd


data = """

A,B,C

87jg,28,3012

h372,28,3011

kj87,27,3011

2yh8,54,3010

802h,53,3010

5d8b,52,3010

"""

df = pd.read_csv(StringIO(data), sep=',')


for key, group in df.groupby(['C','B']):

    group.to_csv(f'df_{key}.csv', index=False)

这会将按数据帧分组的结果导出到本地机器。有没有办法执行此操作并将这些多个拆分 csv 上传到 s3(类似于 boto3 的 put_object)


蝴蝶不菲
浏览 131回答 2
2回答

人到中年有点甜

您也可以使用必须安装的s3fs 。可以使用 完成安装pip,例如:pip install s3fs根据您的代码验证的示例:import osfrom io import StringIOimport pandas as pdimport s3fs# I did not use my default aws profile# so had to provide key and secret. If you use# the default aws profile, providing `key`# and `secret` should not be requiredfs = s3fs.S3FileSystem(        anon=False,        key='<access_key>',        secret='<secret_key>')data = """ A,B,C87jg,28,3012h372,28,3011kj87,27,30112yh8,54,3010802h,53,30105d8b,52,3010"""df = pd.read_csv(StringIO(data), sep=',')for key, group in df.groupby(['C','B']):    group.to_csv(fs.open(f's3://<bucket-name>/df_{key[0]}-M{key[1]}.csv', 'w'), index=False)代码正确上传文件:

一只萌萌小番薯

from io import StringIOimport pandas as pdimport boto3data = """A,B,C87jg,28,3012h372,28,3011kj87,27,30112yh8,54,3010802h,53,30105d8b,52,3010"""df = pd.read_csv(StringIO(data), sep=',')client = boto3.client('s3')for key, group in df.groupby(['C', 'B']):&nbsp; &nbsp; group.to_csv(f'df_{key}.csv', index=False)&nbsp; &nbsp; client.upload_file(f'df_{key}.csv', 'my-another-test-bucket-2',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;f'df_{key[0]}-M{key[1]}.csv')S3 桶
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python