通过解压缩列表向熊猫数据帧添加列

通过解压缩列表向熊猫数据帧添加列

我需要将HTML文件列表读取到熊猫数据帧中。

每个HTML文件都有多个数据帧（我使用pd.concat来组合它们）。
HTML文件名包含一个字符串，我想将其添加为列。

# Read all files into a list

files = glob.glob('monthly_*.html')

# Zip the dfs with the desired string segment

zipped_dfs = [zip(pd.concat(pd.read_html(file)), file.split('_')[1]) for file in files]

我在打开（ df，产品）的压缩列表时遇到问题。

dfs = []

# Loop through the list of zips,

for _zip in zipped_dfs:

# Unpack the zip

for _df, product in _zip:

# Adding the product string as a new column

_df['Product'] = product

dfs.append(_df)

但是，我收到错误'str' object does not support item assignment

有人可以解释添加新列的最佳方法吗？

杨__羊羊

浏览 83回答 1

1回答

繁华开满天机

您应该从列表理解中删除该行。如果您想要串联数据帧和产品名称的元组，则应编写：zipzipped_dfs = [(pd.concat(pd.read_html(file)), file.split('_')[1])               for file in files]但是，不需要创建元组列表的中间步骤。整个方法可以简化如下：dfs = []for file in glob.glob('monthly_*.html'):    # NOTE: your code seemingly keeps .html in the product name    # so I modified the split operation    df = pd.concat(pd.read_html(file))    df['Product'] = file.split('.html')[0].split('_')[1]         dfs.append(df)

0

0

随时随地看视频慕课网APP

相关分类

Python