在Python中从目录中的多个CSV文件中提取特定列

我的目录中有大约 200 个 CSV 文件，其中包含不同的列，但有些文件包含我想要提取的数据。我想要拉的一列称为“程序”（行的顺序不同，但名称相同），另一列包含“建议”（并非所有措辞都相同，但它们都会包含该措辞）。最终，我想为每个 CSV 提取这些列下的所有行，并将它们附加到仅包含这两列的数据框中。我曾尝试先使用一个 CSV 执行此操作，但无法使其工作。这是我尝试过的：

import pandas as pd

from io import StringIO

df = pd.read_csv("test.csv")

dfout = pd.DataFrame(columns=['Programme', 'Recommends'])

for file in [df]:

dfn = pd.read_csv(file)

matching = [s for s in dfn.columns if "would recommend" in s]

if matching:

dfn = dfn.rename(columns={matching[0]:'Recommends'})

dfout = pd.concat([dfout, dfn], join="inner")

print(dfout)

我收到以下错误消息，所以我认为这是格式问题（它不喜欢 pandas df？）： ValueError(msg.format(_type=type(filepath_or_buffer))) ValueError: 无效的文件路径或缓冲区对象类型： <类'pandas.core.frame.DataFrame'>

当我尝试这个时：

csv1 = StringIO("""Programme,"Overall, I am satisfied with the quality of the programme",I would recommend the company to a friend or colleague,Please comment on any positive aspects of your experience of this programme

Nursing,4,4,IMAGE

Nursing,1,3,very good

Nursing,4,5,I enjoyed studying tis programme""")

csv2 = StringIO("""Programme,I would recommend the company to a friend,The programme was well organised and running smoothly,It is clear how students' feedback on the programme has been acted on

IT,4,2,4

IT,5,5,5

IT,5,4,5""")

dfout = pd.DataFrame(columns=['Programme', 'Recommends'])

for file in [csv1,csv2]:

dfn = pd.read_csv(file)

matching = [s for s in dfn.columns if "would recommend" in s]

if matching:

dfn = dfn.rename(columns={matching[0]:'Recommends'})

dfout = pd.concat([dfout, dfn], join="inner")

print(dfout)

这工作正常，但我需要读取 CSV 文件。有任何想法吗？

上面示例的预期输出：

牛魔王的故事

浏览 149回答 1

1回答

撒科打诨

以下作品：import pandas as pdimport globdfOut = []for myfile in glob.glob("*.csv"):    tmp = pd.read_csv(myfile, encoding='latin-1')        matching = [s for s in tmp.columns if "would recommend" in s]    if len(matching) > 0:        tmp.rename(columns={matching[0]: 'Recommend'}, inplace=True)        tmp = tmp[['Subunit', 'Recommend']]        dfOut.append(tmp)        df = pd.concat(dfOut)

0 0

随时随地看视频慕课网APP