python文本文件到行和列

所以我已经尝试了一段时间并且似乎遇到了障碍并且想要帮助。


我有几个文本文件。不用全部写出来,这里有一个例子:


2020

Grum Grum

Stamina: 20

Agility: 23

Strength: 20.5%

Resistances: 20-21-30


2020

Mondo Silo

Stamina: 23

Agility: 13

Strength: 10.5%

Resistances: 20-21-20

等等等等。有些是这样的,它每 6 行开始一个新的统计文件,有些文本文件有它,所以每 10 行就有一个新的统计表。


我的目标是每次统计表结束时,将其放入行和列中。我认为这在电子表格术语中称为转置,但我知道我做错了什么。或者即使那是正确的说..


例如,我希望文件在完成后看起来像这样。


Year | Name | Stamina | Agility | Str | Res

2020 | Grum Grum | Stamina: 20 | Agility: 23 | Strength: 20.5% | Resistances: 20-21-30

我已经尝试过 Numpy、Pandas 和 idk 我做错了什么,老实说不知道要搜索什么才能找到正确的答案。


如果我能得到任何帮助,我将不胜感激,这些文件非常大,我希望能够具体说明我需要统计表来填充的列数。


如果您能提供帮助,请提前致谢。


跃然一笑
浏览 106回答 4
4回答

幕布斯7119047

你可以试试这个来获得所需的数据框:with open(r'test1.txt','r') as file:    data=file.read().split('\n\n')data=[i.split('\n') for i in data]df=pd.DataFrame(data,columns=['Year','Name','Stamina','Agility','Str','Res'])print(df)输出:   Year        Name  ...              Str                    Res0  2020   Grum Grum  ...  Strength: 20.5%  Resistances: 20-21-301  2020  Mondo Silo  ...  Strength: 10.5%  Resistances: 20-21-202  2020   Grum Grum  ...  Strength: 20.5%  Resistances: 20-21-303  2020  Mondo Silo  ...  Strength: 10.5%  Resistances: 20-21-20并编写.txt具有不同行数且具有相同结构的文件列表的数据帧,您可以尝试:选项1import pandas as pdfiles=['test1.txt','test2.txt']                     #list of filesdf=pd.DataFrame(columns=['Year','Name','Stamina','Agility','Str','Res'])  #create the dataframefor file in files:                                  #we open each file    with open(r'path_of_files'+file,'r') as file_r:           data=file_r.read().strip().split('\n\n')        data=[i.split('\n') for i in data if i!=''] #get the rows        print(data)        s = pd.DataFrame(data, columns=df.columns)          df =pd.concat([df, s], ignore_index=True)   #we append the new rows to the dataframe                print(df)df.to_csv(r'test3.txt', sep='|', index=False)       #write the final dataframe to the output file('test3.txt'), with '|' as separator 选项 2import pandas as pdfiles=['test1.txt','test2.txt']                      #list of filesfor file in files:                                   #we open each file    with open(r'path_of_files'+file,'r') as file_r, open(r'test3.txt', 'a') as fout:        data=file_r.read().strip().split('\n\n')        data=[i.split('\n') for i in data if i!='']        df=pd.DataFrame(data,columns=['Year','Name','Stamina','Agility','Str','Res'])   #create a dataframe with the data of the current file        if files.index(file)==0:            fout.write(df.to_string( index = False)) #we let header=true to the first iteration to write the columns, and also write the data        else:            fout.write(df.to_string(header = False, index = False))  #we write the dataframe without the index and the columns names        fout.write('\n')                             #a newline to place correctly the next rows示例对于一些虚拟文件,例如下面的文件 ( test1.txt,test2.txt),您可以看到test3.txt带有两个选项的结果 ( ):测试1.txt2020Grum GrumStamina: 20Agility: 23Strength: 20.5%Resistances: 20-21-302020Mondo SiloStamina: 23Agility: 13Strength: 10.5%Resistances: 20-21-20测试2.txt2020Grum GrumStamina: 20Agility: 23Strength: 20.5%Resistances: 20-21-302020Mondo SiloStamina: 23Agility: 13Strength: 10.5%Resistances: 20-21-202020Mondo SiloStamina: 23Agility: 13Strength: 10.5%Resistances: 20-21-202020Mondo SiloStamina: 23Agility: 13Strength: 10.5%Resistances: 20-21-20带有选项 1 的test3.txt(输出文件)Year|Name|Stamina|Agility|Str|Res2020|Grum Grum|Stamina: 20|Agility: 23|Strength: 20.5%|Resistances: 20-21-302020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-202020|Grum Grum|Stamina: 20|Agility: 23|Strength: 20.5%|Resistances: 20-21-302020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-202020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-202020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20带有选项 2 的test3.txt(输出文件) Year        Name      Stamina      Agility              Str                    Res 2020   Grum Grum  Stamina: 20  Agility: 23  Strength: 20.5%  Resistances: 20-21-30 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20 2020   Grum Grum  Stamina: 20  Agility: 23  Strength: 20.5%  Resistances: 20-21-30 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20

POPMUISE

此选项在将数据加载到数据帧之前修复数据格式。每列顶部的标题和标题下方每行中的数据。这将以标准表格格式显示数据作为一个选项,因为已经有其他好的答案可以将数据转换为请求的格式。从信息存储和检索的角度来看,这是一种呈现和存储数据的标准方式。以标准方式存储数据可以更轻松地检索和使用其他工具来可视化数据。[0::6]: 列表切片,从 0 开始获取列表中的第 6 个值[1::6]: 列表切片获取列表中从 1 开始的每 6 个值用于collections.defaultdict获取列表元素并将它们转换为字典。sep=','使用或将数据框保存到 csvsep='|'读回文件df = pd.read_csv('characters.csv', sep='|')import pandas as pdfrom collections import defaultdict as dd# read the filewith open('test.txt', 'r') as f:    # read the text in; results in a list of strings    text_list = [r.strip() for r in f.readlines() if r.strip()]  # remove all new lines and empty rows# add Year: in front of each year numberyears = text_list[0::6]  # create a list of each yeartext_list[0::6] = [f'Year: {f}' for f in years]# add Name: in front of each namenames = text_list[1::6]  # create a list of each nametext_list[1::6] = [f'Name: {f}' for f in names]# split each string at ': 'text_list = [x.split(': ') for x in text_list]# create a dict for each valuedata = dd(list)for text in text_list:    data[text[0]].append(text[1])# load data into a dataframedf = pd.DataFrame(data)# display df   Year        Name Stamina Agility Strength Resistances0  2020   Grum Grum      20      23    20.5%    20-21-301  2020  Mondo Silo      23      13    10.5%    20-21-20# savedf.to_csv('characters.csv', sep='|', index=False)# file outputyear|name|Stamina|Agility|Strength|Resistances2020|Grum Grum|20|23|20.5%|20-21-302020|Mondo Silo|23|13|10.5%|20-21-20

慕尼黑5688855

尝试这个您可以将您的 txt 文件读取为 csvfile=pd.read_csv('filename.txt',sep=" ",header=None,error_bad_lines=False)or file =pd.read_fwf('filename.txt')

红颜莎娜

如果您将文本文件保持在相同的格式并在组之间换行,这应该适合您:import xlsxwriteritems = []# parse through .txt filewith open('file.txt', 'r') as r:    text = list(r.read().splitlines())    while text.count('') != 0:        text.remove('')    x = 0    while True:        items.append([])        for num in range(0, 6):            items[x].append(text[0])            text.remove(text[0])        x += 1        if len(text) == 0:            break    print(items)# Starting worksheetworkbook = xlsxwriter.Workbook('example.xlsx')worksheet = workbook.add_worksheet()row = 0# Writing column titlestitles = ['Year', 'Name', 'Stamina', 'Agility', 'Str', 'Res']for i in range(0, 6):    worksheet.write(row, i, titles[i])# fills in data from parsed .txt filex, row = 0, 1while True:    for i in range(0, 6):        cur = items[x][0]        worksheet.write(row, i, cur)        items[x].remove(cur)    print(items)    row += 1    x += 1    print('hi')    if len(items) == x:        break# Closes workbookworkbook.close()
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python