猿问

匹配 2 个文本文件中的文本并从第一个文件获取信息并附加到另一个文件

我有 2 个文本文件。第一个文件包含文本的元数据信息(如字体大小等),第二个文件仅包含文本。我需要匹配两个文件之间的文本,从第一个文件中获取元数据信息并将其添加到第二个文件中。例如,


文件A数据:


[Base Font : PSJEPX+Muli-Light, Font Size : 7.5, Font Weight : 300.0]We are not satisfied with our 2018 results. We have the global footprint, assets and team to 

[Base Font : SVTVFR+Muli-Light, Font Size : 7.5, Font Weight : 300.0] 


[Base Font : PSJEPX+Muli-Light, Font Size : 7.5, Font Weight : 300.0]perform better. We have made a number of changes to position for sustainable growth.

New line that does not start with square brackets.

[Base Font : SVTVFR+Muli-SemiBold, Font Size : 8.1, Font Weight : 600.0]Innovation

文件 B 数据:


We are not satisfied with our 2018 results. We have the global footprint, assets and team to perform better. We have made a number of changes to position for sustainable growth.

New line that does not start with square brackets.


Innovation

预期输出:


[Base Font : PSJEPX+Muli-Light, Font Size : 7.5, Font Weight : 300.0]We are not satisfied with our 2018 results. We have the global footprint, assets and team to perform better. We have made a number of changes to position for sustainable growth.

New line that does not start with square brackets.


[Base Font : SVTVFR+Muli-SemiBold, Font Size : 8.1, Font Weight : 600.0]Innovation

因此,基本上,只有当元数据信息发生更改时,“文件 A”中的元数据才必须附加到“文件 B”。


我的方法:


 def readB(x):

     with open(File B) as resultFile:

         for line in resultFile:

             if x in line:

                 print(x)


def readA():

    with open(File A) as bondNumberFile:

        for line in bondNumberFile:

            readB(line)


readA()

我的问题是,我不确定如何从文件 A 中获取元数据信息并将其附加到文件 B。此外,我的代码在匹配文本时无法处理元数据信息(方括号内)。


温温酱
浏览 118回答 1
1回答

摇曳的蔷薇

请尝试下面的程序。该程序首先读取filea并创建样式和线条的字典,然后逐行读取fileb以从字典中匹配和选择样式,并将其写入到filec.import retable={}with open("filea.txt","r") as f:    for line in f:        if line.strip():            parts=re.findall("^(\[.*?\])?(.*)$",line)[0]            if parts[0] in table:                table[parts[0]]+=parts[1]            else:                table[parts[0]]=parts[1]with open("fileb.txt","r") as f, open("filec.txt","w") as f1:    for line in f:        if line.strip():            for i in table:                if line.strip() in table[i]:                    f1.write(i+line)                    break                else:                    pass        else:            f1.write(line)输出[Base Font : PSJEPX+Muli-Light, Font Size : 7.5, Font Weight : 300.0]We are not satisfied with our 2018 results. We have the global footprint, assets and team to perform better. We have made a number of changes to position for sustainable growth.New line that does not start with square brackets.[Base Font : SVTVFR+Muli-SemiBold, Font Size : 8.1, Font Weight : 600.0]Innovation
随时随地看视频慕课网APP

相关分类

Python
我要回答