猿问

从 PDF 中提取声音注释

我有一个列出 PDF 文件注释的脚本从 pdf 解析注释


import popplerqt5

import argparse



def extract(fn):

    doc = popplerqt5.Poppler.Document.load(fn)

    annotations = []

    for i in range(doc.numPages()):

        page = doc.page(i)

        for annot in page.annotations():

            contents = annot.contents()

            if contents:

                annotations.append(contents)

                print(f'page={i + 1} {contents}')


    print(f'{len(annotations)} annotation(s) found')

    return annotations



if __name__ == '__main__':

    parser = argparse.ArgumentParser()

    parser.add_argument('fn')

    args = parser.parse_args()

    extract(args.fn)

但它只适用于文本注释,有很多Python库,如Poppler、PyPDF2、PyMuPDF,我一直在搜索他们的文档和源代码很多,就我而言,他们无法提取声音注释的二进制。你知道有哪个图书馆可以做到这一点吗?我需要提取这些声音注释的二进制文件并将它们转换为 MP3。


阿晨1998
浏览 120回答 1
1回答

qq_花开花谢_0

PyMuPDF 的下一版本将支持提取音频注释。使用此脚本使用 PyMuPDF 从 PDF 中提取音频注释,它很容易使用,只需调用该脚本并将 PDF 文件作为第一个参数传递即可:python script.py myfile.pdf注意:仅适用于 Windows。import fitz, sys, os, subprocessassert len(sys.argv) == 2, "need filename as parameter"ifile = sys.argv[1]doc = fitz.open(ifile)ofolder = os.path.dirname(ifile)if ofolder == "":    ofolder = os.getcwd()flnm = os.path.splitext(os.path.basename(ifile))[0]defolder = ofolder + "\\" + flnmos.mkdir(defolder)defolder = defolder + "\\" + flnmfor page in doc:    print(page)    annotNumber = 1    for annot in page.annots(types=[fitz.PDF_ANNOT_SOUND]):          try:             sound = annot.soundGet()          except Exception as e:            print(e)            continue        for k, v in sound.items():            print(k, "=", v if k != "stream" else len(v))        ofile = defolder + ".page." + str(page.number) + ".annot." + str(annotNumber) + ".raw"        fout = open(ofile,"wb")         fout.write(sound["stream"])        fout.close()        ofileffmpeg = defolder + ".page." + str(page.number) + ".annot." + str(annotNumber) + ".mp3"        annotNumber += 1        if "channels" in sound:            channels = str(sound["channels"])        else:            channels = "1"        if "encoding" in sound:            if sound["encoding"] == "Signed":                encoding = "s"            else:                encoding = "u"        else:            encoding = "u"        if "bps" in sound:            fmt = encoding + str(sound["bps"]) + "be"        else:            fmt = encoding + "8"        subprocess.call(['ffmpeg', '-hide_banner', '-f', fmt, '-ar', str(sound["rate"]), '-ac', channels, '-i', str(ofile), str(ofileffmpeg)], shell=True)
随时随地看视频慕课网APP

相关分类

Python
我要回答