问答详情
源自:5-5 python读取PDF文档(二)

LAParams not defined

#!/usr/bin/python
# -*-coding:utf-8 -*-
from pdfminer.pdfparser import PDFParser, PDFDocument
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfdevice import PDFDevice

fp = open("naacl06-shinyama.pdf",'rb');

parser = PDFParser(fp);
doc = PDFDocument();
parser.set_document(doc);

doc.initialize("");

resource = PDFResourceManager();

laparms = LAParams()

device = PDFPageAggregator(resource,laparms=laparms);
interpreter = PDFPageInterpreter(resource,device);
for page in doc.get_pages():
   interpreter.process_page(page);
   layout = device.get_result();
   for out in layout:
       print(out.get_text())

提问者:慕粉4254989 2017-04-10 23:14

个回答

  • qq_起风了_40
    2017-08-25 22:37:45

    你现在能操作了不??


  • yaoliguo1990
    2017-07-10 18:22:44

    device = PDFPageAggregator(resource,laparms=laparms);


    你这条代码中laparms应该为laparams

  • lapetus
    2017-04-11 12:24:13

    #!/usr/bin/python

    # -*- coding: utf-8 -*-

    from pdfminer.pdfparser import PDFParser,PDFDocument

    from pdfminer.pdfinterp import PDFResourceManager,PDFPageInterpreter

    from pdfminer.pdfdevice import PDFDevice

    from pdfminer.layout import LAParams

    from pdfminer.converter import PDFPageAggregator