显示如下:
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2096
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3237
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 884
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1528
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 703
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3344
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 4177
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1492
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 990
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2082
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 686
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 801
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 703
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2096
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3237
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 5196
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 933
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 884
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1528
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1492
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 990
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2082
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 686
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 801
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 4033
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 841
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 686
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1107
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1625
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 683
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2201
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3647
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 660
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2059
WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2986
...
...
WARNING:pdfminer.converter:undefined:
i try this, and it works.
import logging logging.Logger.propagate = False logging.getLogger().setLevel(logging.ERROR)
however , i don't know why !
-------------------------------------------------------------------------------------------------------------------------------------------
it sets the root logger to level Error. This will stop PDFMiner warn logging, since it logs to the root logger, but not your own logging.
I needed to set propagation to False, because after PDFMiner usage, I had duplicate logging entries. This was caused by the root logger.
from: http://stackoverflow.com/questions/29762706/warnings-on-pdfminer
emmmmmm 对啊,去除警告不是目的,目的是为了显示中文啊。。。。警告去了,中文还是没显示出来。。有啥意义呢
回复 原来我叫小土慕课网给我改了名字:
我後來繼續做 發現 pdf 分兩種
1.文字轉pdf => 用pdfminerk3 處理 轉回txt
2.圖片轉pdf=> 用Tesseract (OCR庫)處理 轉回txt
所以上面那篇如果轉出來 還是沒東西的話
可以用Tesseract (OCR庫)試試看
我最後用下面幾個庫 解決pdf是圖檔狀態下的問題
tesseract ( OCR庫 命令在python外執行 )
pyocr (tesseract python 庫的接口 )
pillow (p3從python圖像庫PIL分出來的 )
imagemagick
wand (imagemagick python 庫的接口 )