使用 pytesseract 执行 OCR 时出错

FileNotFoundError: [WinError 2] The system cannot find the file specified.

During handling of the above exception, another exception occurred: pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path.  

我正在使用 Pycharm 社区并尝试为 OCR 安装 tesseract。我的代码如下:


import cv2

import numpy as np

import pytesseract

from PIL import Image

from pytesseract import image_to_string


# Path of working folder on Disk

src_path = "C:/Users/fsipl/Desktop/"


def get_string(img_path):

    # Read image with opencv

    img = cv2.imread(img_path)


    # Convert to gray

    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


    # Apply dilation and erosion to remove some noise

    kernel = np.ones((1, 1), np.uint8)

    img = cv2.dilate(img, kernel, iterations=1)

    img = cv2.erode(img, kernel, iterations=1)


    # Write image after removed noise

    cv2.imwrite(src_path + "removed_noise.png", img)


    #  Apply threshold to get image with only black and white

    #img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)


    # Write the image after apply opencv to do some ...

    cv2.imwrite(src_path + "thres.png", img)


    # Recognize text with tesseract for python

    result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))


    # Remove template file

    #os.remove(temp)


    return result



print('--- Start recognize text from image ---')

print(get_string(src_path+"word_text.jpg"))


print("------ Done -------")


繁花不似锦
浏览 295回答 1
1回答

Smart猫小萌

是的,我通过单行更改解决了问题。我们必须提供pytesseract exe的可执行路径pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'这是下面的代码:def get_string(img_path):    # Read image with opencv    img = cv2.imread(img_path)    # Convert to gray    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)    # Apply dilation and erosion to remove some noise    kernel = np.ones((1, 1), np.uint8)    img = cv2.dilate(img, kernel, iterations=1)    img = cv2.erode(img, kernel, iterations=1)    # Write image after removed noise    cv2.imwrite(src_path + "removed_noise.png", img)    #  Apply threshold to get image with only black and white    # img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)    # Write the image after apply opencv to do some ...    cv2.imwrite(src_path + "thres.png", img)    # Recognize text with tesseract for python    pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'    result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))    # Remove template file    # os.remove(temp)    return result
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python