Introduction: OCR With Opencv Python

In this instructables im going to tell you how to perform Optical Character Recognition using Google's Tesseract engine.

for opencv /python installation see this link below

https://www.instructables.com/id/Opencv-and-Python...

for tesseract click below

https://github.com/tesseract-ocr/tesseract

Step 1: Python Code

The combination of python and opencv with tesseract Engine

from PIL import Image
import pytesseract import numpy as np import argparse import cv2, os

# parse the argument parser = argparse.ArgumentParser() parser.add_argument("-i", "--image", required = True) parser.add_argument("-p", "--preprocess", type = str, default = "thresh") args = vars(parser.parse_args())

# load the example image and convert it to grayscale image = cv2.imread(args["image"]) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # check preprocess to apply thresholding on the image if args["preprocess"] == "thresh": gray = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1] elif args["preprocess"] == "blur": gray = cv2.medianBlur(gray, 3) # write the grayscale image to disk as a temporary file filename = "{}.png".format(os.getpid()) cv2.imwrite(filename, gray)

# load the image as a PIL/Pillow image # apply OCR # delete temp image text = pytesseract.image_to_string(Image.open(filename)) os.remove(filename)

#TO-DO : Additional processing such as spellchecking for OCR errors or NLP print(text) # show the output images cv2.imshow("Image", image) cv2.imshow("Output", gray) cv2.waitKey(0) cv2.destroyAllWindows()

Step 2: How to Run the Python Code

Open terminal

Run command in this sequence :

$ python ocr.py -image image.jpg

(Applying Gaussian Blur/Thresholding)

$ python ocr.py -image image.jpg -preprocess blur

Now you would see the image and text will be printed in the terminal.