Optical Character Recognition using Python and Google Tesseract OCR

Author anirudhmergu
Published July 20, 2021
20 comments Join the Conversation

In this article, we will install Tesseract OCR on our system, verify the Installation and try Tesseract on some of the sample images.

TL; DR
Detailed Steps

TL; DR

Time needed: 45 minutes.

In order to decompile an application, you will need to perform the following steps.

Install Tesseract OCR on your computer
macOS users, run brew install tesseract.
Linux users, run sudo apt-get install tesseract-ocr
Windows users, consult tesseract documentation to install the binary. For detailed steps, continue reading the blog.
Verify the Installation of Tesseract on your machine
Run tesseract -v to verify the installation. If the command prints the version properly, then we are good to go!
Create a new file named ocr.py
Create a new file called ocr_main.py and copy the contents from the detailed blog.
Run the python script
Run the script using python ocr_main.py

Detailed Steps

Step One – Installing Tesseract OCR

For macOS users, we’ll be using Homebrew to install Tesseract:

brew install tesseract

If you’re using the Ubuntu operating system, simply use apt-get to install Tesseract OCR:

sudo apt-get install tesseract-ocr

For Windows, please consult Tesseract doc u mentation

Step Two – Verifying the Installation of Tesseract OCR

To validate that Tesseract has been successfully installed on your machine, execute the following commands:

tesseract -v

You should see the Tesseract version printed on your screen, along with a list of image file format libraries Tesseract is compatible with. For example,

tesseract 3.05.01
leptonica-1.74.1
libgif 4.1.6(?) : libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.20 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.3 : libopenjp2 2.1.0

If the Tesseract version is not displayed on your screen, a blank window may be opened and closed automatically.

If you get errors instead, then re-install Tesseract and make sure you update your PATH variable and try to open the console or the IDE which you are using with Administrative Privileges.

Step Three – Testing out Tesseract OCR

In order to obtain reasonable results, you need to supply images that are cleanly pre-processed and crisp.

Recommendations:

Use images with high resolution and DPI possible.
Make sure that the text is clearly visible and with no pixelations or deformations.

The GitHub repository for this tutorial will be available here.

Let’s start coding now:

Create a file named ocr_main.py (I chose it, you can name it whatever you want)

1. Import necessary libraries

import cv2
import pytesseract
from PIL import Image

2. Get the path of the image file we are working on. I’m going to store the path to the file in a variable called path

# Get File Name from Command Line
path = input("Enter the file path : ").strip()

3. Load the image data and store it in the variable image

# load the image
image = cv2.imread(path)

4. Convert the image to grayscale for better recognition of text and store the data in gray

# Converting to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

5. If you want to pre-process your image, then do it accordingly.

temp = input("Do you want to pre-process the image ?nThreshold : 1nGrey : 2nNone : 0nEnter your choice : ").strip()
# If user enter 1, Process Threshold or if user enters 2, then process medianBlur. Else, do nothing.
if temp == "1":
    gray = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
elif temp == "2":
    gray = cv2.medianBlur(gray, 3)

6. Save the pre-processed temporary file as temp.png

filename = "{}.png".format("temp")
cv2.imwrite(filename, gray)

7. Apply OCR and print the output string.

text = pytesseract.image_to_string(Image.open(filename))
print(text)

And the final code will be :

import cv2
import pytesseract
from PIL import Image

def main():
    # Get File Name from Command Line
    path = input("Enter the file path : ").strip()
    # load the image

    image = cv2.imread(path)
    # Convert image to grayscale

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    temp = input("Do you want to pre-process the image ?nThreshold : 1nGrey : 2nNone : 0nEnter your choice : ").strip()

     # If user enter 1, Process Threshold
     if temp == "1":
         gray = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
     elif temp == "2":
         gray = cv2.medianBlur(gray, 3)

     # store grayscale image as a temp file to apply OCR

     filename = "{}.png".format("temp")

     cv2.imwrite(filename, gray)

     # load the image as a PIL/Pillow image, apply OCR, and then delete the temporary file

     text = pytesseract.image_to_string(Image.open(filename))

     print(text)

 try:
     main()
 except Exception as e:
     print(e.args) print(e.__cause__)

Step Four: Let’s put our code to Test OCR

Here are some of the sample pictures to test Tesseract.

Before testing out tesseract, I recommend you to download the GitHub Repository from here

Text in bold represents output and the italic text indicates input.

Let’s try it on the first sample.

Sample 1

python ocr_main.py
Enter the file path: sample1.png
Do you want to pre-process the image?
Threshold: 1
Grey: 2
None : 3
Enter your choice: 1
You are awesome.

It works well on Sample Image 1, let’s try it on Sample Image 2.

Sample 2

python ocr_main.py
Enter the file path: sample1.png
Do you want to pre-process the image?
Threshold: 1
Grey: 2
None : 3
Enter your choice: 1
Some italic text.

And finally on the last sample.

Sample 3

python ocr_main.py
Enter the file path: sample1.png
Do you want to pre-process the image?
Threshold: 1
Grey: 2
None : 3
Enter your choice: 1
Hawdwriting

Thanks for taking time for reading this article, A big thumbs up for you people.

If you have any queries regarding this article, I would be glad to help you out. Please let me know in the comments section below 🙂

Just split the image containing data into two parts vertically.
Run OCR on each of the two images and store them in two different lists. Say names and values
As every record will be separated by an empty line character i.e. '\n'. You can split them using names.split("\n") and values.split("\n")
This will give you an array of strings

Create a new string, say output = ""

Then write some code to take each record simultaneously from both the arrays and append it to the output string as output += str(name)+","+str(value).

Create a file buffer. For easiness, I recommend using CSV and then convert open it in excel and save it as a new excel file.

f = open("file.csv", "w+")

Write the output string to file.

f.write(output)

Close the output stream

f.close()

Then open this file in excel and then save as a new excel file.

Hope it helps!

Thanks,
Anirudh

20 comments

Adi Sankuri says:

September 23, 2018 at 11:20 pm

I would like to retrieve data from a structured form into an excel sheet which has 2 columns. 1st column contains indicates the name of the field. 2nd column indicates the value of the field. How can I do it

1. Anirudh Mergu says:
  
  September 23, 2018 at 11:20 pm
  
  Just split the image containing data into two parts vertically.
  Run OCR on each of the two images and store them in two different lists. Say names and values
  As every record will be separated by an empty line character i.e. 'n'. You can split them using names.split("n") and values.split("n")
  This will give you an array of strings
  
  Create a new string, say output = ""
  
  Then write some code to take each record simultaneously from both the arrays and append it to the output string as output += str(name)+","+str(value).
  
  Create a file buffer. For easiness, I recommend using CSV and then convert open it in excel and save it as a new excel file.
  
  f = open("file.csv", "w+")
  
  Write the output string to file.
  
  f.write(output)
  
  Close the output stream
  
  f.close()
  
  Then open this file in excel and then save as a new excel file.
  
  Hope it helps!
  
  Thanks,
  Anirudh
  
Adi Sankuri says:

August 30, 2018 at 2:05 pm

I would like to retrieve data from a structured form into an excel sheet which has 2 columns. 1st column contains indicates the name of the field. 2nd column indicates the value of the field. How can I do it

1. Anirudh Mergu says:
  
  September 5, 2018 at 12:06 am
  
  Just split the image containing data into two parts vertically.
  Run OCR on each of the two images and store them in two different lists. Say names and values
  As every record will be separated by an empty line character i.e. '\n'. You can split them using names.split("\n") and values.split("\n")
  This will give you an array of strings
  
  Create a new string, say output = ""
  
  Then write some code to take each record simultaneously from both the arrays and append it to the output string as output += str(name)+","+str(value).
  
  Create a file buffer. For easiness, I recommend using CSV and then convert open it in excel and save it as a new excel file.
  
  f = open("file.csv", "w+")
  
  Write the output string to file.
  
  f.write(output)
  
  Close the output stream
  
  f.close()
  
  Then open this file in excel and then save as a new excel file.
  
  Hope it helps!
  
  Thanks,
  Anirudh
  
Ajeet says:

September 23, 2018 at 11:20 pm

In my image I have got a value like 60-70 mg but OCR converts this as 607is70 mg , is there a fix this kind of issues.

1. Anirudh Mergu says:
  
  September 23, 2018 at 11:20 pm
  
  The results depend on the quality of the image, kindly use an image with a better resolution and use the pre-process methods to clean the clutter from the image. Hope this solution solves the issue.
  
Ajeet says:

September 5, 2018 at 2:37 pm

In my image I have got a value like 60-70 mg but OCR converts this as 607is70 mg , is there a fix this kind of issues.

1. Anirudh Mergu says:
  
  September 5, 2018 at 2:57 pm
  
  The results depend on the quality of the image, kindly use an image with a better resolution and use the pre-process methods to clean the clutter from the image. Hope this solution solves the issue.
  
abraham says:

September 23, 2018 at 11:20 pm

Nice information bro. i saw few posts…….keep rocking.

1. Anirudh Mergu says:
  
  September 23, 2018 at 11:20 pm
  
  Thanks Abraham! This means a lot! ❤️
  
abraham says:

September 20, 2018 at 12:57 am

Nice information bro. i saw few posts…….keep rocking.

1. Anirudh Mergu says:
  
  September 20, 2018 at 1:22 am
  
  Thanks Abraham! This means a lot! ❤️
  
Nitin Kshatriya says:

September 27, 2018 at 4:58 pm

Just a quick question, how can I use the about model for mobile. Apart from using API, is there way to use them in IOS/Android devices?

1. Anirudh Sai Mergu says:
  
  September 30, 2018 at 1:07 am
  
  You may consider using these repositories for more details:
  Android: https://github.com/rmtheis/tess-two
  iOS: https://github.com/gali8/Tesseract-OCR-iOS
  
  Hope it helps!
  
Karan Davda says:

July 18, 2019 at 4:36 pm

if i want to convert image to any other colour then?? or if not want to convert image into gray then?? what should i do ??

1. Anirudh Sai Mergu says:
  
  July 19, 2019 at 11:39 am
  
  You can try by removing this statement from the code
  gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  
  1. Karan Davda says:
    
    July 19, 2019 at 5:27 pm
    
    it genrate error
    error : (“OpenCV(4.1.0) /io/opencv/modules/imgproc/src/thresh.cpp:1509: error: (-215:Assertion failed) src.type() == CV_8UC1 in function ‘threshold’n”,)
    None
    
    apart this i keep that statement as it is and i try by changing value of threshold , but there is no change. this grayscale make image so dark https://uploads.disquscdn.com/images/3a27939653465fd88149ca20b9bcb59a2e3c45376194637f29362b41a25cd236.png
    
Luís Cunha says:

August 8, 2019 at 1:54 pm

Hello, program starts smoothly , but after selecting the pre-proccess option the following error appears :

(“module ‘pytesseract’ has no attribute ‘image_to_string'”,)

None

can you help ?

Ditiya Mukherjee says:

July 22, 2020 at 10:36 pm

getting error as (“module ‘cv2’ has no attribute ‘imread'”,)

Anonymous says:

December 31, 2020 at 11:54 pm

cool stuff nice job

Table of Contents

TL; DR

Detailed Steps

Step One – Installing Tesseract OCR

Step Two – Verifying the Installation of Tesseract OCR

Step Three – Testing out Tesseract OCR

Recommendations:

Step Four: Let’s put our code to Test OCR

anirudhmergu

20 comments

Leave a Reply Cancel reply

anirudhmergu

Reverse Engineer a Windows .exe Application to get its source code and VS Project

Optical Character Recognition using Python and Google Tesseract OCR

Animate your Android App easily in Android Studio without using any third-party library

Optical Character Recognition using Python and Google Tesseract OCR

Table of Contents

TL; DR

Detailed Steps

Step One – Installing Tesseract OCR

Step Two – Verifying the Installation of Tesseract OCR

Step Three – Testing out Tesseract OCR

Recommendations:

Step Four: Let’s put our code to Test OCR

anirudhmergu

Next Post

20 comments

Leave a Reply Cancel reply

anirudhmergu

Reverse Engineer a Windows .exe Application to get its source code and VS Project

Optical Character Recognition using Python and Google Tesseract OCR

Animate your Android App easily in Android Studio without using any third-party library