Optical Character Recognition(OCR) is a technology that enables the identification of text within images, such as scanned documents and pictures. OCR technology is used to turn virtually any form of written text image into machine-readable text data (typed, handwritten, or printed).
OCR technology has proved remarkably useful in digitizing historic newspapers by transforming them into completely searchable formats.
OCR made it simpler and quicker to access those previous texts while enabling its users to edit with word processors like Microsoft Word or Google Docs.
How it works
How is this really accomplished? Text on a picture is easily perceptible to us and we can detect characters and read the text, but it’s just a set of dots on a screen.
The picture is first scanned and the elements of text and graphics are transformed into a bitmap, which is basically a black and white dot matrix. In order to improve the accuracy of the operation, the image is then pre-processed where the brightness and contrast are adapted.
The image is now divided into zones that identify areas of interest, such as where the images or text are, and this helps to start the process of extraction. The text-containing areas can now be further broken down into lines, words, and characters, and the software can now match the characters through the comparison and different detection algorithms. The outcome is the text and later is converted into separate means such as word documents, PDFs, or even audio content through text-to-speech technologies.
The system may not be 100% accurate and may require human intervention to correct any elements that have not been correctly scanned. You can also fix errors using a dictionary or even Natural Language Processing (NLP).
What is the need for businesses to use OCR software?
Today, many sectors are using OCR software technology that earlier was broken due to problems such as data unusability, inaccuracy, and failure. By revolutionizing data and storage systems, there are many aspects that OCR technology has helped industries such as healthcare, human resources, banking, and insurance. Digitization and sharing of files to avoid common user errors.
The modern office worker also takes for granted the process of making a scanned file searchable since OCR today is embedded in many applications, websites and content management systems. Way back in the early days of the new millennium OCR used to be a costly form of software reserved for imaging service bureaus and Fortune 500 businesses.
For almost any company, storing information is important, but can you imagine how much it would benefit public services and government companies?
It’s much quicker to recover invoices while technology is on your hand. However, the magic doesn’t stop here, almost any industry you can think of will benefit greatly from OCR advantages.
5 Benefits of using an OCR
- Better Search, accessibility, and usability of data
- Time and Storage Saving
- Improves Customer Satisfaction
- Top-grade Translation
- Enhancing Security
OCR in Flutter
In this blog we will explore how OCR in Flutter recognizes the text.
There are two ways we can interpret text depending on whether we want to perform the character recognition using a live camera or from an image.
In the previous blog, we looked at doing the OCR using flutter_mobile_vision plugin from the device camera.
In this blog, we will look at performing the character recognition on images and extracting the characters out of the image.
We will use the python packages to achieve this by defining a Django api and using it from the flutter mobile app.
There are few available options in Python which can be used for OCR which includes pytesseract, pyocr and cv2. In this blog, we will show the different options which we tried.
Implementation :
Let’s see how we implement the OCR functionality on image using Flutter and Python.
Setting up Python Django API
Download the prerequisite python engine which is used by all the ocr packages..
Download tesseract-ocr engine from
“https://github.com/tesseract-ocr/tesseract “
(OR)
–pip install tesseract-ocr
(OR)
https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh
Setting up this engine requires a number of prerequisite steps and dependencies which are listed out based on the Operating systems.
Follow the link here to setup tesseract based on your OS.
This is a very important step without which your OCR will not work as expected.
You need to Setup the environment variable for the path pointing to your tesseract engine location.
Installation of Dependencies
Let’s create and activate the Python virtual environment using the below commands:
1 2 |
python3 -m venv env env source env/bin/activate |
Next we will install the django related dependencies:
1 2 |
pip install django pip install djangorestframework |
Now we will install the 2 python packages which we want to try out in this blog.
1 2 |
-pip install pytesseract -pip install pyocr |
Next we will Create a project called tutorial for our example project:
1 2 3 4 |
django-admin startproject restapi python manage.py startapp restapp cd restapp/api |
In settings.py file add http://10.0.2.2:8000 in allowed host’s [] of urls if you are using an emulator for testing your flutter app. This will ensure your api is accessible from the emulator app.
In project level file urls.py add a url that will redirect to the app level urls.py file as shown
1 |
path('api/', include('restapp.api.urls')), |
In app level urls.py add a url that will hit a method in views.py file.
1 |
path('upload/', FileView.as_view(), name='file-upload') |
In views.py file within /restapi/restapp/api/ folder, Import all the following packages which includes permissions, csrf process and importing API.
Importing all the required packages as shown
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
from rest_framework import permissions // This is used to import permissions from django.views.decorators.csrf import csrf_exempt // it is to exempt the csrf process from PIL import Image// used to import Image method import sys from rest_framework.views import APIView // used to import generic apiView from rest_framework.response import Response // used to import Response obj from restapp.models import File from restapp.api.serializers import FileSerializer |
The following code method will extract the text from a given image and display the text on the screen. Below is the code showing the 2 variations using the 2 different packages.
Using Pytesseract
1 2 3 4 5 6 |
import pytesseract // used to import the pytesseract packages img = Image.open(attached_file) // used to load the image result = pytesseract.image_to_string(img,lang = 'eng')// method to extract the text from loaded image return Response({"result":result}, status=200) |
Using Pyocr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
import pyocr // used to import the pyocr packages import pyocr.builders img = Image.open(attached_file) // used to load the image Tools = pyocr.get_available_tools() If len(tools) ==0: print(‘no tool found’) tool=tools[0] langs=tool.get_available_languages() lang=langs[0] txt=tool.image_to_string(Image.open(attached_file),lang=lang, builder=pyocr.builders.TextBuilder()) return Response({"result":txt}, status=200) |
While trying out the 2 options, we found pyocr to be faster and more accurate with respect to extraction.
To start the django application,
1 2 3 |
python manage.py makemigrations python manage.py migrate python manage.py runserver |
Now the application will be running on http://localhost:8080
Flutter Application:
Create a new flutter app and include a camera widget to pick images using image picker.
Step:1
Add the following dependencies to your pubspec.yaml file
1 2 |
image_picker: ^0.6.7+11 http: ^0.12.2 |
Next our app will have the following code to use the image_picker to pick the image from the device gallery.
1 2 3 4 5 6 7 8 |
final pickedFile = await picker.getImage(source: ImageSource.gallery); setState(() { if(pickedFile != null){ file = File(pickedFile.path); }else{ print('No image selected'); } }); |
Note: If no image is selected the application will prompt you to choose an image to perform OCR.
Step:2
Then, send the picked image to the Python api using http POST method ,url and get a response from that call.
As mentioned earlier, to run on an emulator we should use the url – http://10.0.2.2:8000 and should write in python allowed host’s.
Below is the code to submit a post request to the api in the flutter app and receive the response text which is displayed finally.
1 2 3 4 5 6 7 8 9 10 11 |
var postUri = Uri.parse('$baseUrl/api/create');//this method makes a api call http.MultipartRequest request = new http.MultipartRequest("POST", postUri); http.MultipartFile multipartFile = await http.MultipartFile.fromPath("attached_file", file.path); request.files.add(multipartFile); http.StreamedResponse response = await request.send(); print(response.statusCode); response.stream.bytesToString().then((value) => print(value)); } on Error catch(e) { print(e); } } |
The following screenshots show how OCR works and identifies text from the image you upload.
Select the image on which you want to perform the OCR by clicking the Choose Image button. Once you choose the image click on the Upload Image button.
Clicking on the Upload Image button will send the image to undergo the process of OCR. Once the OCR is performed and the text is extracted from the image uploaded, the extracted text is displayed on your screen as shown below.
You can find the complete code for this example here.
Conclusion
In the blog, we saw how we could extract text from images in a Flutter application using the pytesseract and pyocr python packages.
Today, as the need for online documentation is rising the usage of OCR is also rising and is widely being used by many sectors. The need for ID verification services is one of the main reasons for the rising demand of OCR technology as it saves a lot of time by reading credentials from an official document. By using the latest technologies like these you can harness the power of OCR and perform the conversion process and the quality of the output and its accuracy depends on the quality of the image. Over the last few years, the global OCR Systems Market has seen tremendous growth, and market analysts and studies say that in the coming years, this market will hit new heights.