If you’re
like lot of developers, there’s a chance you might not have high
experience in programming, but that doesn’t mean you can’t learn to develop an automation which your followers love – you just need the right process!
Here, I have shared my experience at the time of development of data scraping from the Aadhar card image using the collaboration of Python and Uipath.
Here, I have shared my experience at the time of development of data scraping from the Aadhar card image using the collaboration of Python and Uipath.
OBJECTIVE:
The main objective of this development
is to extract all details from the Aadhar card and filter the required data.
TOOLS USED:
· Python 3.6
· Uipath
LIBRARIES USED IN PYTHON:
pytesseract
I have used Python-tesseract
(OCR) tool for recognizing and “read” the text from the image.
(Image is in digitalized or scanned or captured formats)
openCV
I have used OpenCV to convert the image to array structures in order to get NumPy
arrays.
pillow
Pillow helps me to open, manipulate, and save the images.
numpy
It is highly optimized library for numerical operations.
In order to get the information from the set, by using NumPy I calculate
the location and found the exact location of particular data.
PACKAGES USED IN UIPATH:
- UiPath.Python.Activities
STEPS:
-> Take image
-> crop to the box (which has text in it)
-> convert into grayscale (mono crome)
-> give to tesseract
-> text (output of tesseract)
Now if this text has processed a piece of meaningful information obtain from it such us,
-> find name using name databases
-> find the year of birth
-> find for Aadhar ID(UID)
Steps followed in converting the image:
Original Image
The following is a sample image containing some random data.
Grayscale Image
Since the color of the image is not an important feature, it’s better
to convert the image from three color channels to one color channel,
preferably grayscale.
While invoking the python script in UiPath we need some alternate format in python script.
For that, I split up my python script into four scripts such us, “extract
text from image”, “extract name”, “extract DOB”, “extract Aadhar number” with
the single function and single return value in order to invoke the first file
and get python object in UiPath and pass those first object as input for the second
file while invoking. Likewise done for remaining parts.
Every script needs to pass the parameter as input in the
function. Return function as output from python to UiPath.
Extract the Name, DOB, Aadhar number from the given image and write
into the text file.
Here is the Processing Video:
Here is the Processing Video:

