Automation World: January 2020

If you’re like lot of developers, there’s a chance you might not have high experience in programming, but that doesn’t mean you can’t learn to develop an automation which your followers love – you just need the right process!

Here, I have shared my experience at the time of development of data scraping from the Aadhar card image using the collaboration of Python and Uipath.

OBJECTIVE:

The main objective of this development is to extract all details from the Aadhar card and filter the required data.

TOOLS USED:

· Python 3.6

· Uipath

LIBRARIES USED IN PYTHON:

pytesseract

I have used Python-tesseract (OCR) tool for recognizing and “read” the text from the image.

(Image is in digitalized or scanned or captured formats)

openCV

I have used OpenCV to convert the image to array structures in order to get NumPy arrays.

pillow

Pillow helps me to open, manipulate, and save the images.

numpy

It is highly optimized library for numerical operations. In order to get the information from the set, by using NumPy I calculate the location and found the exact location of particular data.

PACKAGES USED IN UIPATH:

UiPath.Python.Activities

STEPS:

-> Take image

-> crop to the box (which has text in it)

-> convert into grayscale (mono crome)

-> give to tesseract

-> text (output of tesseract)

Now if this text has processed a piece of meaningful information obtain from it such us,

-> find name using name databases

-> find the year of birth

-> find for Aadhar ID(UID)

Steps followed in converting the image:

Original Image

The following is a sample image containing some random data.

Grayscale Image

Since the color of the image is not an important feature, it’s better to convert the image from three color channels to one color channel, preferably grayscale.

While invoking the python script in UiPath we need some alternate format in python script.

For that, I split up my python script into four scripts such us, “extract text from image”, “extract name”, “extract DOB”, “extract Aadhar number” with the single function and single return value in order to invoke the first file and get python object in UiPath and pass those first object as input for the second file while invoking. Likewise done for remaining parts.

Every script needs to pass the parameter as input in the function. Return function as output from python to UiPath.

Extract the Name, DOB, Aadhar number from the given image and write into the text file.

Here is the Processing Video:

Automation World

Pages

Monday, January 27, 2020

Aadhar card scraping using python and uipath