Top 38 pre-processing must haves for Intelligent Data Capture
by Rajesh Agarwal, on Dec 16, 2019 11:00:00 AM
Estimated reading time: 5 mins
Paper-based processing still exists. It is going to stay for quite some time. Yes, not in just small business pockets but in a good 25-30% of business operation scenarios. When converted into monetary form, this aspect of business processing amounts to double digit millions annually in terms of revenue. The theme is majorly recurrent in BFSI and Finance & Accounts and Procurement sections of almost all Manufacturing, Telecom, Supply Chain, and Research & Analytics companies. Yet when it comes to business, you cannot compromise on speed, efficiency, and quality.
As a matter of fact, Automation cannot take place without digitization. Simply put, Digitization is the stepping stone to your Digital Transformation journey.
It is interesting to note that, Document Processing or Optical Character Recognition (OCR) as it is popularly known, helps to digitize paper based enterprise assets. This actually leads to the materialization and fulfillment of complex business use cases.
However, the fact remains that OCR has inherent quality issues. A hindrance in the form of quality of the digitized asset renders even hi-tech technologies, such as Robotic Process Automation (RPA) and Intelligent Automation (IA), simply ineffective. Here, Intelligent Document Processing, more popularly known as Intelligent Data Capture, is the way ahead. It enables you to read and ingest text from an image, thus making use cases such as Tab Banking, On-mobile Onboarding, and faster claim processing a matter of few minutes as against hours and days required in the bygone years.
Intelligent Data Capture is an integrated solution, which has features of Optical Character Recognition (OCR), Optical Mark Recognition (OMR), as well as Intelligent Character Recognition (ICR). Learn how Intelligent Data Capture helps in processing health claims by seamlessly reading characters, tick marks, and hand-written characters. Watch now >
Technology is increasingly transforming insurance and healthcare processes to achieve savings and cost reductions. Read more >
What is Intelligent Data Capture?
Intelligent Data Capture is the process of capturing data from all types of documents including “unstructured ones” such as email, text, PDF, scanned documents, etc., classifying it into categories, and extracting relevant information for further processing. The software solutions for Intelligent Data Capture use Artificial Intelligence algorithms to extract the data in a template free mode, process it, and then feed it into different applications, databases, and downstream systems.
However, at times the image itself is not clear, has carbon smudges, is skewed, and not properly oriented. At times, it could be a dot matrix print or have high noise and contrast. All this results in an inefficient data capture output as per the popular concept “Garbage in Garbage out” or “GIGO”.
Intelligent Data Capture complements RPA to achieve total automation. Read more >
It is interesting to note that the reliability and authenticity of the data captured depends on the clarity and effectiveness of the image captured. This calls for pre-processing of the image prior to data capture in order to enhance the image quality and improve the capturing process. It also requires certain post-processing to improve the quality of the data captured.
Top 38 pre-processing features for an accurate and efficient OCR:
OCR issues negate the benefits reaped through automation. The aforementioned 38 functionalities work together in tandem and help you generate a 99.0% perfect Intelligent Data Capture.
- De-Skew: Straightens skewed images
- Sub-Image: Separates out an area from the original document image prior to processing
- Noise Removal: Removes isolated specks and machine dot shading
- Lines: Offers settings for horizontal and vertical line removal and reporting
- Vertical Registration: Registers to a particular point using vertical lines
- Resize: Use these settings to "stretch" or "shrink" an image to a new size
- Smoothing & completion: Smoothens characters for better OCR reading
- Inverse Text Correction: Converts white text on black background to normal black-on-white text and makes OCR reading of such text possible
- Horizontal registration: Registers to a particular point using horizontal lines
- Auto-rotate: Performs automatic image rotation
- Intelligent crop: Automatically removes thick black or white borders from an image
- Manual rotate: Offers manual rotation to get correct orientation
- Manual crop and pad: Performs manual crop to add or delete pixles on image size
- Contrast: To increase or decrease contrast
- Brightness: To increase or decrease brightness
- Hue: Improves color depth
- RGB separation: Removes RGB color one by one
- Dotted line: Removes dotted lines for better OCRing
- Test registration: Aligns all images at a particular text
- In painting: Removes water marks incorporated as a separate layer
- Stamp removal: Removes stamp marks, which are in specific pre-defined color
- Edge smoothening: Makes lines perfect
- Character smoothening: Makes characters perfect
- Character thinning: Makes characters thin
- Character separation: Separates machine print words for better readability
- Back ground cleaning: Removes background
- Perimeter recognition: Allows boundary recognition for box type shapes
- Contouring: Allows boundary recognition for non-standard shapes
- Remove handwritten noise: Removes handwritten characters
- Page recognition: Allows to recognize the page
- Form bursting: Explodes a page into multiple sub section
- Color drop-out: Removes color that is redundant - RGB/CMK, etc
- Remove grey: Removes grey shaded background
- Carbon cleaning: Removes carbon marks and smudges to the maximum extent possible
- Grow: Makes the lighter text dark
- Filter: Offers filter for Blurr/Dilate/Median
- Gamma: Allows to set relation between the black and white pixels
- Mirror: Flips the image so that text can be visible
These 38 pre-processing Intelligent Data Capture functionalities prove to be the deciding factor between bad OCR output and good OCR output after image enhancement, thereby determining the success of the overall automation effort or otherwise. These features are instrumental in not only enhancing the image quality but also making total automation and a paperless office a business reality.
Intelligent Data Capture takes OCR to a new level with its image post-processing features. Read more >
White paper on "The A-Z of Intelligent Data Capture and why it is more than just OCR" explains how Intelligent Data Capture brings enterprises having a paper-based work environment on the same level as their digital-born counterparts. Download now >
Intelligent Data Capture and allied technologies help to extend RPA in many ways. Learn about the 6 smart ways in which an enterprise-grade RPA product can be extended. Read more >
Learn how enterprises are taking their first step towards Digital Transformation with Intelligent Document Processing. Watch now >
Get a detailed view of the Intelligent Document Processing (IDP) technology vendor landscape. Read the Everest Group Peak Matrix Report for Intelligent Document Processing (IDP) Technology Vendors. Download now >
Intelligent Data Capture along with RPA and IA provide a phenomenal success in many use cases, which were rendered simply impossible till a few years ago. The very fact that information from unstructured data sources such as a PDF, a printout, or even an image can be read and captured to update databases and downstream systems was highly unbelievable. Today, Intelligent Data Capture is a strong business enabler. It makes 3-minute on-boarding a digital reality, not only saving revenue in order of millions but also allowing you to do more with the same number of resources. Having said this, Intelligent Data Capture is just a milestone in the RPA and IA journey while leaving scope for more high-tech advancement in the near future.
- Simplify your Trade Finance operations with Intelligent Automation
- Democratize your business process automation with Robotic Process Automation (RPA)
- How Artificial Intelligence transforms the Robotic Process Automation landscape to make it more productive
- Read "Five RPA success stories in the Banking & Financial Services sector" which improved productivity