Complement RPA with Intelligent Data Capture for total automation
by Rajesh Agarwal, on Dec 16, 2019 11:00:00 AM
Estimated reading time: 3 mins
Robotic Process Automation (RPA) software has matured over the years into the Intelligent Automation (IA) realm and resolved the trade-offs related to efficiency and productivity. Yet both RPA and IA don’t quite achieve total automation in case of a paper-based environment. Yes, in this case you need Document Processing technologies, i.e. Optical Character Recognition (OCR) and Optical Mark Recognition (OMR), which are sometimes also referred to as electronic data capture or OCR automation, to achieve total automation.
Today, enterprises have outdone paper. However, the fact remains that paper-based processes still exist thereby reducing the business impact of automation. OCR digitizes a paper document and extracts information from it thus bridging the gap towards achieving total automation and a paperless office. Many RPA software and IA use cases originate from a document, either in a physical or an electronic format; for example electronic healthcare records capture. This is true about all the domains, where paper prevails such as contract management and document processing related to loans, shipping, logistics, invoices, orders, etc.
Learn how total automation is achieved by augmenting RPA software and AI with OCR.
Watch now >>
Point of handshake between RPA bots & OCR
RPA bots can fetch the document from a system for OCR. After digitization and extraction of details, the useful information is either directly posted to a downstream system or handed over to an RPA bot. Depending on the process, the RPA bots either posts the data to an ERP, triggers a workflow, stores the digitized asset in a Document Management System (DMS), or feeds the data into a core enterprise system.
RPA and OCR use cases - Towards total automation
Automation use cases from different industries and how RPA and OCR help achieve operational excellence.
Read now >>
Challenges of regular OCR
The success of RPA depends on accuracy of OCR, intelligence of the OCR to read unstructured data, and integration of the OCR with the RPA technology platform. OCR is thus a quite important element in automation. However, following are some of the most enduring OCR challenges, which relate to Image Quality and Content.
Image Quality challenges
- Skewness: Most of the times, the document image is not properly aligned or straight. It is skewed to be read correctly in a x,y coordinate frame of reference.
- Blurring: The image is hazy and not clear. The degraded copy reduces readability and makes it difficult for the OCR engine to read it.
- Artistic fonts: The decorative font depicting heading on a document, such as invoice, purchase order, loan document, etc., which is bigger and different from the rest of the document text, poses hindrances in machine reading.
- Camera images: Whenever a photograph of a document is taken by using a faulty instrument or camera phone instead of using a scanned image, the mismatch of brightness and contrast impedes OCR reading.
- Background noise & watermarks: A document with carbon smudges, rubber stamps, signatures across text, heavy water marks, etc. affects OCR resulting in inaccurate data capture.
- Low resolution image: In case of low resolution scanned images or stretched fax documents, the accuracy of the OCR output is low.
Content challenges
- Scattered information: Location of the form headings or key words differs from copy to copy in case of unstructured documents.
- Foreign language: When the language is not available in the OCR engine, it poses a challenge to readability of the document.
- Dense documents: When the document comprises multiple fonts, font types, and font sizes, then the OCR output is not good.
- Interference of lines & logos: When pictures, logos, or lines are embedded within text, it affects the OCR readability.
- Handwriting: Handwritten documents are difficult for machine-read or performing OCR.
Intelligent data capture software mitigates regular OCR issues
OCR issues related to image quality and content usually nullify the benefit of process automation capabilities. Intelligent Data Capture solutions available in the market not only mitigate the issues but also provide 99% accurate output. The solutions have pre-processing features to improve image quality and post-processing features to improve the data quality.
The pre-processing features comprise of brightness & contrast optimization, despeckle, noise removal, auto-rotate, deskew, resize, etc. among many others.
Top 38 pre-processing must haves for Intelligent Data Capture.
Read now >>
The post-processing features built using Artificial Intelligence and fuzzy logic include but are not limited to template free data capture, auto-micro zone, auto-determine, auto-correction of OCR issues such as s/5, i/1, B/8, O/0, Z/2, auto-format, auto-validate to ensure accuracy, etc. Intelligent data capture software also validates based on pre-defined business rules; e.g. in a alpha-numeric string such as PAN number, the fifth alphabet is always the first alphabet of the surname.
The post-processing features of Intelligent Data Capture.
Read now >>
The resultant effect of the pre-processing and post-processing functions is intelligent data capture with much higher throughput and accuracy. This Intelligent OCR or OCR automation extracts all the necessary information including metadata, indexes it, and converts it into a format required by the next downstream system. RPA then either parks it or posts it in a downstream system (ERP, DMW, or database).
Intelligent Data Capture is more advanced in comparison with OCR. It enables enterprises to deploy end-to-end automation even in paper-driven environments.
Get a glimpse of the Intelligent Document Processing vendor landscape from Everest Group PEAK Matrix for Intelligent Document Processing (IDP) Technology Vendors.
Download now >>
Business impact
The augmentation of RPA and IA with intelligent data capture software results in setting up a paperless office through total automation. It bridges the gap left in regular automation. The solution also increases accuracy and provides a steep increase in the straight through passes and reduces the turnaround time. It further decreases the overall process time, improves customer experience, which helps in winning better business opportunities.