Fact: around 80 percent of the data in an organisation is completely unstructured. That ‘data chasm' can include images, web pages, hand-written documents, signatures, and even mobile content.

Of course, most businesses don’t usually worry about the chasm – until they want to leverage unstructured data to underpin and drive their robotic process automation (RPA) initiatives. 

Then it quickly dawns on them that they need to digitise and understand all data types –  and fill the ‘80 percent’ chasm.

A yawning gap that needs to be filled

OCR (Optical Character Recognition) is usually the ‘go to’ technology for capturing data and digitising documents.  But in documents that contain signatures, handwriting or images, digitisation is almost impossible because of OCR’s zone-based or template-dependent data extraction methods. 

That’s where the yawning gap needs to be filled.

In its Intelligent Document Processing (IDP) Playbook, research firm Everest Group highlights the fact that RPA’s limitations can be overcome by complementing them with AI-based technologies to enable true end-to-end automation. IDP solutions, it says, are capable of processing documents with greater accuracy and are more resilient to changes in document templates than traditional OCR.

According to market trends outlined by Everest Group, IDP solutions will replace traditional OCR-based solutions for document extraction as they offer straight-through processing to users with high accuracy. 

Specifically, IDP may use OCR to convert images of documents to digital format, but then extracts specific information using machine learning and/or deep learning. With IDP, the extraction does not depend on the template, but on the content.

Key capabilities of IDP

The core technologies powering IDP cannot be powered by OCR alone. IDP is only possible due to other technologies such as computer vision, machine learning and deep learning. Some key applications for IDP solutions include:

  • Claims processing - IDP solutions can process semi-structured and unstructured data and convert it to structured format to be further processed by RPA

  • Invoice processing - IDP solutions can be used to extract invoice and purchase order details, and Machine Learning capabilities can be leveraged to handle exceptions

  • Contact centre -  AI technologies can understand/predict customer behaviour and provide personalised recommendations

Use cases

One of India’s leading providers of credit ratings, industry research and analytics needed to extract huge volumes of financial information from a wide variety of documents, including annual reports and quarterly statements. The firm was able to position IDP in combination with its analytics solution to increase the value proposition of its offering. 

It began its IDP journey in 2018 and is currently working with AntWorks to extract over 200 data points per document from over 12,000 documents annually. Staff productivity increased by 15 percent in the initial stages and freed the firm to explore new business areas.

At a human resources consulting and services firm, its consultants had to extract and analyse data from proposals sent as PDF, memos, emails, etc. In adopting an IDP solution, the firm achieved productivity gains through reduction in the manual capturing of data and consequently increased the utilisation of resources in analytical activities.

Future trends

As enterprises mature in their IDP journeys, consolidation of broader transformation projects such as RPA and other AI initiatives under a single umbrella is anticipated.

There are many reasons why IDP adoption is growing, including:

  • The increasing need for enterprises to process large volumes of semi-structured and unstructured documents with greater accuracy and speed

  • The demand among enterprises to enable end-to-end process automation with integrated RPA and IDP capabilities

  • Improved sophistication of AI technologies powering IDP solutions, which significantly increases their accuracy rates in processing documents compared to traditional OCR solutions

Digitising and understanding all data types is crucial. The ability to extract clean and complete data will help to effectively fill the 80 percent unstructured data chasm – and help organisations navigate their way to automation success.

Join our webinar, ‘Data: The Critical Fuel for Intelligent Automation’ on March 25, 2020.