Data is at the heart of digital transformation – in fact, it’s the lifeblood of digitisation. To stay ahead of the curve, companies are trying to leverage data that makes employees and processes more productive, improve customer experience, open new markets and generally find new ways of securing competitive advantage.

Analyst firm IDC estimates that more than 5 billion consumers interact with data every day–and that number will top 6 billion by 2025, when each connected person will have at least one data interaction every 18 seconds.

Given those stats, the real value of data should never be underestimated–companies must recognise that data has real business worth. Organisations that are first through the gateway of digital transformation will be the first to find out just how valuable their data really is.

But there’s one pressing problem. Not all data is nice, neat and structured. In fact, a large percentage of it is unstructured. And it’s a challenge that every organisation faces when trying to extract, understand and process the data that drives most digital transformation initiatives.

Understanding Unstructured Data

So what do we know about unstructured data? And why is it such a challenge? Here’s a quick Q&A:

  • What is unstructured data? Unstructured data is data that isn’t stored in a fixed record length format. Examples include emails, social media feeds, digital images, videos, handwritten text and signatures.

  • Why does unstructured data matter? According to projections from analysts IDC, 80 percent of worldwide data will be unstructured by 2025. For many large companies, it has reached that critical mass already.

  • When do businesses use unstructured data? They use it every day to underpin and drive virtually all processes in their organisations, and particularly those processes they want or need to automate.

  •  Who is affected by unstructured data? Internally, almost every corporate department uses unstructured data in some form or other. Externally, unstructured data is used to monitor and report on company operations, communicate with customers and stakeholders and a wide range of business-critical functions.

  • How important is it for companies to manage unstructured data? The preparation and processing of unstructured data–and the ability to extract, understand and manage it efficiently–can mean the difference between business success and failure.

Real-World Examples

Unstructured data can affect everyone at a company, from the entry-level staffer to the CEO. The term ‘unstructured data’ in itself may sound slightly ethereal–so what does it look like in the real world? Here are some examples:

  • A paper-based purchase order with a signature that comes into your accounts payable department

  • A product photo that needs to be associated with an order

  • A bar code that is assigned to an item in your warehouse

  • Medical records that may require supporting X-rays or MRI images

  • Motor insurance documents that require a signature or photo

The Failure of OCR

Traditionally, being able to capture and digitise data to underpin and drive process automation–which is at the heart of digital transformation–has largely been down to OCR (Optical Character Recognition) technology. 

The problem is, OCR falls down when you present it with anything other than neat, nicely structured data. In other words, OCR doesn’t like unstructured data.

With OCR, your business can only automate some of your processes, or parts of those processes, but not all your processes. That in turn can lead to automation project roadblocks, or even failure.

Take a Different Approach

So, for an organisation deep into a digital transformation cycle, what’s the answer? Taking a cognitive machine reading (CMR) approach to automation means you can capture and curate all data – structured or unstructured. 

CMR has undoubtedly emerged as a key disruptor of OCR and is filling what many are calling ‘the unstructured data chasm’–the data gap that businesses can, and must, fill to execute on their digital transformation goals.

While OCR is primarily zone-based and template-dependent–so for every type of form or document, a unique template must be created and/or a zone needs to be defined–a CMR approach takes you out of that restricted zone. In fact, you can be zone-, template- and mode-independent, read any structured, semi-structured or unstructured data, and gain greater efficiency in end-to-end automation.

4 Key Benefits of a CMR Approach


    One of the key drawbacks of OCR is the sheer amount of time it takes to process documents. OCR goes through a recursive process to find all content, irrespective of whether you want it or not. In stark contrast, due to CMR’s underlying architecture, it does not capture what is not needed. Documents can therefore be processed faster, with less CPU utilisation.


    CMR not only enables you to generate a higher capture rate, but also ensures higher accuracy of the data captured. It is the only data ingestion engine that meets the complex Machine Vision requirements of highly unstructured data and disparate document formats, such as handwritten text, forms and images.


    CMR offers high flexibility in processing multi-format documents, which in turn leads to a reduction in processing time. CMR’s multi-format data ingestion means you can move beyond file type limitations–it can ingest TIFF, JPEG, PDF or any image file format. You can also export data for downstream consumption in varied formats, including CSV, JSON, XML and DB schema.


    CMR is a Unified Data Platform for meeting the needs of true process automation. It gives you a real competitive edge because it can capture and curate all data. In particular, it empowers you to overcome the challenges of digitising unstructured data, including printed text (structured and unstructured), handwritten text, cursive writing, image and object recognition, signature verification and Natural Language Understanding.

Disruptive Technology

There’s no doubt that with CMR you can not only expand your automation scope, but also get faster ROI, increased data certainty and continuous improvement when it comes to optimising the automation of business processes.

CMR is fundamentally a disruptive technology, and you can adopt it today to manage your entire content capture and curation demands, fast-track your business automation–and power your digital transformation success, without unstructured data blocking your path.

Download the DISRUPTOCR Ebook and discover how to transform your unstructured data.