:: Knowledge Centre :: How to know whether your IDP solution for mortgage document processing is accurate?

Article, Mortgage

How to know whether your IDP solution for mortgage document processing is accurate?

June 30, 2021
12:00 am

Intelligent document Processing (IDP) is a technology that is helping companies in various sectors unlock the power of data hidden in the various documents involved in their process. It is a combination of different components like OCR, Business rules, Business process management and Analytics. But primarily the functions of an IDP lie in understanding the document types and extracting the data inside those documents.

Due to the complex nature of underlying technology and multiple layers of outputs which leads to final result it is also difficult to understand and measure the effectiveness of the IDP system. Today there are various vendors for the IDP which offer various flavors of solution in the market. Also, vendors of RPA and system integrators also offer IDP solutions apart from the pure play IDP products. And all of them promise accuracy as one of the key metric of their performance.

It has to be argued that due to multiple stages involved in the document processing, all of the vendors may not be having the same definition in mind when they talk of accuracy. This blog tries to dissect and demystify the meaning of accuracy as used in context of IDP

Steps in IDP and components of accuracy

Steps in an IDP — Steps involved in an IDP

As one can see output of each step feeds into the next step. It is imperative that the accuracy of one stage or step will affect the accuracy of the next step. But if we were to focus on each step and understand accuracy at that stage we end up with the following 5 definitions of accuracy.

OCR accuracy – Percentage of original document digitized correctly. This is influenced by the OCR engine that we employ. Also the host of preprocessing steps before the OCR will influence this accuracy a lot. There are also a set of post processing steps on the OCR’ed data that will improve the accuracy of the data which is then fed into the classification process.
Document identification accuracy – Percentage of document types identified correctly. This is dependent more on the method used to identify the document. Different products use different methods for identifying the documents. The most commonly used methods are identifying the characteristics like Document title, heading, and other typical phrases which might suggest beginning and end of specific documents.
Indexing accuracy – Of the document types identified, % where the pages are correctly identified. Depends on the logic used to attribute a given page to a certain document. We might be using a page-begin and page-end parameter that might influence this figure. Usually, vendors deal with this stage by using an algorithm like bag of words, etc. The presence of blank pages, etc. might influence the accuracy at this stage
Data value accuracy – Of the fields identified, Percentage where the data value matches the original. This is more dependent on whether the value of the field was read correctly. Output of this stage can be influenced by OCR accuracy.

These levels of accuracy are not independent of one other. OCR accuracy for one can influence the indexing and data extraction accuracy. After all, if the characters are not identified correctly the remaining logic to identify the page start, bag of words, etc. may not work well.

As an overall measure, we can think of an overarching accuracy promised by the entire system.

Overall accuracy – Accuracy of Straight through processing – Effective accuracy of processing a document package correctly in totality. This has to be measured as the percentage of documents processed correctly – that is, 100% correct indexing and 100% correct data extraction.

Planned vs Observed Accuracy – About confidence levels

Let’s now look at another way to look at the accuracy – Planned vs Observed.

When an IDP is being setup with a particular document or a data fields we have various parameters that are used for the same. Like, some parameters will go to identify the document type and the page numbers correctly. There will be some phrases and logic used to identify a data field.

A provider can setup confidence levels based on the parameters. For example, if 8 parameters go on to identify a document correctly, how confident can we be about the document being correctly identified provided all of the 8 parameters matched. That is if 8 parameters match can we be 98% confident or 100% confident that the document has been identified correctly? This will be ascertained through human intervention – comparing the machine output with human scrutiny, to determine confidence levels.

Most of the times this is what the vendors will be mentioning as the accuracy. The observed accuracy is something that has to be determined by periodic sampling.

White gloved Accuracy

As mentioned in the beginning various kind of vendors offer solutions to varying effects. It is very common for the BPO providers or the system integrators to offer a human assisted solution. This is where the output of the system is reviewed and modified by human associates. They might re-index the document and identify the data which system couldn’t correctly extract. This results in near perfect accuracy which reached 99% or more.

Machine learning and Accuracy

Modern IDPs like DocVu.AI loop in machine learning to improve the accuracy of the system even before the humans get involved. The whole idea is to eliminate and finally reduce the manual involvement in the processing of images and files. The focus then is on a attaining a high degree of straight through processing rate. This accuracy is something which is improved as the learning in the system improves. The learning – improvement loop is more like identifying what new parameters can be employed to bring in corrections when the system error is identified. For example if the system identifies a document wrongly, through machine learning it can be trained to look for additional parameters which will strengthen the business rules engine thereby improving accuracy next time a similar situation is encountered.

It has to be remarked that the machine learning accuracy is again dependent on the underlying OCR accuracy. Another very important factor is the initial learning that the system has already had. DocVu.AI is specialized on mortgage domain and hence the system that you get is already trained well on Document formats and data intricacies related to the domain. The advantage is that you start off with a higher accuracy than a generic IDP when you use it for the Mortgage specific use cases.

Hope this demystifies how accuracy can be viewed for an automation involving document processing. DocVu.AI achieves higher accuracies

Want to know how DocVu.AI makes document processing faster?

Learn more about DocVu.AI's unique features and capabilities that make your document processing seamless.

Subscribe to our newsletter

Stay informed with the latest on the Industries we work with and news updates from our company.

View All

Article

What Is Intelligent Document Processing (IDP)? Evolution, Use Cases, and Benefits in Financial Services

From paper ledgers and OCR scanners to AI-driven document intelligence, how financial institutions moved from manual document burden to automated, scalable workflows, and what the next generation of IDP makes possible. Why Financial Services Depends

Article

The Mortgage Document Processing Workflow: Intake, Extraction, and Validation Explained

A complete guide for mortgage lenders, operations leaders, and underwriting teams — covering how automated mortgage processing and loan document automation reduce processing time, eliminate validation errors, and scale loan volume without adding headcount. Why

Article

Five Ways AI Reduces Mortgage Processing Bottlenecks

Where operational throughput breaks down in document-heavy workflows and the specific system capabilities that fix it. The Throughput Problem No One Fixes at the Root Most mortgage operations leaders know exactly where their pipeline slows

Article

The Hidden Document Challenges Behind DSCR Loans

Why document variability — not borrower qualification — is the real constraint on Non-QM scale. When the File Breaks Before Underwriting Starts Your DSCR pipeline is growing. Your underwriting criteria are clear. But files are

Article, Mortgage

How to know whether your IDP solution for mortgage document processing is accurate?

Read the full article.

Steps in IDP and components of accuracy

Planned vs Observed Accuracy – About confidence levels

White gloved Accuracy

Machine learning and Accuracy

Want to know how DocVu.AI makes document processing faster?

Subscribe to our newsletter

Related

What Is Intelligent Document Processing (IDP)? Evolution, Use Cases, and Benefits in Financial Services

The Mortgage Document Processing Workflow: Intake, Extraction, and Validation Explained

Five Ways AI Reduces Mortgage Processing Bottlenecks

The Hidden Document Challenges Behind DSCR Loans

Extract accurate data faster and redefine document processing.

Subscribe to our newsletter

4 Cedarbrook Drive, Bldg. B Cranbury,
NJ 08512, United States
Phone: 609 452 0700

Sunil Nehru

Sundareswaran Krishnamoorthy

Article, Mortgage

How to know whether your IDP solution for mortgage document processing is accurate?

Read the full article.

Steps in IDP and components of accuracy

Planned vs Observed Accuracy – About confidence levels

White gloved Accuracy

Machine learning and Accuracy

Want to know how DocVu.AI makes document processing faster?

Subscribe to our newsletter

Related

What Is Intelligent Document Processing (IDP)? Evolution, Use Cases, and Benefits in Financial Services

The Mortgage Document Processing Workflow: Intake, Extraction, and Validation Explained

Five Ways AI Reduces Mortgage Processing Bottlenecks

The Hidden Document Challenges Behind DSCR Loans

4 Cedarbrook Drive, Bldg. B Cranbury,NJ 08512, United StatesPhone: 609 452 0700

4 Cedarbrook Drive, Bldg. B Cranbury,
NJ 08512, United States
Phone: 609 452 0700