As a wave of automation sweeps into our world, computers will play a very important part in enterprise and consumer industries.
In enterprises digital automation can play an important role in: –
- reducing costs
- improving turn around time
- offering better products and solutions
- improving customer experience
In digital automation we have reached a point where we need to understand structured, semi structured and unstructured data to initiate automation processes. Without the ability understanding this data we cannot take our digital automation drives forward.
Intelligent document processing has made huge strides of progress in the past few years to reach a point where IDPs are explored and evaluated in use cases. There are a few perils that we need to discuss
Accuracy – Accuracy means the ability of an IDP to correctly extract data from a document. IDPs make bold claims of the ability to process with 99% accuracy but it should be important for you to be able to evaluate these claims. Time taken to complete the process is another factor that you should consider. You can use an evaluation tools such as <Docvu data > to evaluate the IDP.
Confidence level – Confidence level translates to the ability to process a specific document format reliably. You need to evaluate the coverage that the IDP can provide to the data that you deal with. Some IDPs perform better than others in some scenarios. That being said it becomes important for you to evaluate which IDP would be able to cover all the data that you deal with. Generic IDPs might be able to process a lot of data formats but might miss out on some data formats that you deal with. Specialized IDPs might be able to process 100% of the data formats in a specific process but might not do so well in some other process.
Both Accuracy and confidence levels are key parameters to evaluate an IDP. An IDP with low confidence level and High Accuracy means that the IDP might not be able to process a specific document type most of the times but when it does the accuracy is very high. Generally speaking both these parameters should be very high but then the problem of false positives creeps in.
The perils of false positives
False positives – False positives is when an IDP says that it has accurately extracted data from a document but the data is wrong. IDPs tune their IDPs aggressively to reach 100% coverage. This relaxes the threshold for accuracy or reduce the confidence level required and that leads to false positives.
Solution to false positives
The problem with false positives is that the IDP will notify that it has correctly extracted the data but there are errors in the data extracted. This could lead to a faulty analysis of credibility of customer or non compliance to regulations which leads to costs and mistrust in the system. To address this the process will have to be tweaked to have human oversight/inputs for all documents processed and all the from automation will be reduced dramatically.
A better way is to not compromise with accuracy and confidence levels and let the IDP highlight the documents that it could not be processed which can be passed on to a human operator to process. The documents that the IDP could process would have a very high level of accuracy and can be moved forward in the automated process.