Game Changer: Google Document AI - Solution for Complex Document Parsing

Exploring the possibilities of 'DocumentAI'

Game Changer: Google Document AI - Solution for Complex Document Parsing

Documents frequently remain the lifeblood of business processes. They get created, circulated, adjusted, forwarded, and in many cases, lost within giant directories.

Solving the Document Pipeline Problem

Even a well-designed document pipeline suffers from attrition. A channel may suffer from dark spots, where documents are stored within the system but, for whatever reason, cannot be searched for and located when needed. Worst of all is a lack of a document processing pipeline altogether. Without processes and functional systems for ingesting, storing, and discovering documents, your organization is effectively operating in the dark.

There is room in this Process for an AI/ML categorization system to function, providing value through its capacity to make sense of large volumes of unstructured data. Before that point, it is crucial to understand the elements that make up a modern document discovery solution.

The Document Ingestion Process

Whether or not documents are being scanned in or are entirely digital, the fundamental issue remains the same: the data is usually semi-structured or unstructured. Even if there are standardized fields that are relatively consistent, upfront processing is still required to make the data usable.

This can be done with an ELT process, which aims to transform the data into a structured, relational format comprehensively. This data can then be stored in a data warehouse and utilized for analytics in a presentation layer. However, an alternative is to retain the documents in their original formats instead of building a rich layer of metadata around them.

We at Periscope would recommend the second approach to build a document search and discovery system which is easier and less disruptive to establish within an existing document pipeline. It is the perfect option for documents that are preserved in their original formats. During the ingestion process, we utilize Google's DocumentAI to perform Optical Character Recognition (OCR). Because of this, documents with imperfections and handwriting are processed without issue. Additionally, Speech-to-Text: Automatic Speech Recognition can feed in speech audio and be handled within the same pipeline as more conventional documents.

Document AI Solution

There are three critical components in a Google Cloud Document AI solution:

Figure 1 depicts a sample architecture using Document AI to do document processing.

Sample Document AI Components
Graphical user interface, application, Teams

Description automatically generated
Sample Document AI Architecture

Figure 1. Sample Document AI Architecture

The high-level Process

Figure 2. The high-level Process for performing Machine Learning on a document leveraging Google Cloud AutoML for custom-trained models

Sample Implementation of Document AI

DocumentAI can be used to solve numerous unstructured document processing requirements. One clear example of how DocumentAI can be used is to ingest, OCR, extract, understand, and store intelligence reports that otherwise would live in unstructured document repositories. Figure 3 shows the life cycle of a message and how Google Cloud Documentai would be able to process and analyze the document leveraging AI/ML technology.

Sample Document AI Workflow

Figure 3. Sample Document AI Workflow

Keyword Tagging for Machine Learning

Although manual tagging through a central portal can already be a potent document pipeline tool for many organizations, automating this Process opens up new horizons. Manual tagging performed on documents is like the tagging required to build a machine-learning model. Because of this, a modern document discovery system is ideally suited for implementing machine learning classification functionality.

Once a specific volume of documents has been manually tagged, DocumentAI allows that tagged data to be easily rolled into a model for Google's Machine Learning Services. This model can then be implemented to automate the process of categorizing documents, providing companies with a solution that can break through the most congested document bottlenecks. The document pipeline can then be augmented further and made even more accessible with tools with the ability to use natural language queries to search for documents or critical statements.

This is just one example of how Google Cloud DocumentAI can efficiently bring structure to unstructured data and leverage the data to draw new insights to inform analysts and commanders. One of the most significant benefits of DocumentAI is that both pre-trained and custom-trained AI/ML models are available depending on the type of document processing required.

The wrong term, good concept: Defining HITL

AI models don't make predictions with 100% confidence as their "understanding" of data is primarily based on statistics, which lacks the concept of absolute certainty as humans use it in practice. To account for this inherent uncertainty, some AI systems allow humans to interact with it directly.

Due to this interaction (feedback), the machine adjusts its "view of the world." This works much like you would teach a child when it points at a cat saying, "woof woof" – through repeated feedback ("No, that's a cat"), the child will learn to connect the pieces.

With these two key terms in our books, we can formally define the concept:

HITL refers to systems that allow humans to give direct feedback to a model for predictions below a certain confidence level.

In practice, you need to determine what level of confidence is acceptable for the Process: If it is ok to have wrong predictions "slipping through," you can set a threshold relatively low – which, in turn, requires much less manual intervention through human labor. In other cases, you want to be sure that the system only records "correct" predictions.

This solution offers a workflow and a user interface (UI) that people can utilize to label, review, and edit the data extracted from documents. The tool has certain UI features to streamline labelers' workflow and filter the output based on the confidence threshold. It also allows companies to manage their labelers' pool. The client company can use their employees as labelers or Hire a Google HITL workforce to accomplish the task.

Humans and machines, hand in hand

Human-in-the-Loop aims to achieve what neither a human being nor a machine can achieve. Humans need to step in and intervene when a machine can't solve a problem. This process results in the creation of a continuous feedback loop. With constant feedback, the algorithm learns and produces better results every time.

The importance of human intervention in different types of data

Image: the importance of human intervention in different types of data

Typically, there are two Machine Learning algorithms where you can integrate HITL approaches. These include supervised and unsupervised learning.

In supervised learning, experts use labeled data sets to train algorithms to produce appropriate functions. These can then help to map new examples. Doing this will allow the algorithm to determine tasks for unlabelled data correctly.

In unsupervised learning, unlabelled datasets are fed to the algorithms. Thus, they need to learn to find a structure in the unlabelled data and memorize it accordingly. This falls under the Human-in-the-Loop Deep Learning approach.

Business Use Cases where Document AI is a game changer -

Insurance Quotes - Data Extraction

Extracting the data fields like Premium, Insurer Name, Insurer address, Insurer phone, Agency name, Insurance Agent Name, Insurance Agent address, Insurance Agent Email, Customer name, Customer Address, Customer phone, Agency Agent Name, Agent Address, Agency Agent email, Agency Agent Phone, Current Policy Number, Policy Type, Policy Start Date, Policy End Date, Commission from multiple insurance vendors with different document formats can be a challenging task.

The customized processor in DocumentAI helps solve this problem.


Description automatically generated
The architecture of the Implementation with DocumentAI

Lending AI

Challenges faced by lending companies during the Loan origination

A picture containing text, toiletry, skin cream, powder

Description automatically generated

Manual Loan origination takes up to 35-40 days.

A picture containing timeline

Description automatically generated

Document AI Integrated Loan Lending Experience is taking 80% lesser time.

With lending DocumentAI, we can Automate mortgage loan application doc.

Borrowers to lenders help fasten the Process.


Helped Docusign train on 1000 contracts to identify the blank fields which need user input

Without DocumentAI, it was a manual process

Procurement Doc AI

A specialized model validates procurement data, like invoice number id, etc.

That eliminates human intervention.

  • Digitizing books for e-readers
  • Filling out medical intake forms at doctor's offices
  • Submitting expense reports based on receipts and invoices
  • Authenticating identity based on ID cards
  • Approving loans based on income information from tax forms
  • Understanding contracts for key business agreements

How Periscope can Help

Reach out to us at to discuss, and we can help simplify your complex document data extraction problems by implementing an Automated Document Extraction Pipeline using Google's Cloud DocumentAI