min read
Dec 6

How to Extract Data From Insurance Declaration Pages Using AI

Jon Corrin
Chief Executive Officer

🦾 How to Extract Data From Insurance Declaration Pages Using AI

A few months ago, our team started working on a feature to streamline the agent workflow of extracting data from an insurance declaration page (dec page) and integrating it into the various system agents use. This feature quickly captured the attention of our customer base, who are already working with these documents, as it gave them hope that life could be easier and deal flow could move faster. We partnered with a third-party, Sensible (, to solve this problem but a lot of my customers are now asking me if this is something AI can solve on its own. In this article, I'm going to talk about exactly that, how AI can extract data from dec pages, Acord Forms, and Supplemental Applications.

💡 Overcoming the Challenges of PDF Data Integration Today

The short answer is they can't do it well today without extensive training and work on your AI model; but there's hope for the future. There are a few challenges when trying to extract data from a PDF like the ones listed above: 

  1. For electronic PDFs, the field names can vary from one PDF to another, making it inconsistent and hard to standardize.
  2. For image-like PDFs, you first have to use Optical Character Recognition (OCR) to extract the data from the PDF, which can reduce the quality of the extracted data result if the result of the OCR is poor.
  3. For integrating that data into various systems, you still need to format the extracted data, assuming it's perfect, into the format the vendor you're trying to integration into requires in order to push data into their systems, assuming you have access and can implement code on their APIs.

When introducing AI, it'd be ideal for a "multi-modal" model (this simply means the AI can take in and spit out more media types than text like images and audio) to be able to extract the data itself in the perfect format you needed in order to push that data into the systems you need to automate your processes; but this is where currently technology falls short.

🚀 The Future of PDF Data Extraction Using AI

Large AI companies like OpenAI, Google, Facebook, Amazon, and more are already working on making it more possible to achieve the goal of extracting standardized data from PDFs, overcoming the challenges above. OpenAI came out with a feature called "functions" over the summer that enables developers like the ones we have here at XILO to suggest what format we want the data returned in when trying to extract data from PDFs. Zapier uses technology like this in their new conversational Zap creation feature where you can simply tell the bot what type of Zap trigger and action you want and it'll generate it for you (although this doesn't work extremely well just yet). I have found that the technology to accomplish this task has become better over the last year and will continue to get better as these large companies become more competitive in the AI space.

If you're interested in seeing how it works today, go ahead and submit the chat below which asks OpenAI to extract insurance data from the uploaded dec page. This is just an example dec page and the results are inconsistent, but you can see how they may have a promising future! Here's how to try it out: 

Step 1: Review the PDF

Step 2: Click Send

Step 3: Review the results sent back by OpenAI

I won't know exactly what your result was but when testing it myself, sometimes I would get a perfect result and sometimes I'd get an imperfect one. Although it's only showing its response in text format, you'd have to structure it to fit the needs of the vendor in order to successfully use it for integrations.

🤖 Can You Extract Data From Dec Pages Using ChatGPT?

ChatGPT is a conversational bot built by OpenAI that uses OpenAIs APIs as you just used above. Your results will vary slightly from using ChatGPT versus their API because of the data the AI is trained on, but the results should not be vastly different. With that being said, do not upload PDFs with client data into ChatGPT! OpenAIs API is much more secure but still shouldn't be trusted; ChatGPT is transparently not secure and will even reject such requests in its current state.

🌟 The Future of AI in Insurance: Introducing AI Agents

Looking towards the future, we anticipate the emergence of what are called "AI Agents" – sophisticated AI systems that can handle multi-step tasks. These AI virtual assistants could revolutionize the industry by managing the entire process of data extraction, from uploading a PDF to cloud services, scanning it using OCR, structuring the data, and integrating it into various systems. This advanced level of AI operation is an area of active development among leading AI companies. I've tried the technology myself and it's still in its infancy, but the opportunities are limitless when AI can assess its own efficacy in a specific task and use that to decide whether it needs to do a better job or not. Image syncing this up to a platform like Zapier that offers thousands of integrations at your fingertips? The world would be our oyster when it comes to automating processes and eliminating data entry from an insurance agents workflows.

In conclusion, AI holds immense potential in automating tasks such as extracting data from PDFs and integrating it into the systems that agents use to service their customers; but it's not fully ready. As AI technology evolves, agents who leverage these advancements will likely experience significant gains in efficiency and speed.

👥 Building a Community: AI For Insurance Agents

As I continue to explore the role of AI in insurance, I am dedicated to building a community around this topic. If you are interested in AI's potential in the insurance industry, feel free to connect with me on LinkedIn. Together, we can discuss and shape the future of AI in our field.

Book a Demo
You successfully subscribed!
Oops! Something went wrong while submitting the form.