From Templates to Prompts: How Document Capture Is Embracing GenAI

Vijay Chaudhary

Lead Software Engineer

Published Jun 14, 2025

Generative AI is making a big impact on enterprise applications, and the image capture space is also starting to see these changes. GenAI is bringing new ways to handle tasks that could really change how this field works. Most intelligent capture platforms now integrate generative AI to enhance document capture and understanding. In this article we will go through some of the progress Tungsten (formerly Kofax) TotalAgility has enabled LLMs into its low-code process orchestration with AI-driven extraction and conversational interfaces. These capabilities allow users to interact with the system in natural language, enable them to build solutions way faster and get insights from documents.

Sharing a few thoughts on how GenAI could reshape image capture work.

Building capture projects will take less time because GenAI makes it faster and easier to set up data extraction.
Complex documents can be handled without needing a lot of custom rules or templates.
We can still use rule-based methods to double check important data fields that GenAI pulls out, to make sure it’s correct.
Existing capture tools may shift their role—instead of doing all the extraction work, they could be used to catch errors or “hallucinations” from GenAI and document processing workflows.
Processing large numbers of documents will still need applications that can track and audit everything that happens.
User interfaces where people can review and tag documents will still matter, especially when the AI isn’t confident. Straight through processing is risky at this point of time.
The real value will come from writing smart prompts that help GenAI get the right data out of documents.
Prompts used for classifying and extracting data can be reused across different tools, making those flexible and easy to transfer between tools.

In this article we will cover some of the recent additions to TotalAgility platform, our focus will be on features with GenAI capabilities.

[Conversational Copilot Interfaces]

Copilot for Development - A user can type in request or upload a hand-drawn image of design for the Development workflow, generating workflows, data models, forms, and business rules based on the description. And copilot creates these artifacts in real time with LLMs. This speeds up solution development by allowing citizen developers to build automation using everyday language.
Copilot for Extraction - Users describe in natural language the fields or information they want from a document, and the platform generates the extraction data. This enables users to describe what they would like to extract across multiple languages and cuts model development time by up to 80%. It works on unstructured or variable documents without lengthy training, by intelligently breaking down documents, extracting text, and applying the user’s prompt to retrieve specified data points. Copilot for Extraction eliminates manual field definition and training, handling high variability in layouts while reducing maintenance overhead.
Copilot for Insights - Provides a conversational AI assistant for data analysis and document understanding. Users can converse with data naturally by asking questions and getting instant answers with annotations back to the source, even across large collections of documents from your repository. This is work in progress, more uses cases and integrations will come in new future I believe.

[Intelligent Document Understanding with LLMs]

Feature called Auto-Extract uses an LLM to automatically pull key-value data from documents with no prior template or training. This is essentially zero-shot extraction – given a document, the AI identifies important fields (like dates, totals, names). No training extraction to automatically identify key-value pairs, reducing build time by 90%.

Total Agility also integrates with Azure AI Document Intelligence (or Google Document OCR) for improved text layer extraction from document (especially handwritten data) and then use it with the generative Auto-Extract. It is like getting the best text layer available and then using it with advanced LLMs using prompts (with document text layer sent as context - RAG), leading to best extraction results.

[Knowledge Integration with LLMs]

In the roadmap generative AI is coupled with knowledge sources to produce field results. It uses techniques like retrieval augmented generation (RAG) to feed corporate (document/image) data to the LLMs. This implies that the tool can connect to enterprise content (documents, databases, CRM records) so that the AI’s answers reference up-to-date and relevant information (mitigating hallucinations). For example, a Copilot might pull in relevant policy documents or past cases from a knowledge base when a user asks a question, ensuring the answer is based on actual data. Intelligent Search & Retrieval allows users to search across documents and get summaries of key points or direct Q&A with cited references. This feature effectively uses an LLM to read complex documents and pinpoint information, boosting knowledge workers in content-heavy tasks. The idea is to intelligently chunk long form content and provide a curated context to the LLM which could reduce the risk of hallucinations. By combining its IDP extraction, search, and the LLM - TotalAgility can deliver summaries of lengthy documents, answers queries with evidence.

Coming to the technical details, you can connect LLMs on the Total Agility Platform via one of these three ways.

[1] OpenAI ChatGPT - Supports integration with ChatGPT OpenAI, most of the GenAI integration features on the platform with OpenAI.

[2] Azure OpenAI - Supports integration with Azure OpenAI generative AI models, with this also most of the GenAI integration features on the platform with OpenAI

[3] Custom LLM – Allows us to integrate other language models by defining a REST API or a custom workflow as the intermediary. Here you can invoke a chat completion API with other providers. This means an enterprise could use an on-prem model or another third-party LLM (e.g. with AWS SageMaker or Google Vertex AI) by writing a small adapter process.

Note - Copilot for Extraction & Auto Extract features works either OpenAI ChatGPT or Microsoft Azure OpenAI and not with Custom LLMs. Meaning with OpenAI and Azure OpenAI TotalAgility has tighter integrations.

[Summary]

Generative AI is transforming how document capture and automation projects are delivered. These capabilities simplify the handling of complex or unstructured documents, which previously required time-consuming rule or template-based configurations. While GenAI offers powerful automation, it's equally important to validate its outputs. Rule based checks and human in the loop interfaces remain critical, especially for high-confidence, high-volume environments. These advances bring speed and cost savings, especially as LLM APIs become more affordable (right now cost is a big question) in future, they also introduce challenges. Generative AI models can behave unpredictably, which makes them risky for straight-through processing without oversight.

These new enhancements offers a blueprint for doing it within a governed, enterprise workflow context. It combines the conversational power of modern LLMs with the structure of enterprise data, business rules, and audit trails. Other platforms are also moving in this direction, bringing together document intelligence, BPM, RPA, and generative AI into one solution. To summarize, this approach brings together the strengths of generative AI and the reliability of traditional capture, creating efficient way to automate document heavy capture business processes.

From Templates to Prompts: How Document Capture Is Embracing GenAI

Vijay Chaudhary

Lead Software Engineer

[Conversational Copilot Interfaces]

[Intelligent Document Understanding with LLMs]

[Knowledge Integration with LLMs]

AI-ML & Automations

1,633 follower

More articles by this author

Others also viewed

LLM Deployment 101: Which Method Should You Use and When?

Generative AI Frameworks Every AI/ML Engineer Should Know!

Top 10 AI Actions in Power Automate to Supercharge Your Workflows (In-Depth)

Command-A with Ollama: Empowering Enterprise AI with Local Deployment

2025 AI Predictions: RAG + Knowledge Graphs + Agents + Foundation Models Will Outperform Custom Models for Most Business Cases

The Rise of Services-as-Software: Revolutionizing Business Operations

Prompt Hubs and Model Routers: The New Middle Layer of Enterprise AI

What is Retrieval Augmented Generation (RAG)? All You Need to Know

Building Agentic Workflows with n8n and LLMs

Asynchronous AI : The Future of Multi Tiered, Team Structured AI Deployment

Explore topics

[Conversational Copilot Interfaces]

[Intelligent Document Understanding with LLMs]

[Knowledge Integration with LLMs]

AI-ML & Automations

1,633 follower

Small problem big outcome - A Practical Guide to OMR

Aug 16, 2025

Text Recognition to Intelligence - IDP with OCR Tools

Jul 14, 2025

The Rise of Generative IDP - AI Meets Document Capture

Jun 29, 2025

Unlocking Text Analytics with Amazon Comprehend

May 4, 2025

Unlocking Document Insights with Amazon Textract

Apr 22, 2025

Document Processing: Named Entity Recognition with Azure Services

Apr 6, 2025

Document Intelligence: A Dive into LayoutLM and Cloud Offerings

Mar 31, 2025

Agents with a Mind: A Practical Start to Agentic AI

Mar 23, 2025

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Mar 16, 2025

Splitting Text Right Way - NLTK, SpaCy or Markdown

Mar 2, 2025

Others also viewed

LLM Deployment 101: Which Method Should You Use and When?

Generative AI Frameworks Every AI/ML Engineer Should Know!

Top 10 AI Actions in Power Automate to Supercharge Your Workflows (In-Depth)

Command-A with Ollama: Empowering Enterprise AI with Local Deployment

2025 AI Predictions: RAG + Knowledge Graphs + Agents + Foundation Models Will Outperform Custom Models for Most Business Cases

The Rise of Services-as-Software: Revolutionizing Business Operations

Prompt Hubs and Model Routers: The New Middle Layer of Enterprise AI

What is Retrieval Augmented Generation (RAG)? All You Need to Know

Building Agentic Workflows with n8n and LLMs

Asynchronous AI : The Future of Multi Tiered, Team Structured AI Deployment

Explore topics