[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation with Retrieval-Augmented Generation and LLMs

Grid Dynamics / Transforming Business Process Automation
Transforming Business Process
Automation with Retrieval-Augmented
Generation and LLMs
Đorđe Grozdić | November 2023

About Myself
PhD in Machine Learning and Artificial Intelligence
10+ years of hands on experience in Data Science
Senior Staff Data Scientist
& Senior Specialization Lead
Đorđe Grozdić

3
About Grid Dynamics
Grid Dynamics, a global digital engineering company, co-
innovates with the most respected brands in the world
to solve complex problems, optimize business operations
and better serve customers.
Grid Dynamics is a leading provider of technology consulting,
agile custom software development, and data analytics
for Fortune 1000 and Global 2000 enterprises undergoing
digital transformation.
3

4
Grid Dynamics: Prepare to Grow
UK
Serbia
Poland
Ukraine
Armenia
Mexico India
Spain
USA
4,000
engineers, architects
and tech managers
GDYN
Nasdaq-listed since 2020
18
countries
ᐧ USA
ᐧ Mexico
ᐧ UK
ᐧ Netherlands
ᐧ Spain
ᐧ Poland
ᐧ Serbia
ᐧ Romania
ᐧ Moldova
ᐧ Ukraine
ᐧ Armenia
ᐧ Jamaica
ᐧ India
Yerevan
Guadalajara
Kyiv
Krakov
Amsterdam
London
Belgrad
New York
San Francisco
Bay Area (HQ)
Portland
Chicago
Dallas
Atlanta
Tampa
Lviv
Madrid
Grid Dynamics was founded in Silicon Valley in 2006 with the
mission to bring emerging technology to large enterprises.
With proven ability to scale globally, we became trusted tech
partner for tier-1 firms.
Netherlands
Areas of focused growth
Existing locations
Headquarters
Hyderabad
Jamaica

5
Digital Innovation Partner
for Fortune 1000
and many more...
.Tech. .СPG.
.Finance. .Retail. .Other.

Introduction
Business Process Automation (BPA) and the impact of AI

・Necessity of automation in today's competitive business landscape.
・Limitations of traditional document processing:
・Inability to manage high volume and complexity.
・Slower processes with higher error rates.
・Emergence of Large Language Models (LLMs):
・Advancements in complex, human-like text generation.
・Challenges with domain-specific tasks.
・Introduction of Retrieval-Augmented Generation (RAG):
・Seamlessly integrates domain-specific data in real-time.
・Reduces the need for continuous model retraining.
・Advantages of RAG:
・Cost-effective and secure.
・Provides greater explainability.
・Minimizes errors and "hallucinations" compared to general-purpose LLMs.
Business Process Automation and LLMs
7

What is Retrieval-Augmented
Generation?
Brief overview of how RAG works.

Retrieval-Augmented Generation (RAG) is a machine learning approach that combines the
strengths of information retrieval methods with the generative capabilities of language models
Architecture of Retrieval-Augmented Generation
9

RAG in Practice
High-level view of how RAG is applied across
different industries.

RAG in Supply Chain
11

RAG in Supply Chain
12

RAG in Supply Chain
13

Deep Dive: RFP Processing with
RAG
Case study with specifics on how RAG can optimize RFP processing.

・RFP (Request For Proposal) is a document issued by a business or
organization when seeking proposals or bids from potential suppliers or
service providers.
・Intelligent Document Processing (IDP) tool:
・ Perform ad hoc analysis of large documents such as contracts and RFPs,
ask questions and summaries.
・ Automatically fill forms such as RFP responses by generating answers
based on your knowledge base.
・ Control the style of the generated answers and adjust details using natural
language instructions.
・ Automatically validate that generated or manually created documents are
consistent with your knowledge base.
・ Combine the above blocks into complex workflows.
RFP Processing - Use case
15

Intelligent Document Processing - Workflow
16

Intelligent Document Processing
17

Architecture of RAG
Overview of the architecture with focus on the
Retriever, Generator, and Orchestrator.

The Process Flow
19

Building a RAG Pipeline
Key steps from document loading to answer
generation.

・Diversity of Text Data Sources:
・Handles various document types: .txt, .pdf, .docx, .xlsx, .csv, .json, .html, .md, code files…
・Ensures compatibility across a wide range of data formats.
・Preparation and Loading Processes:
・Involves extraction, parsing, cleaning, formatting, and text conversion.
・Essential for feeding clean and structured data to LLMs.
・LangChain:
・A Tool for Data Loading: Recognized for its capability to process over 80 document types.
・Offers versatility in handling diverse data inputs.
1. Document Loading
21

・Document Splitting
・Essential for managing extensive documents within LLM token limits.
・Process: Load → Parse → Convert → Chunk.
・Challenges in Context Preservation
・Example of context loss shown in figure on the right side.
・Importance of semantic consideration in splitting.
・Principles of Text Splitting
・Chunk Size: Based on character, word, or token count.
・Overlap: Ensures continuity of context between chunks (see figure below).
・Chunking Techniques
・Fixed-size with overlap: Simple but potentially context-disrupting.
・Sentence Splitting: Utilizes NLP tools for coherent segmentation.
・Recursive Chunking: Hierarchical and iterative approach.
・Specialized Techniques: Adapts to structured formats like Markdown.
・Optimizing Chunk Size
・Preprocess data for quality enhancement.
・Experiment with a range of chunk sizes for optimal balance.
・Iteratively evaluate performance to refine chunking strategy.
・Conclusion
・ Tailor document splitting approach to specific application needs.
2. Document Splitting
22

・Text Embedding
・Post-splitting: Text chunks are transformed into vector representations.
・Purpose: Facilitate semantic similarity comparisons.
・Vector Embeddings Role
・Fundamental in ML for mapping complex data into vector space.
・Captures semantic information in text data.
・Semantic Relationships
・ Example: Different sentences with similar meanings
are close in vector space.
・ Visualization: Clustering in embeddings indicates semantic proximity.
・Evolution of Embedding Models
・Word2Vec and GloVe: Word-level embeddings from co-occurrence.
・Transformers (BERT, RoBERTa, GPT): Context-aware embeddings.
・Context-Aware Embedding
・Consider entire sentence context, enriching semantic capture.
・Critical for ambiguity resolution and NLP advances.
・Use Cases in NLP
・ Example: Distinct meanings of 'bank' in different contexts.
・ RAG (Retrieval and Generation): Utilizes transformer models for efficient document handling.
3. Text Embedding
23

・Vector Stores Storage:
・Houses document chunk embeddings and associated IDs.
・Function: Facilitates efficient vector lookups for similar content.
・Notable Vector Stores
・FAISS: Specializes in handling massive vector collections.
・SPTAG: Offers customizable search algorithms for precision and speed.
・Milvus: Open-source database compatible with major ML frameworks.
・Chroma: In-memory database versatile for cloud and on-premise deployment.
・Weaviate: Stores both vectors and objects, supports various search methods.
・Elasticsearch: Scales well for large-scale vector data applications.
・Pinecone: Managed service, optimal for real-time analysis and ML applications.
・Considerations for Choice
・Scale of data and computational resources.
・Integration with existing frameworks and infrastructure.
・Balancing between precision, speed, and storage efficiency.
・Implications for RAG
・The correct pairing of text embedding and vector store is critical.
・Enables rapid retrieval of relevant document chunks.
4. Vector Store
24

・Retrieval Process Overview
・Begins with query transformation into vector form.
・Comparison with document chunk vectors in vector store.
・Objective: Retrieve relevant document chunks corresponding to the query.
・Retrieval Mechanisms Similarity Search:
・Uses cosine similarity to find related documents.
・Maximum Marginal Relevance (MMR): Ensures diversity and
reduces redundancy.
・Similarity Score Threshold: Filters documents above a certain
similarity score.
・Top 'k' Documents: Retrieves a set number of documents
based on ranking.
・Advanced Retrieval Methods
・Self-Query/LLM-Aided Retrieval:
⎯ Splits the query into search and filter terms.
⎯ Utilizes metadata filters for more precise retrieval.
・Compression Retrieval:
⎯ Compression LLM condenses information to focus on key aspects.
⎯ Balances storage efficiency with retrieval speed.
・Traditional vs. Modern Techniques
・Vector-Based Retrieval: Preferred for RAG due to semantic matching capabilities.
・Traditional NLP Techniques: SVM, TF-IDF, etc., less common in RAG systems.
5. Document Retrieval
25

・Answer Generation
・Involves creating a prompt from relevant document chunks and the user query.
・The prompt guides the LLM to generate relevant and insightful responses.
・Standard Method: The “Stuff” Approach
・Simplest form of generating answers.
・Direct processing of prompt for immediate answer generation.
・Limited by context window size - less effective for complex, multi-document queries.
・ Advanced Methods for Complex Queries
・Map-reduce Method:
⎯ Processes each document chunk individually.
⎯ Combines separate answers into one final response.
⎯ Advantage: Handles an arbitrary number of chunks; effective for comprehensive answers.
⎯ Drawback: Slower and may miss context spread across multiple chunks.
・Refine Method:
⎯ Iterative updating of prompt with relevant information.
⎯ Useful for dynamic contexts where initial answers can be refined.
・Map-rerank Method:
⎯ Ranks documents by relevance to the query.
⎯ Ideal for scenarios with multiple plausible answers.
・Choice of Method
・Dependent on the complexity of the query and desired answer abstraction level.
・Enhances the accuracy and relevance of LLM responses.
6. Answer Generation
26

Benefits of RAG and Conclusions
Advantages of RAG over general-purpose LLMs.

’Real-Time’ Data Integration:
・Immediate inclusion of new data into the system's
knowledge base.
・Eliminates the need for constant model retraining.
Reduced Costs:
・Indexing and retrieval reduce computational
expenses.
・Saves time by avoiding frequent retraining cycles.
Enhanced Security:
・Sensitive data remains in the document store,
not exposed to the model.
・Real-time access restrictions improve data
protection.
Advantages of RAG over General-Purpose LLMs
28
Greater Explainability:
・Responses can be traced back to source
documents.
・Increases transparency and accountability in
automated processes.
Reduction in Hallucination:
・Relies on actual documents to generate responses,
decreasing false information.
・Ensures information reliability by referencing the
existing knowledge base.
Overcoming Context Size Limitations:
・Retrieves only relevant documents, tackling the
token limitation of LLMs.
・Facilitates handling of extensive data sets beyond
the usual LLM capacity.

5000 Executive Parkway,
Suite 520 / San Ramon, CA
650-523-5000
info@griddynamics.com
www.griddynamics.com
Grid Dynamics Holdings, Inc.
Thank you for your attention!

[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation with Retrieval-Augmented Generation and LLMs

More Related Content

Similar to [DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation with Retrieval-Augmented Generation and LLMs (20)

More from DataScienceConferenc1 (20)

Recently uploaded (20)

[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation with Retrieval-Augmented Generation and LLMs