SlideShare a Scribd company logo
Zagreb, Croatia October 2024
Vector search and multimodal embeddings
Márton Kodok
Software Architect at REEA.net
1. Intro into BigQuery
2. What are multimodal embeddings?
3. Understanding Vector search concepts
4. Vector search in BigQuery
5. Demo 1: SQL syntax
6. Demo 2: Example app about multimodal use cases
Agenda
Vector search and multimodal embeddings BigQuery @martonkodok
● Google Developer Expert on Cloud technologies (2016→)
● Champion of Google Cloud Innovators program (2021→)
● Among the Top 3 romanians on Stackoverflow 207k reputation
● Crafting Cloud Architecture+ML backends at REEA.net
Articles: martonkodok.medium.com
Twitter: @martonkodok
Slideshare:martonkodok
StackOverflow: pentium10
GitHub: pentium10
Vector search and multimodal embeddings BigQuery @martonkodok
About me
@martonkodok
Multimodal
embeddings
Part #1
Vector search and multimodal embeddings BigQuery @martonkodok
The keyword search limitations on traditional databases
ID Name City
001 Foo NYC
002 Bar LDN
User App
Tabular data
and keywords
Searchterm/keyword
Vector search and multimodal embeddings BigQuery @martonkodok
1. Exact Match Limitations:
rely on keyword matching, missing nuances like synonyms or different phrasing.
2. Struggle with Multimedia Data:
They are poorly equipped to handle unstructured data types like images, audio, and video.
3. Scalability Challenges:
Processing massive datasets of diverse formats becomes increasingly inefficient and slow.
The keywordsearch limitations on traditional databases
Vector embeddings capturetheunderlyingmeaningof your data
Vector search and multimodal embeddings BigQuery @martonkodok
User
[0.1, 0.002,
-0.56, 0.98...]
App + AI Embeddings
Capture
themeaningofyourdata
Vector search and multimodal embeddings BigQuery @martonkodok
1. Semantic Understanding
Embeddings capture the meaning and context of data, enabling searches beyond exact matches.
2. Handling Diverse Data Types
Embeddings search efficiently handles various data formats, including unstructured data:image/video
3. Results are more relevant and comprehensive, providing a richer user experience.
Multimodal embeddings is semantic, not just surface-level keywords
Vector search and multimodal embeddings BigQuery @martonkodok
We’re no longer confined to text. Generate embeddings on Multimodal data.
Input
[0.1, 0.002,
0.56, 0.98...]
Multimodal
Embedding Model Embeddings
[0.93, 0.133,
0.142, 0.03...]
[0.22, 0.092,
0.391, 0.78...]
Joint Embedding
Vector Space
Image:
“gray tabby cat
laying in front of a
Christmas tree”
Text:
size color
living
● Embed text, image, and video in the
same semantic space with the same
dimensionality
● Applications: image/video content
search, classification,
recommendation
Vector search and multimodal embeddings BigQuery @martonkodok
Multimodal embeddings space
“ This opens up new possibilities.
For example, you could ask questions like…
Vector search and multimodal embeddings BigQuery @martonkodok
Vector search and multimodal embeddings BigQuery @martonkodok
Multimodal embeddings use cases
1. Find me a picture similar to some input text
2. Find images of cats similar to this one, but in a snowy landscape.
3. Maybe you ask for customer support issues that resemble this one, even if the words differ.
4. Find items with two persons
5. or two exact items…
6. Finding product recommendations that understand not just your purchase history, but the styles and
aesthetics of the images you've been browsing.
“ How do we obtain the embeddings?
Vector search and multimodal embeddings BigQuery @martonkodok
Vector search and multimodal embeddings BigQuery @martonkodok
Google Cloud - Foundation Models via Vertex AI
Gemini 1.5 Flash
fastest, most cost-efficient
model yet for high volume tasks
Gemini 1.5 Pro
Multimodal reasoning for longer
prompts, 1 million context
window
Imagen 3
Generate images from
Text prompts
Multimodal
Embeddings
Extract semantic information
Chirp for
Speech to Text
Build voice enabled applications
Vector Search
Part #2
Joint Embedding
Vector Space
size color
living
How do we search?
Vector search and multimodal embeddings BigQuery @martonkodok
Approximate Nearest Neighbor Search
Vector search and multimodal embeddings BigQuery @martonkodok
embedding_space
query_item candidate_items
Summary:
● Faster search times
● Scalable to larger datasets
● Slightly reduced recall
…
1
2
N
Query Search Rank
Part #3
Vector search
in BigQuery
Vector search and multimodal embeddings BigQuery @martonkodok
SQL syntax to generate multimodal embeddings in BigQuery
● Generate embeddings
● ML.GENERATE_EMBEDDING (table)
creates embeddings on top of GCS
objects defined in an Object Table
● ML.GENERATE_EMBEDDING (sql)
create embeddings with SQL syntax
on text, images, video
● Store embeddings in BigQuery
alongside business data for joins
and search
Search the vector space to find information that is similar to the question.
Generate
embeddings with
AI models.
Build vector search
index.
Search vector
space.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Vector
embeddings
Vector search
space
Text
Image
Audio
Video
Code
1. Embeddings 2. Index 3. Search
Show me rainbow
sweaters!
Text
Image
Audio
Video
Code
Vector search and multimodal embeddings BigQuery @martonkodok
“
Vector search and multimodal embeddings BigQuery @martonkodok
Data
Multimodal
Embeddings
Search Input
Vertex AI
BigQuery
Dataset
Results
Build Embeddings
Table
Vector Search
Demo
Part #4
Architecture options
Generate
embeddings
Raw Data
In GCS, BigQuery,
or elsewhere
Using BigQuery,
Vertex AI, or other
services
Store
embeddings
In BigQuery
Create Vector
Index
In BigQuery
Batch Predictions
Using BigQuery
SQL
Auto-Sync
Embeddings
To Vertex AI Feature
Store 2.0
Online
Predictions
Using Vertex AI
Feature Store 2.0
Vector search and multimodal embeddings BigQuery @martonkodok
Gen AI … repository
@martonkodok
@martonkodok
Code repository GenAI for Google Cloud
goo.gle/gen-ai-github
Extend search
Recommendation
“Recommend a product for this
customer (based on CRM data)”
Classification
Extract & group entities, e.g. names or
places, from a piece of text based on
context
Info Management
Employees can access information
across documents, reports, emails with
simple search
Outlier Detection
Identify anomalies or fraudulent
activities by comparing clickstream
data with historical patterns
Vector search and multimodal embeddings BigQuery @martonkodok
Takeaways
New ML.GENERATE_EMBEDDING function to create
multimodal vector embeddings
New VECTOR_SEARCH function for powerful, managed vector
search capabilities
Start quickly, with the features you expect from BigQuery
Create governed RAG applications with LangChain
integration
Vector search and multimodal embeddings BigQuery @martonkodok
Resources
BQ Embedding Generation
Documentation | Blog | Video
BQ Vector Search
Documentation | Blog| LangChain
Demo Assets
BQ LangChain Notebook |
Gen AI Repo
goo.gle/next24-ana302
Continue your learning journey!
“ But there is more…
Vector search and multimodal embeddings BigQuery @martonkodok
“
@martonkodok
Article about Imagen 3
Function Calling in Gemini
Remote Functions in BQ
Linkedin: @martonkodok
Thank you. Q&A.
Reea.net - Integrated web solutions driven by creativity
to deliver projects.
Follow for articles:
martonkodok.medium.com
Slides available on:
slideshare.net/martonkodok

More Related Content

PPTX
Vector Databases and Why Are They Used in Modern AI - Marko Lohert - ATD 2024
PDF
London IR Meetup - Players in Vector Search_ algorithms, software and use cases
PDF
Dmitry Voitekh "Applications of Multimodal Learning in media search engines"
PPTX
Vectorland: Brief Notes from Using Text Embeddings for Search
PDF
Vector databases and neural search
PDF
Confluent Current 2024 - Multimodal Embeddings
PPTX
Haystack 2019 - Search with Vectors - Simon Hughes
PPTX
Searching with vectors
Vector Databases and Why Are They Used in Modern AI - Marko Lohert - ATD 2024
London IR Meetup - Players in Vector Search_ algorithms, software and use cases
Dmitry Voitekh "Applications of Multimodal Learning in media search engines"
Vectorland: Brief Notes from Using Text Embeddings for Search
Vector databases and neural search
Confluent Current 2024 - Multimodal Embeddings
Haystack 2019 - Search with Vectors - Simon Hughes
Searching with vectors

Similar to Vector search and multimodal embeddings in BigQuery (20)

PDF
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
PPTX
Vector_Databases_Presentation_in_modern_era.pptx
PPTX
Vectors in Search - Towards More Semantic Matching
PPTX
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
PPTX
Embeddings for recommendation systems
PPTX
Embeddings for Recommendation Systems
PPTX
Neural Models for Information Retrieval
PDF
stackconf 2022: Introduction to Vector Search with Weaviate
PPTX
Neural Models for Information Retrieval
PPTX
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
PPTX
Using Text Embeddings for Information Retrieval
PDF
How Vector Search Transforms Information Retrieval?
PPTX
Vector-Databases-Powering-the-Next-Generation-of-AI-Applications.pptx
PDF
Mastering Vector Search with MongoDB Atlas - Manosh Malai - Mydbops MyWebinar 39
PDF
Distributed Vector Databases - What, Why, and How
PPTX
Vector_Databases_Detailed_Presentation.pptx
PDF
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
PDF
Weaviate and Pinecone Comparison.pdf
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
Vector_Databases_Presentation_in_modern_era.pptx
Vectors in Search - Towards More Semantic Matching
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Embeddings for recommendation systems
Embeddings for Recommendation Systems
Neural Models for Information Retrieval
stackconf 2022: Introduction to Vector Search with Weaviate
Neural Models for Information Retrieval
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Using Text Embeddings for Information Retrieval
How Vector Search Transforms Information Retrieval?
Vector-Databases-Powering-the-Next-Generation-of-AI-Applications.pptx
Mastering Vector Search with MongoDB Atlas - Manosh Malai - Mydbops MyWebinar 39
Distributed Vector Databases - What, Why, and How
Vector_Databases_Detailed_Presentation.pptx
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Weaviate and Pinecone Comparison.pdf
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Ad

More from Márton Kodok (20)

PDF
AI Agents with Gemini 2.0 - Beyond the Chatbot
PDF
Gemini 2.0 and Vertex AI for Innovation Workshop
PDF
Function Calling with the Vertex AI Gemini API
PDF
BigQuery Remote Functions for Dynamic Mapping of E-mobility Charging Networks
PDF
Build applications with generative AI on Google Cloud
PDF
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
PDF
DevBCN Vertex AI - Pipelines for your MLOps workflows
PDF
Discover BigQuery ML, build your own CREATE MODEL statement
PDF
Cloud Run - the rise of serverless and containerization
PDF
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
PDF
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
PDF
Vertex AI: Pipelines for your MLOps workflows
PDF
Cloud Workflows What's new in serverless orchestration and automation
PDF
Serverless orchestration and automation with Cloud Workflows
PDF
Serverless orchestration and automation with Cloud Workflows
PDF
Serverless orchestration and automation with Cloud Workflows
PDF
BigdataConference Europe - BigQuery ML
PDF
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
PDF
BigQuery ML - Machine learning at scale using SQL
PDF
Applying BigQuery ML on e-commerce data analytics
AI Agents with Gemini 2.0 - Beyond the Chatbot
Gemini 2.0 and Vertex AI for Innovation Workshop
Function Calling with the Vertex AI Gemini API
BigQuery Remote Functions for Dynamic Mapping of E-mobility Charging Networks
Build applications with generative AI on Google Cloud
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
DevBCN Vertex AI - Pipelines for your MLOps workflows
Discover BigQuery ML, build your own CREATE MODEL statement
Cloud Run - the rise of serverless and containerization
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI: Pipelines for your MLOps workflows
Cloud Workflows What's new in serverless orchestration and automation
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
BigdataConference Europe - BigQuery ML
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
BigQuery ML - Machine learning at scale using SQL
Applying BigQuery ML on e-commerce data analytics
Ad

Recently uploaded (20)

PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Introduction to Artificial Intelligence
PDF
Digital Strategies for Manufacturing Companies
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
System and Network Administration Chapter 2
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
medical staffing services at VALiNTRY
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Transform Your Business with a Software ERP System
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
System and Network Administraation Chapter 3
PDF
Understanding Forklifts - TECH EHS Solution
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Introduction to Artificial Intelligence
Digital Strategies for Manufacturing Companies
Design an Analysis of Algorithms I-SECS-1021-03
System and Network Administration Chapter 2
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
medical staffing services at VALiNTRY
PTS Company Brochure 2025 (1).pdf.......
Wondershare Filmora 15 Crack With Activation Key [2025
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Operating system designcfffgfgggggggvggggggggg
Which alternative to Crystal Reports is best for small or large businesses.pdf
Transform Your Business with a Software ERP System
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
System and Network Administraation Chapter 3
Understanding Forklifts - TECH EHS Solution

Vector search and multimodal embeddings in BigQuery

  • 1. Zagreb, Croatia October 2024 Vector search and multimodal embeddings Márton Kodok Software Architect at REEA.net
  • 2. 1. Intro into BigQuery 2. What are multimodal embeddings? 3. Understanding Vector search concepts 4. Vector search in BigQuery 5. Demo 1: SQL syntax 6. Demo 2: Example app about multimodal use cases Agenda Vector search and multimodal embeddings BigQuery @martonkodok
  • 3. ● Google Developer Expert on Cloud technologies (2016→) ● Champion of Google Cloud Innovators program (2021→) ● Among the Top 3 romanians on Stackoverflow 207k reputation ● Crafting Cloud Architecture+ML backends at REEA.net Articles: martonkodok.medium.com Twitter: @martonkodok Slideshare:martonkodok StackOverflow: pentium10 GitHub: pentium10 Vector search and multimodal embeddings BigQuery @martonkodok About me
  • 5. Vector search and multimodal embeddings BigQuery @martonkodok The keyword search limitations on traditional databases ID Name City 001 Foo NYC 002 Bar LDN User App Tabular data and keywords Searchterm/keyword
  • 6. Vector search and multimodal embeddings BigQuery @martonkodok 1. Exact Match Limitations: rely on keyword matching, missing nuances like synonyms or different phrasing. 2. Struggle with Multimedia Data: They are poorly equipped to handle unstructured data types like images, audio, and video. 3. Scalability Challenges: Processing massive datasets of diverse formats becomes increasingly inefficient and slow. The keywordsearch limitations on traditional databases
  • 7. Vector embeddings capturetheunderlyingmeaningof your data Vector search and multimodal embeddings BigQuery @martonkodok User [0.1, 0.002, -0.56, 0.98...] App + AI Embeddings Capture themeaningofyourdata
  • 8. Vector search and multimodal embeddings BigQuery @martonkodok 1. Semantic Understanding Embeddings capture the meaning and context of data, enabling searches beyond exact matches. 2. Handling Diverse Data Types Embeddings search efficiently handles various data formats, including unstructured data:image/video 3. Results are more relevant and comprehensive, providing a richer user experience. Multimodal embeddings is semantic, not just surface-level keywords
  • 9. Vector search and multimodal embeddings BigQuery @martonkodok We’re no longer confined to text. Generate embeddings on Multimodal data. Input [0.1, 0.002, 0.56, 0.98...] Multimodal Embedding Model Embeddings [0.93, 0.133, 0.142, 0.03...] [0.22, 0.092, 0.391, 0.78...]
  • 10. Joint Embedding Vector Space Image: “gray tabby cat laying in front of a Christmas tree” Text: size color living ● Embed text, image, and video in the same semantic space with the same dimensionality ● Applications: image/video content search, classification, recommendation Vector search and multimodal embeddings BigQuery @martonkodok Multimodal embeddings space
  • 11. “ This opens up new possibilities. For example, you could ask questions like… Vector search and multimodal embeddings BigQuery @martonkodok
  • 12. Vector search and multimodal embeddings BigQuery @martonkodok Multimodal embeddings use cases 1. Find me a picture similar to some input text 2. Find images of cats similar to this one, but in a snowy landscape. 3. Maybe you ask for customer support issues that resemble this one, even if the words differ. 4. Find items with two persons 5. or two exact items… 6. Finding product recommendations that understand not just your purchase history, but the styles and aesthetics of the images you've been browsing.
  • 13. “ How do we obtain the embeddings? Vector search and multimodal embeddings BigQuery @martonkodok
  • 14. Vector search and multimodal embeddings BigQuery @martonkodok Google Cloud - Foundation Models via Vertex AI Gemini 1.5 Flash fastest, most cost-efficient model yet for high volume tasks Gemini 1.5 Pro Multimodal reasoning for longer prompts, 1 million context window Imagen 3 Generate images from Text prompts Multimodal Embeddings Extract semantic information Chirp for Speech to Text Build voice enabled applications
  • 16. Joint Embedding Vector Space size color living How do we search? Vector search and multimodal embeddings BigQuery @martonkodok
  • 17. Approximate Nearest Neighbor Search Vector search and multimodal embeddings BigQuery @martonkodok embedding_space query_item candidate_items Summary: ● Faster search times ● Scalable to larger datasets ● Slightly reduced recall … 1 2 N Query Search Rank
  • 19. Vector search and multimodal embeddings BigQuery @martonkodok SQL syntax to generate multimodal embeddings in BigQuery ● Generate embeddings ● ML.GENERATE_EMBEDDING (table) creates embeddings on top of GCS objects defined in an Object Table ● ML.GENERATE_EMBEDDING (sql) create embeddings with SQL syntax on text, images, video ● Store embeddings in BigQuery alongside business data for joins and search
  • 20. Search the vector space to find information that is similar to the question. Generate embeddings with AI models. Build vector search index. Search vector space. . . . . . . . . . . . . . . . Vector embeddings Vector search space Text Image Audio Video Code 1. Embeddings 2. Index 3. Search Show me rainbow sweaters! Text Image Audio Video Code Vector search and multimodal embeddings BigQuery @martonkodok
  • 21. “ Vector search and multimodal embeddings BigQuery @martonkodok Data Multimodal Embeddings Search Input Vertex AI BigQuery Dataset Results Build Embeddings Table Vector Search
  • 23. Architecture options Generate embeddings Raw Data In GCS, BigQuery, or elsewhere Using BigQuery, Vertex AI, or other services Store embeddings In BigQuery Create Vector Index In BigQuery Batch Predictions Using BigQuery SQL Auto-Sync Embeddings To Vertex AI Feature Store 2.0 Online Predictions Using Vertex AI Feature Store 2.0 Vector search and multimodal embeddings BigQuery @martonkodok
  • 24. Gen AI … repository @martonkodok
  • 25. @martonkodok Code repository GenAI for Google Cloud goo.gle/gen-ai-github
  • 26. Extend search Recommendation “Recommend a product for this customer (based on CRM data)” Classification Extract & group entities, e.g. names or places, from a piece of text based on context Info Management Employees can access information across documents, reports, emails with simple search Outlier Detection Identify anomalies or fraudulent activities by comparing clickstream data with historical patterns Vector search and multimodal embeddings BigQuery @martonkodok
  • 27. Takeaways New ML.GENERATE_EMBEDDING function to create multimodal vector embeddings New VECTOR_SEARCH function for powerful, managed vector search capabilities Start quickly, with the features you expect from BigQuery Create governed RAG applications with LangChain integration Vector search and multimodal embeddings BigQuery @martonkodok
  • 28. Resources BQ Embedding Generation Documentation | Blog | Video BQ Vector Search Documentation | Blog| LangChain Demo Assets BQ LangChain Notebook | Gen AI Repo goo.gle/next24-ana302 Continue your learning journey!
  • 29. “ But there is more… Vector search and multimodal embeddings BigQuery @martonkodok
  • 30. “ @martonkodok Article about Imagen 3 Function Calling in Gemini Remote Functions in BQ
  • 31. Linkedin: @martonkodok Thank you. Q&A. Reea.net - Integrated web solutions driven by creativity to deliver projects. Follow for articles: martonkodok.medium.com Slides available on: slideshare.net/martonkodok