SlideShare a Scribd company logo
Keeping Up with AI:
Gemma’s Here So
You Don’t Left Behind
Agadir
Mohammed Buallay
Mohammed Buallay
- Google Developers Experts for AI/ML
- Lead Data Architect and Reporting
- I love to travel and meet new people
󰎓
Mohammed Buallay
- I started working on AI in 2017
- As senior project
AI Intelligence: Exploring the Future of Artificial Intelligence
Let’s meet with LLMs
What is Gemma?
Automate News Summary
Outlines
Let’s meet with
LLMs
What is
an LLM?
[...] [...] [...]
[...]
0.02
0.03
0.9 0.01 0.0 …
Dogs Rain Drops Fish Wind …
and
cats
raining
It’s
,‫ﻧﺘﻌﻠﻢ‬ ‫ﻣﻨﻚ‬ ‫اﺣﻨﺎ‬ ‫و‬ ‫ﻣﻌﻠﻢ‬ ‫أﻧﺖ‬
Violets are blue,
Sugar is sweet,
‫ﻧﺘﻌﻠﻢ‬ ‫ﻣﻨﻚ‬ ‫اﺣﻨﺎ‬ ‫و‬ ‫ﻣﻌﻠﻢ‬ ‫أﻧﺖ‬
‫اﻟﻄﻴﺐ‬ ‫ﻓﻲ‬ ‫اﻟﻤﺜﻞ‬ ‫ﺑﻚ‬ ‫ﻧﻀﺮب‬
‫واﻟﻬﻤﺔ‬
‫اﻟﻘﻠﺐ‬ ‫ﻣﻦ‬ ‫وﻧﻘﻮﻟﻬﺎ‬ ‫ﺑﻚ‬ ‫ﻧﻔﺨﺮ‬ ‫ﻛﻠﻨﺎ‬
‫ﻓﻲ‬ ‫ﺑﺲ‬ ‫ﻣﻮ‬ ،‫اﻟﻔﻌﻞ‬ ‫ﻓﻲ‬ ‫ﻗﺪوﺗﻨﺎ‬
‫اﻟﻜﻠﻤﺔ‬
*
Modern LLMs can
be very large.
LLMs Explained
Classic Natural
Language Problems
LLMs Explained
Entity extraction Classification Summarization
Sentiment Analysis Translation Code Generation
…and more!
LLMs let us
prototype fast.
LLMs Explained
Explain this meme: Model: The meme plays of the
double meaning of “sprite”:
Model:
User:
Therefore, a game developer’s favorite drink is
Sprite because of their love for sprites in game
development, thus the drink’s name is a pun.
Sprite (the drink): popular lemon-lime soda
Sprite (2D bitmap image): often used to represent
characters, objects, etc. in game environments.
LLMs Explained
Model:
1. **Build a volcano.**
This is a classic science fair project that is both fun and educational. You
can learn about the science of volcanoes, earthquakes, and plate tectonics.
2. **Make a solar oven.**
This is a great project to learn about renewable energy and how to harness
the power of the sun. You can use your solar oven to cook food, dry fruit, or
even melt chocolate.
3. **Create a 3D printer.**
This is a more challenging project, but it is also very rewarding. You can
learn about 3D printing technology and design your own custom objects.
4. **Build a robot.**
This is a great project for those who are interested in engineering and
robotics. You can learn about basic electronics, mechanics, and
programming.
5. **Design a water filtration system.**
This is a project that can help you learn about the importance of clean
water and how to filter it. You can use your water filtration system to clean
water from a stream or river.
LLMs Explained
Give me five neat ideas for
science fair projects.
User:
Why are large language models
different?
LLMs are characterized by emergent
abilities, or the ability to perform tasks that
were not present in smaller models.
LLMs contextual understanding of human
language changes how we interact with
data and intelligent systems.
LLMs can find patterns and connections in
massive, disparate data corpora.
Search
Conversation
Content generation
What are some everyday tasks
that can be improved by AI?
Google is the industry pioneer in AI
Responsible AI
3,000
Researchers
7,000
Publications
Built & Tested
for Safety
Privacy in design
Upholds high scientific
standards
Accountable to People
Socially Beneficial
Avoid creating unfair
bias
2015
Google DeepMind
AlphaGo defeats Go
champion
2016
Google’s
DeepMind
helps detect
eye disease
2017
Google invents
Transformer
kickstarting LLM
revolution
2018
Google’s
groundbreaking
large language
model, BERT
2019
Text-to-Text
Transfer
Transformer
LLM 10B P
Model Open
Sourced
2020
Google LaMDA
Model Trained
to converse
2022
AlphaFold
predicts
structures of all
known proteins
2023
A conversational
AI Service
powered by
PaLM2
2024
Family of
multimodal LLMs &
products
AI Intelligence: Exploring the Future of Artificial Intelligence
Gemini is designed to be
Multimodality
GEMINI CAN
Understand
text, images,
audio and more
The Gemini Ecosystem
The most advanced AI from Google
For Developers
For Consumers
For Business and Enterprise
Models
Gemini API
(in Google AI Studio + ai.google.dev)
Gemini for Google Workspace
Gemini for Google Cloud
Gemini in Vertex AI
Gemini | app and web
Gemini in the Google App
Gemini in Gmails, Docs…
NotebookLM this deck
is about
But also Imagen,
Veo….
Unified SDK
Getting started with the
Gemini API
Train the Trainer in AI
AI Studio
aistudio.google.com
● Generate API Keys
● Create, test, and save prompts
● Customize models in minutes
● Generate starter code
AI Studio
aistudio.google.com
AI Studio
aistudio.google.com
AI Studio
aistudio.google.com
AI Studio
aistudio.google.com
AI Studio
ai.google.dev
REST API + new Client libraries for Python, Node, Java, and Go
Libraries
Libraries
Unified SDKs
…and more!
Vertex AI
Enterprise grade support.
Full MLOps (Examples: Model
evaluation, monitoring, registry)
Vertex AI
Check it out when you're
ready for enterprise-scale
production workloads.
Gemini API and Vertex AI
Both give access Gemini family
models.
Vertex AI
Cloud Production
ai.google.dev/docs/migrate_to_cloud
PlatfoEndpoints
rms
Platforms
Platforms
Now way easier with
the unified SDKs
Usage patterns
Train the Trainer in AI
We’ll cover
● Basic prompting
● Code generation and execution
● Grounding with Google Search
● Function calling and structured outputs
● Vision (images, video) and audio understanding
● Image and video generation
● Tuning
Usage Patterns
Basic prompting
Train the Trainer in AI
Write a poem about a magic backpack.
In a world of wonder, behold,
A backpack with tales yet untold.
Its fabric shimmers, alive with might,
A portal to realms, day and night.
aistudio.google.com
Write a poem about a magic backpack.
In a world of wonder, behold,
A backpack with tales yet untold.
Its fabric shimmers, alive with might,
A portal to realms, day and night.
...
aistudio.google.com
Generate a short tip about JavaScript. Good tips are short and
contain a code example showing the tip.
**Tip: Use Array.prototype.flat() to flatten an array of
arrays.**
```javascript
const array = [[1, 2], [3, 4], [5, 6]];
console.log(array.flat());
// Output: [1, 2, 3, 4, 5, 6]
```
Generate a short tip about JavaScript. Good tips are short and
contain a code example showing the tip.
**Tip: Use Array.prototype.flat() to flatten an array of
arrays.**
```javascript
const array = [[1, 2], [3, 4], [5, 6]];
console.log(array.flat());
// Output: [1, 2, 3, 4, 5, 6]
```
Generate a short tip about JavaScript. Good tips are short and
contain a code example showing the tip.
topic: let and const
tip: Prefer using `let` and `const` to initialise variables,
instead of `var`, as `var` has no block scope or protection
against mutability.
e.g. const myVar = "EXAMPLE";
topic: arrow syntax
tip:
from google import genai
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Explain how AI works",
)
print(response.text)
# Okay, let's break down how AI (Artificial Intelligence) works.
It's a broad and complex field, so I’ll focus on the core
concepts and provide examples to make...
chat = client.chats.create(
model="gemini-2.0-flash"
)
response = chat.send_message(
"Hello, what should I have for dinner?")
print(response.text)
# 'Here are some suggestions...'
response = chat.send_message(
"How do I cook the first one?")
Contents
Settings/Configs Text
Text
Text
Tools
Tools
Safety Settings
Safety Settings
Generation Config
Content of role: "user"
containing multimodal Parts:
Content of role: "model"
"user" content: a text Part
Tools specified by the caller
Safety setting configured by caller
Generation Config includes: temperature, Top P, Top K,
stop seq, max output tokens etc.
Generate Content Request
Candidate(s)
Feedback
Text
Block Reason
Safety Ratings
Candidate "Content"
note: you can ask for more than one content now
why the model stopped generating
feedback on the prompt
Finish Reason
Finish Message set if finish reason is present
Safety Ratings
Safety Ratings
how safe is the response
Generate Content Response
…
AI Studio
Endpoints
Prompting
https://ai.google.dev/docs/prompt_best_practices
● Chained prompts - make a plan, then execute it
● Context - Few shot prompts
● Generation parameters - Temperature, Safety settings, Top-P, TopK
🤳
Code generation
Train the Trainer in AI
Code Generation
Code generation
Convert human language
requests to machine requests
Rapid prototyping for tools,
extensions, and applications
Generate code based on a
user prompt
Examples
Code generation
● Generate data
AI Studio example
AI Intelligence: Exploring the Future of Artificial Intelligence
Examples
Code generation
● Generate data
● Generate a SQL query
AI Studio example
AI Intelligence: Exploring the Future of Artificial Intelligence
Examples
Code generation
● Generate data
● Generate a SQL query
● Simulate Execution
AI Studio example
AI Intelligence: Exploring the Future of Artificial Intelligence
Examples
Code generation
● Generate data
● Generate a SQL query
● Simulate Execution
● Quick coding help - Smart pair programer.
AI Studio example
AI Intelligence: Exploring the Future of Artificial Intelligence
Write a unit test for this Python function:
def is_diver_neutrally_buoyant(density_water, density_diver):
"""Check if a diver is neutrally buoyant."""
if density_water == density_diver:
return True
AI Studio example
import unittest
class IsDiverNeutrallyBuoyantTest(unittest.TestCase):
def test_diver_is_neutrally_buoyant(self):
self.assertTrue(is_diver_neutrally_buoyant(1000, 1000))
def test_diver_is_not_neutrally_buoyant(self):
self.assertFalse(is_diver_neutrally_buoyant(1000, 1010))
def test_invalid_input_types(self):
with self.assertRaises(TypeError):
is_diver_neutrally_buoyant("1000", 1000)
with self.assertRaises(TypeError):
is_diver_neutrally_buoyant(1000, "1000")
Examples
Code generation
● Generate data
● Generate a SQL query
● Simulate Execution
● Quick coding help - Smart pair programer.
● Generating a full web application
● Integration in IDEs (ex: Copilot, Cursor, Continue, etc.)
Code Execution
Train the Trainer in AI
Code execution
● Available via AI Studio and the Gemini APIs
● Gives Gemini the ability to write, execute, and fix Python code
● Sandboxed environment (no internet access)
● Can also output images!
Tuning goo.gle/gemini-api-cookbook-code-execution
AI Studio example
response = client.models.generate_content(
model = "gemini-2.0-flash",
contents = "Can you draw a graph of the first 50 values of the
fibonacci serie?",
config = types.GenerateContentConfig(
tools=[types.Tool(
code_execution=types.ToolCodeExecution
)]
)
)
response = client.models.generate_content(
model = "gemini-2.0-flash",
contents = "Can you draw a graph of the first 50 values of the
fibonacci serie?",
config = types.GenerateContentConfig(
tools=[types.Tool(
code_execution=types.ToolCodeExecution
)]
)
)
# Okay, I can help you with that. I'll generate the first 50 Fibonacci
numbers and then create a graph plotting the index (from 1 to 50)
against the Fibonacci number. I will use python to generate the numbers
and then display a graph.
import matplotlib.pyplot as plt
def fibonacci(n):
fib_list = []
a, b = 0, 1
for _ in range(n):
fib_list.append(b)
a, b = b, a + b
return fib_list
fib_numbers = fibonacci(50)
plt.figure(figsize=(12, 6))
plt.plot(range(1, 51), fib_numbers, marker='o', linestyle='-')
plt.title('First 50 Fibonacci Numbers')
plt.xlabel('Index')
plt.ylabel('Fibonacci Number')
plt.grid(True)
plt.yscale('linear') # Or 'log' if you want to see the initial values better
plt.show()
Search Grounding
Train the Trainer in AI
● Models have knowledge cut-offs
● Grounding with Google Search provides up-to-date
information
● Gives Gemini the ability to browse the web and cite sources
● 500 free RPD with 2.0 models
Grounding on Google Search
Search & IR goo.gle/gemini-api-cookbook-grounding
AI Studio example
● Instructions + Context + Question all in the prompt
● No need for RAG or fine-tuning
● Easy to implement
○ No extra code, just upload and ask.
Use the prompt's context window
Search & IR goo.gle/gemini-api-cookbook-pdf
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Who is Guillaume Vernade?",
config = {
'tools': [{
'google_search': {}
}]
}
)
print(response.text)
# Guillaume Vernade, also known as Giom, is a tech enthusiast who
has worked in various roles, from neural networks and AI to Agile
coaching and video game production at companies like Ubisoft and
Google Stadia. Currently, he works at Google Deepmind as a
Developer Advocate, focusing on Generative AI with Gemini models.
pdf_file = client.files.upload(file="invoice.pdf")
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=[
"Summarize this document",
pdf_file,
])
print(response.text)
# This doc is an GCP invoice related to your Vertex AI usage…
Use the prompt's context window
Learning more
● gemini-2.0-flash: 1M tokens.
○ Remember: Everything is tokens.
○ Up to 2M+ tokens for Pro series
● Using context may be more flexible than Retrieval Augmented
Generation (RAG), or fine-tuning.
● More input context means higher latency
○ Use context caching to help reduce load times
https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024
Search & IR
AI Intelligence: Exploring the Future of Artificial Intelligence
Search & IR
Deep-research
● Gemini agent built using grounding and function calling
● Plan first
● Then search, reason and report
Try deep research in the Gemini app
Search & IR
NotebookLM
● Understand complex info using your uploaded sources.
● Ask questions and get answers with citations.
● Generate summaries, study guides, and audio overviews.
● Works with Google Docs, PDFs, web pages.
NotebookLM
Function Calling
Train the Trainer in AI
Function Calling goo.gle/gemini-fn-call-sql-github
Function Calling
https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/function-calling/sql-talk-app
SELECT ROUND((
COUNT(DISTINCT IF(returned_at IS NOT NULL, order_id, NULL)) /
COUNT(DISTINCT order_id)) * 100, 2)
AS return_rate
FROM thelook_ecommerce.orders
goo.gle/gemini-fn-call-sql-github
Function Calling
Function Calling
● Describe external functions to the model.
● The model may ask you to call the function to help it respond to
your queries (if not using automatic function calling).
● Endless possibilities for integrating external tools.
goo.gle/gemini-api-cookbook-function-callin
g
Function calling - Basics
Function Calling
● How?
● The google.genai SDK will inspect the function's type hints
to determine the schema.
● Allowed types are limited:
AllowedTypes = (
int | float | str | bool | list | dict )
goo.gle/gemini-api-cookbook-function-callin
g
def dim_lights(brightness: float) -> bool:
"""Dim the lights."""
print(f”Lights are now set to {brightness:.0%}”)
return True
response = client.models.generate_content(
model='gemini-2.0-flash',
contents="Start the movie party"
config = types.GenerateContentConfig(
tools=[dim_lights])
Function calling - Basics
Function Calling
● Because function calling requires alternating turns, it's easiest
to use through chat.
● Enable "automatic function calling" when you start a chat, and
the Chat session will call the function(s) for you.
○ You don't have to use automatic function calling, it just
makes simple cases easier.
goo.gle/gemini-api-cookbook-function-callin
g
chat = client.chats.create(
model = “gemini-2.0-flash”,
config = {
'tools': [set_light_values]
}
)
response = chat.send_message(
'Please turn down the lights about halfway.')
response = chat.send_message(
'Please turn down the lights about halfway.')
print(response.text)
# Lights are now set to 50%.
Function calling - Basics
Function Calling
● What happened? Use the chat history to find out.
● The chat history collects all the function calls and responses
that took place.
goo.gle/gemini-api-cookbook-function-callin
g
for content in chat.get_history():
part = content.parts[0]
print(content.role,":"))
if part.text:
print(part.text)
if part.function_call:
print("Function call: {", part.function_call, "}")
if part.function_response:
print("Function response: {", part.function_response, "}")
Contents
Tools
Text
Function Declaration
Function Declaration
Text
Function Calling interaction
Function Call
Function Response
Text
model may predict a function call
based on user content
model can understand the function response
and generate text
OR another function call
if one or more function declarations are
provided, function calling feature will turn on
Function calling - More Examples
Function Calling
● Wikipedia research aid
○ Integrates a search tool.
○ Uses the Gemini API inside the function call to summarize pages.
def wikipedia_search(queries:list[str]) -> list[str]:
...
https://ai.google.dev/docs/search_reranking_using_embeddings/
chat = model.start_chat(enable_automatic_function_calling=True)
chat = client.chats.create(
model = 'gemini-2.0-flash',
config = {
"tools": [wikipedia_search],
"automatic_function_calling": {"disable": False}
# automatic_function_calling is enabled by default
}
)
res = chat.send_message("Explain how deep-sea life survives.")
chat = model.start_chat(enable_automatic_function_calling=True)
chat = client.chats.create(
model = 'gemini-2.0-flash',
config = {"tools": [wikipedia_search]}
)
res = chat.send_message("Explain how deep-sea life survives.")
# Searching for "How do deep-sea creatures survive the extreme pressure?"
# Related search terms: ['Deep sea', 'Deep-sea community', 'Deep-sea fish']
# Fetching page: "Deep sea"
# Information Source: https://en.wikipedia.org/wiki/Deep_sea
# Fetching page: "Deep-sea community"
# Information Source: https://en.wikipedia.org/wiki/Deep-sea_community
# Fetching page: "Deep-sea fish"
# Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish
# Searching for "How do deep-sea creatures survive the cold temperatures?"
# Related search terms: ['Deep-sea community', 'Deep sea', 'Deep-water coral']
# Fetching page: "Deep-water coral"
Function calling - Schema
Function Calling
● It's an OpenAPI schema, written as a protobuf.
● The protobuf-classes are available in the google.genai
client library.
● Functions can be written as JSON-compatible objects.
More examples:
https://ai.google.dev/gemini-api/docs/function-calling
https://ai.google.dev/tutorials/function_calling_python_quickstart
calculator = {
'function_declarations': [{
'name': 'multiply',
'description': 'Returns the product of two numbers.',
'parameters': {
'type': 'OBJECT',
'properties': {
'a': {'type': 'NUMBER'},
'b': {'type': 'NUMBER'}},
'required': ['a', 'b']}}]}
config = {
'tools': [set_light_values],
'automatic_function_calling': {'disable': True},
}
Structured output
Train the Trainer in AI
Structured data
Structured output
● Structured data extraction.
● You can just ask the model to do it and return JSON.
goo.gle/gemini-api-cookbook-json-output
AI Studio example
response = client.models.generate_content(
model='gemini-2.0-flash',
contents=textwrap.dedent("""
Please return JSON describing the the people, places, things and relationships from this
story using the following schema:
{"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATION
PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
PLACE = {"name": str, "description": str}
THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str}
Here is the story:
""") + story)
AI Studio example
class Recipe(BaseModel):
recipe_name: str
recipe_description: str
recipe_ingredients: list[str]
class Cookbook(BaseModel):
cookbook_name: str
cookbook_description: str
cookbook_recipes: list[Recipe]
response = client.models.generate_content(
model='gemini-2.0-flash',
contents=”Create a cooking book about cookies, macarons and crème brulées.”,
config=types.GenerateContentConfig(
response_mime_type="application/json",
response_schema=Cookbook,
),
)
goo.gle/gemini-api-cookbook-json-output
Function calling - Structured data
Structured output
● Asking for JSON often works.
● Function calling lets you strictly describe the schema.
● With a strict description, we can strictly enforce that that's what
gets returned.
get_people = types.FunctionDeclaration(
name="get_people",
description="Get information about characters",
parameters=Person,
)
get_relationships = types.FunctionDeclaration(
name="get_relationships",
description="Get information about relationships between people",
parameters=Relationships
)
get_places = types.FunctionDeclaration(
name="get_places",
description="Get information about places",
parameters=Places
)
story_tools = types.Tool(
function_declarations=[get_people, get_relationships, get_places],
)
Function calling - Structured data
Function Calling
● Why?
○ Function calling doesn't return a string.
○ Function calling returns data-structures: You don't parse
text.
https://ai.google.dev/tutorials/structured_data_extraction
response = client.models.generate_content(f"""
Please add the people, places, things and relationships
from this story to the database:
{story}
""")
fc = response.candidates[0].content.parts[0].function_call
pprint.pprint(type(fc).to_dict(fc))
{'name': 'add_to_database',
'args': {'people': [{'description': 'A curious girl with wild hair of '
'midnight blue and eyes that shone like...,
'name': 'Luna',
...
'places': [{'description': 'Nestled between rolling hills and '
'sparkling rivers, where Luna and her '
'family lived.',
'name': 'Small town'}],
'relationships': [{'person_1_name': 'Luna', 'person_2_name': 'Elara',
'relationship': 'Mother-daughter'},
{'person_1_name': 'Luna', 'person_2_name': 'Orion',
'relationship': 'Father-daughter'}],
'things': [{'description': 'A dusty, forgotten backpack that Luna '
'found in the attic, which possessed a...',
'end_place_name': None,
'name': 'Backpack',
'start_place_name': 'Attic'},
...
Train the Trainer in AI
Vision understanding
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
Spatial understanding
Multimodality
● Images and videos are just tokens in the input
● Can be used for instructions, context or query subject
● Also supports structured data outputs for visual content
goo.gle/gemini-2-spatial
AI Studio example
from google import genai
from google.genai import types
import PIL.Image
image = PIL.Image.open('/path/to/image.png')
client = genai.Client(api_key=GEMINI_API_KEY)
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=["What is this image?", image])
print(response.text)
Multimodality goo.gle/gemini-2-spatial
Image → structured data
Multimodality
● Generate text or structured data from images
goo.gle/gemini-api-cookbook-pdf
AI Studio
Endpoints
Images
Structured Data goo.gle/gemini-api-cookbook-pdf
AI Studio
Endpoints
Images
Structured Data goo.gle/gemini-api-cookbook-pdf
Experimental spatial understanding
Multimodality
● Pointing at specific items
● Able to find them again in other pictures
● 3d understanding
goo.gle/gemini-2-spatial-3d
Multimodality goo.gle/gemini-2-spatial-3d
Video understanding
Multimodality
● Search within videos
● Extract information and text
● Structure information
● Analyse key moments
● Soon: Youtube integration
goo.gle/gemini-api-cookbook-video
AI Studio example
Train the Trainer in AI
Audio understanding
goo.gle/gemini-api-cookbook-audio
AI Studio example
Train the Trainer in AI
Live API
Live API
Live API
● Low-latency bidirectional voice and video interactions
● Outputs natural, human-like voice (English only)
● Multi-tools integration (but no automatic function calling)
● Experimental, limited to 10mn
● Uses websockets instead of REST
goo.gle/gemini-api-cookbook-liveapi
AI Studio example
AI Intelligence: Exploring the Future of Artificial Intelligence
Train the Trainer in AI
Multimodal generation
Imagen
Multimodal output
● Our highest quality text-to-image model
● Generates stunning, detailed, artifact-free images.
● Understands complex prompts and diverse styles.
● Superior text rendering for creative uses.
● Built with robust safety and watermarking (SynthID).
● Some extra options when using Vertex AI (negative prompt,
image2image)
● Paid tier only (0.03$/image)
goo.gle/gemini-api-cookbook-imagen
Prompting guide
Image-out
Multimodal output
● Currently in early-access
● Gemini can generate images as well as text
● Optimized for multi-turn conversations
Audio-out
Multimodal output
● Currently in early-access
● Gemini can generate audio directly (like Live API)
● Optimized for multi-turn conversations
Veo
Multimodal output
● Currently in early-access
● State-of-the-art realistic video generation.
● Produces high-quality 4K video output.
● Advanced motion and physics simulation.
● Precise camera control and shot styling.
● Superior realism and artifact reduction.
Tuning
Train the Trainer in AI
Tuning
● If you aren’t able to get good results with:
○ Prompting
○ Retrieval (RAG)
○ Tools and function calling
○ Search grounding
…then try fine-tuning.
Tuning
https://ai.google.dev/tutorials/tuning_quickstart_python
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
AI Intelligence: Exploring the Future of Artificial Intelligence
Gemma
Let the fun begin! 😁
Colab: Inference images with Gemma3
AI Intelligence: Exploring the Future of Artificial Intelligence
Agadir

More Related Content

PPTX
[GEMINI EXTERNAL DECK] Introduction to Gemini.pptx
PPTX
Intro To Gemini API - Build with AI.pptx
PDF
GDG Los Angeles - Introduction to Gemini
PPTX
Building Your Own AI Instance (TBLC AI )
PDF
Gemini 2.0 and Vertex AI for Innovation Workshop
PPTX
GDG MIT Generative AI : Getting started with the study jam
PDF
[BuildWithAI] Introduction to Gemini.pdf
PPTX
BUILD WITH AI GDG on Campus PEC Tiruvallur.
[GEMINI EXTERNAL DECK] Introduction to Gemini.pptx
Intro To Gemini API - Build with AI.pptx
GDG Los Angeles - Introduction to Gemini
Building Your Own AI Instance (TBLC AI )
Gemini 2.0 and Vertex AI for Innovation Workshop
GDG MIT Generative AI : Getting started with the study jam
[BuildWithAI] Introduction to Gemini.pdf
BUILD WITH AI GDG on Campus PEC Tiruvallur.

Similar to AI Intelligence: Exploring the Future of Artificial Intelligence (20)

PDF
build-with-ai-sydney AI for web devs Tamas Piros
PDF
Enterprise Trends for Gen AI - Berkeley LLM AI Agents MOOC
PPTX
Google-AI-Studio which takes input and tries
PPTX
Build with AI Event master deck final final
PPTX
Artificial Intelligence (AI) – Powering Data and Conversations.pptx
PPTX
English (US)_Classroom Toolkit - Unlocking Generative AI Safely and Responsib...
PPTX
Orientation session
PDF
Understanding GenAI/LLM and What is Google Offering - Felix Goh
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Leveraging Google Generative AI (Google Gemini) and Ethical Coding Practices ...
PPTX
[KZ] Web Ecosystem with Multimodality of Gemini.pptx
PDF
Easy path to machine learning (Spring 2021)
PPTX
AI_Prompt_engineering_and_LLMs- Data Science digest.pptx
PDF
Easy path to machine learning (2023-2024)
PDF
Build with AI on Google Cloud Session #1
PDF
Easy path to machine learning (2022)
PDF
How to program DL & AI applications
PPTX
English-US-Classroom-Toolkit-Unlocking-Generative-AI-Safely-and-Responsibly-M...
PDF
Event Build With AI - 21 October 2024.pdf
PDF
generative ai is very powerful and generative ai is very powerful
build-with-ai-sydney AI for web devs Tamas Piros
Enterprise Trends for Gen AI - Berkeley LLM AI Agents MOOC
Google-AI-Studio which takes input and tries
Build with AI Event master deck final final
Artificial Intelligence (AI) – Powering Data and Conversations.pptx
English (US)_Classroom Toolkit - Unlocking Generative AI Safely and Responsib...
Orientation session
Understanding GenAI/LLM and What is Google Offering - Felix Goh
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Leveraging Google Generative AI (Google Gemini) and Ethical Coding Practices ...
[KZ] Web Ecosystem with Multimodality of Gemini.pptx
Easy path to machine learning (Spring 2021)
AI_Prompt_engineering_and_LLMs- Data Science digest.pptx
Easy path to machine learning (2023-2024)
Build with AI on Google Cloud Session #1
Easy path to machine learning (2022)
How to program DL & AI applications
English-US-Classroom-Toolkit-Unlocking-Generative-AI-Safely-and-Responsibly-M...
Event Build With AI - 21 October 2024.pdf
generative ai is very powerful and generative ai is very powerful
Ad

Recently uploaded (20)

DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
The spiral of silence is a theory in communication and political science that...
PPTX
worship songs, in any order, compilation
PDF
Swiggy’s Playbook: UX, Logistics & Monetization
PPTX
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
PDF
Presentation1 [Autosaved].pdf diagnosiss
PPTX
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
PPTX
Learning-Plan-5-Policies-and-Practices.pptx
PDF
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
PPT
The Effect of Human Resource Management Practice on Organizational Performanc...
DOC
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
PPTX
S. Anis Al Habsyi & Nada Shobah - Klasifikasi Hambatan Depresi.pptx
PPTX
Tour Presentation Educational Activity.pptx
PDF
natwest.pdf company description and business model
PPTX
Primary and secondary sources, and history
PPTX
Relationship Management Presentation In Banking.pptx
PDF
Parts of Speech Prepositions Presentation in Colorful Cute Style_20250724_230...
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
The spiral of silence is a theory in communication and political science that...
worship songs, in any order, compilation
Swiggy’s Playbook: UX, Logistics & Monetization
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
Presentation1 [Autosaved].pdf diagnosiss
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
Learning-Plan-5-Policies-and-Practices.pptx
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
The Effect of Human Resource Management Practice on Organizational Performanc...
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
S. Anis Al Habsyi & Nada Shobah - Klasifikasi Hambatan Depresi.pptx
Tour Presentation Educational Activity.pptx
natwest.pdf company description and business model
Primary and secondary sources, and history
Relationship Management Presentation In Banking.pptx
Parts of Speech Prepositions Presentation in Colorful Cute Style_20250724_230...
Emphasizing It's Not The End 08 06 2025.pptx
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
Ad

AI Intelligence: Exploring the Future of Artificial Intelligence

  • 1. Keeping Up with AI: Gemma’s Here So You Don’t Left Behind Agadir Mohammed Buallay
  • 2. Mohammed Buallay - Google Developers Experts for AI/ML - Lead Data Architect and Reporting - I love to travel and meet new people
  • 4. Mohammed Buallay - I started working on AI in 2017 - As senior project
  • 6. Let’s meet with LLMs What is Gemma? Automate News Summary Outlines
  • 8. What is an LLM? [...] [...] [...] [...] 0.02 0.03 0.9 0.01 0.0 … Dogs Rain Drops Fish Wind … and cats raining It’s
  • 9. ,‫ﻧﺘﻌﻠﻢ‬ ‫ﻣﻨﻚ‬ ‫اﺣﻨﺎ‬ ‫و‬ ‫ﻣﻌﻠﻢ‬ ‫أﻧﺖ‬ Violets are blue, Sugar is sweet,
  • 10. ‫ﻧﺘﻌﻠﻢ‬ ‫ﻣﻨﻚ‬ ‫اﺣﻨﺎ‬ ‫و‬ ‫ﻣﻌﻠﻢ‬ ‫أﻧﺖ‬ ‫اﻟﻄﻴﺐ‬ ‫ﻓﻲ‬ ‫اﻟﻤﺜﻞ‬ ‫ﺑﻚ‬ ‫ﻧﻀﺮب‬ ‫واﻟﻬﻤﺔ‬ ‫اﻟﻘﻠﺐ‬ ‫ﻣﻦ‬ ‫وﻧﻘﻮﻟﻬﺎ‬ ‫ﺑﻚ‬ ‫ﻧﻔﺨﺮ‬ ‫ﻛﻠﻨﺎ‬ ‫ﻓﻲ‬ ‫ﺑﺲ‬ ‫ﻣﻮ‬ ،‫اﻟﻔﻌﻞ‬ ‫ﻓﻲ‬ ‫ﻗﺪوﺗﻨﺎ‬ ‫اﻟﻜﻠﻤﺔ‬ *
  • 11. Modern LLMs can be very large. LLMs Explained
  • 12. Classic Natural Language Problems LLMs Explained Entity extraction Classification Summarization Sentiment Analysis Translation Code Generation …and more!
  • 13. LLMs let us prototype fast. LLMs Explained
  • 14. Explain this meme: Model: The meme plays of the double meaning of “sprite”: Model: User: Therefore, a game developer’s favorite drink is Sprite because of their love for sprites in game development, thus the drink’s name is a pun. Sprite (the drink): popular lemon-lime soda Sprite (2D bitmap image): often used to represent characters, objects, etc. in game environments. LLMs Explained
  • 15. Model: 1. **Build a volcano.** This is a classic science fair project that is both fun and educational. You can learn about the science of volcanoes, earthquakes, and plate tectonics. 2. **Make a solar oven.** This is a great project to learn about renewable energy and how to harness the power of the sun. You can use your solar oven to cook food, dry fruit, or even melt chocolate. 3. **Create a 3D printer.** This is a more challenging project, but it is also very rewarding. You can learn about 3D printing technology and design your own custom objects. 4. **Build a robot.** This is a great project for those who are interested in engineering and robotics. You can learn about basic electronics, mechanics, and programming. 5. **Design a water filtration system.** This is a project that can help you learn about the importance of clean water and how to filter it. You can use your water filtration system to clean water from a stream or river. LLMs Explained Give me five neat ideas for science fair projects. User:
  • 16. Why are large language models different? LLMs are characterized by emergent abilities, or the ability to perform tasks that were not present in smaller models. LLMs contextual understanding of human language changes how we interact with data and intelligent systems. LLMs can find patterns and connections in massive, disparate data corpora. Search Conversation Content generation
  • 17. What are some everyday tasks that can be improved by AI?
  • 18. Google is the industry pioneer in AI Responsible AI 3,000 Researchers 7,000 Publications Built & Tested for Safety Privacy in design Upholds high scientific standards Accountable to People Socially Beneficial Avoid creating unfair bias 2015 Google DeepMind AlphaGo defeats Go champion 2016 Google’s DeepMind helps detect eye disease 2017 Google invents Transformer kickstarting LLM revolution 2018 Google’s groundbreaking large language model, BERT 2019 Text-to-Text Transfer Transformer LLM 10B P Model Open Sourced 2020 Google LaMDA Model Trained to converse 2022 AlphaFold predicts structures of all known proteins 2023 A conversational AI Service powered by PaLM2 2024 Family of multimodal LLMs & products
  • 23. The Gemini Ecosystem The most advanced AI from Google For Developers For Consumers For Business and Enterprise Models Gemini API (in Google AI Studio + ai.google.dev) Gemini for Google Workspace Gemini for Google Cloud Gemini in Vertex AI Gemini | app and web Gemini in the Google App Gemini in Gmails, Docs… NotebookLM this deck is about But also Imagen, Veo…. Unified SDK
  • 24. Getting started with the Gemini API Train the Trainer in AI
  • 25. AI Studio aistudio.google.com ● Generate API Keys ● Create, test, and save prompts ● Customize models in minutes ● Generate starter code
  • 31. ai.google.dev REST API + new Client libraries for Python, Node, Java, and Go Libraries Libraries Unified SDKs …and more!
  • 32. Vertex AI Enterprise grade support. Full MLOps (Examples: Model evaluation, monitoring, registry) Vertex AI Check it out when you're ready for enterprise-scale production workloads. Gemini API and Vertex AI Both give access Gemini family models. Vertex AI Cloud Production ai.google.dev/docs/migrate_to_cloud PlatfoEndpoints rms Platforms Platforms Now way easier with the unified SDKs
  • 33. Usage patterns Train the Trainer in AI
  • 34. We’ll cover ● Basic prompting ● Code generation and execution ● Grounding with Google Search ● Function calling and structured outputs ● Vision (images, video) and audio understanding ● Image and video generation ● Tuning Usage Patterns
  • 35. Basic prompting Train the Trainer in AI
  • 36. Write a poem about a magic backpack. In a world of wonder, behold, A backpack with tales yet untold. Its fabric shimmers, alive with might, A portal to realms, day and night. aistudio.google.com
  • 37. Write a poem about a magic backpack. In a world of wonder, behold, A backpack with tales yet untold. Its fabric shimmers, alive with might, A portal to realms, day and night. ... aistudio.google.com
  • 38. Generate a short tip about JavaScript. Good tips are short and contain a code example showing the tip. **Tip: Use Array.prototype.flat() to flatten an array of arrays.** ```javascript const array = [[1, 2], [3, 4], [5, 6]]; console.log(array.flat()); // Output: [1, 2, 3, 4, 5, 6] ```
  • 39. Generate a short tip about JavaScript. Good tips are short and contain a code example showing the tip. **Tip: Use Array.prototype.flat() to flatten an array of arrays.** ```javascript const array = [[1, 2], [3, 4], [5, 6]]; console.log(array.flat()); // Output: [1, 2, 3, 4, 5, 6] ```
  • 40. Generate a short tip about JavaScript. Good tips are short and contain a code example showing the tip. topic: let and const tip: Prefer using `let` and `const` to initialise variables, instead of `var`, as `var` has no block scope or protection against mutability. e.g. const myVar = "EXAMPLE"; topic: arrow syntax tip:
  • 41. from google import genai response = client.models.generate_content( model="gemini-2.0-flash", contents="Explain how AI works", ) print(response.text) # Okay, let's break down how AI (Artificial Intelligence) works. It's a broad and complex field, so I’ll focus on the core concepts and provide examples to make...
  • 42. chat = client.chats.create( model="gemini-2.0-flash" ) response = chat.send_message( "Hello, what should I have for dinner?") print(response.text) # 'Here are some suggestions...' response = chat.send_message( "How do I cook the first one?")
  • 43. Contents Settings/Configs Text Text Text Tools Tools Safety Settings Safety Settings Generation Config Content of role: "user" containing multimodal Parts: Content of role: "model" "user" content: a text Part Tools specified by the caller Safety setting configured by caller Generation Config includes: temperature, Top P, Top K, stop seq, max output tokens etc. Generate Content Request
  • 44. Candidate(s) Feedback Text Block Reason Safety Ratings Candidate "Content" note: you can ask for more than one content now why the model stopped generating feedback on the prompt Finish Reason Finish Message set if finish reason is present Safety Ratings Safety Ratings how safe is the response Generate Content Response …
  • 45. AI Studio Endpoints Prompting https://ai.google.dev/docs/prompt_best_practices ● Chained prompts - make a plan, then execute it ● Context - Few shot prompts ● Generation parameters - Temperature, Safety settings, Top-P, TopK
  • 46. 🤳
  • 47. Code generation Train the Trainer in AI
  • 48. Code Generation Code generation Convert human language requests to machine requests Rapid prototyping for tools, extensions, and applications Generate code based on a user prompt
  • 49. Examples Code generation ● Generate data AI Studio example
  • 51. Examples Code generation ● Generate data ● Generate a SQL query AI Studio example
  • 53. Examples Code generation ● Generate data ● Generate a SQL query ● Simulate Execution AI Studio example
  • 55. Examples Code generation ● Generate data ● Generate a SQL query ● Simulate Execution ● Quick coding help - Smart pair programer. AI Studio example
  • 57. Write a unit test for this Python function: def is_diver_neutrally_buoyant(density_water, density_diver): """Check if a diver is neutrally buoyant.""" if density_water == density_diver: return True AI Studio example
  • 58. import unittest class IsDiverNeutrallyBuoyantTest(unittest.TestCase): def test_diver_is_neutrally_buoyant(self): self.assertTrue(is_diver_neutrally_buoyant(1000, 1000)) def test_diver_is_not_neutrally_buoyant(self): self.assertFalse(is_diver_neutrally_buoyant(1000, 1010)) def test_invalid_input_types(self): with self.assertRaises(TypeError): is_diver_neutrally_buoyant("1000", 1000) with self.assertRaises(TypeError): is_diver_neutrally_buoyant(1000, "1000")
  • 59. Examples Code generation ● Generate data ● Generate a SQL query ● Simulate Execution ● Quick coding help - Smart pair programer. ● Generating a full web application ● Integration in IDEs (ex: Copilot, Cursor, Continue, etc.)
  • 60. Code Execution Train the Trainer in AI
  • 61. Code execution ● Available via AI Studio and the Gemini APIs ● Gives Gemini the ability to write, execute, and fix Python code ● Sandboxed environment (no internet access) ● Can also output images! Tuning goo.gle/gemini-api-cookbook-code-execution AI Studio example
  • 62. response = client.models.generate_content( model = "gemini-2.0-flash", contents = "Can you draw a graph of the first 50 values of the fibonacci serie?", config = types.GenerateContentConfig( tools=[types.Tool( code_execution=types.ToolCodeExecution )] ) )
  • 63. response = client.models.generate_content( model = "gemini-2.0-flash", contents = "Can you draw a graph of the first 50 values of the fibonacci serie?", config = types.GenerateContentConfig( tools=[types.Tool( code_execution=types.ToolCodeExecution )] ) ) # Okay, I can help you with that. I'll generate the first 50 Fibonacci numbers and then create a graph plotting the index (from 1 to 50) against the Fibonacci number. I will use python to generate the numbers and then display a graph.
  • 64. import matplotlib.pyplot as plt def fibonacci(n): fib_list = [] a, b = 0, 1 for _ in range(n): fib_list.append(b) a, b = b, a + b return fib_list fib_numbers = fibonacci(50) plt.figure(figsize=(12, 6)) plt.plot(range(1, 51), fib_numbers, marker='o', linestyle='-') plt.title('First 50 Fibonacci Numbers') plt.xlabel('Index') plt.ylabel('Fibonacci Number') plt.grid(True) plt.yscale('linear') # Or 'log' if you want to see the initial values better plt.show()
  • 65. Search Grounding Train the Trainer in AI
  • 66. ● Models have knowledge cut-offs ● Grounding with Google Search provides up-to-date information ● Gives Gemini the ability to browse the web and cite sources ● 500 free RPD with 2.0 models Grounding on Google Search Search & IR goo.gle/gemini-api-cookbook-grounding AI Studio example
  • 67. ● Instructions + Context + Question all in the prompt ● No need for RAG or fine-tuning ● Easy to implement ○ No extra code, just upload and ask. Use the prompt's context window Search & IR goo.gle/gemini-api-cookbook-pdf
  • 68. response = client.models.generate_content( model="gemini-2.0-flash", contents="Who is Guillaume Vernade?", config = { 'tools': [{ 'google_search': {} }] } ) print(response.text) # Guillaume Vernade, also known as Giom, is a tech enthusiast who has worked in various roles, from neural networks and AI to Agile coaching and video game production at companies like Ubisoft and Google Stadia. Currently, he works at Google Deepmind as a Developer Advocate, focusing on Generative AI with Gemini models.
  • 69. pdf_file = client.files.upload(file="invoice.pdf") response = client.models.generate_content( model="gemini-2.0-flash", contents=[ "Summarize this document", pdf_file, ]) print(response.text) # This doc is an GCP invoice related to your Vertex AI usage…
  • 70. Use the prompt's context window Learning more ● gemini-2.0-flash: 1M tokens. ○ Remember: Everything is tokens. ○ Up to 2M+ tokens for Pro series ● Using context may be more flexible than Retrieval Augmented Generation (RAG), or fine-tuning. ● More input context means higher latency ○ Use context caching to help reduce load times https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024 Search & IR
  • 72. Search & IR Deep-research ● Gemini agent built using grounding and function calling ● Plan first ● Then search, reason and report Try deep research in the Gemini app
  • 73. Search & IR NotebookLM ● Understand complex info using your uploaded sources. ● Ask questions and get answers with citations. ● Generate summaries, study guides, and audio overviews. ● Works with Google Docs, PDFs, web pages. NotebookLM
  • 74. Function Calling Train the Trainer in AI
  • 76. Function Calling https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/function-calling/sql-talk-app SELECT ROUND(( COUNT(DISTINCT IF(returned_at IS NOT NULL, order_id, NULL)) / COUNT(DISTINCT order_id)) * 100, 2) AS return_rate FROM thelook_ecommerce.orders goo.gle/gemini-fn-call-sql-github
  • 77. Function Calling Function Calling ● Describe external functions to the model. ● The model may ask you to call the function to help it respond to your queries (if not using automatic function calling). ● Endless possibilities for integrating external tools. goo.gle/gemini-api-cookbook-function-callin g
  • 78. Function calling - Basics Function Calling ● How? ● The google.genai SDK will inspect the function's type hints to determine the schema. ● Allowed types are limited: AllowedTypes = ( int | float | str | bool | list | dict ) goo.gle/gemini-api-cookbook-function-callin g
  • 79. def dim_lights(brightness: float) -> bool: """Dim the lights.""" print(f”Lights are now set to {brightness:.0%}”) return True response = client.models.generate_content( model='gemini-2.0-flash', contents="Start the movie party" config = types.GenerateContentConfig( tools=[dim_lights])
  • 80. Function calling - Basics Function Calling ● Because function calling requires alternating turns, it's easiest to use through chat. ● Enable "automatic function calling" when you start a chat, and the Chat session will call the function(s) for you. ○ You don't have to use automatic function calling, it just makes simple cases easier. goo.gle/gemini-api-cookbook-function-callin g
  • 81. chat = client.chats.create( model = “gemini-2.0-flash”, config = { 'tools': [set_light_values] } ) response = chat.send_message( 'Please turn down the lights about halfway.')
  • 82. response = chat.send_message( 'Please turn down the lights about halfway.') print(response.text) # Lights are now set to 50%.
  • 83. Function calling - Basics Function Calling ● What happened? Use the chat history to find out. ● The chat history collects all the function calls and responses that took place. goo.gle/gemini-api-cookbook-function-callin g
  • 84. for content in chat.get_history(): part = content.parts[0] print(content.role,":")) if part.text: print(part.text) if part.function_call: print("Function call: {", part.function_call, "}") if part.function_response: print("Function response: {", part.function_response, "}")
  • 85. Contents Tools Text Function Declaration Function Declaration Text Function Calling interaction Function Call Function Response Text model may predict a function call based on user content model can understand the function response and generate text OR another function call if one or more function declarations are provided, function calling feature will turn on
  • 86. Function calling - More Examples Function Calling ● Wikipedia research aid ○ Integrates a search tool. ○ Uses the Gemini API inside the function call to summarize pages. def wikipedia_search(queries:list[str]) -> list[str]: ... https://ai.google.dev/docs/search_reranking_using_embeddings/
  • 87. chat = model.start_chat(enable_automatic_function_calling=True) chat = client.chats.create( model = 'gemini-2.0-flash', config = { "tools": [wikipedia_search], "automatic_function_calling": {"disable": False} # automatic_function_calling is enabled by default } ) res = chat.send_message("Explain how deep-sea life survives.")
  • 88. chat = model.start_chat(enable_automatic_function_calling=True) chat = client.chats.create( model = 'gemini-2.0-flash', config = {"tools": [wikipedia_search]} ) res = chat.send_message("Explain how deep-sea life survives.") # Searching for "How do deep-sea creatures survive the extreme pressure?" # Related search terms: ['Deep sea', 'Deep-sea community', 'Deep-sea fish'] # Fetching page: "Deep sea" # Information Source: https://en.wikipedia.org/wiki/Deep_sea # Fetching page: "Deep-sea community" # Information Source: https://en.wikipedia.org/wiki/Deep-sea_community # Fetching page: "Deep-sea fish" # Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish # Searching for "How do deep-sea creatures survive the cold temperatures?" # Related search terms: ['Deep-sea community', 'Deep sea', 'Deep-water coral'] # Fetching page: "Deep-water coral"
  • 89. Function calling - Schema Function Calling ● It's an OpenAPI schema, written as a protobuf. ● The protobuf-classes are available in the google.genai client library. ● Functions can be written as JSON-compatible objects. More examples: https://ai.google.dev/gemini-api/docs/function-calling https://ai.google.dev/tutorials/function_calling_python_quickstart
  • 90. calculator = { 'function_declarations': [{ 'name': 'multiply', 'description': 'Returns the product of two numbers.', 'parameters': { 'type': 'OBJECT', 'properties': { 'a': {'type': 'NUMBER'}, 'b': {'type': 'NUMBER'}}, 'required': ['a', 'b']}}]}
  • 91. config = { 'tools': [set_light_values], 'automatic_function_calling': {'disable': True}, }
  • 93. Structured data Structured output ● Structured data extraction. ● You can just ask the model to do it and return JSON. goo.gle/gemini-api-cookbook-json-output AI Studio example
  • 94. response = client.models.generate_content( model='gemini-2.0-flash', contents=textwrap.dedent(""" Please return JSON describing the the people, places, things and relationships from this story using the following schema: {"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATION PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str} PLACE = {"name": str, "description": str} THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str} RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str} Here is the story: """) + story) AI Studio example
  • 95. class Recipe(BaseModel): recipe_name: str recipe_description: str recipe_ingredients: list[str] class Cookbook(BaseModel): cookbook_name: str cookbook_description: str cookbook_recipes: list[Recipe] response = client.models.generate_content( model='gemini-2.0-flash', contents=”Create a cooking book about cookies, macarons and crème brulées.”, config=types.GenerateContentConfig( response_mime_type="application/json", response_schema=Cookbook, ), ) goo.gle/gemini-api-cookbook-json-output
  • 96. Function calling - Structured data Structured output ● Asking for JSON often works. ● Function calling lets you strictly describe the schema. ● With a strict description, we can strictly enforce that that's what gets returned.
  • 97. get_people = types.FunctionDeclaration( name="get_people", description="Get information about characters", parameters=Person, ) get_relationships = types.FunctionDeclaration( name="get_relationships", description="Get information about relationships between people", parameters=Relationships ) get_places = types.FunctionDeclaration( name="get_places", description="Get information about places", parameters=Places )
  • 99. Function calling - Structured data Function Calling ● Why? ○ Function calling doesn't return a string. ○ Function calling returns data-structures: You don't parse text. https://ai.google.dev/tutorials/structured_data_extraction
  • 100. response = client.models.generate_content(f""" Please add the people, places, things and relationships from this story to the database: {story} """) fc = response.candidates[0].content.parts[0].function_call pprint.pprint(type(fc).to_dict(fc))
  • 101. {'name': 'add_to_database', 'args': {'people': [{'description': 'A curious girl with wild hair of ' 'midnight blue and eyes that shone like..., 'name': 'Luna', ... 'places': [{'description': 'Nestled between rolling hills and ' 'sparkling rivers, where Luna and her ' 'family lived.', 'name': 'Small town'}], 'relationships': [{'person_1_name': 'Luna', 'person_2_name': 'Elara', 'relationship': 'Mother-daughter'}, {'person_1_name': 'Luna', 'person_2_name': 'Orion', 'relationship': 'Father-daughter'}], 'things': [{'description': 'A dusty, forgotten backpack that Luna ' 'found in the attic, which possessed a...', 'end_place_name': None, 'name': 'Backpack', 'start_place_name': 'Attic'}, ...
  • 102. Train the Trainer in AI Vision understanding
  • 113. Spatial understanding Multimodality ● Images and videos are just tokens in the input ● Can be used for instructions, context or query subject ● Also supports structured data outputs for visual content goo.gle/gemini-2-spatial AI Studio example
  • 114. from google import genai from google.genai import types import PIL.Image image = PIL.Image.open('/path/to/image.png') client = genai.Client(api_key=GEMINI_API_KEY) response = client.models.generate_content( model="gemini-2.0-flash", contents=["What is this image?", image]) print(response.text)
  • 116. Image → structured data Multimodality ● Generate text or structured data from images goo.gle/gemini-api-cookbook-pdf
  • 117. AI Studio Endpoints Images Structured Data goo.gle/gemini-api-cookbook-pdf
  • 118. AI Studio Endpoints Images Structured Data goo.gle/gemini-api-cookbook-pdf
  • 119. Experimental spatial understanding Multimodality ● Pointing at specific items ● Able to find them again in other pictures ● 3d understanding goo.gle/gemini-2-spatial-3d
  • 121. Video understanding Multimodality ● Search within videos ● Extract information and text ● Structure information ● Analyse key moments ● Soon: Youtube integration goo.gle/gemini-api-cookbook-video AI Studio example
  • 122. Train the Trainer in AI Audio understanding goo.gle/gemini-api-cookbook-audio AI Studio example
  • 123. Train the Trainer in AI Live API
  • 124. Live API Live API ● Low-latency bidirectional voice and video interactions ● Outputs natural, human-like voice (English only) ● Multi-tools integration (but no automatic function calling) ● Experimental, limited to 10mn ● Uses websockets instead of REST goo.gle/gemini-api-cookbook-liveapi AI Studio example
  • 126. Train the Trainer in AI Multimodal generation
  • 127. Imagen Multimodal output ● Our highest quality text-to-image model ● Generates stunning, detailed, artifact-free images. ● Understands complex prompts and diverse styles. ● Superior text rendering for creative uses. ● Built with robust safety and watermarking (SynthID). ● Some extra options when using Vertex AI (negative prompt, image2image) ● Paid tier only (0.03$/image) goo.gle/gemini-api-cookbook-imagen Prompting guide
  • 128. Image-out Multimodal output ● Currently in early-access ● Gemini can generate images as well as text ● Optimized for multi-turn conversations
  • 129. Audio-out Multimodal output ● Currently in early-access ● Gemini can generate audio directly (like Live API) ● Optimized for multi-turn conversations
  • 130. Veo Multimodal output ● Currently in early-access ● State-of-the-art realistic video generation. ● Produces high-quality 4K video output. ● Advanced motion and physics simulation. ● Precise camera control and shot styling. ● Superior realism and artifact reduction.
  • 132. Tuning ● If you aren’t able to get good results with: ○ Prompting ○ Retrieval (RAG) ○ Tools and function calling ○ Search grounding …then try fine-tuning. Tuning https://ai.google.dev/tutorials/tuning_quickstart_python
  • 137. Gemma Let the fun begin! 😁
  • 138. Colab: Inference images with Gemma3
  • 140. Agadir