W6L1_LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Chatbots and AI Agents

Chatbots and AI Agents
1 1 - 6 6 7 : L A R G E L A N G U A G E M O D E L S :
M E T H O D S A N D A P P L I C A T I O N S

What to expect on the midterm
• Conceptual questions about the content of the lecture and readings
• Topics you should prepare to be assessed on
• Transformer architecture
• Pretraining (data collection and learning objectives)
• Finetuning techniques and data (alignment, RLHF, PETM)
• Evaluation (human and automatic)
• In-context learning
• Interpretability
• Applications (search, dialog agents)

When you think of “chatbot” what comes to
mind?
• ChatGPT
• Bard
• character.ai
Implementation
1. Take pre-trained LLM
2. Finetune it on appropriate data
3. Clever prompting

What distinguishes an AI agent from a
chatbot?
• An agent…
• exists within an environment
• can take actions that change its environment
• can converse with other agents within the environment
• Has a persona
• Has a goal
• Has memories of what has previously transpired
General-purpose chatbots (ChatGPT, Bard, etc.) do not exist in an
environment they can alter, and they do not have specific goals. All
memory is implicit in the conversational history.

Why care about building AI agents?
• Entertainment / video games
• Modeling real-user behaviour
• For example, testing a new application with “mock” users could be less expensive than hiring
real users to test it out.
• Pre-requisite for embodied agents.
• We can use agents acting in a virtual environment to measure progress
toward agents acting in a real one.
• Challenging evaluation platform for natural language understanding
and generation

• Agents in a fantasy text adventure game
• “Learning to Speak and Act in a Fantasy Text Adventure Game.” Urbanek et al.
2021.
• Diplomacy-playing agent
• “Human-level play in the game of Diplomacy by combining language models
with strategic reasoning.” Bakhtin et al. 2022.
• Simulated town
• “Generative Agents: Interactive Simulacra of Human Behavior.” Park et al.
2023.
Case Studies in this Lecture

Agents in a fantasy text adventure game
• Environment:
• Locations, randomly glued together
• Each location also has some number of items
• Agents:
• Each agent is situated in the environment.
• Each agent possess some number of items
• Agent actions:
• Emote: {applaud, cringe, cry, etc.}
• Chat with other agents
• Perform a physical action (e.g. “put robes in
closet” or “eat salmon”)
• Agent, locations, and items have natural
language descriptions.

Agents in a text adventure game

Agents in a text adventure game
Task goal: Can we generate a conversation between the thief and the gravedigger and
predict which actions/emotes they will take after each conversational utterance?

Agents in Diplomacy, a negotiation-based board game
• Seven players compete to control
countries (SCs) on a map.
• At each turn, players chat with each-other
to decide on their actions.
• Any promises, agreements, threats, etc. are
non-binding.
• Once chatting is over, players may choose
to
• Move their units, waging war if into an
already-occupied region
• Use their units to support other units (which
could include the units of a different player)

Agents in Diplomacy, a negotiation-based board game
• Seven players compete to control
countries (SCs) on a map.
• At each turn, players chat with each-other
to decide on their actions.
• Any promises, agreements, threats, etc. are
non-binding.
• Once chatting is over, players may choose
to
• Move their units, waging war if into an
already-occupied region
• Use their units to support other units (which
could include the units of a different player)
Task goal: An AI agent that follows the same rules and norms as the human agents, and
has as good a win-rate as skilled human players.

Agents in a simulated town
• Modeled after the video game the
Sims
• 25 agents
• Each begins the simulation with a pre-
defined set of “seed memories”
• Agents do not have explicit goals

Agents in a simulated town
• Modeled after the video game the
Sims
• 25 agents
• Each begins the simulation with a
pre-defined set of “seed memories”
• Agents do not have explicit goals
• At each step:
• Each agents output a natural
language statement of their action
• “write in journal”
• “walk to pharmacy”
• “talk to Joe”
• Actions and environment state are
parsed into memories, reflections,
and observations.

Where can LLMs be used in these systems?
• Dialog with other agents (who may be either human agents or other AI
agents)
• Deciding on agent actions
• Choosing what information (from the environment and from the agent’s
internal state) to condition the conversation and decision-making on.
.

Where can LLMs be used in these systems?
• Dialog with other agents (who may be either human agents or other AI
agents)
• Deciding on agent intents
• Choosing what information (from the environment and from the agent’s
internal state) to condition the conversation and decision-making on.
Challenges:
• How can we convert world and agent state into natural language?
• How can we convert natural language into agent actions and environment
changes?
• Can all these tasks be accomplished with a general-purpose LM or do we
need finetuned models?

Choosing information to condition the
conversation and decision-making on.
• In many cases, there will be more information than can fit into an LM
context window. Most of this won’t be relevant.
• The Town Sim keeps around a database of memories. Memories are scored by
their recency, importance, and relevance to ongoing memory.
Compute LM sequence embedding of query
memory and each memory in database.
Score database memories by dot product
with query memory.

• In Diplomacy, the dialog model and intent model see as input:
• dialogue history (all messages exchanged between player A and the six other players up
to time t)
• game state, action history, and metadata (current game state, recent action history,
game settings, A’s Elo rating, etc.)
• For the dialog model: A’s intended actions, and the actions A wants its conversational
partner to complete.

• In Diplomacy, the dialog model and intend model see as input:
• dialogue history (all messages exchanged between player A and the six other players up
to time t)
• game state, action history, and metadata (current game state, recent action history,
game settings, A’s Elo rating, etc.)
• For the dialog model: A’s intended actions, and the actions A wants its conversational
partner to complete.
• In the Fantasy Text Adventure, dialog rounds were short enough that all
environment information and history fit into max sequence length.

Deciding on agent intent
• Can we trust an LLM to choose reasonable intents?
• Fantasy Text Adventure Game
• Yes, via a finetuned BERT-based ranker
• Simulated Town
• Yes, through prompting GPT-3 with an agent’s description and memories
• Hierarchical generation: generate a broad plan first, and then generate smaller steps in
the plan
• Diplomacy
• No, use a reinforcement learning agent trained through self-play to output an action
intent

Dialog with other agents
• All three examples in our case study use LLMs to generate dialog.
• Diplomacy and Fantasy Text Adventure finetuned models
• Simulated Town used instruction-tuned GPT-3 without further finetuning
• When is finetuning especially helpful:
• If the world state cannot be effectively represented in natural language.
• When bad dialog can lead to poor outcomes
• Simulated Town paper notes how their generated dialogs tend to be very formal and
stilted, likely due to GPT-3’s instruction tuning.
• An LLM is not always the right tool for the job:
• Example: Settlers of Catan AI agent can do well just with templated text
generation

Quiz Question
In what kinds of scenarios would a pre-trained LLM without finetuning
not be a good choice for outputting agent intents?

W6L1_LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Chatbots and AI Agents

More Related Content

Similar to W6L1_LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Chatbots and AI Agents (20)

More from cniclsh1 (20)

Recently uploaded (20)

W6L1_LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Chatbots and AI Agents