SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder

Outsmarting LLMs:
5 Strategies for Founders & Technologists
Andrew Filev
CEO & Founder
Zencoder AI

The scale/pre-training has been core driver of LLM
performance up until 2024
Performance
Time

And then “the frontier” slowed down a bit
Performance
Time
We are here

Let’s talk about limitations of
LLMs. They are trained to predict
the most probable next symbol,
even if it sometimes leads to
inconsistencies or mistakes.

Failure modes
1. “Fuzzy” (probabilistic) memorization, somewhat “associative”
memory.
3. Struggle with “multi-hop” reasoning, unless extensively
trained with similar examples.
2. Not reliable in following an algorithm, unless extensively
trained on it.
4. No differentiation in epistemic vs aleatoric uncertainty, no
estimates for the cost of error, no explicit evaluation /feedback.
5. They work well “in distribution”. And they are mostly trained
on open internet data, which limits that distribution.

There are ways to combat
these failure modes.
And it’s not what frontier
labs made you believe.

Performance that doesn’t rely on access to vast computing
resources
Image: Epoch AI
Source: Epoch AI
“
AI Capabilities Can Be Significantl
y Improved Without Expensive Re
training
”
“While scaling
compute for training
is key to improving
LLM performance,
some post-training
enhancements can
offer gains
equivalent to
training with 5
to 20x more
compute at less
than 1% the
cost.

The Power of Scaling for Reasoning. Exp or Log?
Activity
Time
Pre-training
New techniques
We are here

The Future is Still Amazing, and it’s yours!
We are still on the exponential curve,
but the direction shifted: there have
been recent breakthroughs in context
length, inference speed, costs.
Mar’23
Nov’23
May’24
GPT4 announced
GPT-4 Turbo is 3x cheaper for
input and 2x cheaper for
output
GPT-4o is 2x cheaper than
GPT-4 Turbo
Example:

1. “Fuzzy” (probabilistic) memorization, somewhat “associative”
memory.
1. Context is key
3. Struggle with “multi-hop” reasoning, unless extensively
trained with similar examples.
3. Use various ways to
decompose the problem
5 Ways
2. Not reliable in following an algorithm, unless extensively
trained on it.
4. No differentiation in epistemic vs aleatoric uncertainty, no
estimates for the cost of error, no explicit evaluation /feedback.
5. They work well “in distribution”. And they are mostly trained
on open internet data, which limits that distribution.
2. Leverage tools
4. Incorporate feedback in
your workflows
5. Experiment with fine tuning
on your proprietary data

The gap between commercial and open-
source LLMs is collapsing very fast.

General comparison of commercial and open-source LLMs
Proprietary
Open Source
MMLU-Pro: the more recent version of a
Massive Multitask Language Understanding
benchmark

Your Secret Weapon: Agentic design patterns
Planning
The LLM comes up with, and executes, a
multistep plan to achieve a goal (for example,
writing an outline for an essay, then doing online
research, then writing a draft, and so on).
Tool Use
The LLM is given tools such as web search, code
execution, or any other function to help it gather
information, take action, or process data.
Reflection
The LLM examines its own work to come up with
ways to improve it.
Multi-agent collaboration
More than one AI agent work together, splitting
up tasks and discussing and debating ideas, to
come up with better solutions than a single
agent would.

Case Study
Lets leverage embedded AI agents to
improve code generation.
By embedding agentic AI into code
assistants we offload routine tasks,
improve code coverage and integrate
into your development workflows
seamlessly.

Robust AI pipeline can improve automatic code generation
LLM
Assistant Engineer
Missing Context Needed, Irrelevant
Context
Hallucinations, Incorrect Code
LLM Agentic Improvement
Assistant
Intelligent
Context
Repairs and Improves
Code
Engineer
AI Agent
Current Solution: Typical AI Coding
Assistant
�
�
�
�
Repo
Grokking™

Hypothetical performance in your specific domain
Open Source
Improved Context
1. Context is key
Recent rapid expansion of context length
allows significantly improve the reasoning
capabilities in complicated scenarios

2. Leverage tools
Tooling
Open Source
Improved Context

3. Use various ways to
decompose the problem.
(Fixed Chain, CoT, Agentic Planning)
Tooling
Workflow
Open Source
Improved Context

4. Incorporate feedback in
your workflows
Tooling
Workflow
Feedback
Open Source
Improved Context

5. Experiment with fine tuning
on your proprietary data
Tooling
Workflow
Feedback
Fine-tuning
Open Source
Improved Context

Key Takeaways
LLMs are incredibly powerful, but that
power has its limitations.
Open source capabilities are rapidly
reaching those of frontier models, giving
you a strong foundation to work with.
You can address those limitations with
your proprietary data, tooling, and know-
how, building powerful agentic systems.
Increases in context length, inference
cost and latency make “in context
learning” a viable alternative to training
the models, that’s especially important
during initial prototyping.

SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder

SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder

More Related Content

Similar to SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder (20)

More from saastr (20)

Recently uploaded (20)

SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder