SlideShare a Scribd company logo
Outsmarting LLMs:
5 Strategies for Founders & Technologists
Andrew Filev
CEO & Founder
Zencoder AI
Introduction to LLMs
The scale/pre-training has been core driver of LLM
performance up until 2024
Performance
Time
And then “the frontier” slowed down a bit
Performance
Time
We are here
NOT TODAY
Let’s talk about limitations of
LLMs. They are trained to predict
the most probable next symbol,
even if it sometimes leads to
inconsistencies or mistakes.
Failure modes
1. “Fuzzy” (probabilistic) memorization, somewhat “associative”
memory.
3. Struggle with “multi-hop” reasoning, unless extensively
trained with similar examples.
2. Not reliable in following an algorithm, unless extensively
trained on it.
4. No differentiation in epistemic vs aleatoric uncertainty, no
estimates for the cost of error, no explicit evaluation /feedback.
5. They work well “in distribution”. And they are mostly trained
on open internet data, which limits that distribution.
There are ways to combat
these failure modes.
And it’s not what frontier
labs made you believe.
Performance that doesn’t rely on access to vast computing
resources
Image: Epoch AI
Source: Epoch AI
“
AI Capabilities Can Be Significantl
y Improved Without Expensive Re
training
”
“While scaling
compute for training
is key to improving
LLM performance,
some post-training
enhancements can
offer gains
equivalent to
training with 5
to 20x more
compute at less
than 1% the
cost.
The Power of Scaling for Reasoning. Exp or Log?
Activity
Time
Pre-training
New techniques
We are here
The Future is Still Amazing, and it’s yours!
We are still on the exponential curve,
but the direction shifted: there have
been recent breakthroughs in context
length, inference speed, costs.
Mar’23
Nov’23
May’24
GPT4 announced
GPT-4 Turbo is 3x cheaper for
input and 2x cheaper for
output
GPT-4o is 2x cheaper than
GPT-4 Turbo
Example:
1. “Fuzzy” (probabilistic) memorization, somewhat “associative”
memory.
1. Context is key
3. Struggle with “multi-hop” reasoning, unless extensively
trained with similar examples.
3. Use various ways to
decompose the problem
5 Ways
2. Not reliable in following an algorithm, unless extensively
trained on it.
4. No differentiation in epistemic vs aleatoric uncertainty, no
estimates for the cost of error, no explicit evaluation /feedback.
5. They work well “in distribution”. And they are mostly trained
on open internet data, which limits that distribution.
2. Leverage tools
4. Incorporate feedback in
your workflows
5. Experiment with fine tuning
on your proprietary data
The gap between commercial and open-
source LLMs is collapsing very fast.
General comparison of commercial and open-source LLMs
Proprietary
Open Source
MMLU-Pro: the more recent version of a
Massive Multitask Language Understanding
benchmark
Your Secret Weapon: Agentic design patterns
Planning
The LLM comes up with, and executes, a
multistep plan to achieve a goal (for example,
writing an outline for an essay, then doing online
research, then writing a draft, and so on).
Tool Use
The LLM is given tools such as web search, code
execution, or any other function to help it gather
information, take action, or process data.
Reflection
The LLM examines its own work to come up with
ways to improve it.
Multi-agent collaboration
More than one AI agent work together, splitting
up tasks and discussing and debating ideas, to
come up with better solutions than a single
agent would.
Case Study
Lets leverage embedded AI agents to
improve code generation.
By embedding agentic AI into code
assistants we offload routine tasks,
improve code coverage and integrate
into your development workflows
seamlessly.
Robust AI pipeline can improve automatic code generation
LLM
Assistant Engineer
Missing Context Needed, Irrelevant
Context
Hallucinations, Incorrect Code
LLM Agentic Improvement
Assistant
Intelligent
Context
Repairs and Improves
Code
Engineer
AI Agent
Current Solution: Typical AI Coding
Assistant
�
�
�
�
Repo
Grokking™
Hypothetical performance in your specific domain
Open Source
Improved Context
1. Context is key
Recent rapid expansion of context length
allows significantly improve the reasoning
capabilities in complicated scenarios
Hypothetical performance in your specific domain
2. Leverage tools
Tooling
Open Source
Improved Context
Hypothetical performance in your specific domain
3. Use various ways to
decompose the problem.
(Fixed Chain, CoT, Agentic Planning)
Tooling
Workflow
Open Source
Improved Context
4. Incorporate feedback in
your workflows
Hypothetical performance in your specific domain
Tooling
Workflow
Feedback
Open Source
Improved Context
Hypothetical performance in your specific domain
5. Experiment with fine tuning
on your proprietary data
Tooling
Workflow
Feedback
Fine-tuning
Open Source
Improved Context
Key Takeaways
LLMs are incredibly powerful, but that
power has its limitations.
Open source capabilities are rapidly
reaching those of frontier models, giving
you a strong foundation to work with.
You can address those limitations with
your proprietary data, tooling, and know-
how, building powerful agentic systems.
Increases in context length, inference
cost and latency make “in context
learning” a viable alternative to training
the models, that’s especially important
during initial prototyping.
Q&
A
SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder

More Related Content

PDF
CBUSDAW - Ash Lewis - Reducing LLM Hallucinations
PDF
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
PDF
>Wondershare Recoverit 13.5.11.3 Free crack | 2025
PDF
MiniTool Partition Wizard 12.8 Crack License Key [2025] Free
PDF
Wondershare Recoverit 13.5.11.3 Free crack
PDF
EssentialPIM Pro Business Free Download
PPTX
[DSC Europe 24] Katherine Munro - Where there’s a will, there’s a way: The ma...
CBUSDAW - Ash Lewis - Reducing LLM Hallucinations
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
>Wondershare Recoverit 13.5.11.3 Free crack | 2025
MiniTool Partition Wizard 12.8 Crack License Key [2025] Free
Wondershare Recoverit 13.5.11.3 Free crack
EssentialPIM Pro Business Free Download
[DSC Europe 24] Katherine Munro - Where there’s a will, there’s a way: The ma...

Similar to SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder (20)

PDF
Mastercam 2025 v27.0.7027 Free Download
PPTX
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
PPTX
Technologies for startup
PPTX
Applications of Generative Artificial intelligence
PDF
SFSCON24 - Moritz Mock, Barbara Russo & Jorge Melegati - Can Test Driven Deve...
PPTX
AI Tools for Productivity: Exploring Prompt Engineering and Key Features
PPTX
Succeeding with Functional-first Programming in Enterprise
PPTX
Fact based Generative AI
PDF
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
PDF
Can AI finally "cure" the Marketing Myopia?
PDF
Structured Software Design
PPTX
Nautral Langauge Processing - Basics / Non Technical
PDF
AC Atlassian Coimbatore Session Slides( 22/06/2024)
PPTX
Cinci ug-january2011-anti-patterns
PDF
Retrieval Augmented Generation A Complete Guide.pdf
PPTX
XP2018 presentation for Phoenix Scrum User Group 2018
PPTX
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
PDF
Google Interview Prep Guide Software Engineer
PDF
Large Language Modelsjjjhhhjjjjjjbbbbbbj.pdf
PPTX
Interpretable Machine Learning
Mastercam 2025 v27.0.7027 Free Download
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Technologies for startup
Applications of Generative Artificial intelligence
SFSCON24 - Moritz Mock, Barbara Russo & Jorge Melegati - Can Test Driven Deve...
AI Tools for Productivity: Exploring Prompt Engineering and Key Features
Succeeding with Functional-first Programming in Enterprise
Fact based Generative AI
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
Can AI finally "cure" the Marketing Myopia?
Structured Software Design
Nautral Langauge Processing - Basics / Non Technical
AC Atlassian Coimbatore Session Slides( 22/06/2024)
Cinci ug-january2011-anti-patterns
Retrieval Augmented Generation A Complete Guide.pdf
XP2018 presentation for Phoenix Scrum User Group 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Google Interview Prep Guide Software Engineer
Large Language Modelsjjjhhhjjjjjjbbbbbbj.pdf
Interpretable Machine Learning
Ad

More from saastr (20)

PDF
Workshop Wednesday: The New Era of HyperFunctional SaaS
PPTX
SaaStr Annual 2024: Supercharge Your GTM With Product-Driven Growth with Gene...
PPTX
SaaStr Annual 2024: Reducing Churn in AI Adoption with Bain Capital
PPTX
SaaStr Annual 2024: From Innovation to Integration: Lessons from Launching a ...
PPTX
SaaStr Annual 2024: How to Maximize Your Revenue: Strategies to Stop Leaving ...
PPTX
SaaStr Annual 2024: Scaling Subscriptions to $2B+ ARR with Canva
PPTX
SaaStr Annual 2024: Turning Data Into Results: Best Practices From Successful...
PPTX
SaaStr Annual 2024: Beyond Public Metrics: Insights from 2,400+ Private B2B C...
PPTX
SaaStr Annual 2024: Unlocking Success: How to Optimize Your Cap Table with Fi...
PPTX
SaaStr Annual 2024: Building Products for the Enterprise with UnifyApps - (10...
PPTX
SaaStr Annual 2024: Scaling CS: From 0-5,000 Customers with Drata
PPTX
SaaStr Annual 2024: 10 Things to Learn About Adding Payments or Banking to Yo...
PPTX
SaaStr Annual 2024: Building a Remote-first Startup - Lessons from Working 5 ...
PPTX
SaaStr Annual 2024: Scaling from $50M to $200M+ ARR: What Growth Investors Lo...
PPTX
SaaStr Annual 2024: From Sellers to Influencers: Transforming B2B SaaS Sales ...
PPTX
SaaStr Annual 2024: AI Productized Services: How Outcome as a Service Can Be ...
PPTX
SaaStr Annual 2024: Preparing for Financial Due Diligence - Don't Let that be...
PPTX
SaaStr Annual 2024: Professional Services: Lessons from Scaling Services Thro...
PPTX
SaaStr Annual 2024: Effectively Scaling Go To Market from 0 to 1,000 with Ind...
PPTX
SaaStr Annual 2024: Why Business-in-a-Box Startups are the Next Big SaaS Cate...
Workshop Wednesday: The New Era of HyperFunctional SaaS
SaaStr Annual 2024: Supercharge Your GTM With Product-Driven Growth with Gene...
SaaStr Annual 2024: Reducing Churn in AI Adoption with Bain Capital
SaaStr Annual 2024: From Innovation to Integration: Lessons from Launching a ...
SaaStr Annual 2024: How to Maximize Your Revenue: Strategies to Stop Leaving ...
SaaStr Annual 2024: Scaling Subscriptions to $2B+ ARR with Canva
SaaStr Annual 2024: Turning Data Into Results: Best Practices From Successful...
SaaStr Annual 2024: Beyond Public Metrics: Insights from 2,400+ Private B2B C...
SaaStr Annual 2024: Unlocking Success: How to Optimize Your Cap Table with Fi...
SaaStr Annual 2024: Building Products for the Enterprise with UnifyApps - (10...
SaaStr Annual 2024: Scaling CS: From 0-5,000 Customers with Drata
SaaStr Annual 2024: 10 Things to Learn About Adding Payments or Banking to Yo...
SaaStr Annual 2024: Building a Remote-first Startup - Lessons from Working 5 ...
SaaStr Annual 2024: Scaling from $50M to $200M+ ARR: What Growth Investors Lo...
SaaStr Annual 2024: From Sellers to Influencers: Transforming B2B SaaS Sales ...
SaaStr Annual 2024: AI Productized Services: How Outcome as a Service Can Be ...
SaaStr Annual 2024: Preparing for Financial Due Diligence - Don't Let that be...
SaaStr Annual 2024: Professional Services: Lessons from Scaling Services Thro...
SaaStr Annual 2024: Effectively Scaling Go To Market from 0 to 1,000 with Ind...
SaaStr Annual 2024: Why Business-in-a-Box Startups are the Next Big SaaS Cate...
Ad

Recently uploaded (20)

PPTX
Anesthesia and it's stage with mnemonic and images
DOC
LSTM毕业证学历认证,利物浦大学毕业证学历认证怎么认证
DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
DOCX
Action plan to easily understanding okey
PDF
PM Narendra Modi's speech from Red Fort on 79th Independence Day.pdf
PPTX
lesson6-211001025531lesson plan ppt.pptx
PPTX
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
PPTX
ANICK 6 BIRTHDAY....................................................
PPTX
Hydrogel Based delivery Cancer Treatment
PPTX
Module_4_Updated_Presentation CORRUPTION AND GRAFT IN THE PHILIPPINES.pptx
PPTX
nose tajweed for the arabic alphabets for the responsive
PPTX
Tablets And Capsule Preformulation Of Paracetamol
PDF
Yusen Logistics Group Sustainability Report 2024.pdf
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
PPTX
Phylogeny and disease transmission of Dipteran Fly (ppt).pptx
PPTX
Human Mind & its character Characteristics
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
PDF
Presentation1 [Autosaved].pdf diagnosiss
PDF
MODULE 3 BASIC SECURITY DUTIES AND ROLES.pdf
Anesthesia and it's stage with mnemonic and images
LSTM毕业证学历认证,利物浦大学毕业证学历认证怎么认证
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
Action plan to easily understanding okey
PM Narendra Modi's speech from Red Fort on 79th Independence Day.pdf
lesson6-211001025531lesson plan ppt.pptx
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
ANICK 6 BIRTHDAY....................................................
Hydrogel Based delivery Cancer Treatment
Module_4_Updated_Presentation CORRUPTION AND GRAFT IN THE PHILIPPINES.pptx
nose tajweed for the arabic alphabets for the responsive
Tablets And Capsule Preformulation Of Paracetamol
Yusen Logistics Group Sustainability Report 2024.pdf
_ISO_Presentation_ISO 9001 and 45001.pptx
Phylogeny and disease transmission of Dipteran Fly (ppt).pptx
Human Mind & its character Characteristics
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
Presentation1 [Autosaved].pdf diagnosiss
MODULE 3 BASIC SECURITY DUTIES AND ROLES.pdf

SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologists with Zencoder

  • 1. Outsmarting LLMs: 5 Strategies for Founders & Technologists Andrew Filev CEO & Founder Zencoder AI
  • 3. The scale/pre-training has been core driver of LLM performance up until 2024 Performance Time
  • 4. And then “the frontier” slowed down a bit Performance Time We are here
  • 6. Let’s talk about limitations of LLMs. They are trained to predict the most probable next symbol, even if it sometimes leads to inconsistencies or mistakes.
  • 7. Failure modes 1. “Fuzzy” (probabilistic) memorization, somewhat “associative” memory. 3. Struggle with “multi-hop” reasoning, unless extensively trained with similar examples. 2. Not reliable in following an algorithm, unless extensively trained on it. 4. No differentiation in epistemic vs aleatoric uncertainty, no estimates for the cost of error, no explicit evaluation /feedback. 5. They work well “in distribution”. And they are mostly trained on open internet data, which limits that distribution.
  • 8. There are ways to combat these failure modes. And it’s not what frontier labs made you believe.
  • 9. Performance that doesn’t rely on access to vast computing resources Image: Epoch AI Source: Epoch AI “ AI Capabilities Can Be Significantl y Improved Without Expensive Re training ” “While scaling compute for training is key to improving LLM performance, some post-training enhancements can offer gains equivalent to training with 5 to 20x more compute at less than 1% the cost.
  • 10. The Power of Scaling for Reasoning. Exp or Log? Activity Time Pre-training New techniques We are here
  • 11. The Future is Still Amazing, and it’s yours! We are still on the exponential curve, but the direction shifted: there have been recent breakthroughs in context length, inference speed, costs. Mar’23 Nov’23 May’24 GPT4 announced GPT-4 Turbo is 3x cheaper for input and 2x cheaper for output GPT-4o is 2x cheaper than GPT-4 Turbo Example:
  • 12. 1. “Fuzzy” (probabilistic) memorization, somewhat “associative” memory. 1. Context is key 3. Struggle with “multi-hop” reasoning, unless extensively trained with similar examples. 3. Use various ways to decompose the problem 5 Ways 2. Not reliable in following an algorithm, unless extensively trained on it. 4. No differentiation in epistemic vs aleatoric uncertainty, no estimates for the cost of error, no explicit evaluation /feedback. 5. They work well “in distribution”. And they are mostly trained on open internet data, which limits that distribution. 2. Leverage tools 4. Incorporate feedback in your workflows 5. Experiment with fine tuning on your proprietary data
  • 13. The gap between commercial and open- source LLMs is collapsing very fast.
  • 14. General comparison of commercial and open-source LLMs Proprietary Open Source MMLU-Pro: the more recent version of a Massive Multitask Language Understanding benchmark
  • 15. Your Secret Weapon: Agentic design patterns Planning The LLM comes up with, and executes, a multistep plan to achieve a goal (for example, writing an outline for an essay, then doing online research, then writing a draft, and so on). Tool Use The LLM is given tools such as web search, code execution, or any other function to help it gather information, take action, or process data. Reflection The LLM examines its own work to come up with ways to improve it. Multi-agent collaboration More than one AI agent work together, splitting up tasks and discussing and debating ideas, to come up with better solutions than a single agent would.
  • 16. Case Study Lets leverage embedded AI agents to improve code generation. By embedding agentic AI into code assistants we offload routine tasks, improve code coverage and integrate into your development workflows seamlessly.
  • 17. Robust AI pipeline can improve automatic code generation LLM Assistant Engineer Missing Context Needed, Irrelevant Context Hallucinations, Incorrect Code LLM Agentic Improvement Assistant Intelligent Context Repairs and Improves Code Engineer AI Agent Current Solution: Typical AI Coding Assistant � � � � Repo Grokking™
  • 18. Hypothetical performance in your specific domain Open Source Improved Context 1. Context is key Recent rapid expansion of context length allows significantly improve the reasoning capabilities in complicated scenarios
  • 19. Hypothetical performance in your specific domain 2. Leverage tools Tooling Open Source Improved Context
  • 20. Hypothetical performance in your specific domain 3. Use various ways to decompose the problem. (Fixed Chain, CoT, Agentic Planning) Tooling Workflow Open Source Improved Context
  • 21. 4. Incorporate feedback in your workflows Hypothetical performance in your specific domain Tooling Workflow Feedback Open Source Improved Context
  • 22. Hypothetical performance in your specific domain 5. Experiment with fine tuning on your proprietary data Tooling Workflow Feedback Fine-tuning Open Source Improved Context
  • 23. Key Takeaways LLMs are incredibly powerful, but that power has its limitations. Open source capabilities are rapidly reaching those of frontier models, giving you a strong foundation to work with. You can address those limitations with your proprietary data, tooling, and know- how, building powerful agentic systems. Increases in context length, inference cost and latency make “in context learning” a viable alternative to training the models, that’s especially important during initial prototyping.
  • 24. Q& A