SlideShare a Scribd company logo
mindcraft.ai
Few Shots Learning
History:
1995 - Internet
2000 - Software
2005 - Web
2010 - Machine Learning
2016 - Deep Learning
2018 - Transformers
2020 - Large Models
2022 - Transfer Learning
sudeep.co
mindcraft.ai
Transfer Learning in LLM
- what caused impact
on few shot learning
- nocode - text instruction
- text embeddings
- data translation
- federated learning
medium.com
mindcraft.ai
Fine Tuning
- image classification
- object detection
- semantic, identity segmentation
- style transfer
- text classification, NER
coco dataset
mindcraft.ai
Dataset Generation
- generate or augment problem-specific data
using LLM, diffusion etc
- select a model from HuggingFace
- fine tune it with the dataset
mindcraft.ai
Zero Shot Learning
- it is not unsupervised
- train on some classes and then predict on a new one
- multimodal (image and text embeddings)
- human language prompt in a Large Model
mindcraft.ai
One Shot Learning
- template matching
- clustering and finding
closest centroid
- triple loss and face detection
- human language prompt + example
medium.com/@crimy
mindcraft.ai
Few Shot Learning
- recommendation systems
- prototypical networks
- chat models - prompt + examples
- LLM fine tuning
towardsdatascience.com
mindcraft.ai
Reinforcement Learning
- translate images into language tokens
- using autoregressive Transformer
to learn world
- 2 hours playing games to train
- outperformed human in 10 of 26 games
ICLR 2023, Transformers are Sample-Efficient World
Models
mindcraft.ai
Document Classification
- text classification task in ~80 categories
- multiple languages
- used BERT and Ada embeddings
- dataset augmentation with GPT3.5
- simple NN for the classification task
- planning to add fine tuning
mindcraft.ai
Anomaly detection
- check fin declarations
- zero shot learning approach
- catches only obvious things
- requires historical and
snapshot clustering
open.ai
mindcraft.ai
Custom Assistant
- replacing categorization bot
with a human language one
- collected dataset of ~2k Q&A
- fine tuned open.ai Davinci
- spent $300 on open.ai
- such system can only gently suggest,
not decide
mindcraft.ai
NER
- NEs are collected by pattern
search (RegEx)
- validated if possible
- used GPT3 for context
matching
- alternatively using GPT3.5 for
direct search
open.ai
mindcraft.ai
Document Scope
- factorize document using
few shot learning GPT3.5
- create template by using clustering
on Ada embeddings
- assign names with
zero-shot learning on GPT3.5
- check document portions
against template
open.ai
mindcraft.ai
Fixing UNSPSC
- old rule-based system
- collecting embeddings (BERT, Ada)
- XGBoost for classification
- 400+ classes, almost 80% accuracy
mindcraft.ai
Address Normalization and Deduplication
- few shot learning
on GPT3.5
- similarity check
with Levenshtein distance
- moving to a model
from HuggingFace
- saved 2 months of work
open.ai
mindcraft.ai
Future of Our Job
- Data preparation
- Fine Tuning
- Edge Models
- Prompt Engineering
promptbase.com
mindcraft.ai
Das ist MindCraft
Decision-making Engines for Data-driven Businesses, especially:
- Document and Web pages Classification, Capturing (NLP, CNN, CV, NER)
- Price Prediction (DNN, Regression, Prognosis)
- Command Centers for IoT systems (RNN, Time Series, Anomaly Detection)
- Computer Vision and Object Detection
- Data Analysis and Generation

More Related Content

PDF
Deep Learning for Autonomous Driving
PDF
Azure Machine Learning 101
PDF
Generalized Linear Models with H2O
PPTX
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
PPTX
Building an ML model with zero code
PDF
Practical Machine Learning Tackle The Realworld Complexities Of Modern Machin...
PPTX
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
PPTX
Large Language Models vs Small Language Models
Deep Learning for Autonomous Driving
Azure Machine Learning 101
Generalized Linear Models with H2O
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Building an ML model with zero code
Practical Machine Learning Tackle The Realworld Complexities Of Modern Machin...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Large Language Models vs Small Language Models

Similar to Andy Bosyi: Few-shot learning as a trade-off between software development and data science (20)

PDF
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
PDF
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
PDF
Containers & AI - Beauty and the Beast!?!
PDF
Distributed Deep Learning with Hadoop and TensorFlow
PDF
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
PPTX
Designing Artificial Intelligence
PPTX
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
PDF
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
PPTX
END-TO-END MACHINE LEARNING STACK
PDF
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
PDF
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
PPTX
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
PDF
Tek12: Graphing real-time performance with Graphite
PDF
Data streaming at VRT
PDF
Which library should you choose for data-science? That's the question!
PDF
Spark Based Distributed Deep Learning Framework For Big Data Applications
PDF
C19013010 the tutorial to build shared ai services session 1
PPTX
Cloudera Data Science Challenge
PPTX
Data Science Challenge presentation given to the CinBITools Meetup Group
PDF
Intro to machine learning
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Containers & AI - Beauty and the Beast!?!
Distributed Deep Learning with Hadoop and TensorFlow
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Designing Artificial Intelligence
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
END-TO-END MACHINE LEARNING STACK
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Tek12: Graphing real-time performance with Graphite
Data streaming at VRT
Which library should you choose for data-science? That's the question!
Spark Based Distributed Deep Learning Framework For Big Data Applications
C19013010 the tutorial to build shared ai services session 1
Cloudera Data Science Challenge
Data Science Challenge presentation given to the CinBITools Meetup Group
Intro to machine learning
Ad

More from Lviv Startup Club (20)

PDF
Maksym Vyshnivetskyi: PMO KPIs (UA) - LemBS
PDF
Oleksandr Ivakhnenko: LinkedIn Marketing і Content Marketing: розширений підх...
PDF
Maksym Vyshnivetskyi: PMO Quality Management (UA)
PDF
Oleksandr Ivakhnenko: Вступ до генерації лідів для ІТ-аутсорсингу (UA)
PDF
Oleksandr Osypenko: Поради щодо іспиту та закриття курсу (UA)
PDF
Oleksandr Osypenko: Пробний іспит + аналіз (UA)
PDF
Oleksandr Osypenko: Agile / Hybrid Delivery (UA)
PDF
Oleksandr Osypenko: Стейкхолдери та їх вплив (UA)
PDF
Rostyslav Chayka: Prompt Engineering для проєктного менеджменту (Advanced) (UA)
PPTX
Dmytro Liesov: PMO Tools and Technologies (UA)
PDF
Rostyslav Chayka: Управління командою за допомогою AI (UA)
PDF
Oleksandr Osypenko: Tailoring + Change Management (UA)
PDF
Maksym Vyshnivetskyi: Управління закупівлями (UA)
PDF
Oleksandr Osypenko: Управління ризиками (UA)
PPTX
Dmytro Zubkov: PMO Resource Management (UA)
PPTX
Rostyslav Chayka: Комунікація за допомогою AI (UA)
PDF
Ihor Pavlenko: Комунікація за допомогою AI (UA)
PDF
Maksym Vyshnivetskyi: Управління якістю (UA)
PDF
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
PDF
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Maksym Vyshnivetskyi: PMO KPIs (UA) - LemBS
Oleksandr Ivakhnenko: LinkedIn Marketing і Content Marketing: розширений підх...
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Oleksandr Ivakhnenko: Вступ до генерації лідів для ІТ-аутсорсингу (UA)
Oleksandr Osypenko: Поради щодо іспиту та закриття курсу (UA)
Oleksandr Osypenko: Пробний іспит + аналіз (UA)
Oleksandr Osypenko: Agile / Hybrid Delivery (UA)
Oleksandr Osypenko: Стейкхолдери та їх вплив (UA)
Rostyslav Chayka: Prompt Engineering для проєктного менеджменту (Advanced) (UA)
Dmytro Liesov: PMO Tools and Technologies (UA)
Rostyslav Chayka: Управління командою за допомогою AI (UA)
Oleksandr Osypenko: Tailoring + Change Management (UA)
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Oleksandr Osypenko: Управління ризиками (UA)
Dmytro Zubkov: PMO Resource Management (UA)
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Maksym Vyshnivetskyi: Управління якістю (UA)
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Ad

Recently uploaded (20)

PDF
IFRS Notes in your pocket for study all the time
PDF
Tata consultancy services case study shri Sharda college, basrur
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PPTX
Probability Distribution, binomial distribution, poisson distribution
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
MSPs in 10 Words - Created by US MSP Network
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
COST SHEET- Tender and Quotation unit 2.pdf
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PDF
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
PPTX
HR Introduction Slide (1).pptx on hr intro
PPTX
Principles of Marketing, Industrial, Consumers,
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PPT
Data mining for business intelligence ch04 sharda
DOCX
Business Management - unit 1 and 2
PPT
Chapter four Project-Preparation material
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PPTX
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
IFRS Notes in your pocket for study all the time
Tata consultancy services case study shri Sharda college, basrur
DOC-20250806-WA0002._20250806_112011_0000.pdf
Probability Distribution, binomial distribution, poisson distribution
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
MSPs in 10 Words - Created by US MSP Network
Euro SEO Services 1st 3 General Updates.docx
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
COST SHEET- Tender and Quotation unit 2.pdf
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
HR Introduction Slide (1).pptx on hr intro
Principles of Marketing, Industrial, Consumers,
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
Data mining for business intelligence ch04 sharda
Business Management - unit 1 and 2
Chapter four Project-Preparation material
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
Board-Reporting-Package-by-Umbrex-5-23-23.pptx

Andy Bosyi: Few-shot learning as a trade-off between software development and data science

  • 1. mindcraft.ai Few Shots Learning History: 1995 - Internet 2000 - Software 2005 - Web 2010 - Machine Learning 2016 - Deep Learning 2018 - Transformers 2020 - Large Models 2022 - Transfer Learning sudeep.co
  • 2. mindcraft.ai Transfer Learning in LLM - what caused impact on few shot learning - nocode - text instruction - text embeddings - data translation - federated learning medium.com
  • 3. mindcraft.ai Fine Tuning - image classification - object detection - semantic, identity segmentation - style transfer - text classification, NER coco dataset
  • 4. mindcraft.ai Dataset Generation - generate or augment problem-specific data using LLM, diffusion etc - select a model from HuggingFace - fine tune it with the dataset
  • 5. mindcraft.ai Zero Shot Learning - it is not unsupervised - train on some classes and then predict on a new one - multimodal (image and text embeddings) - human language prompt in a Large Model
  • 6. mindcraft.ai One Shot Learning - template matching - clustering and finding closest centroid - triple loss and face detection - human language prompt + example medium.com/@crimy
  • 7. mindcraft.ai Few Shot Learning - recommendation systems - prototypical networks - chat models - prompt + examples - LLM fine tuning towardsdatascience.com
  • 8. mindcraft.ai Reinforcement Learning - translate images into language tokens - using autoregressive Transformer to learn world - 2 hours playing games to train - outperformed human in 10 of 26 games ICLR 2023, Transformers are Sample-Efficient World Models
  • 9. mindcraft.ai Document Classification - text classification task in ~80 categories - multiple languages - used BERT and Ada embeddings - dataset augmentation with GPT3.5 - simple NN for the classification task - planning to add fine tuning
  • 10. mindcraft.ai Anomaly detection - check fin declarations - zero shot learning approach - catches only obvious things - requires historical and snapshot clustering open.ai
  • 11. mindcraft.ai Custom Assistant - replacing categorization bot with a human language one - collected dataset of ~2k Q&A - fine tuned open.ai Davinci - spent $300 on open.ai - such system can only gently suggest, not decide
  • 12. mindcraft.ai NER - NEs are collected by pattern search (RegEx) - validated if possible - used GPT3 for context matching - alternatively using GPT3.5 for direct search open.ai
  • 13. mindcraft.ai Document Scope - factorize document using few shot learning GPT3.5 - create template by using clustering on Ada embeddings - assign names with zero-shot learning on GPT3.5 - check document portions against template open.ai
  • 14. mindcraft.ai Fixing UNSPSC - old rule-based system - collecting embeddings (BERT, Ada) - XGBoost for classification - 400+ classes, almost 80% accuracy
  • 15. mindcraft.ai Address Normalization and Deduplication - few shot learning on GPT3.5 - similarity check with Levenshtein distance - moving to a model from HuggingFace - saved 2 months of work open.ai
  • 16. mindcraft.ai Future of Our Job - Data preparation - Fine Tuning - Edge Models - Prompt Engineering promptbase.com
  • 17. mindcraft.ai Das ist MindCraft Decision-making Engines for Data-driven Businesses, especially: - Document and Web pages Classification, Capturing (NLP, CNN, CV, NER) - Price Prediction (DNN, Regression, Prognosis) - Command Centers for IoT systems (RNN, Time Series, Anomaly Detection) - Computer Vision and Object Detection - Data Analysis and Generation