SlideShare a Scribd company logo
4
Most read
TnT-LLM: Text Mining at Scale
with Large Language Models
 Authors: Mengting Wan, Tara Safavi, Sujay Kumar Jauhar,
Yujin Kim, Scott Counts, Jennifer Neville, Siddharth Suri,
Chirag Shah, Ryen W. White, Longqi Yang, Reid Andersen,
Georg Buscher, Dhruv Joshi, Nagu Rangan
 Affiliation: Microsoft Corporation & University of
Washington
 Presented by: Jayanth Kalyanam
Introduction
 The fundamental challenge in text mining is balancing
quality with scalability.
 Manual taxonomies require expensive domain expertise
and time-consuming curation,
 while automated clustering methods produce results
that are difficult to interpret.
 TnT-LLM addresses this by leveraging LLMs' ability to
understand context like humans while processing data
at machine speed.
System Architecture
 The system's design reflects a careful balance between
power and practicality. Core components include:
 - Text Summarization Module
 - Taxonomy Generator
 - Data Labeling System
 - Lightweight Classifier
 Each component is modular, allowing for independent
optimization and updates.
 This architecture enables the system to maintain high
quality while scaling efficiently.
Technical Implementation
 Taxonomy Generation Pipeline:
 The system uses stochastic optimization principles,
similar to how neural networks learn.
 This approach allows the taxonomy to continuously
improve as it sees more data,
 much like how human experts refine their
understanding over time.
 Technical Specifications:
 - Batch size: 200 conversations
 - Parameters: { , } for labels
𝝁 𝚺
 - Temperature: 0.5 (generation), 0.2 (update)
Experimental Setup
 Dataset Design:
 The choice of Bing Copilot conversations was strategic -
it provides natural, diverse interactions across multiple
languages and domains.
 The dataset split (Phase 1: 9,592, Phase 2: 48,160)
allows for thorough testing of both taxonomy generation
and scaling capabilities.
 Classification Tasks:
 - Intent detection (10 categories)
 - Domain classification (25 categories)
Evaluation Metrics
 The evaluation strategy combines technical metrics with
practical assessments:
 Technical Metrics:
 - Coverage and accuracy measurements
 - Inter-rater reliability scores
 - Classification performance metrics
 Understanding these metrics helps validate both the
system's technical performance and its practical utility,
 especially important given the dual challenges of scale
and quality.
Ablation Studies
 The ablation studies reveal interesting insights about
component contributions:
 - LLM Choice: GPT-4 significantly outperforms GPT-3.5-
Turbo for intent understanding because intent detection
requires deeper reasoning capabilities. However, for
simpler domain classification, the performance gap
narrows.
 - Embedding Models: ada2 shows surprisingly robust
multilingual performance, suggesting it captures
universal semantic features better than alternatives.
Performance Analysis
 The system achieves remarkable coverage (>99.5%)
while maintaining high agreement with human judges.
 This indicates that TnT-LLM isn't just sorting texts
efficiently - it's creating categories that make intuitive
sense to humans.
 Key Metrics:
 - Intent accuracy: 0.746 (English), 0.725 (Non-English)
 - Domain accuracy: 0.733 (English), 0.673 (Non-English)
 - Latency: <100ms per classification
Scaling Characteristics
 The system's scaling properties show a clever trade-off
between computational cost and performance.
 By using LLMs for training data generation rather than
direct classification,
 it achieves an 85% reduction in resource utilization
compared to full LLM deployment.
 Processing Capabilities:
 - Batch: 10k texts/minute
 - Real-time: <100ms/text
References
 Aggarwal & Zhai (2012): Text clustering foundations
 Chang et al. (2009): Topic model interpretation
 Lee et al. (2023): LLM data annotation
 McLachlan & Basford (1988): Mixture models
 Rose & Levinson (2004): Web intent analysis
 Ziems et al. (2023): LLMs in computational research

More Related Content

PPTX
ML-Approach-for-Telecom-Network-Operations-Management.pptx
PDF
Final Year IEEE Project 2013-2014 - Parallel and Distributed Systems Project...
PDF
Handwritten Text Recognition Using Machine Learning
PDF
ML Project Presentation - Predictive text input generation
PDF
A Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
DOCX
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
PPTX
Performance analysis of machine learning approaches in software complexity pr...
PDF
IRJET- Machine Learning Techniques for Code Optimization
ML-Approach-for-Telecom-Network-Operations-Management.pptx
Final Year IEEE Project 2013-2014 - Parallel and Distributed Systems Project...
Handwritten Text Recognition Using Machine Learning
ML Project Presentation - Predictive text input generation
A Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
Performance analysis of machine learning approaches in software complexity pr...
IRJET- Machine Learning Techniques for Code Optimization

Similar to TnT-LLM : Text Mining at Scale with Large Language Models (20)

PPTX
Everything you need to know about AutoML
PDF
MODEL CHECKERS –TOOLS AND LANGUAGES FOR SYSTEM DESIGN- A SURVEY
DOCX
DETECTION OF NETWORK INTRUSION USING DCGANSEMI-SUPERVISED APPROACH.docx
PDF
Query Evaluation Techniques for Large Databases.pdf
DOCX
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
DOCX
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
DOCX
A fast clustering based feature subset selection algorithm for high-dimension...
PDF
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
PDF
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
PDF
A Comparison of Stock Trend Prediction Using Accuracy Driven Neural Network V...
PDF
Enterprise performance engineering solutions
PPT
Feasible
PDF
Ameya_Kasbekar_Resume
DOCX
Algorithm ExampleFor the following taskUse the random module .docx
PPTX
Real time and distributed design
PPTX
Foundation of ML Project Presentation - 1.pptx
PDF
Fuzzy Rule Base System for Software Classification
DOC
Cloud data management
DOCX
IEEE 2014 DOTNET DATA MINING PROJECTS Similarity preserving snippet based vis...
DOCX
2014 IEEE DOTNET DATA MINING PROJECT Similarity preserving snippet based visu...
Everything you need to know about AutoML
MODEL CHECKERS –TOOLS AND LANGUAGES FOR SYSTEM DESIGN- A SURVEY
DETECTION OF NETWORK INTRUSION USING DCGANSEMI-SUPERVISED APPROACH.docx
Query Evaluation Techniques for Large Databases.pdf
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
A fast clustering based feature subset selection algorithm for high-dimension...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A Comparison of Stock Trend Prediction Using Accuracy Driven Neural Network V...
Enterprise performance engineering solutions
Feasible
Ameya_Kasbekar_Resume
Algorithm ExampleFor the following taskUse the random module .docx
Real time and distributed design
Foundation of ML Project Presentation - 1.pptx
Fuzzy Rule Base System for Software Classification
Cloud data management
IEEE 2014 DOTNET DATA MINING PROJECTS Similarity preserving snippet based vis...
2014 IEEE DOTNET DATA MINING PROJECT Similarity preserving snippet based visu...
Ad

Recently uploaded (20)

PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Cell Structure & Organelles in detailed.
PPTX
Presentation on HIE in infants and its manifestations
PDF
Complications of Minimal Access Surgery at WLH
PDF
RMMM.pdf make it easy to upload and study
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
master seminar digital applications in india
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
01-Introduction-to-Information-Management.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Anesthesia in Laparoscopic Surgery in India
Cell Structure & Organelles in detailed.
Presentation on HIE in infants and its manifestations
Complications of Minimal Access Surgery at WLH
RMMM.pdf make it easy to upload and study
202450812 BayCHI UCSC-SV 20250812 v17.pptx
human mycosis Human fungal infections are called human mycosis..pptx
master seminar digital applications in india
FourierSeries-QuestionsWithAnswers(Part-A).pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Final Presentation General Medicine 03-08-2024.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Chinmaya Tiranga quiz Grand Finale.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
01-Introduction-to-Information-Management.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Ad

TnT-LLM : Text Mining at Scale with Large Language Models

  • 1. TnT-LLM: Text Mining at Scale with Large Language Models  Authors: Mengting Wan, Tara Safavi, Sujay Kumar Jauhar, Yujin Kim, Scott Counts, Jennifer Neville, Siddharth Suri, Chirag Shah, Ryen W. White, Longqi Yang, Reid Andersen, Georg Buscher, Dhruv Joshi, Nagu Rangan  Affiliation: Microsoft Corporation & University of Washington  Presented by: Jayanth Kalyanam
  • 2. Introduction  The fundamental challenge in text mining is balancing quality with scalability.  Manual taxonomies require expensive domain expertise and time-consuming curation,  while automated clustering methods produce results that are difficult to interpret.  TnT-LLM addresses this by leveraging LLMs' ability to understand context like humans while processing data at machine speed.
  • 3. System Architecture  The system's design reflects a careful balance between power and practicality. Core components include:  - Text Summarization Module  - Taxonomy Generator  - Data Labeling System  - Lightweight Classifier  Each component is modular, allowing for independent optimization and updates.  This architecture enables the system to maintain high quality while scaling efficiently.
  • 4. Technical Implementation  Taxonomy Generation Pipeline:  The system uses stochastic optimization principles, similar to how neural networks learn.  This approach allows the taxonomy to continuously improve as it sees more data,  much like how human experts refine their understanding over time.  Technical Specifications:  - Batch size: 200 conversations  - Parameters: { , } for labels 𝝁 𝚺  - Temperature: 0.5 (generation), 0.2 (update)
  • 5. Experimental Setup  Dataset Design:  The choice of Bing Copilot conversations was strategic - it provides natural, diverse interactions across multiple languages and domains.  The dataset split (Phase 1: 9,592, Phase 2: 48,160) allows for thorough testing of both taxonomy generation and scaling capabilities.  Classification Tasks:  - Intent detection (10 categories)  - Domain classification (25 categories)
  • 6. Evaluation Metrics  The evaluation strategy combines technical metrics with practical assessments:  Technical Metrics:  - Coverage and accuracy measurements  - Inter-rater reliability scores  - Classification performance metrics  Understanding these metrics helps validate both the system's technical performance and its practical utility,  especially important given the dual challenges of scale and quality.
  • 7. Ablation Studies  The ablation studies reveal interesting insights about component contributions:  - LLM Choice: GPT-4 significantly outperforms GPT-3.5- Turbo for intent understanding because intent detection requires deeper reasoning capabilities. However, for simpler domain classification, the performance gap narrows.  - Embedding Models: ada2 shows surprisingly robust multilingual performance, suggesting it captures universal semantic features better than alternatives.
  • 8. Performance Analysis  The system achieves remarkable coverage (>99.5%) while maintaining high agreement with human judges.  This indicates that TnT-LLM isn't just sorting texts efficiently - it's creating categories that make intuitive sense to humans.  Key Metrics:  - Intent accuracy: 0.746 (English), 0.725 (Non-English)  - Domain accuracy: 0.733 (English), 0.673 (Non-English)  - Latency: <100ms per classification
  • 9. Scaling Characteristics  The system's scaling properties show a clever trade-off between computational cost and performance.  By using LLMs for training data generation rather than direct classification,  it achieves an 85% reduction in resource utilization compared to full LLM deployment.  Processing Capabilities:  - Batch: 10k texts/minute  - Real-time: <100ms/text
  • 10. References  Aggarwal & Zhai (2012): Text clustering foundations  Chang et al. (2009): Topic model interpretation  Lee et al. (2023): LLM data annotation  McLachlan & Basford (1988): Mixture models  Rose & Levinson (2004): Web intent analysis  Ziems et al. (2023): LLMs in computational research