TnT-LLM : Text Mining at Scale with Large Language Models

TnT-LLM: Text Mining at Scale
with Large Language Models
 Authors: Mengting Wan, Tara Safavi, Sujay Kumar Jauhar,
Yujin Kim, Scott Counts, Jennifer Neville, Siddharth Suri,
Chirag Shah, Ryen W. White, Longqi Yang, Reid Andersen,
Georg Buscher, Dhruv Joshi, Nagu Rangan
 Affiliation: Microsoft Corporation & University of
Washington
 Presented by: Jayanth Kalyanam

Introduction
 The fundamental challenge in text mining is balancing
quality with scalability.
 Manual taxonomies require expensive domain expertise
and time-consuming curation,
 while automated clustering methods produce results
that are difficult to interpret.
 TnT-LLM addresses this by leveraging LLMs' ability to
understand context like humans while processing data
at machine speed.

System Architecture
 The system's design reflects a careful balance between
power and practicality. Core components include:
 - Text Summarization Module
 - Taxonomy Generator
 - Data Labeling System
 - Lightweight Classifier
 Each component is modular, allowing for independent
optimization and updates.
 This architecture enables the system to maintain high
quality while scaling efficiently.

Technical Implementation
 Taxonomy Generation Pipeline:
 The system uses stochastic optimization principles,
similar to how neural networks learn.
 This approach allows the taxonomy to continuously
improve as it sees more data,
 much like how human experts refine their
understanding over time.
 Technical Specifications:
 - Batch size: 200 conversations
 - Parameters: { , } for labels
𝝁 𝚺
 - Temperature: 0.5 (generation), 0.2 (update)

Experimental Setup
 Dataset Design:
 The choice of Bing Copilot conversations was strategic -
it provides natural, diverse interactions across multiple
languages and domains.
 The dataset split (Phase 1: 9,592, Phase 2: 48,160)
allows for thorough testing of both taxonomy generation
and scaling capabilities.
 Classification Tasks:
 - Intent detection (10 categories)
 - Domain classification (25 categories)

Evaluation Metrics
 The evaluation strategy combines technical metrics with
practical assessments:
 Technical Metrics:
 - Coverage and accuracy measurements
 - Inter-rater reliability scores
 - Classification performance metrics
 Understanding these metrics helps validate both the
system's technical performance and its practical utility,
 especially important given the dual challenges of scale
and quality.

Ablation Studies
 The ablation studies reveal interesting insights about
component contributions:
 - LLM Choice: GPT-4 significantly outperforms GPT-3.5-
Turbo for intent understanding because intent detection
requires deeper reasoning capabilities. However, for
simpler domain classification, the performance gap
narrows.
 - Embedding Models: ada2 shows surprisingly robust
multilingual performance, suggesting it captures
universal semantic features better than alternatives.

Performance Analysis
 The system achieves remarkable coverage (>99.5%)
while maintaining high agreement with human judges.
 This indicates that TnT-LLM isn't just sorting texts
efficiently - it's creating categories that make intuitive
sense to humans.
 Key Metrics:
 - Intent accuracy: 0.746 (English), 0.725 (Non-English)
 - Domain accuracy: 0.733 (English), 0.673 (Non-English)
 - Latency: <100ms per classification

Scaling Characteristics
 The system's scaling properties show a clever trade-off
between computational cost and performance.
 By using LLMs for training data generation rather than
direct classification,
 it achieves an 85% reduction in resource utilization
compared to full LLM deployment.
 Processing Capabilities:
 - Batch: 10k texts/minute
 - Real-time: <100ms/text

References
 Aggarwal & Zhai (2012): Text clustering foundations
 Chang et al. (2009): Topic model interpretation
 Lee et al. (2023): LLM data annotation
 McLachlan & Basford (1988): Mixture models
 Rose & Levinson (2004): Web intent analysis
 Ziems et al. (2023): LLMs in computational research

TnT-LLM : Text Mining at Scale with Large Language Models

More Related Content

Similar to TnT-LLM : Text Mining at Scale with Large Language Models (20)

Recently uploaded (20)

TnT-LLM : Text Mining at Scale with Large Language Models