SlideShare a Scribd company logo
Discovering Textual Structures:
Generative Grammar Induction
using Template Trees
Thomas Winters, Luc De Raedt
KU Leuven, Belgium
@thomas_wint
firstname.lastname@cs.kuleuven.be
Motivation
Generative model designers often
write generative grammars by hand.
Can we bootstrap this by providing some exemplars?
S  <Hero> is going to <Action>
Hero  <Name> the <Job>
Name  Alice | Bob | Cathy
Job  hunter | knight | cook
Action  fish | slay a dragon | relax
Alice the hunter is going to fish
Cathy the cook is going to slay a dragon
Bob the cook is going to fish
Alice the knight is going to relax
?
Motivation
Automatically inducing generative context-free
grammars from text with latent templates
Problem:
Most grammar induction algorithms don’t “acknowledge”
templates, but prefer part-of-speech-like structures
Generative grammars often use template-like structures!
Solution:
Introducing notion of “Template Trees”
Template Tree
= connected acyclic directed graph where each node represents a template
that is more general than the template of all its child nodes
Created by iteratively merging closest templates
hello world hi world hi universe howdy universe
<D> universe
<B> world
<A>
hi <C>
GITTA
Grammar Induction using a Template Tree Approach
1. Creates an initial template tree
2. Prunes redundant children for each node (all
descendent leaves covered by other children)
3. Merges slots (when values have similar value sets)
4. Simplifies tree using slot content
5. Step 3 & 4 until convergence
Example input
1. "I like my cat and my dog"
2. "I like my dog and my chicken"
3. "Alice the cat is jumping"
4. "Bob the dog is walking"
5. “Cathy the cat is walking”
Join closest strings iteratively into templates
with slots to create Template Tree
A: I like my <B> and my <C>
| <D> the <E> is <F>
B: cat | dog
C: chicken | dog
D: <G> | <I>
E: <H> | cat
F: walking | <J>
G: Bob | Cathy
H: cat | dog
I: Alice | Cathy
J: walking | jumping
Merge similar slots and
rename slots in Template Tree
A: I like my <B> and my <B>
| <G> the <B> is <F>
B: chicken | cat | dog
C: <B>
D: <G>
E: <B>
F: walking | jumping
G: Alice | Bob | Cathy
H: <B>
I: <G>
J: <F>
Iteratively merge and simplify tree &
recalculate templates until convergence
A: I like my <B> and my <B>
| <G> the <B> is <F>
B: chicken | cat | dog
F: walking | jumping
G: Alice | Bob | Cathy
Output
S → I like my <B> and my <B>
| <G> the <B> is <F>
B → chicken | cat | dog
F → walking | jumping
G → Alice | Bob | Cathy
Evaluation: Reverse-engineering Twitterbots
GITTA finds decent generalisations, but difficulties with subsequent slots,
especially if slots have large variations in number of words for their slots values.
Given 25/50/100 example generations of Tracery grammar, try to reconstruct grammar
Conclusion
GITTA is a promising approach for learning
generative grammars from texts with latent
templates
Code: https://github.com/twinters/gitta

More Related Content

PPTX
Neuro-Symbolic Creative Artificial Intelligence for Humor (PhD Defense)
PPTX
Prompt engineering: De kunst van het leren communiceren met AI
PPTX
Wetenschapscommunicatie on steroids
PPTX
TorfsBot or Not? Evaluating User Perception on Imitative Text Generation (CLI...
PPTX
Prompt engineering: de kunst van het leren communiceren met AI (Juni 2023)
PPTX
Hoe werken tekstgenerators? (Special Guest in Lieven Scheire's AI voorstelling)
PDF
Pret met Creatieve Computers
PPTX
Hoe leer je computers humor?
Neuro-Symbolic Creative Artificial Intelligence for Humor (PhD Defense)
Prompt engineering: De kunst van het leren communiceren met AI
Wetenschapscommunicatie on steroids
TorfsBot or Not? Evaluating User Perception on Imitative Text Generation (CLI...
Prompt engineering: de kunst van het leren communiceren met AI (Juni 2023)
Hoe werken tekstgenerators? (Special Guest in Lieven Scheire's AI voorstelling)
Pret met Creatieve Computers
Hoe leer je computers humor?

More from Thomas Winters (20)

PPTX
How do you teach computers humor + Text Generators as Creative Partners (May ...
PDF
Hoe schrijven computers zelf tekst? (Kinderlezing)
PPTX
AI als creatieve partner
PDF
De magie achter afbeeldingsgenerators
PPTX
Computational Humor: Can a machine have a sense of humor (December 2022)
PPTX
How to Attract & Survive Media Attention as PhD
PDF
How can AI be a creative partner for PR & marketing?
PDF
Beter leren praten met Artificiële Intelligentie
PPTX
Computational Humor: Can a machine have a sense of humor (2022)
PPTX
TorfsBotOrNot @ Nerdland Festival
PPTX
Creative AI for Improv Theatre
PPTX
Computational Humor: Can a machine have a sense of humor? (2020)
PPTX
Humor Workshop: Hoe schrijf je satire? (KU Leugen)
PPTX
Survival of the Wittiest: Evolving Satire with Language Models
PPTX
Dutch Humor Detection by Generating Negative Examples
PPTX
Modelling Mutually Interactive Fictional Character Conversational Agents
PPTX
Generating Philosophical Statements using Interpolated Markov Models and Dyna...
PPTX
Towards a General Framework for Humor Generation from Rated Examples
PPTX
Generating Dutch Punning Riddles about Current Affairs
PDF
DeepStochLog: Neural Stochastic Logic Programming
How do you teach computers humor + Text Generators as Creative Partners (May ...
Hoe schrijven computers zelf tekst? (Kinderlezing)
AI als creatieve partner
De magie achter afbeeldingsgenerators
Computational Humor: Can a machine have a sense of humor (December 2022)
How to Attract & Survive Media Attention as PhD
How can AI be a creative partner for PR & marketing?
Beter leren praten met Artificiële Intelligentie
Computational Humor: Can a machine have a sense of humor (2022)
TorfsBotOrNot @ Nerdland Festival
Creative AI for Improv Theatre
Computational Humor: Can a machine have a sense of humor? (2020)
Humor Workshop: Hoe schrijf je satire? (KU Leugen)
Survival of the Wittiest: Evolving Satire with Language Models
Dutch Humor Detection by Generating Negative Examples
Modelling Mutually Interactive Fictional Character Conversational Agents
Generating Philosophical Statements using Interpolated Markov Models and Dyna...
Towards a General Framework for Humor Generation from Rated Examples
Generating Dutch Punning Riddles about Current Affairs
DeepStochLog: Neural Stochastic Logic Programming
Ad

Recently uploaded (20)

PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Digital Strategies for Manufacturing Companies
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
history of c programming in notes for students .pptx
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
medical staffing services at VALiNTRY
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
L1 - Introduction to python Backend.pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
CHAPTER 2 - PM Management and IT Context
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Navsoft: AI-Powered Business Solutions & Custom Software Development
Digital Strategies for Manufacturing Companies
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
history of c programming in notes for students .pptx
Design an Analysis of Algorithms I-SECS-1021-03
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
medical staffing services at VALiNTRY
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Reimagine Home Health with the Power of Agentic AI​
Upgrade and Innovation Strategies for SAP ERP Customers
L1 - Introduction to python Backend.pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
wealthsignaloriginal-com-DS-text-... (1).pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Odoo POS Development Services by CandidRoot Solutions
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
CHAPTER 2 - PM Management and IT Context
Ad

Discovering Textual Structures: Generative Grammar Induction using Template Trees

  • 1. Discovering Textual Structures: Generative Grammar Induction using Template Trees Thomas Winters, Luc De Raedt KU Leuven, Belgium @thomas_wint firstname.lastname@cs.kuleuven.be
  • 2. Motivation Generative model designers often write generative grammars by hand. Can we bootstrap this by providing some exemplars? S  <Hero> is going to <Action> Hero  <Name> the <Job> Name  Alice | Bob | Cathy Job  hunter | knight | cook Action  fish | slay a dragon | relax Alice the hunter is going to fish Cathy the cook is going to slay a dragon Bob the cook is going to fish Alice the knight is going to relax ?
  • 3. Motivation Automatically inducing generative context-free grammars from text with latent templates Problem: Most grammar induction algorithms don’t “acknowledge” templates, but prefer part-of-speech-like structures Generative grammars often use template-like structures! Solution: Introducing notion of “Template Trees”
  • 4. Template Tree = connected acyclic directed graph where each node represents a template that is more general than the template of all its child nodes Created by iteratively merging closest templates hello world hi world hi universe howdy universe <D> universe <B> world <A> hi <C>
  • 5. GITTA Grammar Induction using a Template Tree Approach 1. Creates an initial template tree 2. Prunes redundant children for each node (all descendent leaves covered by other children) 3. Merges slots (when values have similar value sets) 4. Simplifies tree using slot content 5. Step 3 & 4 until convergence
  • 6. Example input 1. "I like my cat and my dog" 2. "I like my dog and my chicken" 3. "Alice the cat is jumping" 4. "Bob the dog is walking" 5. “Cathy the cat is walking”
  • 7. Join closest strings iteratively into templates with slots to create Template Tree A: I like my <B> and my <C> | <D> the <E> is <F> B: cat | dog C: chicken | dog D: <G> | <I> E: <H> | cat F: walking | <J> G: Bob | Cathy H: cat | dog I: Alice | Cathy J: walking | jumping
  • 8. Merge similar slots and rename slots in Template Tree A: I like my <B> and my <B> | <G> the <B> is <F> B: chicken | cat | dog C: <B> D: <G> E: <B> F: walking | jumping G: Alice | Bob | Cathy H: <B> I: <G> J: <F>
  • 9. Iteratively merge and simplify tree & recalculate templates until convergence A: I like my <B> and my <B> | <G> the <B> is <F> B: chicken | cat | dog F: walking | jumping G: Alice | Bob | Cathy
  • 10. Output S → I like my <B> and my <B> | <G> the <B> is <F> B → chicken | cat | dog F → walking | jumping G → Alice | Bob | Cathy
  • 11. Evaluation: Reverse-engineering Twitterbots GITTA finds decent generalisations, but difficulties with subsequent slots, especially if slots have large variations in number of words for their slots values. Given 25/50/100 example generations of Tracery grammar, try to reconstruct grammar
  • 12. Conclusion GITTA is a promising approach for learning generative grammars from texts with latent templates Code: https://github.com/twinters/gitta