SlideShare a Scribd company logo
Building a Phonics Engine for
Automated Text Guidance
Dominik Lukeš
Dyslexia Action
Chris Litsas
NTUA
www.ilearnrw.eu
Outline
• Struggling readers needs
• Linguistic background
• Phonics engine need
• Phonics engine specification
• Phonics engine implementation
• Phonics engine applications
• Next steps
Needs of dyslexic people
• Identifying the syllables in a word
• Recognising the structure of words (stem,
prefix, suffix)
• Highlighting typical or repeated patterns of
English orthography
• Identifying phoneme/grapheme
correspondence
• Learning the pronunciation of a word
• Learning the meaning of a word
Linguistic background
Dearest creature in creation
Studying English pronunciation,
I will teach you in my verse
Sounds like corpse, corps, horse and worse.
Though the difference seems little,
We say actual, but victual,
Seat, sweat, chaste, caste, Leigh, eight, height,
Put, nut, granite, and unite.
Gerard Nolst Trenité - The Chaos (1922)
Linguistic background
• tough, though, through, bough, thought,
cough, hiccough
• hosp.i.tal vs. hos.pit.al
• kitt.en vs. kit.ten
• walked, stopped, faked, tried
• exgirlfriend vs. exigent vs. exit
• English vs. Greek
Phonics engine need
• Finding all examples of ‘a’ spelled to rhyme
with ‘hay’ in a text or a corpus.
• Sorting words by their phoneme/grapheme
ratio.
• Identifying appropriate syllable boundaries in
the written form of a multi-syllable word
based on knowledge of the syllable
boundaries in pronunciation
Phonics engine specification
• provide automated guidance to students and
teachers reading texts (using highlighting as
well as explicit information)
• generate more extensive word lists for
practice activities within the serious games
• provide information about word structure to
the game engine
Phonics engine implementation
• Profile of phonic difficulties
• Annotated phonic dictionary
• Look up routines
Phonics profile - categories
Based on a modified and expanded version of
Dyslexia Action Literacy Programme
• Consonants (49)
• Vowels (71)
• Blends and letter patterns (131)
• Syllables (13)
• Suffixes (92)
• Prefixes (42)
• Confusing letters (15)
Phonics profile (JSON)
{"descriptions":["a-æ"],
"problemType":"LETTER_EQUALS_PHONEM
E",
"humanReadableDescription":"a=æ
(at) <> Pronounce a as æ. For
example: at, as, and",
"cluster":3,
"character":"Short vowel"}
Phonic dictionary
Word form: feelings
Related stem: feeling
Pronunciation: ˈfiː.lɪŋz
Phoneme/Grapheme Mapping: f-f,ee-iː,l-l,i-ɪ,ng-ŋ,s-z
Orthographic syllabification: fee.lings
Number of letters: 8
Number of phonemes: 6
Number of syllables: 2
Frequency band: 4
Suffix type: SUFFIX_ADD
Suffix form: s
Prefix type: PREFIX_NONE
Prefix form: NULL
Building the phonic dictionary
• 5,000 most frequent words based on COCA
• Generated derived forms by reversing
hunspell
• Used online tool to generate pronunciation
• Create rules for matching pronunciation with
spelling patterns
• Create rules for displaying
• Mark suffixes and prefixes and types
• Adjust frequencies
• Manual fine tuning (lots of regex)
Phonics engine applications
• Phonics aware reader
• Game support – generating word lists
• Game support – provide word structure
• Game support – link word structures to profile
• Text classification tool
• Online text annotation tool
Phonics aware reader
Phonics aware reader
Game support
Online text tools
Online text tools
Online text tools
Online text tools
Next steps
• Bigger dictionary with more information on
words
• Finetuning of look up routines
• More sophisticated highlighting routines
• More sophisticated NLP
– PoS
– Sentence structure
– Semantics
• WordNet, Framenet
• Named Entities
• Collocations
www.ilearnrw.eu

More Related Content

ODP
Internationalisation with PHP and Intl
PPT
Do We Have To Provide Educational Services When We Teach?
PPTX
Une plate-forme artistique en guyane
PPTX
Using online corpus for literacy teachers
PPTX
Czech without aspect: Marrying functional schemas with functional representat...
PPTX
Dyslexia friendly reader: Prototype and designs
PPTX
Genome Mapping
PPT
Coms30123 Synthesis 3 Projector
Internationalisation with PHP and Intl
Do We Have To Provide Educational Services When We Teach?
Une plate-forme artistique en guyane
Using online corpus for literacy teachers
Czech without aspect: Marrying functional schemas with functional representat...
Dyslexia friendly reader: Prototype and designs
Genome Mapping
Coms30123 Synthesis 3 Projector

Similar to Building a phonics engine for automated text guidance (20)

PPT
PPTX
Introduction to Phonics and methodology.
PPT
Conquer the Code
PPTX
Embedding wave sample recording into E-Learning for English 2.pptx
PPTX
Phonics learning and basic reading skills
PPTX
Phonics in Islands
PPT
PDF
(RELO) Pronunciation in the English Language
PDF
Phonics-Presentation-For-Parents.pdf
PPTX
Introduction to phonics for teachers
PPT
Dyslang Workshop Assessments
PPT
EYFS Phonics Information Evening 2017
PPTX
Morphemes
PPTX
phonics.pptx
PPTX
Introduction to phonics lesson 1
PPTX
All about phonics 2015
PPTX
Session_Presentation_-_Teaching_Phonics_and_Word_Recognition_for_Successful_D...
PPTX
SLAC_lei.pptx
PPT
Pronunciation improvement as a by-product of synthetic phonics instruction
PPT
Phonics-Workshop-Presentation.ppt
Introduction to Phonics and methodology.
Conquer the Code
Embedding wave sample recording into E-Learning for English 2.pptx
Phonics learning and basic reading skills
Phonics in Islands
(RELO) Pronunciation in the English Language
Phonics-Presentation-For-Parents.pdf
Introduction to phonics for teachers
Dyslang Workshop Assessments
EYFS Phonics Information Evening 2017
Morphemes
phonics.pptx
Introduction to phonics lesson 1
All about phonics 2015
Session_Presentation_-_Teaching_Phonics_and_Word_Recognition_for_Successful_D...
SLAC_lei.pptx
Pronunciation improvement as a by-product of synthetic phonics instruction
Phonics-Workshop-Presentation.ppt
Ad

More from Dominik Lukes (20)

PPTX
How to Teach and Learn with ChatGPT - BETT 2023
PPTX
Reading and Writing Innovation Lab - Assistive technology and the reading pro...
PPTX
Supporting Teachers to Support Students-Misaligned incentives, Media and Lear...
PDF
Speech Recognition: Art of the possible - DigiFest 2022
PPTX
Speech Recognition: Art of the possible - DigiFest 2022
PPTX
Hybrid teaching: Hidden skills and knowledge gaps
PPTX
Innovations in reading and writing: What should learning technologists know -...
PPTX
What i learned from 20 years of giving domains
PPTX
Pardon my code mix: Hypostatic frame constructions in Czech
PPTX
Reading, writing, and study skills: Technology You Should know
PPTX
Mindmaps, flowcharts and infographics with everyday tools
PPTX
Tools and strategies for writing in simple language
PPTX
Computer productivity
PPTX
Open licensing is an accessibility and inclusion feature of OERs
PPTX
Have the licensing talk early to maximize impact
PPTX
Copyright and Creative Commons for Teachers Making PowerPoints and Other Teac...
PPTX
Frame Negotiation and Policy Discourse: Markets, local knowledge and centrali...
PDF
Investigating literacy teachers' linguistic knowledge
PDF
L2L, Alternative Formats and Affordable Inclusive Technology
PPTX
Reading on e readers, tablets and phones: Hardware and software for inclusive...
How to Teach and Learn with ChatGPT - BETT 2023
Reading and Writing Innovation Lab - Assistive technology and the reading pro...
Supporting Teachers to Support Students-Misaligned incentives, Media and Lear...
Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022
Hybrid teaching: Hidden skills and knowledge gaps
Innovations in reading and writing: What should learning technologists know -...
What i learned from 20 years of giving domains
Pardon my code mix: Hypostatic frame constructions in Czech
Reading, writing, and study skills: Technology You Should know
Mindmaps, flowcharts and infographics with everyday tools
Tools and strategies for writing in simple language
Computer productivity
Open licensing is an accessibility and inclusion feature of OERs
Have the licensing talk early to maximize impact
Copyright and Creative Commons for Teachers Making PowerPoints and Other Teac...
Frame Negotiation and Policy Discourse: Markets, local knowledge and centrali...
Investigating literacy teachers' linguistic knowledge
L2L, Alternative Formats and Affordable Inclusive Technology
Reading on e readers, tablets and phones: Hardware and software for inclusive...
Ad

Recently uploaded (20)

PDF
Classroom Observation Tools for Teachers
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
01-Introduction-to-Information-Management.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
Cell Structure & Organelles in detailed.
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Lesson notes of climatology university.
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Computing-Curriculum for Schools in Ghana
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
master seminar digital applications in india
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Classroom Observation Tools for Teachers
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
01-Introduction-to-Information-Management.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Cell Structure & Organelles in detailed.
Basic Mud Logging Guide for educational purpose
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Abdominal Access Techniques with Prof. Dr. R K Mishra
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Lesson notes of climatology university.
Supply Chain Operations Speaking Notes -ICLT Program
Computing-Curriculum for Schools in Ghana
Microbial disease of the cardiovascular and lymphatic systems
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
master seminar digital applications in india
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student

Building a phonics engine for automated text guidance

  • 1. Building a Phonics Engine for Automated Text Guidance Dominik Lukeš Dyslexia Action Chris Litsas NTUA www.ilearnrw.eu
  • 2. Outline • Struggling readers needs • Linguistic background • Phonics engine need • Phonics engine specification • Phonics engine implementation • Phonics engine applications • Next steps
  • 3. Needs of dyslexic people • Identifying the syllables in a word • Recognising the structure of words (stem, prefix, suffix) • Highlighting typical or repeated patterns of English orthography • Identifying phoneme/grapheme correspondence • Learning the pronunciation of a word • Learning the meaning of a word
  • 4. Linguistic background Dearest creature in creation Studying English pronunciation, I will teach you in my verse Sounds like corpse, corps, horse and worse. Though the difference seems little, We say actual, but victual, Seat, sweat, chaste, caste, Leigh, eight, height, Put, nut, granite, and unite. Gerard Nolst Trenité - The Chaos (1922)
  • 5. Linguistic background • tough, though, through, bough, thought, cough, hiccough • hosp.i.tal vs. hos.pit.al • kitt.en vs. kit.ten • walked, stopped, faked, tried • exgirlfriend vs. exigent vs. exit • English vs. Greek
  • 6. Phonics engine need • Finding all examples of ‘a’ spelled to rhyme with ‘hay’ in a text or a corpus. • Sorting words by their phoneme/grapheme ratio. • Identifying appropriate syllable boundaries in the written form of a multi-syllable word based on knowledge of the syllable boundaries in pronunciation
  • 7. Phonics engine specification • provide automated guidance to students and teachers reading texts (using highlighting as well as explicit information) • generate more extensive word lists for practice activities within the serious games • provide information about word structure to the game engine
  • 8. Phonics engine implementation • Profile of phonic difficulties • Annotated phonic dictionary • Look up routines
  • 9. Phonics profile - categories Based on a modified and expanded version of Dyslexia Action Literacy Programme • Consonants (49) • Vowels (71) • Blends and letter patterns (131) • Syllables (13) • Suffixes (92) • Prefixes (42) • Confusing letters (15)
  • 10. Phonics profile (JSON) {"descriptions":["a-æ"], "problemType":"LETTER_EQUALS_PHONEM E", "humanReadableDescription":"a=æ (at) <> Pronounce a as æ. For example: at, as, and", "cluster":3, "character":"Short vowel"}
  • 11. Phonic dictionary Word form: feelings Related stem: feeling Pronunciation: ˈfiː.lɪŋz Phoneme/Grapheme Mapping: f-f,ee-iː,l-l,i-ɪ,ng-ŋ,s-z Orthographic syllabification: fee.lings Number of letters: 8 Number of phonemes: 6 Number of syllables: 2 Frequency band: 4 Suffix type: SUFFIX_ADD Suffix form: s Prefix type: PREFIX_NONE Prefix form: NULL
  • 12. Building the phonic dictionary • 5,000 most frequent words based on COCA • Generated derived forms by reversing hunspell • Used online tool to generate pronunciation • Create rules for matching pronunciation with spelling patterns • Create rules for displaying • Mark suffixes and prefixes and types • Adjust frequencies • Manual fine tuning (lots of regex)
  • 13. Phonics engine applications • Phonics aware reader • Game support – generating word lists • Game support – provide word structure • Game support – link word structures to profile • Text classification tool • Online text annotation tool
  • 21. Next steps • Bigger dictionary with more information on words • Finetuning of look up routines • More sophisticated highlighting routines • More sophisticated NLP – PoS – Sentence structure – Semantics • WordNet, Framenet • Named Entities • Collocations