SlideShare a Scribd company logo
Role of Language Engineering
to
Preserve Endangered Languages
Amit Kumar Jha
Ph.D. (Informatics and Language Engineering)
School of Language, MGAHV, Wardha
Sumit Kumar Gupta
MILE, School of Language,
MGAHV, Wardha
National Conference on the Approaches & the Methodologies on the Study of Indegnous & Endangered Language
Dr. Piyush Pratap Singh
Asst. Professor
School of Language
MGAHV, Wardha
Endangered Language
• Endangered language (EL) is the language community incorporates less
number of speakers of that particular language.
• EL is likely to become extinct in the near future. Many languages are failing
out of use and being substituted by others is more widely used in the region
or nation.
Language Engineering
• Language Engineering (LE) is the subfield of computer science which
explores the field of language related software and its feasible hardware
development.
Language Engineering
Goal of Language Engineering
• The ultimate goal of LE is to develop a machine which is able to understand
and generate natural language.
• If the Approaches of LE implements on EL, then EL may be Preserve.
Language Endangered
• The loss of speakers in one language is the gain of speakers of another
language, except for cases of genocide. Languages are generally replaced
when an entire speech community shifts to another language. Replacing
languages are very often official state languages.
• The world is experiencing an unprecedented wave of language extinctions.
There are between 6,000 and 7,000 languages currently spoken, and between
50 to 90 per cent of those will be extinct by the year 2100.
Language Extinction Results
• Language extinction results in loss of cultural identities, knowledge systems,
and the variety of data needed to understand the structure of language in the
mind.
• Documenting endangered languages preserves data and stimulates language
maintenance and revitalisation.
Language Documentation
• Many of these languages do not have a written tradition and written data may be completely
unavailable or sparse, the languages are not used in the media, or their speakers do not use the Internet
(and if they do, they often use another language). In such cases, linguists must start from scratch and
collect as much data as possible by recording speakers of a given language.
• Ideally, language documentation contains representative samples from different speakers – representing
different age groups, different professions, of both sexes, and different origins –, but in the case of
endangered languages this may not be possible, because the number of speakers is too small and/or
there are only elder speakers. An important issue apart from the number of speakers and amount of
data concerns the communication between the linguists or other researchers who want to document a
language, and the language community.
Language Documentation
• In the case of endangered or minority languages, the documenters often are outsiders, not members of
the community. They may not be fluent speakers of the language in question and can communicate
with the speakers in a second or a third language. This often leads to an unnatural use of the language
that is to be documented.
Digitalization
• Digitlizaion is the process in which data is the store in the form of digital.
The durability of digital data is more than others types of data. To preserve
EL by Digitaliztion we convert and store data in digital forrm i.e. text, sound,
image etc. The researchers should create study meterial of EL in digital
form.
Application of Language Engineering
• Speech Generation
• Language Translator
• Speech-to-Text
• Text-to-Speech
• Langauge Teaching
• Translitration Tool
Application of Language Engineering...
• Speaker Identification
• Verification Speech Recognition
• Character and Document Image Recognition
• Question-Answering System
• Word sense Disambiguation
• Information retrieval and Information Extraction
• Film Production and Dialogue Debbing
Speech Generation
• With the help of language engineering we can generate the speech of
Endangered Language by a machine. If a machie will be able to generate EL
then we can preserve that Language.
Language Translator
• Language translator or Machine translator is a machine which is able to
translate one language to another language. The first language is called source
language and the second language is called the target language. If the Source
language or the target language is EL, EL is preventing by this Language
Translator system.
Speech-to-Text
• It is the process of converting speech to text. This is the task of
documentation. If we convert speech file to text file of EL then we preserve
that language.
Language Translator
• Language translator or Machine translator is a machine which is able to
translate one language to another language. The first language is called sourse
language and the second language is called the target language. If the Sourse
language or the target language is EL, EL is prevent by this Language
Translator system.
Transcription Tool
• Transcription is the process in which one script to another script.
• A person which is unknown to a specific language, its script and
pronunciation, the role of Transcription tool is importnat in this context.
• If Transcription tool for an EL will be developed then we increase the
number of people to understand that language.
Text-to-Speech
• Text-to-speech system is the system in which text data is input and it return
speech data as output. It plays important role in Man-Machine interaction.
Langauge Teaching
• Language Teaching is the process of teaching a language. With the help of
LE we can create a system for teaching a language. If EL teaching system is
created EL may be preseve. As it is known that there are some language
which has the speakers of old age and this language doesn’t transfer to the
next generation. After some that language becomes dead. To preserve this
language this system is important.
Question Answering System
• Question-Answering system is a Natural Language Processing system. If a
person ask a question to the system, system returns the answer of that
question.
Extinct Language
• An endangered language is a language that is at a risk of falling out of use,
generally because it has few surviving speakers. If it loses all of its native
speakers, it becomes an extinct language.
Levels of Endangerement
• UNESCO defines four levels of language endangerment between "safe" (not
endangered) and "extinct":
1. Vulnerable
2. Definitely endangered
3. Severely endangered
4. Critically endangered
EL in India
• Indian Goverment started a scheme to preseve EL the name of this Scheme
is SPPEL(Scheme for Protection and Preservation of Endangered
Languages).
• The SPPEL has listed 117 languages to be documented in its current phase.
The Languages are some of lesser known Indian languages which are spoken
by less than 10,000 speakers.
Refrence
• Refrence List :
• B. WEBBER, M. EGG and V. KORDONI (2012). Discourse structure and language technology. Natural Language
Engineering
• Jurafsky, Martin (et.al. ) Sppech and Language Processing. Prentice Hall, Englewood Cliffs, New Jersey 07632
• Reiter, E. and Dale, R. (2000). Building Natural Language Generation Systems. Cambridge University Press, Cambridge.
• Yarowsky, D. (1996). Homograph disambiguation in text-to-speech synthesis. In Progress in Speech Synthesis, pp. 159–175.
Springer-Verlag, Berlin.
• Small, S. L. and Rieger, C. (1982). Parsing and comprehending withWord Experts. In Lehnert,W. G. and Ringle, M. H.
(Eds.), Strategies for Natural Language Processing, pp. 89–147. Lawrence Erlbaum, New Jersey.
• www.sppel.org
Thanks A Lot......

More Related Content

DOCX
Role of language engineering to preserve endangered languages
PPTX
Technologies and englishes
PPTX
Computational linguistics
PDF
Multilingual brain
PDF
A Computational Model of Yoruba Morphology Lexical Analyzer
PPTX
Thorne_Iowa_Pusack 2012
PDF
Students attitude towards teachers code switching code mixing
PPTX
Amadou
Role of language engineering to preserve endangered languages
Technologies and englishes
Computational linguistics
Multilingual brain
A Computational Model of Yoruba Morphology Lexical Analyzer
Thorne_Iowa_Pusack 2012
Students attitude towards teachers code switching code mixing
Amadou

What's hot (20)

PPT
Applied Linguistics
PPTX
Automatic speech recognition
PPTX
Psycholinguistics
PPTX
Linguistics
PDF
Interlanguage errors
PPT
Code Switching: a paper by Krishna Bista
PPTX
Chapter 5
PPTX
SLA-Inter-language presentation
PDF
10 symptoms of a multilingual brain
PPTX
Presentation on code switching
PDF
Interlanguage Analysis of Spanish Learners
PPTX
Linguistic factors presentation
PPTX
(Applied linguistics) cook's book ch 8
PPTX
Code switching &; code mixing
PDF
Code switching
PPTX
Transfer Analysis in Applied Linguistics
PPTX
FUNCTIONS OF ENGLISH AS A LANGUAGE
PPTX
The Pedagogical Aspects of Philippine English
PPTX
Code Switching, Types and Reasons
PPTX
Code switching presentation387
Applied Linguistics
Automatic speech recognition
Psycholinguistics
Linguistics
Interlanguage errors
Code Switching: a paper by Krishna Bista
Chapter 5
SLA-Inter-language presentation
10 symptoms of a multilingual brain
Presentation on code switching
Interlanguage Analysis of Spanish Learners
Linguistic factors presentation
(Applied linguistics) cook's book ch 8
Code switching &; code mixing
Code switching
Transfer Analysis in Applied Linguistics
FUNCTIONS OF ENGLISH AS A LANGUAGE
The Pedagogical Aspects of Philippine English
Code Switching, Types and Reasons
Code switching presentation387
Ad

Similar to Role of Language Engineering to Preserve Endangered Language (20)

PPT
Applied linguistics presentation
PDF
Investigations of the Distributions of Phonemic Durations in Hindi and Dogri
PDF
Investigations of the Distributions of Phonemic Durations in Hindi and Dogri
PPTX
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PDF
Applied Linguistics session 111 0_07_12_2021 Applied linguistics challenges.pdf
PPTX
The four skills in education that gives learning.pptx
PDF
Substitution Error Analysis for Improving the Word Accuracy in Telugu Langua...
PPTX
Linguistics curriculum 001
PDF
B0340710
PPT
L1 nlp intro
PPTX
Week 1 an introduction to the course.pptx
PDF
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...
PDF
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
PDF
American Standard Sign Language Representation Using Speech Recognition
PDF
What is linguistics.pdf
PDF
1.pdf
PDF
Hidden markov model based part of speech tagger for sinhala language
PDF
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
PDF
English as second and foreign language
PDF
Natural language wikipedia
Applied linguistics presentation
Investigations of the Distributions of Phonemic Durations in Hindi and Dogri
Investigations of the Distributions of Phonemic Durations in Hindi and Dogri
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Applied Linguistics session 111 0_07_12_2021 Applied linguistics challenges.pdf
The four skills in education that gives learning.pptx
Substitution Error Analysis for Improving the Word Accuracy in Telugu Langua...
Linguistics curriculum 001
B0340710
L1 nlp intro
Week 1 an introduction to the course.pptx
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
American Standard Sign Language Representation Using Speech Recognition
What is linguistics.pdf
1.pdf
Hidden markov model based part of speech tagger for sinhala language
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
English as second and foreign language
Natural language wikipedia
Ad

More from Dr. Amit Kumar Jha (20)

PPTX
E learning app development
PPT
Maithili Text-to-Speech
PDF
राजभाषा हिंदी के विकास में कंप्यूटर एवं प्रौद्योगिकी का योगदान
PDF
भारतीय भाषाओं के लिए डिजिटल भाषिक मानचित्र
PDF
Hindi Language and Information Technology
PPTX
Information Management System Rajbhasha
DOCX
Morphology
DOCX
Natural language processing
PPTX
Microsoft office & Internet
PPTX
कंप्यूटर पर हिंदी में कार्य
PDF
Clickable Language Map of India
PPTX
Machine translation And Anusaaraka
PDF
Networking and Topology
PPTX
Scientific Research methodology
PPTX
LingPy : A Python Library for Historical Linguistics
PPTX
लिनक्स (Linux)
PPTX
कंप्यूटर की पीढ़ियाँ
DOCX
Online Examination Portal
PPTX
Information engineering
PPTX
Language engineering
E learning app development
Maithili Text-to-Speech
राजभाषा हिंदी के विकास में कंप्यूटर एवं प्रौद्योगिकी का योगदान
भारतीय भाषाओं के लिए डिजिटल भाषिक मानचित्र
Hindi Language and Information Technology
Information Management System Rajbhasha
Morphology
Natural language processing
Microsoft office & Internet
कंप्यूटर पर हिंदी में कार्य
Clickable Language Map of India
Machine translation And Anusaaraka
Networking and Topology
Scientific Research methodology
LingPy : A Python Library for Historical Linguistics
लिनक्स (Linux)
कंप्यूटर की पीढ़ियाँ
Online Examination Portal
Information engineering
Language engineering

Recently uploaded (20)

PPTX
Construction Project Organization Group 2.pptx
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
737-MAX_SRG.pdf student reference guides
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Geodesy 1.pptx...............................................
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
DOCX
573137875-Attendance-Management-System-original
Construction Project Organization Group 2.pptx
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
CH1 Production IntroductoryConcepts.pptx
737-MAX_SRG.pdf student reference guides
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Geodesy 1.pptx...............................................
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
additive manufacturing of ss316l using mig welding
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Internet of Things (IOT) - A guide to understanding
UNIT 4 Total Quality Management .pptx
Automation-in-Manufacturing-Chapter-Introduction.pdf
Safety Seminar civil to be ensured for safe working.
bas. eng. economics group 4 presentation 1.pptx
Sustainable Sites - Green Building Construction
Operating System & Kernel Study Guide-1 - converted.pdf
573137875-Attendance-Management-System-original

Role of Language Engineering to Preserve Endangered Language

  • 1. Role of Language Engineering to Preserve Endangered Languages Amit Kumar Jha Ph.D. (Informatics and Language Engineering) School of Language, MGAHV, Wardha Sumit Kumar Gupta MILE, School of Language, MGAHV, Wardha National Conference on the Approaches & the Methodologies on the Study of Indegnous & Endangered Language Dr. Piyush Pratap Singh Asst. Professor School of Language MGAHV, Wardha
  • 2. Endangered Language • Endangered language (EL) is the language community incorporates less number of speakers of that particular language. • EL is likely to become extinct in the near future. Many languages are failing out of use and being substituted by others is more widely used in the region or nation.
  • 3. Language Engineering • Language Engineering (LE) is the subfield of computer science which explores the field of language related software and its feasible hardware development.
  • 5. Goal of Language Engineering • The ultimate goal of LE is to develop a machine which is able to understand and generate natural language. • If the Approaches of LE implements on EL, then EL may be Preserve.
  • 6. Language Endangered • The loss of speakers in one language is the gain of speakers of another language, except for cases of genocide. Languages are generally replaced when an entire speech community shifts to another language. Replacing languages are very often official state languages. • The world is experiencing an unprecedented wave of language extinctions. There are between 6,000 and 7,000 languages currently spoken, and between 50 to 90 per cent of those will be extinct by the year 2100.
  • 7. Language Extinction Results • Language extinction results in loss of cultural identities, knowledge systems, and the variety of data needed to understand the structure of language in the mind. • Documenting endangered languages preserves data and stimulates language maintenance and revitalisation.
  • 8. Language Documentation • Many of these languages do not have a written tradition and written data may be completely unavailable or sparse, the languages are not used in the media, or their speakers do not use the Internet (and if they do, they often use another language). In such cases, linguists must start from scratch and collect as much data as possible by recording speakers of a given language. • Ideally, language documentation contains representative samples from different speakers – representing different age groups, different professions, of both sexes, and different origins –, but in the case of endangered languages this may not be possible, because the number of speakers is too small and/or there are only elder speakers. An important issue apart from the number of speakers and amount of data concerns the communication between the linguists or other researchers who want to document a language, and the language community.
  • 9. Language Documentation • In the case of endangered or minority languages, the documenters often are outsiders, not members of the community. They may not be fluent speakers of the language in question and can communicate with the speakers in a second or a third language. This often leads to an unnatural use of the language that is to be documented.
  • 10. Digitalization • Digitlizaion is the process in which data is the store in the form of digital. The durability of digital data is more than others types of data. To preserve EL by Digitaliztion we convert and store data in digital forrm i.e. text, sound, image etc. The researchers should create study meterial of EL in digital form.
  • 11. Application of Language Engineering • Speech Generation • Language Translator • Speech-to-Text • Text-to-Speech • Langauge Teaching • Translitration Tool
  • 12. Application of Language Engineering... • Speaker Identification • Verification Speech Recognition • Character and Document Image Recognition • Question-Answering System • Word sense Disambiguation • Information retrieval and Information Extraction • Film Production and Dialogue Debbing
  • 13. Speech Generation • With the help of language engineering we can generate the speech of Endangered Language by a machine. If a machie will be able to generate EL then we can preserve that Language.
  • 14. Language Translator • Language translator or Machine translator is a machine which is able to translate one language to another language. The first language is called source language and the second language is called the target language. If the Source language or the target language is EL, EL is preventing by this Language Translator system.
  • 15. Speech-to-Text • It is the process of converting speech to text. This is the task of documentation. If we convert speech file to text file of EL then we preserve that language.
  • 16. Language Translator • Language translator or Machine translator is a machine which is able to translate one language to another language. The first language is called sourse language and the second language is called the target language. If the Sourse language or the target language is EL, EL is prevent by this Language Translator system.
  • 17. Transcription Tool • Transcription is the process in which one script to another script. • A person which is unknown to a specific language, its script and pronunciation, the role of Transcription tool is importnat in this context. • If Transcription tool for an EL will be developed then we increase the number of people to understand that language.
  • 18. Text-to-Speech • Text-to-speech system is the system in which text data is input and it return speech data as output. It plays important role in Man-Machine interaction.
  • 19. Langauge Teaching • Language Teaching is the process of teaching a language. With the help of LE we can create a system for teaching a language. If EL teaching system is created EL may be preseve. As it is known that there are some language which has the speakers of old age and this language doesn’t transfer to the next generation. After some that language becomes dead. To preserve this language this system is important.
  • 20. Question Answering System • Question-Answering system is a Natural Language Processing system. If a person ask a question to the system, system returns the answer of that question.
  • 21. Extinct Language • An endangered language is a language that is at a risk of falling out of use, generally because it has few surviving speakers. If it loses all of its native speakers, it becomes an extinct language.
  • 22. Levels of Endangerement • UNESCO defines four levels of language endangerment between "safe" (not endangered) and "extinct": 1. Vulnerable 2. Definitely endangered 3. Severely endangered 4. Critically endangered
  • 23. EL in India • Indian Goverment started a scheme to preseve EL the name of this Scheme is SPPEL(Scheme for Protection and Preservation of Endangered Languages). • The SPPEL has listed 117 languages to be documented in its current phase. The Languages are some of lesser known Indian languages which are spoken by less than 10,000 speakers.
  • 24. Refrence • Refrence List : • B. WEBBER, M. EGG and V. KORDONI (2012). Discourse structure and language technology. Natural Language Engineering • Jurafsky, Martin (et.al. ) Sppech and Language Processing. Prentice Hall, Englewood Cliffs, New Jersey 07632 • Reiter, E. and Dale, R. (2000). Building Natural Language Generation Systems. Cambridge University Press, Cambridge. • Yarowsky, D. (1996). Homograph disambiguation in text-to-speech synthesis. In Progress in Speech Synthesis, pp. 159–175. Springer-Verlag, Berlin. • Small, S. L. and Rieger, C. (1982). Parsing and comprehending withWord Experts. In Lehnert,W. G. and Ringle, M. H. (Eds.), Strategies for Natural Language Processing, pp. 89–147. Lawrence Erlbaum, New Jersey. • www.sppel.org