SlideShare a Scribd company logo
CHALLENGES IN BUILDING
NATURAL LANGUAGE PROCESSING
APPLICATIONS FOR
!पाली LANGUAGE
- Chandan Goopta
Unicode number: U+0915
HTML-code: क
NATURAL LANGUAGE PROCESSING
NLP Task English Indic Languages Nepali
Machine Translation Very Good Good
Very Poor
(Google/M$)
Named Entity
Recognition
Very Good Fair None
(Few Ground work)
Optical Character
Recognition
Very Good Poor Very Poor
POS Tagging Good Poor Very Poor
Sentiment Analysis Very Good Fair
Poor
(works on-going)
Speech Recognition Good Poor
None
(Google’s on-work)
What So Far?
Challenges in Building NLP Applications in Nepali Language
SENTIMENT ANALYSIS
• Chunking | Sentence Chunker
• Tagging | POS Tagger
• Resources | SentiWordNet, Subjectivity WordList
• Machine Learning | Corpus, Tagged Samples
Build Everything from Scratch
OR
I CAN USE ENGLISH
LANGUAGE
RESOURCES FOR
NEPALI
SENTIMENT ANALYSIS
• Chunking | Sentence Chunker
• Tagging | POS Tagger
• Resources | SentiWordNet, Subjectivity WordList
• Machine Learning | Corpus, Tagged Samples
I am like Others are Like Professors are Like
BACK TO CHALLENGES
• Unicode Rendering in
Dev-tools
• Lack of Resources
• Very Less Previous 

Works/Research
WHY PYTHON?
–Prof. James A. Hendler

University of Maryland
“I have the students learn Python in our
undergraduate and graduate Semantic Web
courses. Why? Because basically there's nothing
else with the flexibility and as many web
libraries”
WHY PYTHON?
• NLTK, although not the most efficient
implementation, provides a lot of awesome tools
to quickly prototype a hypothesis
Source: Quora
WHY PYTHON?
• Scipy + Numpy: Everything that isn't in NLTK is
denitely in these libraries. If you want to use more
advanced algorithms like Latent Semantic
Indexing or Latent Dirichlet Allocation, Python has
libraries to do that.
Source: Quora
WHY PYTHON?
• Python has really great XML/HTML parsing
libraries such as Beautiful Soup and Scrape.py. 



You can use these libraries to quickly scrape the web and generate large
data sets to improve the performance of your models (because lets face
it, big data trumps complexity)
Source: Quora
WHY PYTHON?
• Python has great web-frameworks like Django/
Pylons/Tornado. 



If you invent a revolutionary sarcasm detector that can predict trends in
the stock market, you can quickly integrated it into a web service, make
millions, and buy a large island in a third-world country.
Source: Quora
WHY PYTHON?
• Consider your other options: It would not make
sense to use a compiled language like C++/Java
for this type of work unless you needed to increase
performance (computational speed, not model
accuracy). 



As far as I can tell, Ruby is completely useless for any Machine Learning,
Data Mining, or Natural Language Processing task. Maybe you could use
Lisp, but at this point, Python has a larger eco-system.
Source: Quora
THANK YOU

More Related Content

PPTX
Craft of coding
PPTX
Not Everything Is An Object
PDF
"Introduction to F#" - South Dakota Code Camp, November 5, 2011
PDF
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
 
PDF
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
PPTX
Protocol buffers
PPT
Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Craft of coding
Not Everything Is An Object
"Introduction to F#" - South Dakota Code Camp, November 5, 2011
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
 
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
Protocol buffers
Trends in Programming Technology you might want to keep an eye on af Bent Tho...

Similar to Challenges in Building NLP Applications in Nepali Language (20)

PDF
Indextank east bay ruby meetup slides
PDF
Learning to code in 2020
PPT
090216 Presentatie Evernote And Tarpi
ODP
Finding Anything: Real-time Search with IndexTank
ODP
Finding Anything: Real-time Search with IndexTank
PPTX
Machine Learning 101 | Essential Tools for Machine Learning
PDF
Building multi billion ( dollars, users, documents ) search engines on open ...
DOCX
A Decision Tree based Recommendation System for Tourists.docx
PDF
Picking programming packages
PDF
Data Workflows for Machine Learning - SF Bay Area ML
PPTX
Natural language processing and search
PPTX
Python Programming Introduction For Students
PPTX
Enterprise Frameworks: Java & .NET
PPTX
Engaging a Developer Audience: Documentation and More
PDF
🌟Is Learning Python Your Career Game-Changer? 🚀🐍
PDF
PDF
PARC Forum 2009: Adventures in SearchLand
PDF
Text Analysis and Semantic Search with GATE
PPTX
Python programming ppt.pptx
PPTX
How to start Python? - lesson 1
Indextank east bay ruby meetup slides
Learning to code in 2020
090216 Presentatie Evernote And Tarpi
Finding Anything: Real-time Search with IndexTank
Finding Anything: Real-time Search with IndexTank
Machine Learning 101 | Essential Tools for Machine Learning
Building multi billion ( dollars, users, documents ) search engines on open ...
A Decision Tree based Recommendation System for Tourists.docx
Picking programming packages
Data Workflows for Machine Learning - SF Bay Area ML
Natural language processing and search
Python Programming Introduction For Students
Enterprise Frameworks: Java & .NET
Engaging a Developer Audience: Documentation and More
🌟Is Learning Python Your Career Game-Changer? 🚀🐍
PARC Forum 2009: Adventures in SearchLand
Text Analysis and Semantic Search with GATE
Python programming ppt.pptx
How to start Python? - lesson 1
Ad

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Approach and Philosophy of On baking technology
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Machine Learning_overview_presentation.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
 
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPT
Teaching material agriculture food technology
PPTX
A Presentation on Artificial Intelligence
Diabetes mellitus diagnosis method based random forest with bat algorithm
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
A comparative analysis of optical character recognition models for extracting...
MIND Revenue Release Quarter 2 2025 Press Release
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Approach and Philosophy of On baking technology
Spectral efficient network and resource selection model in 5G networks
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Dropbox Q2 2025 Financial Results & Investor Presentation
Machine Learning_overview_presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
 
“AI and Expert System Decision Support & Business Intelligence Systems”
Teaching material agriculture food technology
A Presentation on Artificial Intelligence
Ad

Challenges in Building NLP Applications in Nepali Language

  • 1. CHALLENGES IN BUILDING NATURAL LANGUAGE PROCESSING APPLICATIONS FOR !पाली LANGUAGE - Chandan Goopta Unicode number: U+0915 HTML-code: क
  • 3. NLP Task English Indic Languages Nepali Machine Translation Very Good Good Very Poor (Google/M$) Named Entity Recognition Very Good Fair None (Few Ground work) Optical Character Recognition Very Good Poor Very Poor POS Tagging Good Poor Very Poor Sentiment Analysis Very Good Fair Poor (works on-going) Speech Recognition Good Poor None (Google’s on-work) What So Far?
  • 5. SENTIMENT ANALYSIS • Chunking | Sentence Chunker • Tagging | POS Tagger • Resources | SentiWordNet, Subjectivity WordList • Machine Learning | Corpus, Tagged Samples
  • 7. OR I CAN USE ENGLISH LANGUAGE RESOURCES FOR NEPALI
  • 8. SENTIMENT ANALYSIS • Chunking | Sentence Chunker • Tagging | POS Tagger • Resources | SentiWordNet, Subjectivity WordList • Machine Learning | Corpus, Tagged Samples
  • 9. I am like Others are Like Professors are Like
  • 10. BACK TO CHALLENGES • Unicode Rendering in Dev-tools • Lack of Resources • Very Less Previous 
 Works/Research
  • 12. –Prof. James A. Hendler
 University of Maryland “I have the students learn Python in our undergraduate and graduate Semantic Web courses. Why? Because basically there's nothing else with the flexibility and as many web libraries”
  • 13. WHY PYTHON? • NLTK, although not the most efcient implementation, provides a lot of awesome tools to quickly prototype a hypothesis Source: Quora
  • 14. WHY PYTHON? • Scipy + Numpy: Everything that isn't in NLTK is denitely in these libraries. If you want to use more advanced algorithms like Latent Semantic Indexing or Latent Dirichlet Allocation, Python has libraries to do that. Source: Quora
  • 15. WHY PYTHON? • Python has really great XML/HTML parsing libraries such as Beautiful Soup and Scrape.py. 
 
 You can use these libraries to quickly scrape the web and generate large data sets to improve the performance of your models (because lets face it, big data trumps complexity) Source: Quora
  • 16. WHY PYTHON? • Python has great web-frameworks like Django/ Pylons/Tornado. 
 
 If you invent a revolutionary sarcasm detector that can predict trends in the stock market, you can quickly integrated it into a web service, make millions, and buy a large island in a third-world country. Source: Quora
  • 17. WHY PYTHON? • Consider your other options: It would not make sense to use a compiled language like C++/Java for this type of work unless you needed to increase performance (computational speed, not model accuracy). 
 
 As far as I can tell, Ruby is completely useless for any Machine Learning, Data Mining, or Natural Language Processing task. Maybe you could use Lisp, but at this point, Python has a larger eco-system. Source: Quora