Reducing Language Barriers for Tourists
Using Handwriting Recognition Enabled
          Mobile Application

• Edgard Chammas, Chafic          Rami Al Hajj Mohamad
          Mokbel
                                 American University of the
  • University of Balamand             Middle East
   • Balamand, Lebanon                 Egaila, Kuwait

Cristina Oprean, Laurence Likforman Sulem, Gérard Chollet
                   TELECOM ParisTech
                       Paris, France

                     ACTEA 2012
INTRODUCTION
• The goal is to reduce the language barriers for tourists
  who do not speak the Arabic language
  – Take a photo of the text using your smart phone and get the
    corresponding information in your own language
• Handwriting Recognition enabled mobile application
  has been developed for this purpose
  – Robust recognition engine to recognize both handwritten and
    printed noisy texts
  – Two specific vocabularies:
     • Villages and Town Names
     • Restaurant Menu entries
HANDWRITING RECOGNITION SYSTEM


    Based on Hidden Markov Model (HMM)

    Feature vectors extracted from the input image
    by a sliding windows characterized by its size,
    overlap and sliding direction
    
        20 parameters and their derivatives are computed
        by automatically dividing the window into 21 cells
        and defining the baseline
    
        The sequence of feature vectors is modeled by
        HMMs
    
        A HMM model is dedicated to each variant of a letter
HANDWRITING RECOGNITION SYSTEM


    Training of the HMMs is done with the Expectation-
    Maximization (EM) algorithm that iteratively estimates
    their parameters guaranteeing a non-decrease of the
    likelihood function with each iteration


    The Viterbi algorithm yields to the identification of the
    written text by determining the most likely sequence.
RECOGNITION OF PRINTED TEXTS
          USING THE HANDWRITING
           RECOGNITION SYSTEM

    Problem 1: Need for an efficient recognition system to
    recognize both handwritten and printed texts

    Hypothesis to be tested: If sufficient data covering an
    extended set of variability in handwritten texts is used to
    train the handwriting recognition system, then this
    system would be useful in recognizing both handwritten
    and printed texts
       
           The printed text can be considered as the median
           form of the corresponding handwritten text
       
           The handwritten texts are variations of the
           corresponding printed texts
RECOGNITION OF PRINTED TEXTS
      USING THE HANDWRITING
       RECOGNITION SYSTEM
• Problem 2: The system should be easily
  adaptable to any vocabulary
  – No additional data need to be collected and a
    retraining performed when changing the vocabulary


• Solution to be tested: Word models are the
  concatenation of letters’ models
  – Any new word model can be automatically built by
    concatenating its letters’ models
EXPERIMENTS AND RESULTS

    Balamand HCM toolkit used for both training and
    recognition using HMMs

    State of the art UOB-ENST system trained on the IFN-
    ENIT database of Tunisian village names
    
        Vocabulary set of 946 names, written by more than 400
        writers. 26000 images are used for training

    The current performance of the system on handwritten
    words is 90.9%




             Sample image from the training IFN/ENIT
EXPERIMENTS AND RESULTS

    Need to evaluate performance on printed texts.

    Target vocabulary: Lebanese village names and restaurant
    menu entries.

    The HMM models of words are obtained by concatenating
    the character HMMs trained on the IFN/ENIT database

    No printed texts have been included in the training sets

    Small test database has been collected:
       
           40 menu entries corresponding to Lebanese
           specialties
       
           40 Lebanese towns and villages
EXPERIMENTS AND RESULTS

    A test set of 240 images is constructed by
    typing in 3 different fonts these menu entries,
    town and village names




    Sample images from the locally collected test databases
EXPERIMENTS AND RESULTS

    Only 2 errors for 120 menu entries images
    tested and 3 errors for the town and village
    names images
    
        2% error rate

    If we consider the top 5 solutions, then this
    system has no error on this printed database
    even when dealing with a completely different
    vocabulary
MOBILE APPLICATION
MOBILE APPLICATION
CONCLUSIONS


    A novel approach combining both handwritten and
    printed texts recognition in a mobile application

    HMM-based handwritten recognition system achieved
    only 2% error rate on typed texts

    High accuracy also maintained with handwritten words
    belonging to a vocabulary different from the one used
    in training the HMM models

    N-best solutions returned by the server to the user
    makes the application robust to recognition errors
FUTURE WORK


    The simultaneous recognition of handwritten and
    printed texts will be further studied

    The effect on the performance of the introduction of
    printed texts in the training set will be measured and
    interpreted

    The geographical positioning provided by the device
    could be sent to the server permitting to guide the
    recognition by limiting the vocabulary in use
Thank You
Questions?

More Related Content

PDF
Looking at information security from different perspectives
PDF
A MOBILE APPLICATION FOR HANDWRITING RECOGNITION USING MACHINE LEARNING AND I...
PDF
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITION
PDF
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITION
PDF
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITION
PDF
ArabicWordNet: fine-tuning MobileNetV2-based model for Arabic Handwritten Wor...
PDF
40120130406014 2
DOCX
Bangladesh Army University of Science and Technology (BAUST), Saidpur // ...
Looking at information security from different perspectives
A MOBILE APPLICATION FOR HANDWRITING RECOGNITION USING MACHINE LEARNING AND I...
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITION
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITION
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITION
ArabicWordNet: fine-tuning MobileNetV2-based model for Arabic Handwritten Wor...
40120130406014 2
Bangladesh Army University of Science and Technology (BAUST), Saidpur // ...

Similar to Reducing Language Barriers for Tourists Using Handwriting Recognition Enabled Mobile Application (20)

PDF
Efficient feature descriptor selection for improved Arabic handwritten words ...
PDF
A study of feature extraction for Arabic calligraphy characters recognition
PDF
IRJET- Handwritten Character Recognition using Artificial Neural Network
PPTX
Arabic Handwritten Text Recognition and Writer Identification
PPTX
Handwriting Recognition Using Deep Learning and Computer Version
PDF
Holistic Approach for Arabic Word Recognition
PPTX
Automatic handwriting recognition
PPTX
Pre-Defense CSE Thesis Presentation in BAUST
PDF
Recognition of Words in Tamil Script Using Neural Network
PDF
Handwritten Text Recognition and Translation with Audio
PDF
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
PDF
762019128
PDF
An Ensemble Learning Approach Using Decision Fusion For The Recognition Of Ar...
PDF
IRJET - Optical Character Recognition and Translation
PDF
Fuzzy rule based classification and recognition of handwritten hindi
PDF
Fuzzy rule based classification and recognition of handwritten hindi
PDF
856200902 a06
PDF
Hand-written Hindi Word Recognition - A Comprehensive Survey
PDF
Character recognition for bi lingual mixed-type characters using artificial n...
Efficient feature descriptor selection for improved Arabic handwritten words ...
A study of feature extraction for Arabic calligraphy characters recognition
IRJET- Handwritten Character Recognition using Artificial Neural Network
Arabic Handwritten Text Recognition and Writer Identification
Handwriting Recognition Using Deep Learning and Computer Version
Holistic Approach for Arabic Word Recognition
Automatic handwriting recognition
Pre-Defense CSE Thesis Presentation in BAUST
Recognition of Words in Tamil Script Using Neural Network
Handwritten Text Recognition and Translation with Audio
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
762019128
An Ensemble Learning Approach Using Decision Fusion For The Recognition Of Ar...
IRJET - Optical Character Recognition and Translation
Fuzzy rule based classification and recognition of handwritten hindi
Fuzzy rule based classification and recognition of handwritten hindi
856200902 a06
Hand-written Hindi Word Recognition - A Comprehensive Survey
Character recognition for bi lingual mixed-type characters using artificial n...
Ad

Recently uploaded (20)

PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Getting Started with Data Integration: FME Form 101
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPT
Module 1.ppt Iot fundamentals and Architecture
PPT
What is a Computer? Input Devices /output devices
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
Modernising the Digital Integration Hub
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Getting started with AI Agents and Multi-Agent Systems
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Tartificialntelligence_presentation.pptx
Developing a website for English-speaking practice to English as a foreign la...
Hindi spoken digit analysis for native and non-native speakers
Enhancing emotion recognition model for a student engagement use case through...
Getting Started with Data Integration: FME Form 101
sustainability-14-14877-v2.pddhzftheheeeee
Module 1.ppt Iot fundamentals and Architecture
What is a Computer? Input Devices /output devices
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Web Crawler for Trend Tracking Gen Z Insights.pptx
Chapter 5: Probability Theory and Statistics
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
A contest of sentiment analysis: k-nearest neighbor versus neural network
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Modernising the Digital Integration Hub
DP Operators-handbook-extract for the Mautical Institute
1 - Historical Antecedents, Social Consideration.pdf
Getting started with AI Agents and Multi-Agent Systems
Ad

Reducing Language Barriers for Tourists Using Handwriting Recognition Enabled Mobile Application

  • 1. Reducing Language Barriers for Tourists Using Handwriting Recognition Enabled Mobile Application • Edgard Chammas, Chafic Rami Al Hajj Mohamad Mokbel American University of the • University of Balamand Middle East • Balamand, Lebanon Egaila, Kuwait Cristina Oprean, Laurence Likforman Sulem, Gérard Chollet TELECOM ParisTech Paris, France ACTEA 2012
  • 2. INTRODUCTION • The goal is to reduce the language barriers for tourists who do not speak the Arabic language – Take a photo of the text using your smart phone and get the corresponding information in your own language • Handwriting Recognition enabled mobile application has been developed for this purpose – Robust recognition engine to recognize both handwritten and printed noisy texts – Two specific vocabularies: • Villages and Town Names • Restaurant Menu entries
  • 3. HANDWRITING RECOGNITION SYSTEM  Based on Hidden Markov Model (HMM)  Feature vectors extracted from the input image by a sliding windows characterized by its size, overlap and sliding direction  20 parameters and their derivatives are computed by automatically dividing the window into 21 cells and defining the baseline  The sequence of feature vectors is modeled by HMMs  A HMM model is dedicated to each variant of a letter
  • 4. HANDWRITING RECOGNITION SYSTEM  Training of the HMMs is done with the Expectation- Maximization (EM) algorithm that iteratively estimates their parameters guaranteeing a non-decrease of the likelihood function with each iteration  The Viterbi algorithm yields to the identification of the written text by determining the most likely sequence.
  • 5. RECOGNITION OF PRINTED TEXTS USING THE HANDWRITING RECOGNITION SYSTEM  Problem 1: Need for an efficient recognition system to recognize both handwritten and printed texts  Hypothesis to be tested: If sufficient data covering an extended set of variability in handwritten texts is used to train the handwriting recognition system, then this system would be useful in recognizing both handwritten and printed texts  The printed text can be considered as the median form of the corresponding handwritten text  The handwritten texts are variations of the corresponding printed texts
  • 6. RECOGNITION OF PRINTED TEXTS USING THE HANDWRITING RECOGNITION SYSTEM • Problem 2: The system should be easily adaptable to any vocabulary – No additional data need to be collected and a retraining performed when changing the vocabulary • Solution to be tested: Word models are the concatenation of letters’ models – Any new word model can be automatically built by concatenating its letters’ models
  • 7. EXPERIMENTS AND RESULTS  Balamand HCM toolkit used for both training and recognition using HMMs  State of the art UOB-ENST system trained on the IFN- ENIT database of Tunisian village names  Vocabulary set of 946 names, written by more than 400 writers. 26000 images are used for training  The current performance of the system on handwritten words is 90.9% Sample image from the training IFN/ENIT
  • 8. EXPERIMENTS AND RESULTS  Need to evaluate performance on printed texts.  Target vocabulary: Lebanese village names and restaurant menu entries.  The HMM models of words are obtained by concatenating the character HMMs trained on the IFN/ENIT database  No printed texts have been included in the training sets  Small test database has been collected:  40 menu entries corresponding to Lebanese specialties  40 Lebanese towns and villages
  • 9. EXPERIMENTS AND RESULTS  A test set of 240 images is constructed by typing in 3 different fonts these menu entries, town and village names Sample images from the locally collected test databases
  • 10. EXPERIMENTS AND RESULTS  Only 2 errors for 120 menu entries images tested and 3 errors for the town and village names images  2% error rate  If we consider the top 5 solutions, then this system has no error on this printed database even when dealing with a completely different vocabulary
  • 13. CONCLUSIONS  A novel approach combining both handwritten and printed texts recognition in a mobile application  HMM-based handwritten recognition system achieved only 2% error rate on typed texts  High accuracy also maintained with handwritten words belonging to a vocabulary different from the one used in training the HMM models  N-best solutions returned by the server to the user makes the application robust to recognition errors
  • 14. FUTURE WORK  The simultaneous recognition of handwritten and printed texts will be further studied  The effect on the performance of the introduction of printed texts in the training set will be measured and interpreted  The geographical positioning provided by the device could be sent to the server permitting to guide the recognition by limiting the vocabulary in use