International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018
DOI: 10.5121/ijnlc.2018.7302 13
ANN Based POS Tagging For Nepali Text
Archit Yajnik
Department of Mathematics, Sikkim Manipal University, Sikkim, India
ABSTRACT
The article presents Part of Speech Tagging for Nepali Text using three techniques of Artificial Neural
networks. The novel algorithm for POS tagging is introduced .Features are extracted from the marginal
probability of Hidden Markov Model. The extracted features are supplied to 3 different ANN architectures
viz. Radial Basis Function (RBF) network, General Regression Neural Networks (GRNN) and Feed
forward Neural network as an input vector for each word. Two different Annotated Tagged sets are
constructed for training and testing purpose. Results are compared using all the 3 techniques and applied
on both the sets. GRNN based POS tagging technique is found better as it produces 100% and 98.32%
accuracies for both training and testing sets respectively.
KEY WORDS
Radial Basis Function, General Regression Neural Networks, Feed forward neural network, Hidden
Markov Model, POS Tagging
1. INTRODUCTION
Natural Language Processing (NLP) is a diversified field of computational linguistics which in
high demand for the researchersworld wide due to its large number of applications like language
translation, Parsing, POS tagging etc. Among those POS tagging is one of the core part of NLP
which is being used in other applications of computational linguistics. There are various
techniques available in the literature for POS tagging like HMM based Viterbi algorithm, SVM
etc.Artificial neural networks plays a vital role in various fields like medical imaging, image
recognition is covered in [1, 2, 3] and since last one decade it becomes popular in the field of
Computational linguistics also. Due to the computational complexities sometimes it is not
preferred for the big data analysis. General Regression Neural Network which is based on
Probabilistic neural networks is one type of supervised neural network is computationally less
expensive as compared to standard algorithms viz. Back propagation, Radial basis function,
support vector machine etc is exhibited in [4]. That is the reason GRNN is considered for the Past
of speech Tagging experiment for Nepali text in this article. Following sentence illustrates
annotated tagged Nepali sentence
(My Name is Archit)
Several statistical based methods have been implemented for POS tagging [5] as far as Indian
languages are concern. Nepali is widely spoken languages in Sikkim and neighbouring countries
International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018
14
like Nepal , Bhutan etc. The use of ANN architecture is seldom for tagging [6]. To develop a
parser and Morphological analyser for the natural languages POS tagging plays a pivotalrole.
This article presents a neural network architecture based on the Statistical learning theory
described in [4]. This neural network is usually much faster to train than the traditional multilayer
perceptron network. This article is divided into five sections. After this introduction, the second
section briefs the traditional ANN architectures viz Feed forward MLP, Radial basis function
networks (RBF), General Regression Neural Network (GRNN). Experimental set up is
highlighted in the third section followed by Result and discussion in the 4th
section. The article is
concluded in the fifth section.
2. ANN ARCHITECTURES
Single layer feedforward Multilayer Perceptron is based on backpropagation algorithm using
Sigmoidal transfer function is successfully employed in various multiclass classification
problems. Due to the existence of Centres, the kernel based technique RBF is widely used for the
classification. The networks can be trained for centres as well as synaptic weights. The detail
information about these networks is available in [4].
GRNN is also a kernel based technique in which Mahalanobis distance is calculated of each
pattern with the corresponding centre. The detailed information about Probabilistic and General
Regression Neural Networks is available in [4]. GRNN can briefly be introduced for the training
set, { }N
iii y 1
),( =
x . To estimate the joint probability distribution for vectors x and y say ),( yf Y xX,
and therefore )(xXf , we may use a nonparametric estimator known as the Parzen – Rosenblatt
density estimator. Basic to the formulation of this estimator is a kernel, denoted by K(x), which
has properties similar to those associated with a probability density function:
Assuming that x1,x2, … ,xN are independent vectors and identically distributed (each of the
random variables has the same probability distribution as the others), we may formally define the
Parzen – Rosenblatt density estimate of )(xXf as
where the smoothing parameter h is a positive number called bandwidth or simply width; h
controls the size of the kernel. Applying the same estimator on ),( yf Y xX, , the approximated
value for the given vector x is given by
If we take Gaussian kernel i.e.
2
)( X
x −
= eK , we obtain,
International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018
15
where )()(2
ii xxxx −−= T
iD and σ is the standard deviation. )(ˆ xf can be visualized as a
weighted average of all observed values yi , where each observed value is weighted exponentially
according to its Euclidean distance from x. The theory of General Regression Neural Networks
discussed above is pertaining to only neuron in the output layer. The same technique can be
applied for the multiple neurons in the output layer also. Therefore the technique can be
generalized as shown below:
Let wij be the target output corresponding to input training vector xi and jth
output node out of the
total p. Again let Ci be the centres chosen from the random vector x. Then
Here n be the number of patterns in the training set. The estimate yj can be visualized as a
weighted average of all the observed values, wij, where each observed value is weighted
exponentially according to its Euclidean distance from input vector x and n is the number of
patterns available in the input space
3. EXPERIMENTAL PROCEDURE
The survey of Part of Speech Tagging for Indian languages is covered by Antony P J (2011) in
[7]. The details of the tags used for the experiment is available in [8, 9]. Out of which 42100
samples (patterns) are used for training and the remaining 2500 samples for testing. The database
is distributed in to n = 41 tags. Network architecture consists of 41 x 3 = 123 input neurons,
42100 hidden neurons which plays a role of centres Ci (i = 1, 2, …,42100) shown in (4) and 41
neurons in output layer.
Transition and Emission probability matrices are constructed for both the sets
viz. training and testing. Transition matrix demonstrates the probability of occurrence of one tag
(state) after another tag (state) hence becomes a square matrix 41 x 41. Whereas the emission
International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018
16
matrix is the matrix of probability distribution of each Nepali word is allotted the respective tag
hence it is of the size n x m (number of Nepali words) . In order to fetch the features for ith
word
say , the ith
row, ith
column of the transition matrix and ith
row of the emission matrix are
combined hence becomes 41 x 3 = 123 features for each word. Therefore the ANN architectures
consists of 123 input neurons All the patterns (or Nepali words) are used as a centre. Euclidean
distance is calculated between patterns and centres. Training set consists of 5373 words hence the
same number of hidden neurons are incorporated in ANN architectures.
As there are 41 tags, 41 output neurons constitute the output layer of the network. For instance if
the word belongs to NN (common noun) category which is the first tag of the tag set then the first
neuron has a value 1 and all others are 0.
3.1 FEED FORWARD NEURAL NETWORK
The MLP network is trained with 123 input neurons, 30 hidden neurons and 41 output neurons
using sigmoidal transfer function till the error goal of computed output and original output
reachesuptoExp(-16). The code is implemented in Matlab. The network is trained up to 800
epochs.
3.2 RADIAL BASIS FUNCTION NETWORK
The RBF network is trained with 123 input neurons, hidden neurons are same as the number of
patterns and 41 output neurons using sigmoidal transfer function till the error goal of computed
output and original output reachesuptoExp(-16). The code is implemented in Matlab. The
network is trained up to 800 epochs.
3.3 GENERAL REGRESSION NETWORK
The synaptic weights are computed for GRN network as shown in section 2. There are 123 input
neurons, hidden neurons are same as the number of patterns and 41 output neurons using
Gaussian transfer function The code is implemented Java.
4 RESULT ANALYSIS
The training database contains 42100 words whereas testing set consists of 6000 words.Both
database does not contain the words with multiple tags i.e. same nepali word with more than one
different tags. The performance of all the three networks are depicted in table 1. Experiments
demonstrate that all the architectures are better able to identify the suitable tags as far as the
training set is concern but in the case of testing set except GRNN none other network performs
efficiently. Table 1 exhibits the fact that 5899 out of 6000 testing samples are tagged properly
where in other cases tagging accuracy is extremely poor.
International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018
17
Table: 1 Performance analysis of ANN techniques
As demonstrated the output of GRNN is portrayed on the test input shown below. Only the first
word नानीनानीनानीनानी(a child) is erroneously tagged PP (pronoun) but its actual tag is NN (common noun). It
has happened because the frequency of the word नानीनानीनानीनानीis 1 in the training set and the frequency of
PP is highest that is 37 among all the tags.
5 CONCLUSIONS
GRNN achieves better output accuracy (98.32%) in testing set as compared to other two ANN
techniques viz RBF and Feed Forward Neural Network as depicted in table 1. Words of testing
set with less frequency of occurrence may be 1 or 2 in training set are failed to get an appropriate
tag in the case of GRNN architecture. It concludes that GRNN technique which does not require
any kind of training and yields the output only based on the distance provides much better
accuracy as compared to the other ANN architectures which require training.
International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018
18
ACKNOWLEDGEMENTS
The author acknowledges Department of Science and Technology, Government of India for
financial support vide Reference no SR/CSRI/28/2015 under Cognitive Science Research
Initiative (CSRI) to carry out this work.
REFERENCES
[1] Richard O Duda and Peter E Hart, “Pattern Classification”, 2006, Wiley-Interscience, New York,
USA.
[2] S. Rama Mohan, ArchitYajnik: “Gujarati Numeral Recognition Using Wavelets and Neural
Network” Proceedings of Indian International Conference on Artificial Intelligence 2005, pp. 397-
406.
[3] ArchitYajnik, S. Rama Mohan, “Identification of Gujarati characters using wavelets and neural
networks” Artificial Intelligence and Soft Computing 2006, ACTA Press, pp. 150-155.
[4] Simon Haykin, “Neural Networks A Comprehensive Foundation” Second Edition, Prentice Hall
International, Inc., New Jersey, 1999.
[5] Prajadip Sinha et al. 2015.Enhancing the Performance of Part of Speech tagging of Nepali language
through Hybrid approach, 5(5) International Journal of Emerging Technology and Advanced
Engineering.
[6] Tej Bahadur Shai et al. 2013. Support Vector Machines based Part of Speech Tagging for Nepali
Text, Vol: 70-No. 24 International Journal of Computer Applications.
[7] Antony P J et al. 2011.Parts of Speech Tagging for Indian Languages: A Literature Survey,
International Journal of Computer Applications (0975-8887), 34(8).
[8] http://www.lancaster.ac.uk/staff/hardiea /nepali/ postag.php
[9] http://www.panl10n.net /english/ Outputs%20 Phase %202/CCs/Nepal/MPP/ Papers/2008/ Report
%20 on %20 Nepali %20 Computational %20 Grammar.pdf .
[10] ArchitYajnik, “Part of Speech Tagging Using Statistical Approach for Nepali Text”, International
Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:11, No:1,
2017, pp. 76-79.

More Related Content

PDF
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
PDF
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
PDF
A probabilistic data encryption scheme (pdes)
PDF
Investigations on Hybrid Learning in ANFIS
PDF
Performance Efficient DNA Sequence Detectionalgo
PDF
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
PDF
An Index Based K-Partitions Multiple Pattern Matching Algorithm
PDF
Equirs: Explicitly Query Understanding Information Retrieval System Based on Hmm
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
A probabilistic data encryption scheme (pdes)
Investigations on Hybrid Learning in ANFIS
Performance Efficient DNA Sequence Detectionalgo
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
An Index Based K-Partitions Multiple Pattern Matching Algorithm
Equirs: Explicitly Query Understanding Information Retrieval System Based on Hmm

What's hot (17)

PDF
Hidden Layer Leraning Vector Quantizatio
PDF
International Journal of Engineering Research and Development
PDF
Modification of some solution techniques of combinatorial
PDF
Artificial neural networks and its application
PDF
L046056365
PDF
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...
PDF
A SURVEY OF SPIKING NEURAL NETWORKS AND SUPPORT VECTOR MACHINE PERFORMANCE BY...
PDF
Recognition of handwritten digits using rbf neural network
PDF
International Journal of Engineering Research and Development
PDF
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
PDF
Applying Deep Learning Machine Translation to Language Services
PDF
A critical reassessment of
PDF
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PPTX
Instance based learning
DOC
[PDF]
PDF
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
PDF
Improving Neural Abstractive Text Summarization with Prior Knowledge
Hidden Layer Leraning Vector Quantizatio
International Journal of Engineering Research and Development
Modification of some solution techniques of combinatorial
Artificial neural networks and its application
L046056365
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...
A SURVEY OF SPIKING NEURAL NETWORKS AND SUPPORT VECTOR MACHINE PERFORMANCE BY...
Recognition of handwritten digits using rbf neural network
International Journal of Engineering Research and Development
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
Applying Deep Learning Machine Translation to Language Services
A critical reassessment of
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
Instance based learning
[PDF]
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
Improving Neural Abstractive Text Summarization with Prior Knowledge
Ad

Similar to ANN Based POS Tagging For Nepali Text (20)

PDF
Adaptive Training of Radial Basis Function Networks Based on Cooperative
PDF
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
PDF
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
PDF
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
PDF
Et25897899
PDF
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
PDF
2224d_final
PDF
Neural Model-Applying Network (Neuman): A New Basis for Computational Cognition
PDF
Neural Model-Applying Network (Neuman): A New Basis for Computational Cognition
PDF
Hand Written Digit Classification
PDF
Text prediction based on Recurrent Neural Network Language Model
PDF
PERFORMANCE ANALYSIS OF NEURAL NETWORK MODELS FOR OXAZOLINES AND OXAZOLES DER...
PDF
Efficient design of feedforward network for pattern classification
PDF
A Mixed Binary-Real NSGA II Algorithm Ensuring Both Accuracy and Interpretabi...
PDF
An Artificial Intelligence Approach to Ultra High Frequency Path Loss Modelli...
PDF
Kernal based speaker specific feature extraction and its applications in iTau...
PDF
Architecture neural network deep optimizing based on self organizing feature ...
PDF
B021106013
PPTX
ML_Unit_2_Part_A
PDF
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
Adaptive Training of Radial Basis Function Networks Based on Cooperative
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Et25897899
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
2224d_final
Neural Model-Applying Network (Neuman): A New Basis for Computational Cognition
Neural Model-Applying Network (Neuman): A New Basis for Computational Cognition
Hand Written Digit Classification
Text prediction based on Recurrent Neural Network Language Model
PERFORMANCE ANALYSIS OF NEURAL NETWORK MODELS FOR OXAZOLINES AND OXAZOLES DER...
Efficient design of feedforward network for pattern classification
A Mixed Binary-Real NSGA II Algorithm Ensuring Both Accuracy and Interpretabi...
An Artificial Intelligence Approach to Ultra High Frequency Path Loss Modelli...
Kernal based speaker specific feature extraction and its applications in iTau...
Architecture neural network deep optimizing based on self organizing feature ...
B021106013
ML_Unit_2_Part_A
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
Ad

Recently uploaded (20)

PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PDF
Design Guidelines and solutions for Plastics parts
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PPT
Total quality management ppt for engineering students
PPTX
Software Engineering and software moduleing
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PPTX
CyberSecurity Mobile and Wireless Devices
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PPTX
Information Storage and Retrieval Techniques Unit III
PPTX
Module 8- Technological and Communication Skills.pptx
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PPTX
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
PDF
Improvement effect of pyrolyzed agro-food biochar on the properties of.pdf
PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
Design Guidelines and solutions for Plastics parts
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Total quality management ppt for engineering students
Software Engineering and software moduleing
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
CyberSecurity Mobile and Wireless Devices
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
Information Storage and Retrieval Techniques Unit III
Module 8- Technological and Communication Skills.pptx
III.4.1.2_The_Space_Environment.p pdffdf
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
Improvement effect of pyrolyzed agro-food biochar on the properties of.pdf
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
Exploratory_Data_Analysis_Fundamentals.pdf

ANN Based POS Tagging For Nepali Text

  • 1. International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018 DOI: 10.5121/ijnlc.2018.7302 13 ANN Based POS Tagging For Nepali Text Archit Yajnik Department of Mathematics, Sikkim Manipal University, Sikkim, India ABSTRACT The article presents Part of Speech Tagging for Nepali Text using three techniques of Artificial Neural networks. The novel algorithm for POS tagging is introduced .Features are extracted from the marginal probability of Hidden Markov Model. The extracted features are supplied to 3 different ANN architectures viz. Radial Basis Function (RBF) network, General Regression Neural Networks (GRNN) and Feed forward Neural network as an input vector for each word. Two different Annotated Tagged sets are constructed for training and testing purpose. Results are compared using all the 3 techniques and applied on both the sets. GRNN based POS tagging technique is found better as it produces 100% and 98.32% accuracies for both training and testing sets respectively. KEY WORDS Radial Basis Function, General Regression Neural Networks, Feed forward neural network, Hidden Markov Model, POS Tagging 1. INTRODUCTION Natural Language Processing (NLP) is a diversified field of computational linguistics which in high demand for the researchersworld wide due to its large number of applications like language translation, Parsing, POS tagging etc. Among those POS tagging is one of the core part of NLP which is being used in other applications of computational linguistics. There are various techniques available in the literature for POS tagging like HMM based Viterbi algorithm, SVM etc.Artificial neural networks plays a vital role in various fields like medical imaging, image recognition is covered in [1, 2, 3] and since last one decade it becomes popular in the field of Computational linguistics also. Due to the computational complexities sometimes it is not preferred for the big data analysis. General Regression Neural Network which is based on Probabilistic neural networks is one type of supervised neural network is computationally less expensive as compared to standard algorithms viz. Back propagation, Radial basis function, support vector machine etc is exhibited in [4]. That is the reason GRNN is considered for the Past of speech Tagging experiment for Nepali text in this article. Following sentence illustrates annotated tagged Nepali sentence (My Name is Archit) Several statistical based methods have been implemented for POS tagging [5] as far as Indian languages are concern. Nepali is widely spoken languages in Sikkim and neighbouring countries
  • 2. International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018 14 like Nepal , Bhutan etc. The use of ANN architecture is seldom for tagging [6]. To develop a parser and Morphological analyser for the natural languages POS tagging plays a pivotalrole. This article presents a neural network architecture based on the Statistical learning theory described in [4]. This neural network is usually much faster to train than the traditional multilayer perceptron network. This article is divided into five sections. After this introduction, the second section briefs the traditional ANN architectures viz Feed forward MLP, Radial basis function networks (RBF), General Regression Neural Network (GRNN). Experimental set up is highlighted in the third section followed by Result and discussion in the 4th section. The article is concluded in the fifth section. 2. ANN ARCHITECTURES Single layer feedforward Multilayer Perceptron is based on backpropagation algorithm using Sigmoidal transfer function is successfully employed in various multiclass classification problems. Due to the existence of Centres, the kernel based technique RBF is widely used for the classification. The networks can be trained for centres as well as synaptic weights. The detail information about these networks is available in [4]. GRNN is also a kernel based technique in which Mahalanobis distance is calculated of each pattern with the corresponding centre. The detailed information about Probabilistic and General Regression Neural Networks is available in [4]. GRNN can briefly be introduced for the training set, { }N iii y 1 ),( = x . To estimate the joint probability distribution for vectors x and y say ),( yf Y xX, and therefore )(xXf , we may use a nonparametric estimator known as the Parzen – Rosenblatt density estimator. Basic to the formulation of this estimator is a kernel, denoted by K(x), which has properties similar to those associated with a probability density function: Assuming that x1,x2, … ,xN are independent vectors and identically distributed (each of the random variables has the same probability distribution as the others), we may formally define the Parzen – Rosenblatt density estimate of )(xXf as where the smoothing parameter h is a positive number called bandwidth or simply width; h controls the size of the kernel. Applying the same estimator on ),( yf Y xX, , the approximated value for the given vector x is given by If we take Gaussian kernel i.e. 2 )( X x − = eK , we obtain,
  • 3. International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018 15 where )()(2 ii xxxx −−= T iD and σ is the standard deviation. )(ˆ xf can be visualized as a weighted average of all observed values yi , where each observed value is weighted exponentially according to its Euclidean distance from x. The theory of General Regression Neural Networks discussed above is pertaining to only neuron in the output layer. The same technique can be applied for the multiple neurons in the output layer also. Therefore the technique can be generalized as shown below: Let wij be the target output corresponding to input training vector xi and jth output node out of the total p. Again let Ci be the centres chosen from the random vector x. Then Here n be the number of patterns in the training set. The estimate yj can be visualized as a weighted average of all the observed values, wij, where each observed value is weighted exponentially according to its Euclidean distance from input vector x and n is the number of patterns available in the input space 3. EXPERIMENTAL PROCEDURE The survey of Part of Speech Tagging for Indian languages is covered by Antony P J (2011) in [7]. The details of the tags used for the experiment is available in [8, 9]. Out of which 42100 samples (patterns) are used for training and the remaining 2500 samples for testing. The database is distributed in to n = 41 tags. Network architecture consists of 41 x 3 = 123 input neurons, 42100 hidden neurons which plays a role of centres Ci (i = 1, 2, …,42100) shown in (4) and 41 neurons in output layer. Transition and Emission probability matrices are constructed for both the sets viz. training and testing. Transition matrix demonstrates the probability of occurrence of one tag (state) after another tag (state) hence becomes a square matrix 41 x 41. Whereas the emission
  • 4. International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018 16 matrix is the matrix of probability distribution of each Nepali word is allotted the respective tag hence it is of the size n x m (number of Nepali words) . In order to fetch the features for ith word say , the ith row, ith column of the transition matrix and ith row of the emission matrix are combined hence becomes 41 x 3 = 123 features for each word. Therefore the ANN architectures consists of 123 input neurons All the patterns (or Nepali words) are used as a centre. Euclidean distance is calculated between patterns and centres. Training set consists of 5373 words hence the same number of hidden neurons are incorporated in ANN architectures. As there are 41 tags, 41 output neurons constitute the output layer of the network. For instance if the word belongs to NN (common noun) category which is the first tag of the tag set then the first neuron has a value 1 and all others are 0. 3.1 FEED FORWARD NEURAL NETWORK The MLP network is trained with 123 input neurons, 30 hidden neurons and 41 output neurons using sigmoidal transfer function till the error goal of computed output and original output reachesuptoExp(-16). The code is implemented in Matlab. The network is trained up to 800 epochs. 3.2 RADIAL BASIS FUNCTION NETWORK The RBF network is trained with 123 input neurons, hidden neurons are same as the number of patterns and 41 output neurons using sigmoidal transfer function till the error goal of computed output and original output reachesuptoExp(-16). The code is implemented in Matlab. The network is trained up to 800 epochs. 3.3 GENERAL REGRESSION NETWORK The synaptic weights are computed for GRN network as shown in section 2. There are 123 input neurons, hidden neurons are same as the number of patterns and 41 output neurons using Gaussian transfer function The code is implemented Java. 4 RESULT ANALYSIS The training database contains 42100 words whereas testing set consists of 6000 words.Both database does not contain the words with multiple tags i.e. same nepali word with more than one different tags. The performance of all the three networks are depicted in table 1. Experiments demonstrate that all the architectures are better able to identify the suitable tags as far as the training set is concern but in the case of testing set except GRNN none other network performs efficiently. Table 1 exhibits the fact that 5899 out of 6000 testing samples are tagged properly where in other cases tagging accuracy is extremely poor.
  • 5. International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018 17 Table: 1 Performance analysis of ANN techniques As demonstrated the output of GRNN is portrayed on the test input shown below. Only the first word नानीनानीनानीनानी(a child) is erroneously tagged PP (pronoun) but its actual tag is NN (common noun). It has happened because the frequency of the word नानीनानीनानीनानीis 1 in the training set and the frequency of PP is highest that is 37 among all the tags. 5 CONCLUSIONS GRNN achieves better output accuracy (98.32%) in testing set as compared to other two ANN techniques viz RBF and Feed Forward Neural Network as depicted in table 1. Words of testing set with less frequency of occurrence may be 1 or 2 in training set are failed to get an appropriate tag in the case of GRNN architecture. It concludes that GRNN technique which does not require any kind of training and yields the output only based on the distance provides much better accuracy as compared to the other ANN architectures which require training.
  • 6. International Journal on Natural Language Computing (IJNLC) Vol.7, No.3, June 2018 18 ACKNOWLEDGEMENTS The author acknowledges Department of Science and Technology, Government of India for financial support vide Reference no SR/CSRI/28/2015 under Cognitive Science Research Initiative (CSRI) to carry out this work. REFERENCES [1] Richard O Duda and Peter E Hart, “Pattern Classification”, 2006, Wiley-Interscience, New York, USA. [2] S. Rama Mohan, ArchitYajnik: “Gujarati Numeral Recognition Using Wavelets and Neural Network” Proceedings of Indian International Conference on Artificial Intelligence 2005, pp. 397- 406. [3] ArchitYajnik, S. Rama Mohan, “Identification of Gujarati characters using wavelets and neural networks” Artificial Intelligence and Soft Computing 2006, ACTA Press, pp. 150-155. [4] Simon Haykin, “Neural Networks A Comprehensive Foundation” Second Edition, Prentice Hall International, Inc., New Jersey, 1999. [5] Prajadip Sinha et al. 2015.Enhancing the Performance of Part of Speech tagging of Nepali language through Hybrid approach, 5(5) International Journal of Emerging Technology and Advanced Engineering. [6] Tej Bahadur Shai et al. 2013. Support Vector Machines based Part of Speech Tagging for Nepali Text, Vol: 70-No. 24 International Journal of Computer Applications. [7] Antony P J et al. 2011.Parts of Speech Tagging for Indian Languages: A Literature Survey, International Journal of Computer Applications (0975-8887), 34(8). [8] http://www.lancaster.ac.uk/staff/hardiea /nepali/ postag.php [9] http://www.panl10n.net /english/ Outputs%20 Phase %202/CCs/Nepal/MPP/ Papers/2008/ Report %20 on %20 Nepali %20 Computational %20 Grammar.pdf . [10] ArchitYajnik, “Part of Speech Tagging Using Statistical Approach for Nepali Text”, International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:11, No:1, 2017, pp. 76-79.