SlideShare a Scribd company logo
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Novel Scoring System for Identify Accurate
Answers for Factoid Questions
1
Harpreet Kaur, 2
Rimpi Kumari
1
Swami Vivekanand Institute of Engineering and Technology, Banur, Punjab, India
Abstract: Question and Answer System (QAS) are some of the many challenges for natural language understanding and interfaces. In
this paper we have develop a new scoring mathematical model that works on the five types of questions. The question text failures are
first extracted and a score is found based on its structure with respect to its template structure and then answer score is calculated again
the question as well as paragraph. A name entity recognizer and a Part of Speech tagger are applied on each of these words to encode
necessary of information. After that the text to finally reach at the index of the most probable answer with respect to question. In this the
entropy algorithm is used to find the exact answer.
Keywords: Natural language processing, Question answering System, Information retrieval.
1. Introduction
Questions answering (QA) systems look for the answer of a
question in a large collection of documents. The question is
in natural language. QA systems select text passages. Then,
after that the answer is extracted from these passages,
according to criteria issued from the question analysis. NLP
focuses on communications between computers and natural
languages in terms of theoretical results and practical
applications, and on information sharing now that
information is exchange as it never has been before and
sharing information becoming the leading theme in the
domain of NLP systems[2][3]. Automatic question
answering system will help for the above technology. In this
Question Answering System consists of three distinct phases:
Question classification, information retrieval or document
processing and answer extraction.
The design of a standard QA system assumes that the
language in which the question is asked and the text
collection available to be processed are all in the same
language. English QA system research attempts to deal with
a wide range of question types like WHEN, WHERE,
WHAT, HOW, WHOM, WHY & WHOSE. Thus the aim of
a QA system is to localize the exact answer to a question
from a structured or a non-structured collection of texts.
Question Answering (QA) Systems allow the user to ask
questions in a natural language and obtain an exact answer.
In this, we tried to learn the important issues in the field of
Question Answering (QA) systems. We peeked into the
internals of many established QA systems. we do not only
consider simple questions but text problems consisting of
several sentences. Our approach to translating the natural
language question uses an underlying corpus and the
knowledge base to derive meaningful and relevant patterns
which can then be used to process the questions and capture
their meaning with respect to the underlying knowledge
base. We classify the text based on their subject, verb, object
and preposition for determining the possible type of
questions to be generated. The ability of QA systems to
recognize a great amount of answer types is related to their
powerfulness for extracting right answers [5] [6] [8].
2. Previous Work
A survey of different QA techniques has been elaborated.
Question answering system for Indian languages like Hindi,
Telugu, Bengali and Punjabi is discussed. In Hindi language
the Hindi QA system research attempts to deal with a wide
range of question types like when, where, what time, how
many[1][3]. The developed Question-Answering system in
Hindi is using Hindi Shallow Parser. The shallow parser
gives the analysis of the sentence in terms of the
morphological analysis, POS tagging, Chunking etc. In
Bengali language question and answering system is one of
the Indo-Aryan languages of South Asia with over 200
million native speakers. A translation based on transliteration
and a table look-up method is proposed as an interface to the
actual QA task. The implementation part thus involves
transliterating a Bangla question as an equivalent Latin
alphabet (English) version that could be used in an actual QA
task [2]. The Bangla lexicon consists of a good number of
“loan-words” from Arabic, Persian, English and other
languages. An approach to transform the Bangla question
could be;
 Tokenizing the transliterate version of the Bangla
question,
 Translating the remaining question by a simple table
look-up method.
3. Methodology
In this first we collect the corpus of data or paragraph from
encyclopaedia to make the questions and find the exact
answer show n fig1. Corpus is of two types: Questions and
Paragraph. These questions have many types and these types
are what when, where/which, who/whose/whom. After this
with the help of these questions we make the question from
paragraph then next step is the paragraph chunk and question
score, the chunk paragraph is a format of writing, which
forces you to expand on your ideas and explain your
arguments.
Paper ID: 15091303 294
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Figure 1: Flow of Question Answer
It helps in skills writing development and the scores are
calculated on the basis of the accuracy of the answers. After
that the candidate put query or question and answer then the
similarity score will be calculated this loop will continue for
process till the best answer will be find.
4. Results
4.1 Mean Precision Percentage Values
It is the fraction of relevant retrieved answers given by the
question and answer system to the total number of retrieved
answers given by the question and answer system.
Mathematically, it is represented as:
The average Precision, Recall and F-Score is shown in table:
Table 1: Average Precision, Recall and F-Score of each
question type (in Percentage)
S.
No
Question
Type
Worst
Case
Average
Precision
Case
Best
Case
Worst
Case
Average
Recall
Best
Case
Worst
Case
Average
F- Score
Best
Case
1 What
70.6
82.604651
85.604 47.88
59.88372
62.88 29.27
32.2757
35.27
65116 65116 372 372 57 57
2 When
70.57
82.571429
85.571 51.57
63.57143
66.57 30.45
33.4577
36.45
42857 42857 143 143 77 77
3 Why 68.11 80.11 83.11 46.2 58.2 61.2
28.26
31.2683
34.26
83 83
4 Who
67.86
79.857143
82.857
42.3 54.3 57.3
26.9
29.9065
32.9
14286 14286 65 65
5 Where 63.7 75.7 78.7 49.2 61.2 64.2
28.36
31.3671
34.36
71 71
We have also considered the worst case scenario for analysis
the working of the system for each factoid questions, in this
we have found that in worst case the system typically find 7
questions corrects and 9 questions correctly in best possible
case ‘when ‘what’ type of questions are explored and search
on the input paragraph and similar is the case of other factoid
types [4] [7].
The graph given below in fig.2 shows the values of mean
precision for each type of factoid questions types which
shows how the system search for the information which is
relevant to the question to process the best answer from
possible dataset of answer predicted by the system. Scores
are calculated on the basis of the accuracy of the answer that
add another level of precision which can be made by finding
more common artifacts between the question token and the
answer token. The results would be more precise with the
use of more common verbs, nouns, adjectives, adverbs,
pronouns in both token sets and matching pattern with the
usage of regular expressions.
Figure 2: Average Mean Precision of each question type(in
Percentage
4.2 Mean Recall Percentage Values
It is the fraction of the number of relevant retrieved answers
given by the question and answer system to the total number
f relevant answers that should have been retrieved.
Mathematically, it is represented as:
The percentage of recall for each question type can be seen
by the graph given below in fig. 3. The answer found by the
question answering system can be more or less thorough than
the actual answer based on the dataset provided. The number
of answers possible for a query depends on the evaluator and
the ground truth. The answer expected by the evaluator may
differ from depending on the depth of search. As a result of
which there is a good amount of recall percentage due to
obvious reason of the high value of precision. The number
and the type of answers found from the paragraphs quite
similar in nature can be seen because of this high value of
recall mentioned above, creating difficulty in discriminating
one set of answer token from another possible similar set of
answer token.
Paper ID: 15091303 295
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Figure 3: Average Mean Recall of each question type (in %)
4.3 F-measure Percentage Values
It can be calculated only if precision and recall are known for
system. It calculates a harmonic mean between precision and
recall. Mathematically, it is represented as:
Figure 4: Average F-Score of each question types (in
Percentage)
5. Conclusion
Through this thesis work, we tried to learn the important
issues in the field of Question Answering (QA) systems. We
have added all types of questions.. It can be used to improve
question answering system by checking all returned answers.
However, it cannot be used alone to select the good answer.
Answering system has become an important component of
the online education platform. From our research findings we
took the initiative of proposing a basic framework for a QA
task for the language English [9]. The goal of a question
answering system is to retrieving answers to questions rather
than full documents or best matching passages, as most
information retrieval systems.
6. Future Score
In this research paper, we have added all types of questions.
These questions are when, why, who/whom, when, where.
We used the dataset and evaluated the performance of our
system using Recall and Precision. The future work include
that also the more questions can be added and the coding
system could be better [10] [11]. We hope to carry on these
ideas and develop additional mechanisms to question
generation based on the dependency features of the answers
and answer finding.
References
[1] Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin,
Andrew Ng “Web Question Answering: Is More
Always Better?”
[2] Haque, Nafid and Rosner, Mike. A prototype
framework for a Bangla question answering system
using translation based on transliteration and table
look-up as an interface for the medical domain.
University of Malta Gertjan Van Noord, University of
Groningen
[3] Ashish Kumar Saxena, Ganesh Viswanath Sambhu, L.
Venkata Subramaniam*, Saroj Kaushik”IITD-IBMIRL
System for Question Answering using Pattern
Matching, Semantic Type and Semantic Category
Recognition” OCT 2007.
[4] Boris Katz and Jimmy Lin” Selectively Using
Relations to Improve Precision in Question
Answering” MIT Artificial Intelligence Laboratory 200
Technology Square Cambridge, MA 02139
[5] Arnaud Grappy, Brigitte Grau”Answer type validation
in question answering systems”Le centre de hautes
etudes internationals dtnnformatique documentaire
Paris, France, France ©2010
[6] S. M. Harabagiu, M. A. Pa_sca, and S. J. Maiorano.
Experiments with open-domain textual question
answering. In Proceedings of the 18th conference on
Computational linguistics, Morristown, NJ, USA,
2000. Association for Computational Linguistics
[7] Matthew W. Bilotti and Eric Nyberg” Improving Text
Retrieval Precision and Answer Accuracy in Question
Answering Systems” Language Technologies Institute
Carnegie Mellon University5000 Forbes Avenue
Pittsburgh, PA 15213 USA
[8] E. Hovy, L. Gerber, U. Hermjakob, C.-Y. Lin, and D.
Ravichandran. Toward semantics-based answer
pinpointing. In HLT '01: Proceedings of the _rst
international conference on Human language
technology research, Morristown, NJ, USA, 2001.
Association for Computational Linguistics
[9] Guda, Vanitha., Sanampudi, Suresh. Kumar. And
Manikyamba, I.Lalkshmi ,”Approaches For Question
Answering Systems” , Vanitha Guda et al. /
International Journal of Engineering Science and
Technology (IJEST) ISSN : 0975-5462 Vol. 3 No.
2011. 990-995
[10] PINCHAK C. & LIN D. (2006). A Probabilistic
Answer Type Model. In Proceedings of the 11th
Conference of the European Chapter of the Association
for Computational Linguistics, p. 393–400.
[11] Quarteroni, S. and Manandhar S. “Designing an
Interactive Open-Domain Question Answering
System”. Journal of Natural Language Engineering 1.
1-23.
[12] LI X. & ROTH D. (2002). Learning Question
Classifiers. In Proceedings of the 19th International
Paper ID: 15091303 296
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Conference on Computational Linguistics, p. 1–7,
Morristown, NJ, USA : Association for Computational
Linguistics.
Author Profile
Harpreet Kaur is currently persuing the M. Tech in
computer science and engineering from Swami
Vivekanand Institute of Engineering & Technology,
Banur, Punjab. She holds the degree of B. Tech in
Computer Science and Technology from Baba Banda Singh
Bahadur Engineering and Technology, Fathegarh sahib, Punjab.
Er. Rimpi is currently working as Assistant Professor
in Computer Science and Engineering Department at
Swami Vivekanand Institute of Engineering and
Technology, Banur. She has completed her M. Tech in
Computer Engineering from Guru Nanak Dev University, Amritsar,
Punjab in 2011. She holds the degree of B. Tech in Computer
Science and Technology from Guru Nanak Dev University,
Amritsar, Punjab in 2009.
Paper ID: 15091303 297

More Related Content

PDF
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
PDF
J1803015357
PDF
P1803018289
PDF
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
PDF
G1803013542
PDF
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
PDF
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
PDF
A Survey on Sentiment Categorization of Movie Reviews
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
J1803015357
P1803018289
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
G1803013542
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
A Survey on Sentiment Categorization of Movie Reviews

What's hot (20)

PDF
Implementation of Semantic Analysis Using Domain Ontology
PDF
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
PDF
Analysis of sms feedback and online feedback using sentiment analysis for ass...
PDF
Modeling Text Independent Speaker Identification with Vector Quantization
PDF
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
PDF
R04503105108
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
A survey on sentiment analysis and opinion mining
PDF
A survey on sentiment analysis and opinion mining
PDF
Semantic Based Model for Text Document Clustering with Idioms
PDF
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
PDF
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
PPTX
Programmer information needs after memory failure
PDF
Opinion mining on newspaper headlines using SVM and NLP
PDF
Complaint Analysis in Indonesian Language Using WPKE and RAKE Algorithm
PDF
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...
PDF
Semantic analyzer for marathi text
PDF
Semantic analyzer for marathi text
PDF
F0363942
PDF
Twitter Sentiment Analysis on 2013 Curriculum Using Ensemble Features and K-N...
Implementation of Semantic Analysis Using Domain Ontology
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
Analysis of sms feedback and online feedback using sentiment analysis for ass...
Modeling Text Independent Speaker Identification with Vector Quantization
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
R04503105108
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
A survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion mining
Semantic Based Model for Text Document Clustering with Idioms
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
Programmer information needs after memory failure
Opinion mining on newspaper headlines using SVM and NLP
Complaint Analysis in Indonesian Language Using WPKE and RAKE Algorithm
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...
Semantic analyzer for marathi text
Semantic analyzer for marathi text
F0363942
Twitter Sentiment Analysis on 2013 Curriculum Using Ensemble Features and K-N...
Ad

Viewers also liked (9)

PDF
Advance Mobile Education Service for College Students
PDF
Characterization of Mn2+ ion Doped KCdBSi (K2O - CdO - B2O3 – SiO2) Glasses o...
PDF
Judicial Frameworks and Privacy Issues of Cloud Computing
PDF
Analytical Study of AES and Proposed Variant with Enhance Block Length and Ke...
PDF
Security in Vehicular Ad Hoc Networks through Mix-Zones Based Privacy
Advance Mobile Education Service for College Students
Characterization of Mn2+ ion Doped KCdBSi (K2O - CdO - B2O3 – SiO2) Glasses o...
Judicial Frameworks and Privacy Issues of Cloud Computing
Analytical Study of AES and Proposed Variant with Enhance Block Length and Ke...
Security in Vehicular Ad Hoc Networks through Mix-Zones Based Privacy
Ad

Similar to Novel Scoring System for Identify Accurate Answers for Factoid Questions (20)

PDF
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
PDF
QUESTION ANSWERING SYSTEMS: ANALYSIS AND SURVEY
PDF
A_Review_of_Question_Answering_Systems.pdf
PDF
Répondre à la question automatique avec le web
PDF
Open domain question answering system using semantic role labeling
PDF
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
PDF
Question Classification using Semantic, Syntactic and Lexical features
PDF
Question Classification using Semantic, Syntactic and Lexical features
PDF
Development and evaluation of a web based question answering system for arabi...
PDF
Answer extraction and passage retrieval for
PDF
Architecture of an ontology based domain-specific natural language question a...
PDF
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
PDF
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
PDF
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
PDF
QUESTION ANALYSIS FOR ARABIC QUESTION ANSWERING SYSTEMS
PDF
Answer Extraction for how and why Questions in Question Answering Systems
PDF
QA4MRE LIMSI-CNRS - Gleize et al. 2013
PDF
Varga ha
PDF
Response quality-evaluation-in-heterogeneous-question-answering-system-a-blac...
PDF
Application of hidden markov model in question answering systems
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
QUESTION ANSWERING SYSTEMS: ANALYSIS AND SURVEY
A_Review_of_Question_Answering_Systems.pdf
Répondre à la question automatique avec le web
Open domain question answering system using semantic role labeling
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical features
Development and evaluation of a web based question answering system for arabi...
Answer extraction and passage retrieval for
Architecture of an ontology based domain-specific natural language question a...
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
QUESTION ANALYSIS FOR ARABIC QUESTION ANSWERING SYSTEMS
Answer Extraction for how and why Questions in Question Answering Systems
QA4MRE LIMSI-CNRS - Gleize et al. 2013
Varga ha
Response quality-evaluation-in-heterogeneous-question-answering-system-a-blac...
Application of hidden markov model in question answering systems

More from International Journal of Science and Research (IJSR) (20)

PDF
Innovations in the Diagnosis and Treatment of Chronic Heart Failure
PDF
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
PDF
Polarization effect of antireflection coating for soi material system
PDF
Image resolution enhancement via multi surface fitting
PDF
Ad hoc networks technical issues on radio links security & qo s
PDF
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
PDF
Improving the life of lm13 using stainless spray ii coating for engine applic...
PDF
An overview on development of aluminium metal matrix composites with hybrid r...
PDF
Pesticide mineralization in water using silver nanoparticles incorporated on ...
PDF
Comparative study on computers operated by eyes and brain
PDF
T s eliot and the concept of literary tradition and the importance of allusions
PDF
Effect of select yogasanas and pranayama practices on selected physiological ...
PDF
Grid computing for load balancing strategies
PDF
A new algorithm to improve the sharing of bandwidth
PDF
Main physical causes of climate change and global warming a general overview
PDF
Performance assessment of control loops
PDF
Capital market in bangladesh an overview
PDF
Faster and resourceful multi core web crawling
PDF
Extended fuzzy c means clustering algorithm in segmentation of noisy images
PDF
Parallel generators of pseudo random numbers with control of calculation errors
Innovations in the Diagnosis and Treatment of Chronic Heart Failure
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
Polarization effect of antireflection coating for soi material system
Image resolution enhancement via multi surface fitting
Ad hoc networks technical issues on radio links security & qo s
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
Improving the life of lm13 using stainless spray ii coating for engine applic...
An overview on development of aluminium metal matrix composites with hybrid r...
Pesticide mineralization in water using silver nanoparticles incorporated on ...
Comparative study on computers operated by eyes and brain
T s eliot and the concept of literary tradition and the importance of allusions
Effect of select yogasanas and pranayama practices on selected physiological ...
Grid computing for load balancing strategies
A new algorithm to improve the sharing of bandwidth
Main physical causes of climate change and global warming a general overview
Performance assessment of control loops
Capital market in bangladesh an overview
Faster and resourceful multi core web crawling
Extended fuzzy c means clustering algorithm in segmentation of noisy images
Parallel generators of pseudo random numbers with control of calculation errors

Recently uploaded (20)

PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Lesson notes of climatology university.
PDF
Trump Administration's workforce development strategy
PDF
Yogi Goddess Pres Conference Studio Updates
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
01-Introduction-to-Information-Management.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
master seminar digital applications in india
PDF
Classroom Observation Tools for Teachers
PDF
RMMM.pdf make it easy to upload and study
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Paper A Mock Exam 9_ Attempt review.pdf.
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
UNIT III MENTAL HEALTH NURSING ASSESSMENT
History, Philosophy and sociology of education (1).pptx
Weekly quiz Compilation Jan -July 25.pdf
Lesson notes of climatology university.
Trump Administration's workforce development strategy
Yogi Goddess Pres Conference Studio Updates
What if we spent less time fighting change, and more time building what’s rig...
Final Presentation General Medicine 03-08-2024.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
01-Introduction-to-Information-Management.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
master seminar digital applications in india
Classroom Observation Tools for Teachers
RMMM.pdf make it easy to upload and study
2.FourierTransform-ShortQuestionswithAnswers.pdf
Microbial disease of the cardiovascular and lymphatic systems
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf

Novel Scoring System for Identify Accurate Answers for Factoid Questions

  • 1. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Novel Scoring System for Identify Accurate Answers for Factoid Questions 1 Harpreet Kaur, 2 Rimpi Kumari 1 Swami Vivekanand Institute of Engineering and Technology, Banur, Punjab, India Abstract: Question and Answer System (QAS) are some of the many challenges for natural language understanding and interfaces. In this paper we have develop a new scoring mathematical model that works on the five types of questions. The question text failures are first extracted and a score is found based on its structure with respect to its template structure and then answer score is calculated again the question as well as paragraph. A name entity recognizer and a Part of Speech tagger are applied on each of these words to encode necessary of information. After that the text to finally reach at the index of the most probable answer with respect to question. In this the entropy algorithm is used to find the exact answer. Keywords: Natural language processing, Question answering System, Information retrieval. 1. Introduction Questions answering (QA) systems look for the answer of a question in a large collection of documents. The question is in natural language. QA systems select text passages. Then, after that the answer is extracted from these passages, according to criteria issued from the question analysis. NLP focuses on communications between computers and natural languages in terms of theoretical results and practical applications, and on information sharing now that information is exchange as it never has been before and sharing information becoming the leading theme in the domain of NLP systems[2][3]. Automatic question answering system will help for the above technology. In this Question Answering System consists of three distinct phases: Question classification, information retrieval or document processing and answer extraction. The design of a standard QA system assumes that the language in which the question is asked and the text collection available to be processed are all in the same language. English QA system research attempts to deal with a wide range of question types like WHEN, WHERE, WHAT, HOW, WHOM, WHY & WHOSE. Thus the aim of a QA system is to localize the exact answer to a question from a structured or a non-structured collection of texts. Question Answering (QA) Systems allow the user to ask questions in a natural language and obtain an exact answer. In this, we tried to learn the important issues in the field of Question Answering (QA) systems. We peeked into the internals of many established QA systems. we do not only consider simple questions but text problems consisting of several sentences. Our approach to translating the natural language question uses an underlying corpus and the knowledge base to derive meaningful and relevant patterns which can then be used to process the questions and capture their meaning with respect to the underlying knowledge base. We classify the text based on their subject, verb, object and preposition for determining the possible type of questions to be generated. The ability of QA systems to recognize a great amount of answer types is related to their powerfulness for extracting right answers [5] [6] [8]. 2. Previous Work A survey of different QA techniques has been elaborated. Question answering system for Indian languages like Hindi, Telugu, Bengali and Punjabi is discussed. In Hindi language the Hindi QA system research attempts to deal with a wide range of question types like when, where, what time, how many[1][3]. The developed Question-Answering system in Hindi is using Hindi Shallow Parser. The shallow parser gives the analysis of the sentence in terms of the morphological analysis, POS tagging, Chunking etc. In Bengali language question and answering system is one of the Indo-Aryan languages of South Asia with over 200 million native speakers. A translation based on transliteration and a table look-up method is proposed as an interface to the actual QA task. The implementation part thus involves transliterating a Bangla question as an equivalent Latin alphabet (English) version that could be used in an actual QA task [2]. The Bangla lexicon consists of a good number of “loan-words” from Arabic, Persian, English and other languages. An approach to transform the Bangla question could be;  Tokenizing the transliterate version of the Bangla question,  Translating the remaining question by a simple table look-up method. 3. Methodology In this first we collect the corpus of data or paragraph from encyclopaedia to make the questions and find the exact answer show n fig1. Corpus is of two types: Questions and Paragraph. These questions have many types and these types are what when, where/which, who/whose/whom. After this with the help of these questions we make the question from paragraph then next step is the paragraph chunk and question score, the chunk paragraph is a format of writing, which forces you to expand on your ideas and explain your arguments. Paper ID: 15091303 294
  • 2. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Figure 1: Flow of Question Answer It helps in skills writing development and the scores are calculated on the basis of the accuracy of the answers. After that the candidate put query or question and answer then the similarity score will be calculated this loop will continue for process till the best answer will be find. 4. Results 4.1 Mean Precision Percentage Values It is the fraction of relevant retrieved answers given by the question and answer system to the total number of retrieved answers given by the question and answer system. Mathematically, it is represented as: The average Precision, Recall and F-Score is shown in table: Table 1: Average Precision, Recall and F-Score of each question type (in Percentage) S. No Question Type Worst Case Average Precision Case Best Case Worst Case Average Recall Best Case Worst Case Average F- Score Best Case 1 What 70.6 82.604651 85.604 47.88 59.88372 62.88 29.27 32.2757 35.27 65116 65116 372 372 57 57 2 When 70.57 82.571429 85.571 51.57 63.57143 66.57 30.45 33.4577 36.45 42857 42857 143 143 77 77 3 Why 68.11 80.11 83.11 46.2 58.2 61.2 28.26 31.2683 34.26 83 83 4 Who 67.86 79.857143 82.857 42.3 54.3 57.3 26.9 29.9065 32.9 14286 14286 65 65 5 Where 63.7 75.7 78.7 49.2 61.2 64.2 28.36 31.3671 34.36 71 71 We have also considered the worst case scenario for analysis the working of the system for each factoid questions, in this we have found that in worst case the system typically find 7 questions corrects and 9 questions correctly in best possible case ‘when ‘what’ type of questions are explored and search on the input paragraph and similar is the case of other factoid types [4] [7]. The graph given below in fig.2 shows the values of mean precision for each type of factoid questions types which shows how the system search for the information which is relevant to the question to process the best answer from possible dataset of answer predicted by the system. Scores are calculated on the basis of the accuracy of the answer that add another level of precision which can be made by finding more common artifacts between the question token and the answer token. The results would be more precise with the use of more common verbs, nouns, adjectives, adverbs, pronouns in both token sets and matching pattern with the usage of regular expressions. Figure 2: Average Mean Precision of each question type(in Percentage 4.2 Mean Recall Percentage Values It is the fraction of the number of relevant retrieved answers given by the question and answer system to the total number f relevant answers that should have been retrieved. Mathematically, it is represented as: The percentage of recall for each question type can be seen by the graph given below in fig. 3. The answer found by the question answering system can be more or less thorough than the actual answer based on the dataset provided. The number of answers possible for a query depends on the evaluator and the ground truth. The answer expected by the evaluator may differ from depending on the depth of search. As a result of which there is a good amount of recall percentage due to obvious reason of the high value of precision. The number and the type of answers found from the paragraphs quite similar in nature can be seen because of this high value of recall mentioned above, creating difficulty in discriminating one set of answer token from another possible similar set of answer token. Paper ID: 15091303 295
  • 3. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Figure 3: Average Mean Recall of each question type (in %) 4.3 F-measure Percentage Values It can be calculated only if precision and recall are known for system. It calculates a harmonic mean between precision and recall. Mathematically, it is represented as: Figure 4: Average F-Score of each question types (in Percentage) 5. Conclusion Through this thesis work, we tried to learn the important issues in the field of Question Answering (QA) systems. We have added all types of questions.. It can be used to improve question answering system by checking all returned answers. However, it cannot be used alone to select the good answer. Answering system has become an important component of the online education platform. From our research findings we took the initiative of proposing a basic framework for a QA task for the language English [9]. The goal of a question answering system is to retrieving answers to questions rather than full documents or best matching passages, as most information retrieval systems. 6. Future Score In this research paper, we have added all types of questions. These questions are when, why, who/whom, when, where. We used the dataset and evaluated the performance of our system using Recall and Precision. The future work include that also the more questions can be added and the coding system could be better [10] [11]. We hope to carry on these ideas and develop additional mechanisms to question generation based on the dependency features of the answers and answer finding. References [1] Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Ng “Web Question Answering: Is More Always Better?” [2] Haque, Nafid and Rosner, Mike. A prototype framework for a Bangla question answering system using translation based on transliteration and table look-up as an interface for the medical domain. University of Malta Gertjan Van Noord, University of Groningen [3] Ashish Kumar Saxena, Ganesh Viswanath Sambhu, L. Venkata Subramaniam*, Saroj Kaushik”IITD-IBMIRL System for Question Answering using Pattern Matching, Semantic Type and Semantic Category Recognition” OCT 2007. [4] Boris Katz and Jimmy Lin” Selectively Using Relations to Improve Precision in Question Answering” MIT Artificial Intelligence Laboratory 200 Technology Square Cambridge, MA 02139 [5] Arnaud Grappy, Brigitte Grau”Answer type validation in question answering systems”Le centre de hautes etudes internationals dtnnformatique documentaire Paris, France, France ©2010 [6] S. M. Harabagiu, M. A. Pa_sca, and S. J. Maiorano. Experiments with open-domain textual question answering. In Proceedings of the 18th conference on Computational linguistics, Morristown, NJ, USA, 2000. Association for Computational Linguistics [7] Matthew W. Bilotti and Eric Nyberg” Improving Text Retrieval Precision and Answer Accuracy in Question Answering Systems” Language Technologies Institute Carnegie Mellon University5000 Forbes Avenue Pittsburgh, PA 15213 USA [8] E. Hovy, L. Gerber, U. Hermjakob, C.-Y. Lin, and D. Ravichandran. Toward semantics-based answer pinpointing. In HLT '01: Proceedings of the _rst international conference on Human language technology research, Morristown, NJ, USA, 2001. Association for Computational Linguistics [9] Guda, Vanitha., Sanampudi, Suresh. Kumar. And Manikyamba, I.Lalkshmi ,”Approaches For Question Answering Systems” , Vanitha Guda et al. / International Journal of Engineering Science and Technology (IJEST) ISSN : 0975-5462 Vol. 3 No. 2011. 990-995 [10] PINCHAK C. & LIN D. (2006). A Probabilistic Answer Type Model. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, p. 393–400. [11] Quarteroni, S. and Manandhar S. “Designing an Interactive Open-Domain Question Answering System”. Journal of Natural Language Engineering 1. 1-23. [12] LI X. & ROTH D. (2002). Learning Question Classifiers. In Proceedings of the 19th International Paper ID: 15091303 296
  • 4. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Conference on Computational Linguistics, p. 1–7, Morristown, NJ, USA : Association for Computational Linguistics. Author Profile Harpreet Kaur is currently persuing the M. Tech in computer science and engineering from Swami Vivekanand Institute of Engineering & Technology, Banur, Punjab. She holds the degree of B. Tech in Computer Science and Technology from Baba Banda Singh Bahadur Engineering and Technology, Fathegarh sahib, Punjab. Er. Rimpi is currently working as Assistant Professor in Computer Science and Engineering Department at Swami Vivekanand Institute of Engineering and Technology, Banur. She has completed her M. Tech in Computer Engineering from Guru Nanak Dev University, Amritsar, Punjab in 2011. She holds the degree of B. Tech in Computer Science and Technology from Guru Nanak Dev University, Amritsar, Punjab in 2009. Paper ID: 15091303 297