SlideShare a Scribd company logo
Hypothesis Transformation and Semantic Variability Rules Used in RTE Adrian Iftene, Alexandra Balahur-Dobrescu adiftene@info.uaic.ro,abalahur@info.uaic.ro „ Al. I. Cuza“ University, Iasi, Romania Faculty of Computer Science
Overview System presentation Tools Resources Semantic variability rules Fitness calculation Results Peer-to-Peer architecture Conclusions and Future Work Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
System presentation Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Resources Initial   data DIRT Minipar module Dependency trees for  (T, H) pairs LingPipe module Named entities for  (T, H) pairs Final result Core   Module3 Core Module2 Core Module1 Acronyms Background knowledge Wordnet P2P  Computers Wikipedia
Tools - LingPipe LingPipe (http://www.alias-i.com/lingpipe) is a suite of Java libraries for the linguistic analysis of human language. The major tools are for: Sentence Parts of Speech .   Named Entities .   Coreference   Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Example: Hypothesis from pair 111: Leloir was born in Argentina.   <ENAMEX TYPE=&quot;PERSON&quot;> Leloir </ENAMEX> was born in <ENAMEX TYPE=&quot;LOCATION&quot;> Argentina </ENAMEX>.
Tools - MINIPAR MINIPAR (Lin, 1998) transform the text and the hypothesis into dependency trees Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Example:  Le Beau Serge was directed by Chabrol .  ( E0(()  fin C  * ) 1 (Le ~ U 3 lex-mod (gov Le Beau Serge)) 2 (Beau ~ U 3 lex-mod (gov Le Beau Serge)) 3 (Serge Le Beau Serge N 5 s (gov direct)) 4 (was be be 5 be (gov direct)) 5 (directed direct V E0 i (gov fin)) E2 (() Le Beau Serge N 5  obj  (gov direct) (antecedent 3)) 6 (by ~ Prep  5 by-subj (gov direct)) 7 (Chabrol ~ N 6 pcomp-n (gov by)) 8 (.  ~ U  * punc) ) direct (V) ‏ Le_Beau_Serge (N) ‏ be (be) ‏ Chabrol (N) ‏ Le_Beau_Serge (N) ‏ Le (U) ‏ Beau (U) ‏ s be by obj lex-mod lex-mod
Resources DIRT - Discovery of Inference Rules from Text  Extended WordNet Acronyms Background Knowledge Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
Resources – DIRT DIRT is both an algorithm and a resulting knowledge collection (Lin and Pantel, 2001) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Example: Le Beau Serge was directed by Chabrol   &quot;X solves Y&quot;   Y is solved by X X resolves Y X finds a solution to Y X tries to solve Y X deals with Y Y is resolved by X… N:s:V<direct>V:by:N N:obj:V<direct>V:by:N N:s:V<direct>V: :V<direct>V:by:N :V<direct>V:by:N N:obj:V<direct>V:
Resources – DIRT (cont...) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Pair 37: T: She was transferred again to Navy when the American Civil War began, 1861. H: The American Civil War started in 1861. H’: The American Civil War began in 1861. Left – left relations similarity   HypothesisVerb relation1 relation2 TextVerb relation1 relation3 Left Subtree Right Subtree Right Subtree Left Subtree
Resources – DIRT (cont...) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Pair 161: T: The demonstrators, convoked by the solidarity with Latin America committee, verbally attacked Salvadoran President Alfredo Cristiani. H: President Alfredo Cristiani was attacked by demonstrators. H’: Demonstrators attacked President Alfredo Cristiani.   Left – right relations similarity HypothesisVerb relation1 relation2 TextVerb relation3 relation1 Left Subtree Right Subtree Left Subtree Right Subtree
Resources – eXtended WordNet For every synonym, we check to see which word appears in the text tree, and select the mapping with the best value according to the values from eXtended WordNet ( http://xwn.hlt.utdallas.edu/downloads.html )  For example, the relation between “relative” and “niece” is made with a score of 0.078652. Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
Resources - Acronyms The acronyms’ database (http://www.acronym-guide.com) helps our program in finding relations between the acronym and its meaning: “ US - United States ” Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
Resources – Background Knowledge Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Argentine [is] Argentina ar |calling_code = 54 |footnotes = Argentina also has a territorial dispute Argentina', , Nación Argentina (Argentine Nation) for many legal purposes), is in the world. Argentina occupies a continental surface area of Argentina national football team Netherlands [is] Dutch  Netherlands [is] Nederlandse Netherlands [is] Antillen Netherlands [in] Europe Netherlands [is] Holland Antilles [in] Netherlands “ Argentine”: Extracted Snippets from Wikipedia: Usually are “definition” patterns: - verbs like “is”, “define”, “represent”, etc. punctuation context , “ ‘ () [] : anaphora resolution Chinese [in] China Los Angeles [in] California 2 [is] two Netherlands [is] Holland
Semantic Variability Rules Negation rule – given by terms like “no”, “not”, “never” Modal verbs: “may”, “might”, “cannot”, “should”, “could” Certain  cases for particle “to” when it precedes:  a verb: “allow”, “impose”, “galvanize” adjective like “necessary”, “compulsory”, “free” noun like “attempt”, “trial”  Influence of context: Positive words: “certainly”, “absolutely” Negative words: “probably”, “likely” Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
Fitness calculation 1 Local Fitness: 1 at direct mapping, Acronyms, BK DIRT score  eXtended WordNet score Extended Local Fitness: Local Fitness Parent Fitness Mapping of edge label Node Position (left or right) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Text tree node mapping father mapping edge label mapping  Hypothesis tree
Fitness calculation 2 Total Fitness The Negation Value Threshold value = 2.06  Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
Fitness calculation 3 T: The French railway company SNCF is cooperating in the project. H: The French railway company is called SNCF. Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Total_Fitness = (3.125 + 3.125 + 3.125 + 2.5 + 4 + 3.048 + 1.125 + 2.625)/8 = 22.673/8 = 2.834 Positive_Verbs_Number = 1/1 = 1 GlobalFitness = 1*2.834+(1–1)*(4-2.834) = 2.834 2.625 1 (SNCF, call, desc) 1.125 1 (company, call, obj) 3.048 0.096 (call, -, -) 4 1 (be, call, be) 2.5 1 (company, call, s) 3.125 1 (railway, company, nn) 3.125 1 (French, company, nn) 3.125 1 (the, company, det) Extended local fitness Node Fitness Initial entity
Results 1 Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania 16.71 % 0.5758 Without NEs 2.17 % 0.6763 Without SVR 2.00 % 0.6775 Without BK 1.08 % 0.6838 Without Acronyms 1.63 % 0.6800 Without WordNet 0.54 % 0.6876 Without DIRT Relevance Precision System Description Component relevance: 0.6913 0.645 0.865 0.685 0.57 Run02 0.6913 0.635 0.87 0.69 0.57 Run01 Global SUM QA IR IE
Result 2 Pilot task: Yes, No/Unknown + answer justification Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Table over 12 submitted runs:   Mean [understandability correctness]: [4.1 2.0]  [4.3 2.8]*  [4.1 1.5]  [2.7 1.2]  [3.2 1.5]  [3.1 1.5] 0.643 0.437 0.475 0.471 System 2 0.805 0.547 0.595 0.569 System 1 Recall Precision F(b=1/3) ‏ Accuracy 0.753 0.731 max 0.475 0.471 median 0.211 0.365 min F(beta=1/3) ‏ Accuracy
Peer-to-Peer Architecture Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Speed optimization P2P architecture, cache mechanism Ending synchronization Quota mechanism Initiator DIRT db CM CM CM CM Acronyms SMB upload SMB download CM CM
Results Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania 0:00:06.7 5 computers with 7 processes 4 0:00:41 One computer with full cache at start 3 2:03:13 One computer with caching mechanism, but with empty cache at start 2 5:28:45 One computer without caching mechanism 1 Duration Run details No
Conclusions Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Core of our approach is based on a tree edit distance algorithm (Kouylekov, Magnini, 2005) ‏ Main idea is to transform the hypothesis using source like DIRT, WordNet, Wikipedia, Acronyms database Additionally, we built a system to acquire the extra background knowledge and applied complex grammar rules for rephrasing in English At each step, analysis of the influence of resources  used and new subproblems identified and addressed
Future work Search for a method to establish more precise values for penalties The multiplication coefficients for the parameters in the extended local fitness Using machine learning to establish the global threshold  Inserting the Textual Entailment system as part of a Question Answering system Building a Romanian Textual Entailment System Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
Acknowledgments Pre-processing: Daniel Matei NLP group of Iasi:  Coordinator: Prof. Dan Cristea Diana Trandabat, Corina Forascu,Ionut Pistol, Marius Raschip Anaphora resolution group: Iustin Dornescu, Alex Moruz, Gabriela Pavel Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
THANK YOU! Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania

More Related Content

PPT
Named Entity Recognition for Romanian
PPT
A Distributed Architecture System for Recognizing Textual Entailment
PPTX
Machine learning presentation (razi)
PPTX
Fairness in Machine Learning
PPTX
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
Pedersen naacl-2013-demo-poster-may25
PPT
Cs583 info-retrieval
Named Entity Recognition for Romanian
A Distributed Architecture System for Recognizing Textual Entailment
Machine learning presentation (razi)
Fairness in Machine Learning
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
International Journal of Engineering and Science Invention (IJESI)
Pedersen naacl-2013-demo-poster-may25
Cs583 info-retrieval

Similar to Hypothesis Transformation and Semantic Variability Rules Used in RTE (20)

PPT
Change Paths In Reasoning !
PDF
תכניית כנס האיגוד לסטטיסטיקה 2013
PPTX
Linked Data and Ontology Tutorial (for RD-Connect)
PPT
NAISTビッグデータシンポジウム - 情報 松本先生
PPT
The role of linguistic information for shallow language processing
PPTX
Structural Connectomics dMRI
PDF
Some Information Retrieval Models and Our Experiments for TREC KBA
PDF
The Nature of Information
PDF
informatics_future.pdf
PPT
Durkheim Project: Social Media Risk & Bayesian Counters
PDF
Xin Yao: "What can evolutionary computation do for you?"
PPTX
Web Science, SADI, and the Singularity
PDF
Artificial-intelligence and its applications in medicine and dentistry.pdf
PPT
Sight, truth and videotape final 12.8.05
PPT
A biologist in e-Science
PPTX
MIS 07 Expert Systems
PDF
Information_Retrieval_Models_Nfaoui_El_Habib
PPTX
Presentation to the J. Craig Venter Institute, Dec. 2014
PPT
2.17Mb ppt
PPTX
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
Change Paths In Reasoning !
תכניית כנס האיגוד לסטטיסטיקה 2013
Linked Data and Ontology Tutorial (for RD-Connect)
NAISTビッグデータシンポジウム - 情報 松本先生
The role of linguistic information for shallow language processing
Structural Connectomics dMRI
Some Information Retrieval Models and Our Experiments for TREC KBA
The Nature of Information
informatics_future.pdf
Durkheim Project: Social Media Risk & Bayesian Counters
Xin Yao: "What can evolutionary computation do for you?"
Web Science, SADI, and the Singularity
Artificial-intelligence and its applications in medicine and dentistry.pdf
Sight, truth and videotape final 12.8.05
A biologist in e-Science
MIS 07 Expert Systems
Information_Retrieval_Models_Nfaoui_El_Habib
Presentation to the J. Craig Venter Institute, Dec. 2014
2.17Mb ppt
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
Ad

More from Faculty of Computer Science (17)

PPTX
Using Artificial Intelligence in Software Engineering
PPTX
Eye and Voice Control for an Augmented Reality Cooking Experience
PPTX
Learn Chemistry with Augmented Reality
PDF
Exploiting Social Networks. Technological Trends
PPTX
Augmented Reality in Education
PDF
Diversification in an Image Retrieval System
PDF
Using opinion mining techniques for early crisis detection
PPT
Augmented reality
PPT
I See You, You Can't See Me: On People's Perception About Surveillance In Po...
PPT
Question Answering for Machine Reading Evaluation on Romanian and English
PPT
Identify Experts from a Domain of Interest
PPT
Question Answering on Romanian, English and French Languages
PPT
Recovering Diacritics using Wikipedia and Google
PPT
UAIC Participation at RTE4
PPT
Improving a Question Answering System for Romanian Using Textual Entailment
PPT
Graph Coloring using Peer-to-Peer Networks
PPT
Formalizing Peer-to-Peer Systems based on Content Addressable Network
Using Artificial Intelligence in Software Engineering
Eye and Voice Control for an Augmented Reality Cooking Experience
Learn Chemistry with Augmented Reality
Exploiting Social Networks. Technological Trends
Augmented Reality in Education
Diversification in an Image Retrieval System
Using opinion mining techniques for early crisis detection
Augmented reality
I See You, You Can't See Me: On People's Perception About Surveillance In Po...
Question Answering for Machine Reading Evaluation on Romanian and English
Identify Experts from a Domain of Interest
Question Answering on Romanian, English and French Languages
Recovering Diacritics using Wikipedia and Google
UAIC Participation at RTE4
Improving a Question Answering System for Romanian Using Textual Entailment
Graph Coloring using Peer-to-Peer Networks
Formalizing Peer-to-Peer Systems based on Content Addressable Network
Ad

Recently uploaded (20)

PDF
Explaining Sahih Muslim Book 6 – Hadith 216-241
PPTX
The conversion of Saul to Paul according to the Bible
PDF
Printable Latvian Gospel Tract - Be Sure of Heaven.pdf
PPTX
389 Your troops shall be willing 390 This is the Day
PPTX
Camp-Meetings by Pastor Simbaya Bright-WPS Office.pptx
PPT
The Altar Call Training for All Belivers
PDF
Printable Maldivian Divehi Gospel Tract - Be Sure of Heaven.pdf
PPTX
Lesson study with details and Photos. Easy
PDF
Printable Malayalam Gospel Tract - Be Sure of Heaven.pdf
PDF
Printable Luxembourgish Gospel Tract - Be Sure of Heaven.pdf
PPTX
Left_on_Read_by_God_Sermon_by_Preston_Prabu.pptx
PPTX
ream Organic Floral Christianity Faith Sermon Church Presentation.pptx
PDF
Printable Macedonian Gospel Tract - Be Sure of Heaven.pdf
PDF
Printable Latin Gospel Tract - Be Sure of Heaven.pdf
PPTX
The Essence of Sufism: Love, Devotion, and Divine Connection
PDF
Printable Mizo Gospel Tract - Be Sure of Heaven.pdf
PPTX
July 21 The Virtue of the Word of God.pptx
PPTX
Viral_A Study of Acts_Acts 9.19b-31_Slides.pptx
PPTX
Tell it to the World. The things that will amaze them more.
PPTX
Analyizing----Opinion---and---Truth.pptx
Explaining Sahih Muslim Book 6 – Hadith 216-241
The conversion of Saul to Paul according to the Bible
Printable Latvian Gospel Tract - Be Sure of Heaven.pdf
389 Your troops shall be willing 390 This is the Day
Camp-Meetings by Pastor Simbaya Bright-WPS Office.pptx
The Altar Call Training for All Belivers
Printable Maldivian Divehi Gospel Tract - Be Sure of Heaven.pdf
Lesson study with details and Photos. Easy
Printable Malayalam Gospel Tract - Be Sure of Heaven.pdf
Printable Luxembourgish Gospel Tract - Be Sure of Heaven.pdf
Left_on_Read_by_God_Sermon_by_Preston_Prabu.pptx
ream Organic Floral Christianity Faith Sermon Church Presentation.pptx
Printable Macedonian Gospel Tract - Be Sure of Heaven.pdf
Printable Latin Gospel Tract - Be Sure of Heaven.pdf
The Essence of Sufism: Love, Devotion, and Divine Connection
Printable Mizo Gospel Tract - Be Sure of Heaven.pdf
July 21 The Virtue of the Word of God.pptx
Viral_A Study of Acts_Acts 9.19b-31_Slides.pptx
Tell it to the World. The things that will amaze them more.
Analyizing----Opinion---and---Truth.pptx

Hypothesis Transformation and Semantic Variability Rules Used in RTE

  • 1. Hypothesis Transformation and Semantic Variability Rules Used in RTE Adrian Iftene, Alexandra Balahur-Dobrescu adiftene@info.uaic.ro,abalahur@info.uaic.ro „ Al. I. Cuza“ University, Iasi, Romania Faculty of Computer Science
  • 2. Overview System presentation Tools Resources Semantic variability rules Fitness calculation Results Peer-to-Peer architecture Conclusions and Future Work Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 3. System presentation Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Resources Initial data DIRT Minipar module Dependency trees for (T, H) pairs LingPipe module Named entities for (T, H) pairs Final result Core Module3 Core Module2 Core Module1 Acronyms Background knowledge Wordnet P2P Computers Wikipedia
  • 4. Tools - LingPipe LingPipe (http://www.alias-i.com/lingpipe) is a suite of Java libraries for the linguistic analysis of human language. The major tools are for: Sentence Parts of Speech . Named Entities . Coreference Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Example: Hypothesis from pair 111: Leloir was born in Argentina. <ENAMEX TYPE=&quot;PERSON&quot;> Leloir </ENAMEX> was born in <ENAMEX TYPE=&quot;LOCATION&quot;> Argentina </ENAMEX>.
  • 5. Tools - MINIPAR MINIPAR (Lin, 1998) transform the text and the hypothesis into dependency trees Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Example: Le Beau Serge was directed by Chabrol . ( E0(() fin C * ) 1 (Le ~ U 3 lex-mod (gov Le Beau Serge)) 2 (Beau ~ U 3 lex-mod (gov Le Beau Serge)) 3 (Serge Le Beau Serge N 5 s (gov direct)) 4 (was be be 5 be (gov direct)) 5 (directed direct V E0 i (gov fin)) E2 (() Le Beau Serge N 5 obj (gov direct) (antecedent 3)) 6 (by ~ Prep 5 by-subj (gov direct)) 7 (Chabrol ~ N 6 pcomp-n (gov by)) 8 (. ~ U * punc) ) direct (V) ‏ Le_Beau_Serge (N) ‏ be (be) ‏ Chabrol (N) ‏ Le_Beau_Serge (N) ‏ Le (U) ‏ Beau (U) ‏ s be by obj lex-mod lex-mod
  • 6. Resources DIRT - Discovery of Inference Rules from Text Extended WordNet Acronyms Background Knowledge Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 7. Resources – DIRT DIRT is both an algorithm and a resulting knowledge collection (Lin and Pantel, 2001) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Example: Le Beau Serge was directed by Chabrol &quot;X solves Y&quot; Y is solved by X X resolves Y X finds a solution to Y X tries to solve Y X deals with Y Y is resolved by X… N:s:V<direct>V:by:N N:obj:V<direct>V:by:N N:s:V<direct>V: :V<direct>V:by:N :V<direct>V:by:N N:obj:V<direct>V:
  • 8. Resources – DIRT (cont...) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Pair 37: T: She was transferred again to Navy when the American Civil War began, 1861. H: The American Civil War started in 1861. H’: The American Civil War began in 1861. Left – left relations similarity HypothesisVerb relation1 relation2 TextVerb relation1 relation3 Left Subtree Right Subtree Right Subtree Left Subtree
  • 9. Resources – DIRT (cont...) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Pair 161: T: The demonstrators, convoked by the solidarity with Latin America committee, verbally attacked Salvadoran President Alfredo Cristiani. H: President Alfredo Cristiani was attacked by demonstrators. H’: Demonstrators attacked President Alfredo Cristiani. Left – right relations similarity HypothesisVerb relation1 relation2 TextVerb relation3 relation1 Left Subtree Right Subtree Left Subtree Right Subtree
  • 10. Resources – eXtended WordNet For every synonym, we check to see which word appears in the text tree, and select the mapping with the best value according to the values from eXtended WordNet ( http://xwn.hlt.utdallas.edu/downloads.html ) For example, the relation between “relative” and “niece” is made with a score of 0.078652. Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 11. Resources - Acronyms The acronyms’ database (http://www.acronym-guide.com) helps our program in finding relations between the acronym and its meaning: “ US - United States ” Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 12. Resources – Background Knowledge Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Argentine [is] Argentina ar |calling_code = 54 |footnotes = Argentina also has a territorial dispute Argentina', , Nación Argentina (Argentine Nation) for many legal purposes), is in the world. Argentina occupies a continental surface area of Argentina national football team Netherlands [is] Dutch Netherlands [is] Nederlandse Netherlands [is] Antillen Netherlands [in] Europe Netherlands [is] Holland Antilles [in] Netherlands “ Argentine”: Extracted Snippets from Wikipedia: Usually are “definition” patterns: - verbs like “is”, “define”, “represent”, etc. punctuation context , “ ‘ () [] : anaphora resolution Chinese [in] China Los Angeles [in] California 2 [is] two Netherlands [is] Holland
  • 13. Semantic Variability Rules Negation rule – given by terms like “no”, “not”, “never” Modal verbs: “may”, “might”, “cannot”, “should”, “could” Certain cases for particle “to” when it precedes: a verb: “allow”, “impose”, “galvanize” adjective like “necessary”, “compulsory”, “free” noun like “attempt”, “trial” Influence of context: Positive words: “certainly”, “absolutely” Negative words: “probably”, “likely” Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 14. Fitness calculation 1 Local Fitness: 1 at direct mapping, Acronyms, BK DIRT score eXtended WordNet score Extended Local Fitness: Local Fitness Parent Fitness Mapping of edge label Node Position (left or right) ‏ Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Text tree node mapping father mapping edge label mapping Hypothesis tree
  • 15. Fitness calculation 2 Total Fitness The Negation Value Threshold value = 2.06 Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 16. Fitness calculation 3 T: The French railway company SNCF is cooperating in the project. H: The French railway company is called SNCF. Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Total_Fitness = (3.125 + 3.125 + 3.125 + 2.5 + 4 + 3.048 + 1.125 + 2.625)/8 = 22.673/8 = 2.834 Positive_Verbs_Number = 1/1 = 1 GlobalFitness = 1*2.834+(1–1)*(4-2.834) = 2.834 2.625 1 (SNCF, call, desc) 1.125 1 (company, call, obj) 3.048 0.096 (call, -, -) 4 1 (be, call, be) 2.5 1 (company, call, s) 3.125 1 (railway, company, nn) 3.125 1 (French, company, nn) 3.125 1 (the, company, det) Extended local fitness Node Fitness Initial entity
  • 17. Results 1 Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania 16.71 % 0.5758 Without NEs 2.17 % 0.6763 Without SVR 2.00 % 0.6775 Without BK 1.08 % 0.6838 Without Acronyms 1.63 % 0.6800 Without WordNet 0.54 % 0.6876 Without DIRT Relevance Precision System Description Component relevance: 0.6913 0.645 0.865 0.685 0.57 Run02 0.6913 0.635 0.87 0.69 0.57 Run01 Global SUM QA IR IE
  • 18. Result 2 Pilot task: Yes, No/Unknown + answer justification Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Table over 12 submitted runs: Mean [understandability correctness]: [4.1 2.0] [4.3 2.8]* [4.1 1.5] [2.7 1.2] [3.2 1.5] [3.1 1.5] 0.643 0.437 0.475 0.471 System 2 0.805 0.547 0.595 0.569 System 1 Recall Precision F(b=1/3) ‏ Accuracy 0.753 0.731 max 0.475 0.471 median 0.211 0.365 min F(beta=1/3) ‏ Accuracy
  • 19. Peer-to-Peer Architecture Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Speed optimization P2P architecture, cache mechanism Ending synchronization Quota mechanism Initiator DIRT db CM CM CM CM Acronyms SMB upload SMB download CM CM
  • 20. Results Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania 0:00:06.7 5 computers with 7 processes 4 0:00:41 One computer with full cache at start 3 2:03:13 One computer with caching mechanism, but with empty cache at start 2 5:28:45 One computer without caching mechanism 1 Duration Run details No
  • 21. Conclusions Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania Core of our approach is based on a tree edit distance algorithm (Kouylekov, Magnini, 2005) ‏ Main idea is to transform the hypothesis using source like DIRT, WordNet, Wikipedia, Acronyms database Additionally, we built a system to acquire the extra background knowledge and applied complex grammar rules for rephrasing in English At each step, analysis of the influence of resources used and new subproblems identified and addressed
  • 22. Future work Search for a method to establish more precise values for penalties The multiplication coefficients for the parameters in the extended local fitness Using machine learning to establish the global threshold Inserting the Textual Entailment system as part of a Question Answering system Building a Romanian Textual Entailment System Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 23. Acknowledgments Pre-processing: Daniel Matei NLP group of Iasi: Coordinator: Prof. Dan Cristea Diana Trandabat, Corina Forascu,Ionut Pistol, Marius Raschip Anaphora resolution group: Iustin Dornescu, Alex Moruz, Gabriela Pavel Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania
  • 24. THANK YOU! Adrian Iftene&Alexandra Balahur-Dobrescu – “Al.I.Cuza” University of Iasi, Romania