SlideShare a Scribd company logo
Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska,
Wojciech Walczak, Piotr Andruszkiewicz
Paraphrase Detection Ensemble
SemEval 2016 winner
1st place in the English Semantic Textual Similarity (STS) Task
Agenda
1. What is SemEval?
2. Neural networks – what are they?
3. Vector word representations – what are they?
4. Our solution in SemEval2016
What is SemEval?
• SemEval (Semantic Evaluation) is an ongoing series of evaluations of computational semantic analysis
systems.
• Umbrella organization: SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational
Linguistics
Tasks:
Track I. Textual Similarity and Question
Answering Track
• Task1: Semantic Textual Similarity
• Task2: Interpretable Semantic
Textual Similarity
Track II. Sentiment Analysis Track
Track III. Semantic Parsing Track
Track IV. Semantic Analysis Track
Track V. Semantic Taxonomy Track
Input - Candidate Paraphrases Comparison & ResultsAnnotation – Linguists
Annotation – Computer Systems
SemEval Team
Competitor Teams
Paraphrase Detection
1. Cats eat mice.
2. Cats catch mice.
1. Boys play football.
2. Girls play soccer.
1. British PM signed deal.
2. Chinese president visited
Britain.
Score: 4.60
Almost perfect paraphrase!
Score: 0.22
Remotely similar topic but
that’s it.
Score: 1.58
Some common elements
but generally no semantic
similarity.
Real-life example:
1. Inheritance in object oriented programming is a way to form new classes using classes that have already been
defined.
2. The peropos of inheritance in object oriented programming is to minimize the reuse of existing code without
modification.
But first… What are neural networks?
A single neuron
A computational unit
with 3 inputs and 1
output
W, b are parameters
W1
W2
W3
b
output
Neural Network
…. just like running many classifiers together
x1
x2
x3
1
Inputs
Each classifier (neuron) learns its own thing.
We don’t have to specify what they will learn –
they „choose” it during training.
Layer 1
Neural Network
…. just like running many classifiers together
x1
x2
x3
1
Layer 2
Inputs
1
Layer 1
output
x1
x2
x3
1
Layer 2
Inputs
1
Layer 1
Neural Network
…. just like running many classifiers together
And second… What are word vecor representations?
Traditional approach: words as atomic symbols.
In vector space it a word looks like this:
length equal to the size of full dictionary
[ 0 0 0 0 0 0 1 0 0 0 0 ]
a sparse vector (many zeros, one 1).
The problems:
• Dimensionality: up to a few dozen million words (vectors) – we face millions of zeros with a single 1.
• No semantic similarity is represented.
Motel: [ 0 0 0 0 0 1 0 ]
Hotel: [ 1 0 0 0 0 0 0 ]
„Motel” AND „Hotel” = 0
How to make it better? An idea
Similar words appear in similar neighborhoods.
Represent words with respect to their neighbor.s
„You shall know a word by the company it keeps.” J.R.Firth 1957
window of length 5 window of length 5
The word representation vector
Visualization in space
The word representation vector
• The vector representation can represent deep syntactic relations.
Syntactically, xapple – xapples = xcar – xcars and this is represented in the spatial
relations!
• Vectors of similar words in two languages tend to be located close to each
other.
• It is possible to train
representations of images
in a similar way; they
tend to be mapped next to
their word meaning
How do we learn them?
We use pairs of training examples:
Positive example: a word in its correct context
Negative example: a random word in the same context
Cat chills on a mat
Cat chills Jeju a mat
Learning target: score for good examples should be greater than for bad examples.
score(Cat chills on a mat) > score(Cat chills Jeju a mat)
However, there are many more methods to learn such vectors, not all of them neural network-based.
word vectors neural network
input
update
Semantic similarity with word embeddings
Does the lady dance? = Is this woman dancing? =
Count
average
Count
average
Does the lady dance? = Is this woman dancing? =
Count distance between whole sentence vectors
The problem with
this?
No grammatical
relations are
represented…
Paraphrase Detector: rough outline
sentence1 sentence2
1. Training the
autoencoder
3. Adding
WordNet
knowledge
WordNet
Awards&Punishments
2. Computing
sentence
similarity matrices
4. Converting to
fixed-sized feature
vector
5. Adding extra features to the
ensemble (from
complementary parts of the
system)
SCORE
Semeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic Similarity
The
womenThe women swimming
in the morning
swimming
in
the morning
Sentence: The women swimming in the morning
Natural parse tree
Artificially binarized parse tree
(a possibility – not an actual tree)
0.2 5.64
3.44 7.23
0.2
A pooling matrix is built by computing each
node’s distances to all other nodes.
Then, a fixed-size mini-matrix is extracted
by selecting min value of each cell of the
pooling matrix.
The min is selected because it signifies
smallest disctance – so the maximum
similarity between nodes.
Effects of pooling
http://web.eecs.umich.edu/~mihalcea/papers/agirre.semeval16.pdf
Results
The competitors:
• National Institute for Text Mining, UK
• German Research Center for Artificial Intelligence, Germany
• Institute of Software, Chinese Academy of Sciences, China
• University of Colorado Boulder, USA
• Toyota Technological Institute
…and others. In total there were over 100 runs submitted.
Bibliography
• Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska, Wojciech Walczak and Piotr
Andruszkiewicz; Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity;
combining recursive autoencoders, WordNet and ensemble methods to measure semantic
similarity.
• Our presentation at IPI PAN: http://zil.ipipan.waw.pl/seminarium-
archiwum?action=AttachFile&do=view&target=2016-10-10.pdf
• Some ideas and images in ths presentation:
http://www.socher.org/index.php/DeepLearningTutorial/DeepLearningTutorial
• http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

More Related Content

PPTX
Word2vec slide(lab seminar)
PDF
Word Embeddings, why the hype ?
PDF
AINL 2016: Filchenkov
PPT
PDF
Word2vec on the italian language: first experiments
PDF
Word2vec: From intuition to practice using gensim
PDF
Science in text mining
PDF
Word2Vec
Word2vec slide(lab seminar)
Word Embeddings, why the hype ?
AINL 2016: Filchenkov
Word2vec on the italian language: first experiments
Word2vec: From intuition to practice using gensim
Science in text mining
Word2Vec

What's hot (20)

PDF
Word2vec algorithm
DOC
Doc format.
PDF
word embeddings and applications to machine translation and sentiment analysis
PDF
Deep Learning for NLP Applications
PDF
Information Retrieval with Deep Learning
PDF
Deep Learning & NLP: Graphs to the Rescue!
PDF
Week 3 Deep Learning And POS Tagging Hands-On
PPTX
Talk from NVidia Developer Connect
PDF
Colloquium talk on modal sense classification using a convolutional neural ne...
PPT
SNLI_presentation_2
PDF
ESWC 2014 Tutorial part 3
PPT
Latent Semantic Indexing For Information Retrieval
PDF
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
PPTX
DLBLR talk
PDF
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
PPTX
Convolutional neural networks for sentiment classification
PDF
Anthiil Inside workshop on NLP
PDF
Recurrent Neural Networks, LSTM and GRU
PDF
Generating Natural-Language Text with Neural Networks
PDF
Deep Learning, an interactive introduction for NLP-ers
Word2vec algorithm
Doc format.
word embeddings and applications to machine translation and sentiment analysis
Deep Learning for NLP Applications
Information Retrieval with Deep Learning
Deep Learning & NLP: Graphs to the Rescue!
Week 3 Deep Learning And POS Tagging Hands-On
Talk from NVidia Developer Connect
Colloquium talk on modal sense classification using a convolutional neural ne...
SNLI_presentation_2
ESWC 2014 Tutorial part 3
Latent Semantic Indexing For Information Retrieval
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
DLBLR talk
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Convolutional neural networks for sentiment classification
Anthiil Inside workshop on NLP
Recurrent Neural Networks, LSTM and GRU
Generating Natural-Language Text with Neural Networks
Deep Learning, an interactive introduction for NLP-ers
Ad

Viewers also liked (9)

PDF
Elasticsearch speed is key
PDF
Przetwarzanie języka naturalnego
PDF
Wyszukiwanie w plikach audio
PDF
Polecane dla Ciebie: rekomendacje i Mahout na żywo
PDF
Wprowadzenie do analizy sentymentu
PPTX
Elasticsearch i Docker - skalowalność, wysoka dostępność i zarządzanie zasobami
PDF
類神經網路、語意相似度(一個不嫌少、兩個恰恰好)
PDF
Deep Style: Using Variational Auto-encoders for Image Generation
PDF
Deep Learning for Natural Language Processing: Word Embeddings
Elasticsearch speed is key
Przetwarzanie języka naturalnego
Wyszukiwanie w plikach audio
Polecane dla Ciebie: rekomendacje i Mahout na żywo
Wprowadzenie do analizy sentymentu
Elasticsearch i Docker - skalowalność, wysoka dostępność i zarządzanie zasobami
類神經網路、語意相似度(一個不嫌少、兩個恰恰好)
Deep Style: Using Variational Auto-encoders for Image Generation
Deep Learning for Natural Language Processing: Word Embeddings
Ad

Similar to Semeval Deep Learning In Semantic Similarity (20)

PPTX
Pycon ke word vectors
PPTX
Neural Text Embeddings for Information Retrieval (WSDM 2017)
PDF
MACHINE-DRIVEN TEXT ANALYSIS
PDF
Yoav Goldberg: Word Embeddings What, How and Whither
PDF
Word2vec ultimate beginner
PPTX
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
PPTX
Word vectors
PPTX
Vector Space Word Representations - Rani Nelken PhD
PDF
Deep learning for nlp
PPTX
Tomáš Mikolov - Distributed Representations for NLP
PPTX
word vector embeddings in natural languag processing
PDF
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
PPTX
Tutorial on word2vec
PPTX
Lecture1.pptx
PPTX
Text Mining for Lexicography
PDF
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
PDF
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
PDF
Lda2vec text by the bay 2016 with notes
PPTX
Designing, Visualizing and Understanding Deep Neural Networks
PPTX
NLP Introduction and basics of natural language processing
Pycon ke word vectors
Neural Text Embeddings for Information Retrieval (WSDM 2017)
MACHINE-DRIVEN TEXT ANALYSIS
Yoav Goldberg: Word Embeddings What, How and Whither
Word2vec ultimate beginner
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
Word vectors
Vector Space Word Representations - Rani Nelken PhD
Deep learning for nlp
Tomáš Mikolov - Distributed Representations for NLP
word vector embeddings in natural languag processing
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
Tutorial on word2vec
Lecture1.pptx
Text Mining for Lexicography
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
Lda2vec text by the bay 2016 with notes
Designing, Visualizing and Understanding Deep Neural Networks
NLP Introduction and basics of natural language processing

Recently uploaded (20)

PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Encapsulation theory and applications.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Empathic Computing: Creating Shared Understanding
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Spectroscopy.pptx food analysis technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
1. Introduction to Computer Programming.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
TLE Review Electricity (Electricity).pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Tartificialntelligence_presentation.pptx
Programs and apps: productivity, graphics, security and other tools
OMC Textile Division Presentation 2021.pptx
Encapsulation theory and applications.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Empathic Computing: Creating Shared Understanding
SOPHOS-XG Firewall Administrator PPT.pptx
Spectroscopy.pptx food analysis technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Heart disease approach using modified random forest and particle swarm optimi...
Accuracy of neural networks in brain wave diagnosis of schizophrenia
1. Introduction to Computer Programming.pptx
Network Security Unit 5.pdf for BCA BBA.
TLE Review Electricity (Electricity).pptx
MIND Revenue Release Quarter 2 2025 Press Release
Mobile App Security Testing_ A Comprehensive Guide.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation_ Review paper, used for researhc scholars
Tartificialntelligence_presentation.pptx

Semeval Deep Learning In Semantic Similarity

  • 1. Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska, Wojciech Walczak, Piotr Andruszkiewicz Paraphrase Detection Ensemble SemEval 2016 winner 1st place in the English Semantic Textual Similarity (STS) Task
  • 2. Agenda 1. What is SemEval? 2. Neural networks – what are they? 3. Vector word representations – what are they? 4. Our solution in SemEval2016
  • 3. What is SemEval? • SemEval (Semantic Evaluation) is an ongoing series of evaluations of computational semantic analysis systems. • Umbrella organization: SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics Tasks: Track I. Textual Similarity and Question Answering Track • Task1: Semantic Textual Similarity • Task2: Interpretable Semantic Textual Similarity Track II. Sentiment Analysis Track Track III. Semantic Parsing Track Track IV. Semantic Analysis Track Track V. Semantic Taxonomy Track Input - Candidate Paraphrases Comparison & ResultsAnnotation – Linguists Annotation – Computer Systems SemEval Team Competitor Teams
  • 4. Paraphrase Detection 1. Cats eat mice. 2. Cats catch mice. 1. Boys play football. 2. Girls play soccer. 1. British PM signed deal. 2. Chinese president visited Britain. Score: 4.60 Almost perfect paraphrase! Score: 0.22 Remotely similar topic but that’s it. Score: 1.58 Some common elements but generally no semantic similarity. Real-life example: 1. Inheritance in object oriented programming is a way to form new classes using classes that have already been defined. 2. The peropos of inheritance in object oriented programming is to minimize the reuse of existing code without modification.
  • 5. But first… What are neural networks? A single neuron A computational unit with 3 inputs and 1 output W, b are parameters W1 W2 W3 b output
  • 6. Neural Network …. just like running many classifiers together x1 x2 x3 1 Inputs Each classifier (neuron) learns its own thing. We don’t have to specify what they will learn – they „choose” it during training. Layer 1
  • 7. Neural Network …. just like running many classifiers together x1 x2 x3 1 Layer 2 Inputs 1 Layer 1 output
  • 8. x1 x2 x3 1 Layer 2 Inputs 1 Layer 1 Neural Network …. just like running many classifiers together
  • 9. And second… What are word vecor representations? Traditional approach: words as atomic symbols. In vector space it a word looks like this: length equal to the size of full dictionary [ 0 0 0 0 0 0 1 0 0 0 0 ] a sparse vector (many zeros, one 1). The problems: • Dimensionality: up to a few dozen million words (vectors) – we face millions of zeros with a single 1. • No semantic similarity is represented. Motel: [ 0 0 0 0 0 1 0 ] Hotel: [ 1 0 0 0 0 0 0 ] „Motel” AND „Hotel” = 0
  • 10. How to make it better? An idea Similar words appear in similar neighborhoods. Represent words with respect to their neighbor.s „You shall know a word by the company it keeps.” J.R.Firth 1957 window of length 5 window of length 5
  • 13. The word representation vector • The vector representation can represent deep syntactic relations. Syntactically, xapple – xapples = xcar – xcars and this is represented in the spatial relations! • Vectors of similar words in two languages tend to be located close to each other. • It is possible to train representations of images in a similar way; they tend to be mapped next to their word meaning
  • 14. How do we learn them? We use pairs of training examples: Positive example: a word in its correct context Negative example: a random word in the same context Cat chills on a mat Cat chills Jeju a mat Learning target: score for good examples should be greater than for bad examples. score(Cat chills on a mat) > score(Cat chills Jeju a mat) However, there are many more methods to learn such vectors, not all of them neural network-based. word vectors neural network input update
  • 15. Semantic similarity with word embeddings Does the lady dance? = Is this woman dancing? = Count average Count average Does the lady dance? = Is this woman dancing? = Count distance between whole sentence vectors The problem with this? No grammatical relations are represented…
  • 16. Paraphrase Detector: rough outline sentence1 sentence2 1. Training the autoencoder 3. Adding WordNet knowledge WordNet Awards&Punishments 2. Computing sentence similarity matrices 4. Converting to fixed-sized feature vector 5. Adding extra features to the ensemble (from complementary parts of the system) SCORE
  • 19. The womenThe women swimming in the morning swimming in the morning Sentence: The women swimming in the morning Natural parse tree Artificially binarized parse tree (a possibility – not an actual tree)
  • 20. 0.2 5.64 3.44 7.23 0.2 A pooling matrix is built by computing each node’s distances to all other nodes. Then, a fixed-size mini-matrix is extracted by selecting min value of each cell of the pooling matrix. The min is selected because it signifies smallest disctance – so the maximum similarity between nodes.
  • 22. http://web.eecs.umich.edu/~mihalcea/papers/agirre.semeval16.pdf Results The competitors: • National Institute for Text Mining, UK • German Research Center for Artificial Intelligence, Germany • Institute of Software, Chinese Academy of Sciences, China • University of Colorado Boulder, USA • Toyota Technological Institute …and others. In total there were over 100 runs submitted.
  • 23. Bibliography • Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska, Wojciech Walczak and Piotr Andruszkiewicz; Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity. • Our presentation at IPI PAN: http://zil.ipipan.waw.pl/seminarium- archiwum?action=AttachFile&do=view&target=2016-10-10.pdf • Some ideas and images in ths presentation: http://www.socher.org/index.php/DeepLearningTutorial/DeepLearningTutorial • http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/