Annotating a Foreign
Language Lexical
Resource with Pictures
Dmitry Ustalov
IMM UB RAS / UrFU
Yekaterinburg, Russia
Outline
•Introduction
•Related Work
•Approach
•Evaluation
•Results
•Discussion
•Conclusion
2
Introduction
•The problem of mapping images to
the word senses is quite important:
• multimedia search,
• text illustration,
• quality assessment.
•It is also interesting to assess the
Yet Another RussNet lexical resource.
(Braslavski et al, 2014).
3
Related Work
• PicNet, a proprietary resource
(Mihalcea & Leong, 2008).
• ImageNet annotates WordNet with
pictures & bounding boxes
(Deng et al., 2009).
• Intersection with WordNet.ru is negligible.
• ImageCLEF creates software and datasets
for image indexing (Mü̈ller et al., 2010).
4
Related Work: Flickr
•Single-query image retrieval
(Reiter et al., 2007).
•Semantic Web-based approach
(Trojahn et al., 2008).
•Wikipedia-based approach
(Stampouli et al., 2010).
•Flickr tags with visual saliency of
images (Jiang et al., 2014).
5
Problem
Given an annotated image I, a bilingual
dictionary B, and a lexical resource S,
produce a mapping Is.
“cat”,“tomcat”,“kitten” →
«кот, кошка, котёночек»
6
TagBag: Assumptions
•The most image tags are nouns.
•Tags may be polysemous and the
redundant tags may be present.
• “crane” is «журавль» or «кран»?
•The image has a “main” object.
7
TagBag
•Tag. Initialize an empty vector.
• Iterate over image tags and retrieve all
the translations for each tag.
• Add each occurrence to a dimension.
•Bag. Prune that vector.
• Remove the low frequency dimensions
with the cut-off value.
• Return the resulting vector.
8
TagBag: Pseudocode
9
Evaluation
•The present approach is pretty simple.
Let’s evaluate it empirically.
•Take the top 1500 English nouns and
search for Flickr photos.
http://www.talkenglish.com/Vocabulary/T
op-1500-Nouns.aspx
•Get the V.K. Mueller’s dictionary.
http://ustalov.imm.uran.ru/pub/mueller.tar.gz
10
Experimental Setup
•Yet Another RussNet (CC BY-SA).
http://russianword.net/
•Similarity measures:
• cosine similarity,
• Jaccard index.
•Ask three
annotators to
submit
judgements.
11
призрак, тень, намёк
12https://www.flickr.com/photos/127324269@N03/16217604730
труд, работа, занятие
13https://www.flickr.com/photos/79304587@N07/16192772090
мужчина, парень, юноша
14https://www.flickr.com/photos/94029069@N03/15797009873
футбол
15https://www.flickr.com/photos/113780395@N05/15789001293
пища, провизия, питание, корм
16https://www.flickr.com/photos/80972943@N00/16396295195
Results
•The accuracy is moderately high and
the agreement level is good.
•Both measures demonstrate the same
performance.
17
http://ustalov.imm.uran.ru/pub/tagbag-aist.tar.gz
Discussion
•Some mappings are the same w.r.t.
the similarity measures and 13 of 43
of these mapping are wrong.
•Three sources of errors:
• sloppy image tags (7 of 13),
• actual mapping errors (3 of 13),
• batch uploads (3 of 13).
18
Conclusion
•TagBag is an unsupervised approach
for mapping images to synsets.
•The performance depends both on
image tags and ontology bias.
•Visual saliency and spam filtering may
increase the quality.
19
Further Work
20
Thank you!
Dmitry Ustalov
a post-graduate student @
IMM UB RAS, Yekaterinburg, Russia.
https://ustalov.name/
dau@imm.uran.ru
The present work is supported by the Russian Foundation
for the Humanities, project no. 13-04-12020.
21

More Related Content

PDF
Ph.D_slide
PDF
Understanding the world through ontology patterns - Eva Blomqvist, ESSENCE co...
PDF
PDF
オープンデータカフェ・セミナー@八王子 桑山
PPT
хуен бхMo
PPTX
Clib(20090925)
KEY
XenServer und Storage
Ph.D_slide
Understanding the world through ontology patterns - Eva Blomqvist, ESSENCE co...
オープンデータカフェ・セミナー@八王子 桑山
хуен бхMo
Clib(20090925)
XenServer und Storage

Viewers also liked (15)

PPT
Reki rossii
PDF
Drupal and Apache Stanbol
PDF
La grammaire dl
PPT
02 Audiovisual El Salvador 2008
PPTX
Безопасный двор
PDF
Tavant Technologies - Business Intelligence Brochure
PDF
219 fullbook
PPTX
A View on the Future of Sakai
PPT
Hackday Ml
PPT
Senior Thesis Reality Tv
PDF
Cara i'rab bhs arb
PPS
POEMAS DE AMOR
PDF
Richard Rogers - Methods in Media
PPT
Aprender a Convivir y estudio
PDF
Target List of Hesper-BOT Malware
Reki rossii
Drupal and Apache Stanbol
La grammaire dl
02 Audiovisual El Salvador 2008
Безопасный двор
Tavant Technologies - Business Intelligence Brochure
219 fullbook
A View on the Future of Sakai
Hackday Ml
Senior Thesis Reality Tv
Cara i'rab bhs arb
POEMAS DE AMOR
Richard Rogers - Methods in Media
Aprender a Convivir y estudio
Target List of Hesper-BOT Malware
Ad

Similar to Dmitry Ustalov — TagBag: Annotating a Foreign Language Lexical Resource with Pictures (20)

PDF
$tag[$tags] = $tags;
PPT
Metadata first, ontologies second
PPT
One Tag to bind them all: Measuring Term abstractness in Social Metadata
PPT
Image Tag Refinement Along the 'What' Dimension using Tag Categorization and ...
PDF
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
PDF
A Novel Approach For Annotating Images By Semantic Similarity Keyword Based...
PDF
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
PDF
Handout for Tagging and User-Contributed Metadata
PDF
A tutorial review of automatic image tagging technique using text mining
PDF
A tutorial review of automatic image tagging technique using text mining
PDF
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
PDF
40120140501013
PDF
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
PPSX
Image Search: Then and Now
PPTX
MW2011: Klavans, J. +, Computational Linguistics in Museums: Applications fo...
PPT
Tagging and Folksonomies
PDF
Collaborative semantic annotation of images ontology based model
PDF
PhD defense : Multi-points of view semantic enrichment of folksonomies
PPT
Bagwords
PDF
IJNLC 2013 - Ambiguity-Aware Document Similarity
$tag[$tags] = $tags;
Metadata first, ontologies second
One Tag to bind them all: Measuring Term abstractness in Social Metadata
Image Tag Refinement Along the 'What' Dimension using Tag Categorization and ...
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
A Novel Approach For Annotating Images By Semantic Similarity Keyword Based...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Handout for Tagging and User-Contributed Metadata
A tutorial review of automatic image tagging technique using text mining
A tutorial review of automatic image tagging technique using text mining
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
40120140501013
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
Image Search: Then and Now
MW2011: Klavans, J. +, Computational Linguistics in Museums: Applications fo...
Tagging and Folksonomies
Collaborative semantic annotation of images ontology based model
PhD defense : Multi-points of view semantic enrichment of folksonomies
Bagwords
IJNLC 2013 - Ambiguity-Aware Document Similarity
Ad

More from AIST (20)

PDF
Alexey Mikhaylichenko - Automatic Detection of Bone Contours in X-Ray Images
PDF
Алена Ильина и Иван Бибилов, GoTo - GoTo школы, конкурсы и хакатоны
PDF
Станислав Кралин, Сайтсофт - Связанные открытые данные федеральных органов ис...
PDF
Павел Браславский,Velpas - Velpas: мобильный визуальный поиск
PDF
Евгений Цымбалов, Webgames - Методы машинного обучения для задач игровой анал...
PDF
Александр Москвичев, EveResearch - Алгоритмы анализа данных в маркетинговых и...
PDF
Петр Ермаков, HeadHunter - Модерация резюме: от людей к роботам. Машинное обу...
PPTX
Иосиф Иткин, Exactpro - TBA
PPTX
Nikolay Karpov - Evolvable Semantic Platform for Facilitating Knowledge Exchange
PDF
George Moiseev - Classification of E-commerce Websites by Product Categories
PDF
Elena Bruches - The Hybrid Approach to Part-of-Speech Disambiguation
PDF
Marina Danshina - The methodology of automated decryption of znamenny chants
PDF
Edward Klyshinsky - The Corpus of Syntactic Co-occurences: the First Glance
PPTX
Galina Lavrentyeva - Anti-spoofing Methods for Automatic Speaker Verification...
PDF
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
PDF
Kaytoue Mehdi - Finding duplicate labels in behavioral data: an application f...
PPTX
Valeri Labunets - The bichromatic excitable Schrodinger metamedium
PPTX
Valeri Labunets - Fast multiparametric wavelet transforms and packets for ima...
PDF
Alexander Karkishchenko - Threefold Symmetry Detection in Hexagonal Images Ba...
PPTX
Artyom Makovetskii - An Efficient Algorithm for Total Variation Denoising
Alexey Mikhaylichenko - Automatic Detection of Bone Contours in X-Ray Images
Алена Ильина и Иван Бибилов, GoTo - GoTo школы, конкурсы и хакатоны
Станислав Кралин, Сайтсофт - Связанные открытые данные федеральных органов ис...
Павел Браславский,Velpas - Velpas: мобильный визуальный поиск
Евгений Цымбалов, Webgames - Методы машинного обучения для задач игровой анал...
Александр Москвичев, EveResearch - Алгоритмы анализа данных в маркетинговых и...
Петр Ермаков, HeadHunter - Модерация резюме: от людей к роботам. Машинное обу...
Иосиф Иткин, Exactpro - TBA
Nikolay Karpov - Evolvable Semantic Platform for Facilitating Knowledge Exchange
George Moiseev - Classification of E-commerce Websites by Product Categories
Elena Bruches - The Hybrid Approach to Part-of-Speech Disambiguation
Marina Danshina - The methodology of automated decryption of znamenny chants
Edward Klyshinsky - The Corpus of Syntactic Co-occurences: the First Glance
Galina Lavrentyeva - Anti-spoofing Methods for Automatic Speaker Verification...
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
Kaytoue Mehdi - Finding duplicate labels in behavioral data: an application f...
Valeri Labunets - The bichromatic excitable Schrodinger metamedium
Valeri Labunets - Fast multiparametric wavelet transforms and packets for ima...
Alexander Karkishchenko - Threefold Symmetry Detection in Hexagonal Images Ba...
Artyom Makovetskii - An Efficient Algorithm for Total Variation Denoising

Recently uploaded (20)

PPTX
Preformulation.pptx Preformulation studies-Including all parameter
PPTX
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
PPT
Cell Structure Description and Functions
PPTX
limit test definition and all limit tests
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PDF
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PDF
Chapter 3 - Human Development Poweroint presentation
PPT
LEC Synthetic Biology and its application.ppt
PDF
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
PPTX
Substance Disorders- part different drugs change body
PDF
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
PPTX
Toxicity Studies in Drug Development Ensuring Safety, Efficacy, and Global Co...
PDF
5.Physics 8-WBS_Light.pdfFHDGJDJHFGHJHFTY
PDF
Sustainable Biology- Scopes, Principles of sustainiability, Sustainable Resou...
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
PMR- PPT.pptx for students and doctors tt
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PDF
Metabolic Acidosis. pa,oakw,llwla,wwwwqw
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
Preformulation.pptx Preformulation studies-Including all parameter
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
Cell Structure Description and Functions
limit test definition and all limit tests
Presentation1 INTRODUCTION TO ENZYMES.pptx
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
Chapter 3 - Human Development Poweroint presentation
LEC Synthetic Biology and its application.ppt
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
Substance Disorders- part different drugs change body
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
Toxicity Studies in Drug Development Ensuring Safety, Efficacy, and Global Co...
5.Physics 8-WBS_Light.pdfFHDGJDJHFGHJHFTY
Sustainable Biology- Scopes, Principles of sustainiability, Sustainable Resou...
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PMR- PPT.pptx for students and doctors tt
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
Metabolic Acidosis. pa,oakw,llwla,wwwwqw
THE CELL THEORY AND ITS FUNDAMENTALS AND USE

Dmitry Ustalov — TagBag: Annotating a Foreign Language Lexical Resource with Pictures