SlideShare a Scribd company logo
Placing Images with Refined Language Models and
Similarity Search with PCA-reduced VGG Features
Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and
Yiannis Kompatsiaris1
1 Information Technologies Institute (ITI), CERTH, Greece
2 CEA LIST, 91190 Gif-sur-Yvette, France
MediaEval 2016 Workshop, Oct. 20-21, 2016, Hilversum, Netherlands.
Summary
Tag-based location estimation (1 runs)
• Built upon the scheme of our 2015 participation (Kordopatis-Zilos et al.,
MediaEval 2015)
• Based on a refined probabilistic Language Model
Visual-based location estimation (1 run)
• Extract PCA-reduced VGG features to compute image similarities
• Geospatial clustering scheme of the most visually similar images
Hybrid location estimation (3 run)
• Combination of the textual and visual approaches using a set of rules
Training sets
• Training set released by the organisers (≈4.7M geotagged items)
• YFCC dataset, excl. images from users in test set (≈40M geotagged items)
• External data derived from gazetteers, i.e. Geonames and OpenStreetMap
G. Kordopatis-Zilos, A. Popescu, S. Papadopoulos, and Y. Kompatsiaris. Socialsensor at mediaeval placing task
2015. In MediaEval 2015 Placing Task, 2015
Tag-based location estimation
• Processing steps of the approach
– Offline: language model construction
– Online: location estimation
OpenStreetMap
Pre-processing
• Tags and titles of the training set items are processed
• Apply
– URL decoding
– lowercase transformation
– tokenization
• Remove
– accents
– symbols
– punctuations
• The multi-word tags are split into their individual terms,
which are also included in the item's term set
• Discard numerics or less than three characters terms
Language Model (LM)
• LM-based estimation
– Most Likely Cell (mlc) considered the cell with the highest probability and
used to produce the estimation
𝑚𝑙𝑐𝑗 = arg max 𝑖
𝑘=1
𝑇 𝑗
𝑝(𝑡 𝑘|𝑐𝑖) ∗ 𝑤(𝑡 𝑘)
Inspired from (Popescu, MediaEval 2013)
• LM generation scheme
– divide earth surface in rectangular
cells with a side length of 0.01°
– calculate term-cell probabilities
𝑝(𝑡|𝑐) = 𝑁 𝑢/𝑁𝑡
A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In MediaEval 2013 Placing Task, 2013
Feature Selection and Weighting
Feature Weighting
• Locality weight function, a function based on term relative position in T
• Spatial Entropy weight function, a Gaussian function based on the term’s
spatial entropy
• Linear combination of the two weights
Feature Selection
• Calculate terms locality using a grid of 0.01°×0.01°
• When a user uses a given term, he/she is assigned to the
entire cell neighborhood instead of a unique cell:
𝑙 𝑡 = 𝑁𝑡 ∗
𝑐∈𝐶 𝑢∈𝑈𝑡,𝑐
|{𝑢′|𝑢′ ∈ 𝑈𝑡,𝑐, 𝑢′ ≠ 𝑢}|
𝑁𝑡
2
• Terms with non-zero locality score form the term set 𝑇
Refinements
• Multiple Grids
– Built an additional LM using a finer
grid (cell side length of 0.001°)
– combine the MLC of the individual
language models
• Similarity search (Van Laere et al., ICMR 2011)
– determine 𝑘 𝑡 most similar training images in the MLC
– their center-of-gravity is the final location estimation
From: (Kordopatis-Zilos et al., PAISI 2015)
G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social media content with a
refined language modelling approach. In Intelligence and Security Informatics, pages 21–40, 2015
Visual-based location estimation
Main Objectives
• Ensure that the visual features are generic and transferable
• Provide a compact representation of the features
Model building
• CNN features extracted by fine-tuning the VGG model
• Training: ~5K Points Of Interest (POIs), over 7M Flickr images using
queries with:
– the POI name and a radius of 5km around its coordinates
– the POI name and the associated city name
• Compressed outputs of fc7 layer (4096d) to 128d using PCA,
learned on a subset of 250,000 train images
• Similarity Search based on the PCA-reduced CNN features
O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources using language models and similarity
search. ICMR ’11, pages 48:1–48:8, New York, NY, USA, 2011. ACM
Visual-based location estimation
Location Estimation
• Geospatial clustering of 𝑘 𝑣 = 20 visually most similar images
• The largest cluster (or the first in case of equal size) is selected and
its centroid is used as the location estimate
Visual Confidence
• Confidence metric for the visual estimation is based on the size of
the largest cluster
𝑐𝑜𝑛𝑓𝑣 𝑖 = max(
𝑛 𝑖 − 𝑛 𝑡
𝑘 𝑣 − 𝑛 𝑡
, 0)
𝑛 𝑖 : number of neighbors in the largest cluster of image i
𝑛 𝑡: configuration parameter of the confidence score ‘’strictness’’
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In
International Conference on Learning Representations, 2015
Hybrid-based location estimation
• A set of rules to determine the
source of estimation between the
text and visual approaches
• The visual estimation is chosen in
cases:
→ No estimation could be produced by
the text approach
→ Visual estimation fell inside the
borders of the mlc
→ By comparing the confidence scores
𝑐𝑜𝑛𝑓𝑣 and 𝑐𝑜𝑛𝑓𝑡
• Otherwise the text estimation is
selected
Runs and Results
RUN-1: Tag-based location estimation + released training set
RUN-2: Visual-based location estimation + released training set
RUN-3: Hybrid location estimation + released training set
RUN-4: Hybrid location estimation + YFCC dataset
RUN-5: Hybrid location estimation + YFCC + External data
RUN-E: Visual-based location estimation + entire YFCC dataset
Images
Runs and Results
RUN-1: Tag-based location estimation + released training set
RUN-2: Visual-based location estimation + released training set
RUN-3: Hybrid location estimation + released training set
RUN-4: Hybrid location estimation + YFCC dataset
RUN-5: Hybrid location estimation + YFCC + External data
Videos
References
G. Kordopatis-Zilos, A. Popescu, S. Papadopoulos, and Y. Kompatsiaris.
Socialsensor at mediaeval placing task 2015. In MediaEval 2015 Placing
Task, 2015
G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social
media content with a refined language modelling approach. In
Intelligence and Security Informatics, pages 21–40, 2015
A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In
MediaEval 2013 Placing Task, 2013
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-
scale image recognition. In International Conference on Learning
Representations, 2015
O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr
resources using language models and similarity search. ICMR ’11, pages
48:1–48:8, New York, NY, USA, 2011. ACM
Thank you!
Data/Code:
– https://github.com/MKLab-ITI/multimedia-geotagging/
Get in touch:
– Giorgos Kordopatis-Zilos: georgekordopatis@iti.gr
– Symeon Papadopoulos: papadop@iti.gr / @sympap
With the support of:

More Related Content

PPTX
In-depth Exploration of Geotagging Performance
PDF
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
PPTX
Automated features extraction from satellite images.
PPTX
CERTH/CEA LIST at MediaEval Placing Task 2015
PDF
linkIn_CVPR15
PDF
Feature Extraction from the Satellite Image Gray Color and Knowledge Discove...
PPTX
Enhancement of Old Images and Documents by Digital Image Processing Techniques.
PPTX
Restitution Automation
In-depth Exploration of Geotagging Performance
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
Automated features extraction from satellite images.
CERTH/CEA LIST at MediaEval Placing Task 2015
linkIn_CVPR15
Feature Extraction from the Satellite Image Gray Color and Knowledge Discove...
Enhancement of Old Images and Documents by Digital Image Processing Techniques.
Restitution Automation

What's hot (20)

PPTX
PPTX
PPTX
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
PDF
Tracking emerges by colorizing videos
PPTX
Convolutional Patch Representations for Image Retrieval An unsupervised approach
PDF
HARMONIOUS - 3D reconstruction and Stream flow monitoring
PPTX
EUSIPCO19
PPTX
Deep image retrieval - learning global representations for image search - ub ...
PDF
e-SOTER Regional pilot platform as EU contribution to a Global Soil observing...
 
PDF
Human tracking using thermal imaging
PDF
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
PDF
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
PDF
PCA and Classification
PPT
Real-Time Logo Detection and Tracking
PPTX
[0312] joohee
PDF
MediaEval 2016 - ININ Submission to Zero Cost ASR Task
PPTX
Provenance Analytics at AAAI Human Computation Conference 2013
PDF
Evaluating effectiveness of radiometric correction for optical satellite imag...
PPTX
Edward Robson
PDF
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
Tracking emerges by colorizing videos
Convolutional Patch Representations for Image Retrieval An unsupervised approach
HARMONIOUS - 3D reconstruction and Stream flow monitoring
EUSIPCO19
Deep image retrieval - learning global representations for image search - ub ...
e-SOTER Regional pilot platform as EU contribution to a Global Soil observing...
 
Human tracking using thermal imaging
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
PCA and Classification
Real-Time Logo Detection and Tracking
[0312] joohee
MediaEval 2016 - ININ Submission to Zero Cost ASR Task
Provenance Analytics at AAAI Human Computation Conference 2013
Evaluating effectiveness of radiometric correction for optical satellite imag...
Edward Robson
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
Ad

Viewers also liked (20)

PDF
Presentation1
PDF
Development of Safe Drinking Water Supply_Ms. Sabitri Tripathi
PPSX
Fotos asombrosas
PPTX
Verifying Multimedia Use at MediaEval 2016
PPTX
Visual Sensor Network & Coverage Issue
PPTX
Use of hydrometallurgy in metal recovery from mine wastes
DOCX
Gerenciamento de Riscos Corporativos FGV 2017
PDF
Matteo Pozzi - Storytelling in games is (not) the new black
PPTX
Zigbee module interface with ARM 7
PDF
Embedded c programming guide e book atmel 8051 / 89c51 /89c52
PDF
Antropologia e cultura tylor boas e malinowski
PDF
Indústria cultural cultura de massa pdf
PPT
8051 zigbee interface
PPTX
HOME AUTOMATION USING MOBILE PHONES GIRISH HARMUKH AND NEERAJ YADAV
PPTX
Optimum energy management system
PDF
Η ΛΙΛΙΚΑ ΞΕΚΙΝΑΕΙ ΕΡΕΥΝΑ
PDF
ΕΝΑ ΨΑΡΙ ΠΟΥ ΔΕΝ ΚΟΛΥΜΠΑΕΙ
PPT
Beautiful Landscapes
PPT
20160126 université act 5 stratégies observance
Presentation1
Development of Safe Drinking Water Supply_Ms. Sabitri Tripathi
Fotos asombrosas
Verifying Multimedia Use at MediaEval 2016
Visual Sensor Network & Coverage Issue
Use of hydrometallurgy in metal recovery from mine wastes
Gerenciamento de Riscos Corporativos FGV 2017
Matteo Pozzi - Storytelling in games is (not) the new black
Zigbee module interface with ARM 7
Embedded c programming guide e book atmel 8051 / 89c51 /89c52
Antropologia e cultura tylor boas e malinowski
Indústria cultural cultura de massa pdf
8051 zigbee interface
HOME AUTOMATION USING MOBILE PHONES GIRISH HARMUKH AND NEERAJ YADAV
Optimum energy management system
Η ΛΙΛΙΚΑ ΞΕΚΙΝΑΕΙ ΕΡΕΥΝΑ
ΕΝΑ ΨΑΡΙ ΠΟΥ ΔΕΝ ΚΟΛΥΜΠΑΕΙ
Beautiful Landscapes
20160126 université act 5 stratégies observance
Ad

Similar to Placing Images with Refined Language Models and Similarity Search with PCA-reduced VGG Features (20)

PPTX
Geotagging Social Media Content with a Refined Language Modelling Approach
PPTX
Geotagging Social Media Content with a Refined Language Modelling Approach
PPTX
Human Pose Estimation by Deep Learning
PPTX
GIS Analysis For Site Remediation
PDF
IGIS Workshop - Introduction to ArcGIS Pro - Apr 2022 - Presentation.pdf
PDF
Techniques for effective and efficient fire detection from social media images
PDF
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
PPTX
Enviromental impact assesment for highway projects
PDF
Big Linked Data Federation - ExtremeEarth Open Workshop
PDF
NetVLAD: CNN architecture for weakly supervised place recognition
PPTX
Project Matsu: Elastic Clouds for Disaster Relief
PPTX
Developing a Tutorial for Grouping Analysis in ArcGIS
PDF
Semi-Automatic Classification Algorithm: The differences between Minimum Dist...
PDF
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
PDF
Google Earth Web Service as a Support for GIS Mapping in Geospatial Research ...
PPTX
Project Matsu
PDF
PPT s12-machine vision-s2
PDF
A location-aware embedding technique for accurate landmark recognition
PDF
Brewing the Ultimate Data Fusion
PDF
Lec07 aggregation-and-retrieval-system
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
Human Pose Estimation by Deep Learning
GIS Analysis For Site Remediation
IGIS Workshop - Introduction to ArcGIS Pro - Apr 2022 - Presentation.pdf
Techniques for effective and efficient fire detection from social media images
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Enviromental impact assesment for highway projects
Big Linked Data Federation - ExtremeEarth Open Workshop
NetVLAD: CNN architecture for weakly supervised place recognition
Project Matsu: Elastic Clouds for Disaster Relief
Developing a Tutorial for Grouping Analysis in ArcGIS
Semi-Automatic Classification Algorithm: The differences between Minimum Dist...
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
Google Earth Web Service as a Support for GIS Mapping in Geospatial Research ...
Project Matsu
PPT s12-machine vision-s2
A location-aware embedding technique for accurate landmark recognition
Brewing the Ultimate Data Fusion
Lec07 aggregation-and-retrieval-system

More from Symeon Papadopoulos (20)

PDF
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
PDF
Deepfakes: An Emerging Internet Threat and their Detection
PDF
Knowledge-based Fusion for Image Tampering Localization
PDF
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
PPTX
COVID-19 Infodemic vs Contact Tracing
PDF
Similarity-based retrieval of multimedia content
PPTX
Twitter-based Sensing of City-level Air Quality
PPTX
Aggregating and Analyzing the Context of Social Media Content
PDF
Verifying Multimedia Content on the Internet
PPTX
A Web-based Service for Image Tampering Detection
PPTX
Learning to detect Misleading Content on Twitter
PPTX
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
PPTX
Multimedia Privacy
PPTX
Perceived versus Actual Predictability of Personal Information in Social Netw...
PPTX
Web and Social Media Image Forensics for News Professionals
PPTX
Predicting News Popularity by Mining Online Discussions
PPTX
Finding Diverse Social Images at MediaEval 2015
PPTX
Verifying Multimedia Use at MediaEval 2015
PDF
Detecting image splicing in the wild Web
PPTX
Learning to Classify Users in Online Interaction Networks
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
Deepfakes: An Emerging Internet Threat and their Detection
Knowledge-based Fusion for Image Tampering Localization
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
COVID-19 Infodemic vs Contact Tracing
Similarity-based retrieval of multimedia content
Twitter-based Sensing of City-level Air Quality
Aggregating and Analyzing the Context of Social Media Content
Verifying Multimedia Content on the Internet
A Web-based Service for Image Tampering Detection
Learning to detect Misleading Content on Twitter
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Multimedia Privacy
Perceived versus Actual Predictability of Personal Information in Social Netw...
Web and Social Media Image Forensics for News Professionals
Predicting News Popularity by Mining Online Discussions
Finding Diverse Social Images at MediaEval 2015
Verifying Multimedia Use at MediaEval 2015
Detecting image splicing in the wild Web
Learning to Classify Users in Online Interaction Networks

Recently uploaded (20)

PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Business Analytics and business intelligence.pdf
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ISS -ESG Data flows What is ESG and HowHow
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Galatica Smart Energy Infrastructure Startup Pitch Deck
climate analysis of Dhaka ,Banglades.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Analytics and business intelligence.pdf
Lecture1 pattern recognition............
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Ppt On Nestle.pptx huunnnhhgfvu
Data_Analytics_and_PowerBI_Presentation.pptx
Qualitative Qantitative and Mixed Methods.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
IB Computer Science - Internal Assessment.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Fluorescence-microscope_Botany_detailed content
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
MODULE 8 - DISASTER risk PREPAREDNESS.pptx

Placing Images with Refined Language Models and Similarity Search with PCA-reduced VGG Features

  • 1. Placing Images with Refined Language Models and Similarity Search with PCA-reduced VGG Features Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and Yiannis Kompatsiaris1 1 Information Technologies Institute (ITI), CERTH, Greece 2 CEA LIST, 91190 Gif-sur-Yvette, France MediaEval 2016 Workshop, Oct. 20-21, 2016, Hilversum, Netherlands.
  • 2. Summary Tag-based location estimation (1 runs) • Built upon the scheme of our 2015 participation (Kordopatis-Zilos et al., MediaEval 2015) • Based on a refined probabilistic Language Model Visual-based location estimation (1 run) • Extract PCA-reduced VGG features to compute image similarities • Geospatial clustering scheme of the most visually similar images Hybrid location estimation (3 run) • Combination of the textual and visual approaches using a set of rules Training sets • Training set released by the organisers (≈4.7M geotagged items) • YFCC dataset, excl. images from users in test set (≈40M geotagged items) • External data derived from gazetteers, i.e. Geonames and OpenStreetMap G. Kordopatis-Zilos, A. Popescu, S. Papadopoulos, and Y. Kompatsiaris. Socialsensor at mediaeval placing task 2015. In MediaEval 2015 Placing Task, 2015
  • 3. Tag-based location estimation • Processing steps of the approach – Offline: language model construction – Online: location estimation OpenStreetMap
  • 4. Pre-processing • Tags and titles of the training set items are processed • Apply – URL decoding – lowercase transformation – tokenization • Remove – accents – symbols – punctuations • The multi-word tags are split into their individual terms, which are also included in the item's term set • Discard numerics or less than three characters terms
  • 5. Language Model (LM) • LM-based estimation – Most Likely Cell (mlc) considered the cell with the highest probability and used to produce the estimation 𝑚𝑙𝑐𝑗 = arg max 𝑖 𝑘=1 𝑇 𝑗 𝑝(𝑡 𝑘|𝑐𝑖) ∗ 𝑤(𝑡 𝑘) Inspired from (Popescu, MediaEval 2013) • LM generation scheme – divide earth surface in rectangular cells with a side length of 0.01° – calculate term-cell probabilities 𝑝(𝑡|𝑐) = 𝑁 𝑢/𝑁𝑡 A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In MediaEval 2013 Placing Task, 2013
  • 6. Feature Selection and Weighting Feature Weighting • Locality weight function, a function based on term relative position in T • Spatial Entropy weight function, a Gaussian function based on the term’s spatial entropy • Linear combination of the two weights Feature Selection • Calculate terms locality using a grid of 0.01°×0.01° • When a user uses a given term, he/she is assigned to the entire cell neighborhood instead of a unique cell: 𝑙 𝑡 = 𝑁𝑡 ∗ 𝑐∈𝐶 𝑢∈𝑈𝑡,𝑐 |{𝑢′|𝑢′ ∈ 𝑈𝑡,𝑐, 𝑢′ ≠ 𝑢}| 𝑁𝑡 2 • Terms with non-zero locality score form the term set 𝑇
  • 7. Refinements • Multiple Grids – Built an additional LM using a finer grid (cell side length of 0.001°) – combine the MLC of the individual language models • Similarity search (Van Laere et al., ICMR 2011) – determine 𝑘 𝑡 most similar training images in the MLC – their center-of-gravity is the final location estimation From: (Kordopatis-Zilos et al., PAISI 2015) G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social media content with a refined language modelling approach. In Intelligence and Security Informatics, pages 21–40, 2015
  • 8. Visual-based location estimation Main Objectives • Ensure that the visual features are generic and transferable • Provide a compact representation of the features Model building • CNN features extracted by fine-tuning the VGG model • Training: ~5K Points Of Interest (POIs), over 7M Flickr images using queries with: – the POI name and a radius of 5km around its coordinates – the POI name and the associated city name • Compressed outputs of fc7 layer (4096d) to 128d using PCA, learned on a subset of 250,000 train images • Similarity Search based on the PCA-reduced CNN features O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources using language models and similarity search. ICMR ’11, pages 48:1–48:8, New York, NY, USA, 2011. ACM
  • 9. Visual-based location estimation Location Estimation • Geospatial clustering of 𝑘 𝑣 = 20 visually most similar images • The largest cluster (or the first in case of equal size) is selected and its centroid is used as the location estimate Visual Confidence • Confidence metric for the visual estimation is based on the size of the largest cluster 𝑐𝑜𝑛𝑓𝑣 𝑖 = max( 𝑛 𝑖 − 𝑛 𝑡 𝑘 𝑣 − 𝑛 𝑡 , 0) 𝑛 𝑖 : number of neighbors in the largest cluster of image i 𝑛 𝑡: configuration parameter of the confidence score ‘’strictness’’ K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, 2015
  • 10. Hybrid-based location estimation • A set of rules to determine the source of estimation between the text and visual approaches • The visual estimation is chosen in cases: → No estimation could be produced by the text approach → Visual estimation fell inside the borders of the mlc → By comparing the confidence scores 𝑐𝑜𝑛𝑓𝑣 and 𝑐𝑜𝑛𝑓𝑡 • Otherwise the text estimation is selected
  • 11. Runs and Results RUN-1: Tag-based location estimation + released training set RUN-2: Visual-based location estimation + released training set RUN-3: Hybrid location estimation + released training set RUN-4: Hybrid location estimation + YFCC dataset RUN-5: Hybrid location estimation + YFCC + External data RUN-E: Visual-based location estimation + entire YFCC dataset Images
  • 12. Runs and Results RUN-1: Tag-based location estimation + released training set RUN-2: Visual-based location estimation + released training set RUN-3: Hybrid location estimation + released training set RUN-4: Hybrid location estimation + YFCC dataset RUN-5: Hybrid location estimation + YFCC + External data Videos
  • 13. References G. Kordopatis-Zilos, A. Popescu, S. Papadopoulos, and Y. Kompatsiaris. Socialsensor at mediaeval placing task 2015. In MediaEval 2015 Placing Task, 2015 G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social media content with a refined language modelling approach. In Intelligence and Security Informatics, pages 21–40, 2015 A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In MediaEval 2013 Placing Task, 2013 K. Simonyan and A. Zisserman. Very deep convolutional networks for large- scale image recognition. In International Conference on Learning Representations, 2015 O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources using language models and similarity search. ICMR ’11, pages 48:1–48:8, New York, NY, USA, 2011. ACM
  • 14. Thank you! Data/Code: – https://github.com/MKLab-ITI/multimedia-geotagging/ Get in touch: – Giorgos Kordopatis-Zilos: georgekordopatis@iti.gr – Symeon Papadopoulos: papadop@iti.gr / @sympap With the support of:

Editor's Notes

  • #4: Different kinds of user classification: topic-oriented (e.g., interest/expertise) role-based/behavioral (e.g., bot/spammer) geographical location Useful for advertising, user recommendation, expert search, etc. For personal accounts, user classification raises privacy concerns Challenges multi-linguality Brevity informal language