SlideShare a Scribd company logo
VISUALIZING IMAGE COLLECTIONS
WITH ONTOGEN
From Images To Ontologies
IMAGE DATA
   Difficult to handle

   High-dimensional representations

   The amount of image data is constantly increasing
    and there is a rising need for reliable automatic
    image analysis systems in practical applications
   Image representation          Application

                     Data
                     Mining

         Extract
         features
                    Text


            Color
             info
                         SIFT
                       features
SIFT FEATURES
   Rotation, scale and translation invariant orientation
    gradients located at “interesting” points on an
    image

   Usually, the SIFT feature space is quantized so that
    some “representative” vectors are found

   Each feature on an observed image is then
    assigned to its nearest representative and this is
    how the so called “codebook” histogram is obtained
COLOR HISTOGRAMS
   Color information on an image might or might not
    be of interest for a particular problem, but it usually
    represents a useful piece of information

   There are several ways to handle this
    information, but the simplest and fastest one is to
    simply divide the color spectrum into “buckets” and
    calculate the distribution of colors into these
    buckets, thereby obtaining the color histogram for
    an image
ONTOGEN
   OntoGen is a tool which allows us to do semi-
    automatic ontology construction, clustering,
    classification, as well as data visualization via
    multidimensional scaling

   This can easily be applied on image data to gain an
    overview of collections of images
IMAGE FEATURE EXTRACTION
   We extract SIFT features and color histograms for
    each image

   We calculate the distance between images as the
    weighted sum of distances between the two
    distributions (SIFT codebook and color data)

   If images have annotations, this can easily be
    incorporated by adding a third part in the
    representation for each image
ONTOGEN ON IMAGE DATA
   On the next few slides we show the usage of
    OntoGen on one simple data set

   The data was taken from ImageNet online image
    collection. The particular subset contains images of
    various types of flowers, as well as images of fire
    and images of buildings
MAIN WINDOW WHEN THE COLLECTION IS
LOADED
DOCUMENT LIST FOR QUICK OVERVIEW
DOCUMENT ATLAS WHEN NOT DISPLAYING
IMAGES
DOCUMENT ATLAS WHEN DISPLAYING IMAGES
CREATING AN ONTOLOGY
 We can do k-means clustering to detect groups of
  similar images
 We can use these groups to create a level in the
  ontology
 The relevant features are displayed on top of the
  nodes
SO, LET’S LOOK AT SOME OF THOSE NODES
AND THEIR MEDOIDS…PRETTY GOOD…
HOWEVER…
   One of the first-level sub-concepts is not good,
    which can be seen by observing it’s medoids:




   So, now we can branch it further into more refined
    sub-concepts to improve the quality
BEFORE WE DO SO, WE CAN VISUALIZE THE
SUB-CONCEPT IN DOCUMENT ATLAS
SO …
   This is definite evidence that the concept should be
    split into at least two different sub-concepts

   Most of the images inside it represent buildings, but
    there are some that belong to a certain type of
    flower, as well as some depicting fire

   So, just to be safe, let’s say we want 5 sub-
    concepts
THIS IS HOW THE NEW ONTOLOGY WILL LOOK
LIKE:
AND THE MEDOIDS FOR THE FIVE NEW
REFINED SUB-CONCEPTS ARE:
CONCLUSIONS
   What we see is that we can construct an image
    ontology in a semi-supervised way

   By using k-means clustering based on SIFT+color
    image representation we can detect candidates for
    concepts in the ontology and then refine them until
    we reach good quality
AKNOWLEDGEMENTS
 Thiswork was supported by the bilateral
 project between Slovenia and Romania
 “Understanding Human Behavior for Video
 Survailance Applications,” the Slovenian
 Research Agency and the ICT Programme
 of the EC PlanetData (ICTNoE-257641).

More Related Content

PDF
Scope and Issues in Alpha Compositing Technology
PDF
Mirko Lucchese - Deep Image Processing
PPTX
Object detection
PDF
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
PPT
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
PPT
Visual Search
PPTX
Alpha compositing computer technology
PPTX
Final ppt
Scope and Issues in Alpha Compositing Technology
Mirko Lucchese - Deep Image Processing
Object detection
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Visual Search
Alpha compositing computer technology
Final ppt

Viewers also liked (16)

PDF
What have fruits to do with technology? The case of Orange, Blackberry and Apple
PPTX
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
PPTX
Secondpresentation
PDF
Model based similarity measure in time cloud
PDF
SciQL, Bridging the Gap between Science and Relational DBMS
PPTX
Second Presentation
PDF
SciQL, Bridging the Gap between Science and Relational DBMS
PDF
SciQL, A Query Language for Science Applications
PDF
A Contextualized Knowledge Repository for Open Data about Trentino
PPT
Data and Knowledge Evolution
PPT
On the need for a W3C community group on RDF Stream Processing
PDF
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
PDF
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
PDF
Heuristic based Query Optimisation for SPARQL
PPS
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
PDF
Sensor Data Management
What have fruits to do with technology? The case of Orange, Blackberry and Apple
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Secondpresentation
Model based similarity measure in time cloud
SciQL, Bridging the Gap between Science and Relational DBMS
Second Presentation
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, A Query Language for Science Applications
A Contextualized Knowledge Repository for Open Data about Trentino
Data and Knowledge Evolution
On the need for a W3C community group on RDF Stream Processing
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Heuristic based Query Optimisation for SPARQL
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Sensor Data Management
Ad

Similar to OntoGen Extension for Exploring Image Collections (20)

PDF
Semantic Scene Classification for Image Annotation
PDF
Flickr Image Classification using SIFT Algorism
PDF
Web Image Retrieval Using Visual Dictionary
PDF
Web Image Retrieval Using Visual Dictionary
PDF
Web Image Retrieval Using Visual Dictionary
PDF
Combining Generative And Discriminative Classifiers For Semantic Automatic Im...
PPTX
11 cie552 image_featuresii_sift
PPTX
Object recognition
PDF
PDF
PDF
Semantic Hybridized Image Features in Visual Diagnostic of Plant Health
PPTX
Introduction image features
PDF
. Color and texture-based image segmentation using the expectation-maximizat...
KEY
Content-based Image Retrieval
PDF
Feature extraction based retrieval of
PPTX
06 image features
PPTX
A brief introduction to extracting information from images
PDF
Computer Vision: Pattern Recognition
PDF
Content Based Image Retrieval Approach Based on Top-Hat Transform And Modifie...
PDF
A hybrid approach for categorizing images based on complex networks and neur...
Semantic Scene Classification for Image Annotation
Flickr Image Classification using SIFT Algorism
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
Combining Generative And Discriminative Classifiers For Semantic Automatic Im...
11 cie552 image_featuresii_sift
Object recognition
Semantic Hybridized Image Features in Visual Diagnostic of Plant Health
Introduction image features
. Color and texture-based image segmentation using the expectation-maximizat...
Content-based Image Retrieval
Feature extraction based retrieval of
06 image features
A brief introduction to extracting information from images
Computer Vision: Pattern Recognition
Content Based Image Retrieval Approach Based on Top-Hat Transform And Modifie...
A hybrid approach for categorizing images based on complex networks and neur...
Ad

More from PlanetData Network of Excellence (20)

PDF
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
PDF
Towards Enabling Probabilistic Databases for Participatory Sensing
PDF
Privacy-Preserving Schema Reuse
PDF
Pay-as-you-go Reconciliation in Schema Matching Networks
PPTX
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
PDF
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
PPT
CLODA: A Crowdsourced Linked Open Data Architecture
PDF
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
PPS
Access Control for RDF graphs using Abstract Models
PDF
Arrays in Databases, the next frontier?
PPS
Abstract Access Control Model for Dynamic RDF Datasets
PPTX
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
PDF
Adaptive Semantic Data Management Techniques for Federations of Endpoints
PDF
Building a Front End for a Sensor Data Cloud
PDF
Exposing Real World Information for the Web of Things
PDF
Spatio-temporal reasoning for traffic scene understanding
PDF
Tractor Pulling on Data Warehouse
PPT
Declarative Repairing Policies for Curated KBs
PPTX
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
Towards Enabling Probabilistic Databases for Participatory Sensing
Privacy-Preserving Schema Reuse
Pay-as-you-go Reconciliation in Schema Matching Networks
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
CLODA: A Crowdsourced Linked Open Data Architecture
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Access Control for RDF graphs using Abstract Models
Arrays in Databases, the next frontier?
Abstract Access Control Model for Dynamic RDF Datasets
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Building a Front End for a Sensor Data Cloud
Exposing Real World Information for the Web of Things
Spatio-temporal reasoning for traffic scene understanding
Tractor Pulling on Data Warehouse
Declarative Repairing Policies for Curated KBs
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...

Recently uploaded (20)

PPTX
Tartificialntelligence_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
cuic standard and advanced reporting.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Electronic commerce courselecture one. Pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Machine Learning_overview_presentation.pptx
Tartificialntelligence_presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Group 1 Presentation -Planning and Decision Making .pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
MYSQL Presentation for SQL database connectivity
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
cuic standard and advanced reporting.pdf
Network Security Unit 5.pdf for BCA BBA.
Electronic commerce courselecture one. Pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
“AI and Expert System Decision Support & Business Intelligence Systems”
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Dropbox Q2 2025 Financial Results & Investor Presentation
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
20250228 LYD VKU AI Blended-Learning.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Reach Out and Touch Someone: Haptics and Empathic Computing
Machine Learning_overview_presentation.pptx

OntoGen Extension for Exploring Image Collections

  • 1. VISUALIZING IMAGE COLLECTIONS WITH ONTOGEN From Images To Ontologies
  • 2. IMAGE DATA  Difficult to handle  High-dimensional representations  The amount of image data is constantly increasing and there is a rising need for reliable automatic image analysis systems in practical applications
  • 3. Image representation Application Data Mining Extract features Text Color info SIFT features
  • 4. SIFT FEATURES  Rotation, scale and translation invariant orientation gradients located at “interesting” points on an image  Usually, the SIFT feature space is quantized so that some “representative” vectors are found  Each feature on an observed image is then assigned to its nearest representative and this is how the so called “codebook” histogram is obtained
  • 5. COLOR HISTOGRAMS  Color information on an image might or might not be of interest for a particular problem, but it usually represents a useful piece of information  There are several ways to handle this information, but the simplest and fastest one is to simply divide the color spectrum into “buckets” and calculate the distribution of colors into these buckets, thereby obtaining the color histogram for an image
  • 6. ONTOGEN  OntoGen is a tool which allows us to do semi- automatic ontology construction, clustering, classification, as well as data visualization via multidimensional scaling  This can easily be applied on image data to gain an overview of collections of images
  • 7. IMAGE FEATURE EXTRACTION  We extract SIFT features and color histograms for each image  We calculate the distance between images as the weighted sum of distances between the two distributions (SIFT codebook and color data)  If images have annotations, this can easily be incorporated by adding a third part in the representation for each image
  • 8. ONTOGEN ON IMAGE DATA  On the next few slides we show the usage of OntoGen on one simple data set  The data was taken from ImageNet online image collection. The particular subset contains images of various types of flowers, as well as images of fire and images of buildings
  • 9. MAIN WINDOW WHEN THE COLLECTION IS LOADED
  • 10. DOCUMENT LIST FOR QUICK OVERVIEW
  • 11. DOCUMENT ATLAS WHEN NOT DISPLAYING IMAGES
  • 12. DOCUMENT ATLAS WHEN DISPLAYING IMAGES
  • 13. CREATING AN ONTOLOGY  We can do k-means clustering to detect groups of similar images  We can use these groups to create a level in the ontology  The relevant features are displayed on top of the nodes
  • 14. SO, LET’S LOOK AT SOME OF THOSE NODES AND THEIR MEDOIDS…PRETTY GOOD…
  • 15. HOWEVER…  One of the first-level sub-concepts is not good, which can be seen by observing it’s medoids:  So, now we can branch it further into more refined sub-concepts to improve the quality
  • 16. BEFORE WE DO SO, WE CAN VISUALIZE THE SUB-CONCEPT IN DOCUMENT ATLAS
  • 17. SO …  This is definite evidence that the concept should be split into at least two different sub-concepts  Most of the images inside it represent buildings, but there are some that belong to a certain type of flower, as well as some depicting fire  So, just to be safe, let’s say we want 5 sub- concepts
  • 18. THIS IS HOW THE NEW ONTOLOGY WILL LOOK LIKE:
  • 19. AND THE MEDOIDS FOR THE FIVE NEW REFINED SUB-CONCEPTS ARE:
  • 20. CONCLUSIONS  What we see is that we can construct an image ontology in a semi-supervised way  By using k-means clustering based on SIFT+color image representation we can detect candidates for concepts in the ontology and then refine them until we reach good quality
  • 21. AKNOWLEDGEMENTS  Thiswork was supported by the bilateral project between Slovenia and Romania “Understanding Human Behavior for Video Survailance Applications,” the Slovenian Research Agency and the ICT Programme of the EC PlanetData (ICTNoE-257641).