SlideShare a Scribd company logo
Multimedia Content Based RetrievalGovindarajuHujigalgovin.tech1@gmail.com
Content based retrieval in multimediaan important research areachallenging problem since multimedia data needs detailed interpretation from pixel valuesdifferent strategies in terms of syntactic and semantic indexing for retrieval
Why do we need MCBR ?How do I find what I’m looking for?!
Multimedia content Retrievalmultimedia and storage technology that has led to building of a large repository of digital image, video, and audio data.Compared to text search, any assignment of text labels a massively labor intensive effort.Focus is an calculating statistics which can be approximately correlated to the content featureswithout costly human interaction.
Multimedia content RetrievalSearch based on Syntactic featuresShape, texture, color histogramrelatively undemandingSearch based on Semantic features human perception“ List all dogs look like cat”“City” “Landscape” “cricket”
Syntactic indexingUse syntactic features as the basis for matching and employ either Query-through-dialog or Query by-example box to interface with the user.Query-through-dialog Enter the words describing the imageQuery-through-dialog not convenient as the user needs to know the exact details of the attributes like shape, color, texture etc.
Image descriptors – Color Apples are red … … But tomatoes are too!!!
Image descriptors – Texture Texture differentiates between a Lawn and a Forest
Syntactic indexingQuery by exampleexample images and user chose the closest.various features like color, shape, textures  and spatial distribution f the chosen image are evaluated and matched against the images in the database.Similarity or distance metric.In Video, various key frames of video clips which are close to the user query are shown.
Multimedia content based retrieval slideshare.ppt
Syntactic indexingQuery by example limitationsImage can be annotated and interpreted in many ways. For example, a particular user may be interested in a waterfall, another may be interested in mountain and yet another in the sky, although all of them may be present in the same image.User may wonder "why do these two images look similar?" or "what specific parts of these images are contributing to the similarity?“. User is required to know the search structure and other details for efficiently searching the database.It requires many comparisons and results may be too many depending on threshold.
Semantic indexingMatch the human perception and cognition
Semantic content contains high-level concepts such as objects and events.
As humans think in term of events and remember different events and objects after watching video, these high-level concepts are the most important cues in content-based retrieval. Let’s take as an example a soccer game, humans usually remember goals, interesting actions, red cards etc.Semantic indexingThere exists a relationship between the degree of action and the structure of visual patterns that constitute a movie.Movies can be classified into four broad categories: Comedies, Action, Dramas, or Horror films. Inspired by cinematic principles, four computable video features (average shot length, color variance, motion content and lighting key) are combined in a framework to provide a mapping to these four high-level semantic classes.
Motion feature as indexing cue..Spatial Scene Analysis on video can be fully transferred  from CBIR but temporal analysis is the uniqueness   about video. Temporal Information induces the concept of motion for the objects present in the document
Motion feature as indexing cue..Frame level: Each frame is treated separately.    There is no temporal analysis at this level.Shot-level: A shot is a set of contiguous frames    all acquired through a continuous camera  recording.    Only the temporal information is used.Scene-level: A scene is a set of contiguous shots    having a common semantic significance.Video-level: The complete video object is treated as a whole.
Motion feature as indexing cue..The three types of Shot-level are as follows:Cut: A sharp boundary between shots. This generally implies a peak in the difference between color or motion histograms corresponding to the two frames surrounding the cut.Dissolve: The content of last images of the first shots is continuously mixed with that of the first images of the second shot.Wipe: The images of the second shot continuously cover or push out of the display that of the first shot.
Motion feature as indexing cueOften through motion that the content in a video is expressed and the attention of the viewers captivatedQuery techniquesSet of motion vector trajectories mapped to set of objects. Visual query can be ‘player’.[Dimitrova]Use animated sketch to formulate queries.Motion and temporal duration are the key attributes assigned to each object in the sketch in addition to the usual attributes such as shape, color and texture. [VideoQ]
Matching techniquesMethod of finding similarity between the two sets of multimedia  data, which can either be images or videos.Search based on features like location, colors and concepts, examples of which are ‘mostly red’, ‘sunset’, ‘yellow flowers’ etc.User specify the relative weights to the features or assign equal weightageAutomatically identifying the relevance of the features is under active research.
Learning methods in retrievalThe user generates both the positive and negative retrieval examples (relevance feedback).Each image can represent multiple concepts. To replace one of these ambiguities, each image is modeled as a bag of instances (sub-blocks in the image). A bag is labeled as a positive example of a concept, if there exist some instances representing the concept, which could be a car or a waterfall scene. If there does not exist any instance, the bag is labelled as a negative example.The concept is learned by using a small collection of positive and negative examples and this is used to retrieve images containing a similar concept from the database.
Learning methods in retrievalThe ability to infer high-level understanding from a multimedia content has proven to be a difficult goal to achieve.Example, the category “John eating icecream”.Such categories might require the presence of sophisticated scene understanding algorithms along with the understanding of spatio-temporal relationship between entities (like the behavior eating can be characterized as repeatedly putting something eatable in mouth).
Structure in multimedia contentTo achieve efficiency in content-production and due to the limited number of available resources, standard techniques are employed.The intention of video making is to represent an action or to evoke emotions using various storytelling methods. Figure 1 gives an analysis of the basic techniques of shot transitions that are used to convey particular intentions.
Multimedia content based retrieval slideshare.ppt
Structure in multimedia contentSpecial structure of news in ‘begin shot’, ‘newscaster shot’, ‘interview’, ‘weather forecast’ etc. and builds a video model of news.car-race video has unusual zoom-in and zoom-out, basketball has left-panning and right-panning that last for certain maximum duration.The motion activity in interesting shots in sports is higher than its surrounding shots and so on.
Future of CBR systemsThere is ambiguity in making such conclusions, for example, dissolve can be either due to ‘flashback’ or due to ‘time lapse’.  if the number of dissolves is two, most probably ‘flashback’                - “Multimedia Content Description Interface” - specify a standard set of descriptors that can be used to describe various types of multimedia informationMake collaborative effort to tag the multimedia
Commercial systems – Like.com
Commercial systems – Like.com
Commercial systems – Like.com
Commercial systems – Like.com
ConclusionsSystematic exploration of construction of high-level indexes is lacking.None of the work has considered exploring features close to the human perception.In summary, there is a great need to extract semantic indices for making the CBR system serviceable to the user. Though extracting all such indices might not be possible, there is a great scope for furnishing the semantic indices with a certain well-established structure.

More Related Content

DOCX
Crucial decisions in designing a data warehouse
PPT
4.3 multimedia datamining
DOCX
Image processing
PPTX
Fundamentals and image compression models
PDF
Multimedia Information Retrieval
PPTX
Data mining tasks
PPTX
Smoothing Filters in Spatial Domain
PDF
Digital Image Processing: Image Segmentation
Crucial decisions in designing a data warehouse
4.3 multimedia datamining
Image processing
Fundamentals and image compression models
Multimedia Information Retrieval
Data mining tasks
Smoothing Filters in Spatial Domain
Digital Image Processing: Image Segmentation

What's hot (20)

PPTX
Temporal database
PPTX
The impact of web on ir
PPTX
Multimedia Database
PPTX
Spatial databases
PDF
Content Based Image Retrieval
PPTX
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
PPTX
Color image processing
PDF
Information retrieval-systems notes
PPT
Multimedia Mining
PPT
Digital Image File Formats
PPT
Video display devices
PDF
Unit 07: Design Patterns and Frameworks (1/3)
PPTX
Web search vs ir
PPTX
ODP
Web content mining
PPTX
Digital image processing
PPT
Latent Semantic Indexing For Information Retrieval
PPTX
Image Smoothing using Frequency Domain Filters
PPTX
Multimedia Building Blocks by Daniyal Khan
PDF
Data compression introduction
Temporal database
The impact of web on ir
Multimedia Database
Spatial databases
Content Based Image Retrieval
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
Color image processing
Information retrieval-systems notes
Multimedia Mining
Digital Image File Formats
Video display devices
Unit 07: Design Patterns and Frameworks (1/3)
Web search vs ir
Web content mining
Digital image processing
Latent Semantic Indexing For Information Retrieval
Image Smoothing using Frequency Domain Filters
Multimedia Building Blocks by Daniyal Khan
Data compression introduction
Ad

Viewers also liked (20)

PPTX
Content Based Image and Video Retrieval Algorithm
PDF
Content based video retrieval system
PDF
Video Indexing and Retrieval
PPT
Multimedia Information Retrieval: What is it, and why isn't ...
PPT
Video Indexing And Retrieval
PPTX
Information storage and retrieval
PPT
Content based image retrieval(cbir)
PPTX
Multimedia
PPTX
Mcbr ppt-mini
PDF
Iaetsd enhancement of face retrival desigend for
PDF
Review on content based video lecture retrieval
PDF
Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)
PDF
Interval Pattern Structures: An introdution
PDF
A model integration framework
PPT
Recommendation and Information Retrieval: Two Sides of the Same Coin?
PPTX
E:\사본 Learner Training Strategies
PDF
Formal Concept Analysis
PPTX
Language Experience Activities for Elementary Grades, Adult Low Level Readers...
PDF
(In)Formal Concept Analysis
PPTX
Whole language method sutriyani (2)
Content Based Image and Video Retrieval Algorithm
Content based video retrieval system
Video Indexing and Retrieval
Multimedia Information Retrieval: What is it, and why isn't ...
Video Indexing And Retrieval
Information storage and retrieval
Content based image retrieval(cbir)
Multimedia
Mcbr ppt-mini
Iaetsd enhancement of face retrival desigend for
Review on content based video lecture retrieval
Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)
Interval Pattern Structures: An introdution
A model integration framework
Recommendation and Information Retrieval: Two Sides of the Same Coin?
E:\사본 Learner Training Strategies
Formal Concept Analysis
Language Experience Activities for Elementary Grades, Adult Low Level Readers...
(In)Formal Concept Analysis
Whole language method sutriyani (2)
Ad

Similar to Multimedia content based retrieval slideshare.ppt (20)

PDF
H018124360
PDF
Scene Description From Images To Sentences
PDF
Image retrieval and re ranking techniques - a survey
PPSX
Image Search: Then and Now
PDF
Techniques Used For Extracting Useful Information From Images
PDF
Intrusive Images, Neural Mechanisms, And Treatment...
PDF
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
PDF
Applications of spatial features in cbir a survey
PDF
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
PDF
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
PPT
Visual Search
PDF
Key Frame Extraction for Salient Activity Recognition
PDF
A novel Image Retrieval System using an effective region based shape represen...
PPTX
Computer_Vision_ItsHistory_Advantages_and Uses.pptx
PPT
Twente ir-course 20-10-2010
PDF
The Visual Data Discovery Tool
PDF
A Review on Matching For Sketch Technique
PDF
The deep learning technology on coco framework full report
PDF
IRJET- Neural Story Teller using RNN and Generative Algorithm
H018124360
Scene Description From Images To Sentences
Image retrieval and re ranking techniques - a survey
Image Search: Then and Now
Techniques Used For Extracting Useful Information From Images
Intrusive Images, Neural Mechanisms, And Treatment...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Applications of spatial features in cbir a survey
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
Visual Search
Key Frame Extraction for Salient Activity Recognition
A novel Image Retrieval System using an effective region based shape represen...
Computer_Vision_ItsHistory_Advantages_and Uses.pptx
Twente ir-course 20-10-2010
The Visual Data Discovery Tool
A Review on Matching For Sketch Technique
The deep learning technology on coco framework full report
IRJET- Neural Story Teller using RNN and Generative Algorithm

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
KodekX | Application Modernization Development
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Understanding_Digital_Forensics_Presentation.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Network Security Unit 5.pdf for BCA BBA.
Encapsulation_ Review paper, used for researhc scholars
KodekX | Application Modernization Development
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The Rise and Fall of 3GPP – Time for a Sabbatical?
sap open course for s4hana steps from ECC to s4
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
20250228 LYD VKU AI Blended-Learning.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Understanding_Digital_Forensics_Presentation.pptx

Multimedia content based retrieval slideshare.ppt

  • 1. Multimedia Content Based RetrievalGovindarajuHujigalgovin.tech1@gmail.com
  • 2. Content based retrieval in multimediaan important research areachallenging problem since multimedia data needs detailed interpretation from pixel valuesdifferent strategies in terms of syntactic and semantic indexing for retrieval
  • 3. Why do we need MCBR ?How do I find what I’m looking for?!
  • 4. Multimedia content Retrievalmultimedia and storage technology that has led to building of a large repository of digital image, video, and audio data.Compared to text search, any assignment of text labels a massively labor intensive effort.Focus is an calculating statistics which can be approximately correlated to the content featureswithout costly human interaction.
  • 5. Multimedia content RetrievalSearch based on Syntactic featuresShape, texture, color histogramrelatively undemandingSearch based on Semantic features human perception“ List all dogs look like cat”“City” “Landscape” “cricket”
  • 6. Syntactic indexingUse syntactic features as the basis for matching and employ either Query-through-dialog or Query by-example box to interface with the user.Query-through-dialog Enter the words describing the imageQuery-through-dialog not convenient as the user needs to know the exact details of the attributes like shape, color, texture etc.
  • 7. Image descriptors – Color Apples are red … … But tomatoes are too!!!
  • 8. Image descriptors – Texture Texture differentiates between a Lawn and a Forest
  • 9. Syntactic indexingQuery by exampleexample images and user chose the closest.various features like color, shape, textures and spatial distribution f the chosen image are evaluated and matched against the images in the database.Similarity or distance metric.In Video, various key frames of video clips which are close to the user query are shown.
  • 11. Syntactic indexingQuery by example limitationsImage can be annotated and interpreted in many ways. For example, a particular user may be interested in a waterfall, another may be interested in mountain and yet another in the sky, although all of them may be present in the same image.User may wonder "why do these two images look similar?" or "what specific parts of these images are contributing to the similarity?“. User is required to know the search structure and other details for efficiently searching the database.It requires many comparisons and results may be too many depending on threshold.
  • 12. Semantic indexingMatch the human perception and cognition
  • 13. Semantic content contains high-level concepts such as objects and events.
  • 14. As humans think in term of events and remember different events and objects after watching video, these high-level concepts are the most important cues in content-based retrieval. Let’s take as an example a soccer game, humans usually remember goals, interesting actions, red cards etc.Semantic indexingThere exists a relationship between the degree of action and the structure of visual patterns that constitute a movie.Movies can be classified into four broad categories: Comedies, Action, Dramas, or Horror films. Inspired by cinematic principles, four computable video features (average shot length, color variance, motion content and lighting key) are combined in a framework to provide a mapping to these four high-level semantic classes.
  • 15. Motion feature as indexing cue..Spatial Scene Analysis on video can be fully transferred from CBIR but temporal analysis is the uniqueness about video. Temporal Information induces the concept of motion for the objects present in the document
  • 16. Motion feature as indexing cue..Frame level: Each frame is treated separately. There is no temporal analysis at this level.Shot-level: A shot is a set of contiguous frames all acquired through a continuous camera recording. Only the temporal information is used.Scene-level: A scene is a set of contiguous shots having a common semantic significance.Video-level: The complete video object is treated as a whole.
  • 17. Motion feature as indexing cue..The three types of Shot-level are as follows:Cut: A sharp boundary between shots. This generally implies a peak in the difference between color or motion histograms corresponding to the two frames surrounding the cut.Dissolve: The content of last images of the first shots is continuously mixed with that of the first images of the second shot.Wipe: The images of the second shot continuously cover or push out of the display that of the first shot.
  • 18. Motion feature as indexing cueOften through motion that the content in a video is expressed and the attention of the viewers captivatedQuery techniquesSet of motion vector trajectories mapped to set of objects. Visual query can be ‘player’.[Dimitrova]Use animated sketch to formulate queries.Motion and temporal duration are the key attributes assigned to each object in the sketch in addition to the usual attributes such as shape, color and texture. [VideoQ]
  • 19. Matching techniquesMethod of finding similarity between the two sets of multimedia data, which can either be images or videos.Search based on features like location, colors and concepts, examples of which are ‘mostly red’, ‘sunset’, ‘yellow flowers’ etc.User specify the relative weights to the features or assign equal weightageAutomatically identifying the relevance of the features is under active research.
  • 20. Learning methods in retrievalThe user generates both the positive and negative retrieval examples (relevance feedback).Each image can represent multiple concepts. To replace one of these ambiguities, each image is modeled as a bag of instances (sub-blocks in the image). A bag is labeled as a positive example of a concept, if there exist some instances representing the concept, which could be a car or a waterfall scene. If there does not exist any instance, the bag is labelled as a negative example.The concept is learned by using a small collection of positive and negative examples and this is used to retrieve images containing a similar concept from the database.
  • 21. Learning methods in retrievalThe ability to infer high-level understanding from a multimedia content has proven to be a difficult goal to achieve.Example, the category “John eating icecream”.Such categories might require the presence of sophisticated scene understanding algorithms along with the understanding of spatio-temporal relationship between entities (like the behavior eating can be characterized as repeatedly putting something eatable in mouth).
  • 22. Structure in multimedia contentTo achieve efficiency in content-production and due to the limited number of available resources, standard techniques are employed.The intention of video making is to represent an action or to evoke emotions using various storytelling methods. Figure 1 gives an analysis of the basic techniques of shot transitions that are used to convey particular intentions.
  • 24. Structure in multimedia contentSpecial structure of news in ‘begin shot’, ‘newscaster shot’, ‘interview’, ‘weather forecast’ etc. and builds a video model of news.car-race video has unusual zoom-in and zoom-out, basketball has left-panning and right-panning that last for certain maximum duration.The motion activity in interesting shots in sports is higher than its surrounding shots and so on.
  • 25. Future of CBR systemsThere is ambiguity in making such conclusions, for example, dissolve can be either due to ‘flashback’ or due to ‘time lapse’. if the number of dissolves is two, most probably ‘flashback’ - “Multimedia Content Description Interface” - specify a standard set of descriptors that can be used to describe various types of multimedia informationMake collaborative effort to tag the multimedia
  • 30. ConclusionsSystematic exploration of construction of high-level indexes is lacking.None of the work has considered exploring features close to the human perception.In summary, there is a great need to extract semantic indices for making the CBR system serviceable to the user. Though extracting all such indices might not be possible, there is a great scope for furnishing the semantic indices with a certain well-established structure.
  • 31. ConclusionsContent-based video indexing and retrieval is an active area of research with continuing attributions from several domain including image processing, computer vision,databasesystem and artificial intelligence.