Combining Multimedia and Semantics (LACNEM2010)

Combining Multimedia and SemanticsOscar Corcho (ocorcho@fi.upm.es)Universidad Politécnica de Madridhttp://www.oeg-upm.net/LACNEM 2010, Cali, ColombiaSeptember 9th 2010Credits: Adrián Siles, Mariano Rico, Víctor Méndez, Hector Andrés García-Silva, María del Carmen Suárez-Figueroa, Ghislain Atemezing, Raphaël TroncyWorkdistributedunderthelicenseCreativeCommonsAttribution-Noncommercial-Share Alike3.0http://www.slideshare.net/ocorcho

2Asunción Gómez PérezOntologyEngineering Group. Whomwe areDirector: A. Gómez-PérezResearch Group (37 people)2 Full Professor4 AssociateProfessors1 AssistantProfessor3 Postdocs17 PhD Students8 MScStudents2 Software Engineers Management (4 people)2 Project Managers1 SystemAdministrator1 Secretary 50+ PastCollaborators 10+ visitors

Research Areas20042008199519972000

Beforewestart…Howmany of youhaveeverheardabouttheword “Ontology”?And howmany of you do actuallyknowwhatitmeans?4

Comingtotermswithontologies and semanticsAn ontology is an engineering artifact, which provides: A vocabulary of termsA set of explicit assumptions regarding the intended meaning of the vocabulary. Almost always including concepts and their classificationAlmost always including properties between conceptsShared understanding of a domain of interest Agreement on the meaning of termsFormal and machine manipulable model of a domain of interestBesides...The meaning (semantics) of such terms is formally specifiedNew terms can be formed by combining existing onesCan also specify relationships between terms in multiple ontologies5

Example: Anontologyaboutsatellites6

OutlineIntroductionWhat I willbetalkingabout and what I willnot…Therewereseveraloptionsthat I exploredbeforeselectingtheonethatyouwillbehearing in thistalk…7

Option 1: The Semantic GapThe lack of coincidencebetweentheinformationthatone can extractfromthesensory data and theinterpretationthatthesame data has for a user in a givensituation8A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain: Content-based image retrieval at the end of the early years, IEEE PAMI, 1349–1380, 2000.However, I alreadyassumedthatEbroulwouldbetalking a lotaboutit in hisopeningkeynote (as he did). Besides, I havenotworked at allonthelow-levelpart, so itmaybedifficultfor me toprovideyouwith a goodinsightonthe (many) open problems in thisarea

Option 2: MPEG-7 and the Semantic WebISO standard since December 2001Main components:Descriptors (Ds) and Description Schemes (DSs)DDL (XML Schema + extensions)Concern all types of mediaA good number of ontologies developed around it9Part 5 – MDSMultimedia Description Schemes

Option 2: MPEG-7 and the Semantic WebHowever, thetalkmay: Be a bit boring and tootechnicalMaylackthemix of state of the art and visionthataninvitedtalkshouldnormallyhave And MPEG-7 isnotusedtoomuch…So I willcoveronlysomeaspects of thislater, when I talkabout multimedia ontologies.10

Option 3: Canonical Processes of Media Production (and semantics, obviously)For example….http://www.cewe-photobook.comApplication for authoring digital photo booksAutomatic selection, sorting and ordering of photosContext analysis methods: timestamp, annotation, etc.Content analysis methods: color histograms, edge detection, etc.Customized layout and backgroundPrint by the European leader photo finisher company11Credits: Raphaël Troncy, LyndaHardman

CeWe Color PhotoBook ProcessesMy winter ski holidays with my friendsCredits: Raphaël Troncy, LyndaHardman

CeWe Color PhotoBook ProcessesCredits: Raphaël Troncy, LyndaHardman

Semantics can be important in the processCredits: Raphaël Troncy, LyndaHardman

Option 3: Canonical Processes of Media ProductionHowever, some of youprobablyattended Raphaël Troncy’stalklastyear (available in slideshare)19

In summary…I decidedtotalkaboutsomethingthat I havebeenworking in forthelastcouple of years, and which combinesSemantics (of course, thisisthekeyexpertise of ourgroup)Mainlyannotation, Linked Data and a bit of Multimedia OntologyEngineeringSocial networks, collaboration, sharing and collectiveintelligenceExploiting home networks and online multimedia sitesAnd, obviously, multimediaAnd hence I stillleaveoutmanyinterestingtopics (e.g., semanticsin user interfaces)20

OutlineIntroductionWhat I willbetalkingabout and what I willnotSem-UPnP-GridSharing multimedia contentacrosshomesthroughsemanticannotationsCredits: Mariano Rico and Adrián Siles (UPM), Víctor Méndez and José Manuel Gómez-Pérez (iSOCO), José Manuel Palacios and Mónica Pérez (TID)Sem4TagsTagdisambiguation in FlickrM3 Ontology(onlyif time permits)A semanticbackboneforour multimedia-relatedworkConclusions and outlook21

InternetMotivationMultimedia resources in Web2.0 are stored in centralised servers.You lose some of yourrights as anauthorwhenyouuploadtheseresourcestothese servers.Privacyproblems.Poorannotations and metadata.Theseresourcescannotbesharedwithotherresources in your home.22UpGrid

Multimedia Content SharingwithUpGrid23Annotation:“Ángel onthebeach”Reasoning: “Ángel is my son”

-------------------------------

Ángel is my nephewJuanP2PSemantic-basedquery:“multimedia contentrelatedto my nephew”Annotation:“Ángel playing soccer”PedroAdditionalsemanticinformation: “Ángel is my son”

“Pedro is my brother”Additionalsemanticinformation:“Juan is my brother”

Architecture (anotherviewonit)

SnapshotsfromtheapplicationCheckhttp://www.youtube.com/results?search_query=UPnPGrid

SummaryAneffectivemeansforsharing multimedia contentsacrosshomes, avoiding Web2.0 siteswhereyourrightsmaybecompromisedHowever, itisstill a prototype, and no serioususabilitytesting has been doneMuchworkstillneeded in ordertogointo a real systemAnd endusersfinditdifficulttoprovideannotationsDo you imagine yourparents and grandparentsannotatingphotos and videos likethat?Let’sseehowthiscouldbeamelioratedwiththenextpart of ourpresentation.27

OutlineIntroductionWhat I willbetalkingabout and what I willnotSem-UPnP-GridSharing multimedia contentacrosshomesthroughsemanticannotationsSem4TagsTagdisambiguation in FlickrCredits: Héctor Andrés García SilvaM3 Ontology(onlyif time permits)A semanticbackboneforour multimedia-relatedworkConclusions and outlook28Egresado de laUniversidad del Valle

IntroductionSocial Tagging SystemsWeb 2.0 applications Applications for storing, sharing, and discovering information resources.Users assign tagsto identify information resourcesTags are used to search/discover resources29

IntroductionFolksonomyEmerging classification scheme from social tagging systems Folk: People, Taxonomy: ClassificationRepresented by: Users, Tags, ResourcesTaxonomyFolksonomyTop-down

IntroductionWhy is tagging so popular?Reduce cognitive burdens it’s easy to useUsers don´t need any special skill or experienceThe benefits of tagging are immediateFuture retrievalContribution and sharingAttract AttentionSelf PresentationOpinion Expression31

IntroductionHoweverTags can be ambiguous Polysemy: partyas a celebration as opposed to partyas a political organizationSynonym: party and celebration Morphological variations: party, parties, partying, partyignPluralsAcronymsConjugated verbsMisspellingCompound wordsPolitical party, PoliticalParty, Political_party, Political-Party, etc.Detail/granularity levelA general tag as partyin contrast to a specific tag as banquet.32

MotivationThe problem: Morphological variations, synonyms, granularity, and polysemy hamper information retrieval processes based on folksonomies. Systems ignore resources tagged with morphological variationsor synonyms of that tag, as well as the resources tagged with more generic or more specific tags710.659 results8.661.581 Results33

When searching with polysemous tags, all the resources tagged with that tag are retrieved without taking into account the tag sense the user was looking for. (e.g., Query flickr with bank results in photos about financial institutions, river edges, fog banks, and sand banks, etc. )34Motivation

MotivationWhat if we associate tags with semantic entities?http://morpheus.cs.umbc.edu/aks1/ontosem.owl #non-work-activityWe can avoid the aforementioned pitfalls#organization#special-occasion#political-entity#party#Celebration#political-party#Coalition#federation#Birthday#Anniversaryuk, tories, party, conservative, speech party, balloons, colors, bar, crowd35

State of the Art: Semantic Grounding of Cross-Lingual FolksonomiesGarcia HA, Corcho O, Alani H, Gómez-Pérez A. Review of the state of the art: Discovering and Associating Semantics to Folksonomies. Knowledge Engineering Review (in press)None of the analyzed approaches deals with multilingual tags36

Semantic Grounding of Cross-Lingual FolksonomiesMSR: a Multilingual Sense Repository based on Wikipedia and enriched with semantic information taken from DBpedia.Terms and frequencyBancoBankhttp://dbpedia.org/resource/BankTerms and frequencyBancoCardumenSwarmhttp://dbpedia.org/resource/SwarmBanco de Arenahttp://dbpedia.org/resource/SandBankTerms and frequencySandbank37

Semantic Grounding of Cross-Lingual FolksonomiesSem4Tags: A process for Associating Semantics to Tags.38Dinero,Calle,Santander,Money,Madrid,Atm, cajeroEuropeEuro FinanceCentral bankawesomePicNikon ..BankBancohttp://dbpedia.org/resource/Bank

Semantic Grounding of Cross-Lingual FolksonomiesDisambiguation activityThe candidate senses and the tag context are represented as vectors. The vector components are the set of most frequent terms in each Wikipedia page representing a sense.For each sense the values of the vector are calculated using TF-IDF.For the tag context the values in each position are 1 or 0 if the corresponding term appears in the tag context. The tag context vector is compared against each sense vector using the cosine of the angle as similarity measure. The most similar sense to the tag context is selected as the one representing the meaning of the analyzed tag3939

Semantic Grounding of Cross-Lingual FolksonomiesDisambiguation activityWe use the information of the wikipedia default sense for a term. Sim(TagContext, Sensei)= λ*Cosine + β*defaultSenseWe experimentally defined β = 0,2 and λ = 0.8We attempt to use DBpedia semantic information in the disambiguation activity:Sim(TagContext, Sensei)= λ*Cosine + β*defaultSense + δ*SemanticInfoStudies have shown that tags in flickr refers mainly to: Locations, Time, Given Names, Potography related subjects among others. We use DBpedia and YAGO relations to classify the senses according to this categories.However, we found that not all the senses related to a term have the same amount of relations. (e.g. Madrid is not a city)40

Let’s try ithttp://robinson.dia.fi.upm.es:8080/SemanticTagsWebApp/index.jspWhatdoes “bernabeu” mean ifitscontextis…?estadio, madrid, fútbol41

ExperimentBaseline: Directly associate tags with DBpedia resourcesLook for spaces and replace them with ' _‘.For tags in English:Create a URI of the form http://en.wikipedia.org/wiki/tagQuery DBpedia using the http://xmlns.com/foaf/0.1/page relationFor tags in Spanish:Create a URI of the form http://es.wikipedia.org/wiki/tagQuery DBpedia using the http://dbpedia.org/property/wikipage-es relation42

ExperimentApproaches:Baseline: Selection of the sense without a disambiguation activity.Sem4Tags: For each sense we use the whole Wikipedia article as source for frequentterms.Sem4TagsAC: Same as Sem4Tags including the selection of the Active Context.Sem4TagsAbs: For each sense we use the Wikipedia article abstract (extracted from DBpedia) as source for frequent terms.Sem4TagsAbsAC: Same as Sem4TagsAbs including the selection of the Active Context.43

ExperimentInitial Data SetWide range of Users, photos, and tags.764 photos uploaded by 719 users to Flickr that have been tagged with tags describing tourist places in Spain12.4 (+/- 7.85) tags per photo9484 tagging activities (TAS) : <user,photo,tag>4135 distinct tags where usedProcessed Data SetFrom each photo we processed on average 2 tags 2260 taggingactivities (TAS)44

ExperimentEvaluation Campaign41 EvaluatorsEvaluate semantic associations produce by each approach: <user; tag; photo; DBpedia resource; language>Three different evaluators evaluated each semantic association.Questions:Able to identify the tag meaning (known or Unknown)Tag language (English, Spanish, Both, other)The tag correspond to a Named entityAccording to the identified tag language they evaluate the semantic association in terms ofHighly related, Related, Not Related.45

ExperimentResultsEvaluators identified the semantics of the 87% of TAS (known)62.6 % of TAS were considered in English87.7% of TAS were considered in SpanishAgreement among evaluators (Fleiss’ kappa statistics):k=0.76 for highly relatedK=0.71 for the related case/highly related case46

ExperimentPrecision and RecallforHighlyRelevantresults47EnglishSpanish

ExperimentConclusionsBaseline obtained high precision, however it was able to find semantic resources for just a fraction of the analyzed data set:Baseline: 27.7% in English and 19.4% in Spanish.Sem4Tags: 79.1 % in English and 81.4% in SpanishAll approaches obtained better precision with named entities than with unnamed entities. Sem4Tags and Sem4TagsAC are the approaches that obtained the best results in terms of Precision and Recall. Sometimes Sem4TagsAC obtains better P@1 values but the improvements are supported by no or low statistical evidence. Sem4TagsAbs and Sem4TagsAbs are clearly the worst approaches. 48

OutlineIntroductionWhat I willbetalkingabout and what I willnotSem-UPnP-GridSharing multimedia contentacrosshomesthroughsemanticannotationsSem4TagsTagdisambiguation in FlickrM3 Ontology(onlyif time permits)A semanticbackboneforour multimedia-relatedworkConclusions and outlook49

There are already multimedia ontologiesMDS Upper Layer represented in RDFS2001: HunterLater on: link to the ABC upper ontologyMDS fully represented in OWL-DL2004: Tsinaraki et al., DS-MIRF modelMPEG-7 fully represented in OWL-DL2005: Garcia and Celma, Rhizomik modelFully automatic translation of the whole standardMDS and Visual parts represented in OWL-DL2007: Arndt et al., COMM model Re-engineering MPEG-7 using DOLCE design patternsHowever, their requirements are not always clear nor have they been developed with clear methodological guidelines

Combining Multimedia and Semantics (LACNEM2010)

More Related Content

Similar to Combining Multimedia and Semantics (LACNEM2010) (20)

More from Oscar Corcho (20)

Recently uploaded (20)

Combining Multimedia and Semantics (LACNEM2010)