SlideShare a Scribd company logo
A Unified Music Recommender System Using
Users’ Listening Habits and Semantics of Tags
Hyon Hee Kim
Department of Statistics and Information Science,
Dongduk Women’s University
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Motivation (1/3)
• In a Social Music Site
– Music recommendation is essential.
– Music recommendation is different from other product recommendation
• Explicit information : Rating system
• Implicit information : the number of plays
• Listening habits-based User Profiling
– Cold Start Problem
• A new users with little information
• A new items with only a few ratings
– Data Sparsity Problem
• Data is very small compared to needed music items
Classic rock
british
pop
rock
• Collaborative Tagging
– A tool for users to represent their preferences about web resources
– Users add keywords which are freely chosen by themselves to web resources
– Using tag data for user profiling in personalized recommender systems
• Tag-based User Profiling
– More Easily added tags without listening to music
– Semantically meaningful tags
Motivation (2/3)
Motivation (3/3)
• In the case of last.fm
• Factual Tags
– 85% of tags
– genre, region, instrumentation
• Emotional Tags
– 10% of tags
– opinion, sentiment, mood
• Personal Tags
– 5% of tags
– to organize, to browse, etc.
Objectives
• A Novel Approach to Music Recommendation
– Combining listening habits and semantics of tags
• Using a Tag Ontology and an Emotion Ontology
– UniTag: Resolving semantic ambiguity of tags
– UniEmotion: Assigning weighted values to the emotional tags
→ Semantically Enhanced Music Recommendation
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Overview of the System
Outline
• Motivation & Objectives
• Overview of the System
• Tag-based User Profiling
– Preprocessing of tags
– Algorithms for generating user profiles
– Preliminary experimental results
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Preprocessing of Tags (1/3)
• A tag does not have any pre-defined term or hierarchies of a term
• Problems of tag data
– Synonymy
• Different words represents the same meaning
• E.g., hiphop, hip-hop, hip hop/ R & B, Rhythm and Blues, Blues
– Polysemy
• A single word contains multiple meanings
• E.g., French => French rock, French pop, French artist
– Spelling variants
• misspelling
• Foreign language
Preprocessing of Tags (2/3)
• Tag Ontology
– Tags, users, items
• UniTag Ontology
– uniTag:Users
• uniTag:userID, uniTag:hasAdded, uniTag:hasAddedTo
– uniTag:Items
• uniTag:itemID
– uniTag:Tags
• uniTag:tagID, uniTag:tagName, uniTag:RTag, uniTag:subTag,
• uniTag:Rtags {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
• uniTag:classifiedAs, uniTag:isKindOf, uniTag:istheSameAs, uniTag:tagVariation
Preprocessing of Tags (3/3)
• Rules for reasoning prefix
– French rock, progressive rock, post rock=> rock
(Tag (?t) ^ tagPrefix (?t, ?p) ^ Prefix(?p) ^ subTag(?t, ?s) ^ Rtags (?s) ->
classifiedAs (?t, ?s)
• Rules for reasoning expert knowledge
– Soul => rhythm and blues, rhythm and blues => blues then Soul => blues
(Tag (?t) ^ isKindof (?t, ?A) ^ isKindof (?A, ?B) -> isKindof (?t, ?B)
• Rules for reasoning synonym
– Hip-hop, hiphop => hip hop
(Tag(?t) ^tagVariation (?t, ?R) ^ istheSameAs (?t, ?s) -> tagVariation (?s, ?R)
Algorithm for Generating User Profiles (1/2)
Algorithm 1. Generation of A Tag-based Profile
Input: set of Representative tags Tr, set of a user’s tag Tu
Output: set of frequencey for each representative tag of the user FTr
var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
var tagFrequency[] = { }, tempFrequency [] = { }
var RTag = null
while ∃next tag t in Tu do
RTag = FindRTag (t)
If Rtag == RTags [i] then
{ tempFrequency[i] = tempFrequency[i] + 1
tagFrequency [i] = tempFrequency [i] }
else
tagFrequency [i] = tempFrequency [i]
endwhile rock hiphop electronic metal jazz rap funk folk blues reggae
user1 6 2 2 3 2 4 3 1 1 1
user2 5 0 0 0 0 0 0 0 1 0
user3 2 2 1 1 1 1 2 0 0 1
user4 10 1 0 1 2 0 2 3 3 1
user5 1 4 0 0 0 4 1 0 0 0
Table 1. An example of tag-based profiles
Algorithm for generating User Profiles (2/2)
Algorithm 2. Generation of A Track-based Profile
Input: set of tracks of a usr TRu, set of Representative tags Tr
Output: set of number of a user’s tracks for each representative musical genre Tn
var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
var numTrack[ ] = { }, tempnumTrack [ ] = { }
var RTrack = null
while ∃next tag t in Tu do
RTrack = FindGenre (t)
If Rtrack == RTags [i] then
{ tempnumTrack [i] = tempnumTrack[i] + 1
numTrack[i] = tempnumTrack [i] }
else
numTrack [i] = tempnumTrack [i]
endwhile rock hiphop electronic metal jazz rap funk folk blues reggae
User1 65 176 5 4 0 168 0 3 0 0
User2 411 8 11 109 3 5 8 1 0 0
User3 157 7 11 10 6 2 1 39 4 2
User4 257 20 9 18 2 5 0 9 0 0
User5 110 277 15 8 6 85 10 3 2 7
Table 2. An example of track-based profiles
Preliminary Experimental Results (1/3)
• 1,000 user data set from Last.fm
– Users, tags, music items
• Standardization
– To remove extensive preference
• K-Means clustering algorithm
– Canopy Clustering
– 6 centroid points and 6 clusters
Preliminary Experimental Results (2/3)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Cluster1 0.241 1.472 0.626 0.130 1.267 1.621 2.168 0.274 1.078 0.381
Cluster2 2.171 0.032 0.517 3.052 0.011 -0.030 0.328 1.533 1.245 0.162
Cluster3 -0.206 -0.273 -0.517 -0.178 -0.180 -0.294 -0.233 -0.171 -0.204 -0.136
Cluster4 -0.341 0.660 -0.459 -0.284 -0.208 1.178 -0.179 -0.321 -0.166 0.273
Cluster5 -0.074 -0.155 1.320 -0.230 -0.115 -0.261 -0.209 -0.070 -0.172 -0.071
Cluster6 2.815 7.640 5.168 -0.136 9.254 6.135 7.000 4.286 4.421 5.254
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Cluster1 -0.411 0.495 0.406 -0.338 1.565 0.131 1.632 -0.135 0.147 0.812
Cluster2 0.200 -0.444 0.007 -0.341 0.907 -0.468 -0.288 2.617 1.097 0.020
Cluster3 -0.897 1.651 -0.539 -0.442 -0.213 1.836 0.059 -0.507 -0.415 0.034
Cluster4 1.925 -0.590 -0.404 0.852 -0.264 -0.491 0.655 -0.002 2.850 -0.108
Cluster5 0.914 -0.557 -0.216 0.794 -0.296 -0.511 -0.297 0.014 -0.157 -0.147
Cluster6 -0.472 -0.327 0.380 -0.373 -0.184 -0.371 -0.241 -0.205 -0.300 -0.093
Table 3. Values of Centers of Tag-based Profiles
Table 4. Values of Centers of Track-based Profiles
• Clustering Validity
– Inter-cluster distances
– Distances between all pairs of centroids using cosine distance measure
Preliminary Experimental Results (3/3)
– T-test
• Mean of inter-cluster distances of tag-based profiles
• Mean of inter-cluster distances of track-based profiles
N Mean Std Dev t p-value
Tag-based profiles 15 0.8325 0.6834
2.55 0.0165
Track-based profiles 15 0.3785 0.0885
Table 5. T-test result for the means of inter-cluster distances
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
– UniEmotion Ontology
– Generation of User Profiles
– Music Recommendation Algorithm
• Performance Evaluation
• Related Work
• Conclusions and Future Work
UniEmotion Ontology (1/5)
[Plutchik’s model]
UniEmotion Ontology (2/5)
P: 0.625, O: 0.25, N: 0.125
P: 0.375, O: 0.625, N: 0
P: 1.0, O: 0, N: 0
• Definition of the intensity of emotional tags
• SentiWordNet, http://sentiwordnet.isti.cnr.it/
UniEmotion Ontology (3/5)
• Intensity of emotional tags
– Strong
• Positive value >= 0.75 or Negative value>= 0.75
– Middle
• 0.25 <= Positive value <= 0.75 or
• 0.25 <= Negative value <= 0.75
– Weak
• Positive value < 0.25 and Negative value < 0.25
UniEmotion Ontology (4/5)
• Assigning the weights to the tags
– Factual tags: 1
– Positive tags
• Strong: 2.5
• Middle: 2
• Weak: 1.5
– Negative tags
• Strong: -2.5
• Middle: -2
• Weak: -1.5
• Final score of an item => sum of the weights
UniEmotion Ontology (5/5)
• Two classes
– UniEmotion:Positive
• Emotional tags belonging to the positive emotional categories
• trust, surprise, anticipation, and happiness
– UniEmotion:Negative
• Emotional tags belonging to the negative emotional categories
• disgust, anger, fear, and sadness
• Two properties
– UniEmotion:Intensity
• Specifying the intensity of tags
– UniEmotion:Weight
• Specifying the weight of tags
Generation of User Profiles (1/2)
1. Listening habits-based User Profiles
– U1 = {u1, u2, …, um}, I1 = {i1, i2, …, in},
– <u, I, n>
• N: number of plays
2. Tag score-based User Profiles
– U2 = {u1, u2, …, um}, I2 = {i1, i2, …, in},
– <u, I, s>
• S: scores of tags assigned by UniEmotion ontology
3. Hybrid User Profiles
– U3 = {u1, u2, …, um}, I3 = I1 ∩ I2,
– <u, I, m>
• M = α * n +(1- α) * s; α = 0.5
Generation of User Profiles (2/2)
1. Listening habits-based
User profiles
2. Tag score-based
User profiles
3. Hybrid
User profiles
Music Recommendation Algorithm (1/2)
• Finding Similar Users
– Pearson Correlation Similarity
• Calculating scores of items
– Considering the similar users’ rates
• Recommending top n items
Music Recommendation Algorithm (2/2)
Input: a set of user profiles UP
Output: a set of recommended items RI
1. For all yi ∈ U
Compute a similarity s between X and yi.
2. Sort by similarity
3. Select top n neighbors
4.
5. For all
Compute a similarity t between x and
For all
preference +=t * pref
6. Rank by preference
7. Select top n items
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Performance Evaluation
• Implementation Environment: Apache Web Server
– User database : MySQL 5.0
– Listening habits collector, tag score generator: PHP
– Recommendation Engine: Apache Mahout
– UniTag and UniEmotion Ontology: JDK6.0
• Experimental Data
– 1, 000 user information from last.fm [http://mir.dcs.gla.ac.uk/]
– Containing 18,700 artist and 12,600 tags
– 70% training data, 30% test data
Performance Evaluation
• Evaluation Model
– Recommended items
• Items which users are interested in (True Positive, TP)
• Items which users are not (False Positive, FP)
– Items which are not recommended
• Items which users are interested in (False Negative, FN)
• Items which users are not interested in (True Negative, TN)
– Precision P = TP/ TP+ FP
• # of correct recommendation/# of all recommended items
– Recall R = TP / TP+FN
• # of correct recommendation/# of preferred items
– F-measure F = 2* P* R / P+R
• Harmonic average between precision and recall
Experimental Results (1/3)
• Precisions
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Experimental Results (2/3)
• Recalls
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Experimental Results (3/3)
• F-measure
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Statistical Validation
• One-way ANOVA about three groups
– Method1: listening habits-based approach
– Method2: tag-based approach
– Method3: hybrid approach
• Tukey Multiple Comparison Test
– Asymmetric distributions
• Log transformation
– Different characters in case two groups have significant
difference
Method 1 2 3 F
Mean of log(prec) -3.962B -4.036B -2.879A 34.27***
Mean
Precision(SD)
0.020
(0.006)
0.020
(0.009)
0.068
(0.040)
N 24 24 24
Method 1 2 3 F
Mean of log(recall) -3.285B -4.099c -2.635A 26.80***
Mean
Recall (SD)
0.044
(0.023)
0.019
(0.010)
0.093
(0.056)
N 24 24 24
<Table1. test for precision> ***: p<0.001
<Table2. test for recall> ***:p<0.001
Method 1 2 3 F
Mean of log(F-measure) -3.748B -4.117c -2.894A 41.31***
Mean
F-measure (SD)
0.024
(0.006)
0.018
(0.008)
0.06
(0.034)
N 24 24 24
<Table2. test for F-measure> ***: p<0.001
Related Work
• MusicBox
– A personalized music recommender system based on social tags
– 3-order tensors model
– The method improves the recommendation quality
• Foafing the music
– Collecting music information in a semantic web environment
– User information, music information, concert information
– Recommendation of similar music items
• OntoEmotions
– An ontology of emotional categories covering the basic emotions
– Armeteo art portal
– New relations can be inferred by reasoning on the ontology of emotions
Conclusions
• Solution to Cold Start Problem
– It takes time to collect users’ listening habits.
– Adding tags is easily done
– Tags look like word-of-mouth
• Performance Enhancement
– Precision, Recall, F-measure
– Hybrid approach > listening habits-based approach, tag-based approach
Future Work
• Elaborating UniEmotion Ontology
– Emerging Internet Slangs
• Item Selection
– Product Network Analysis Considering Tags
– Analyzing short description

More Related Content

PPTX
A system to generate rhythms automatically for songs in rhythm game
PDF
Understanding Music Playlists
PDF
Music Recommendation 2018
PDF
도시의 마음, 그 발현 - Emergent Mind of City
PDF
Studying Social Selection vs Social Influence in Virtual Financial Communities
PDF
Analyzing Big Data to Discover Honest Signals of Innovation
PDF
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
PPTX
빅데이터 기술을 활용한 뉴스 큐레이션 서비스 - 온병원
A system to generate rhythms automatically for songs in rhythm game
Understanding Music Playlists
Music Recommendation 2018
도시의 마음, 그 발현 - Emergent Mind of City
Studying Social Selection vs Social Influence in Virtual Financial Communities
Analyzing Big Data to Discover Honest Signals of Innovation
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
빅데이터 기술을 활용한 뉴스 큐레이션 서비스 - 온병원

Viewers also liked (18)

PPTX
온라인 데이터 분석을 통한 선거예측- 김찬우, 조인호
PDF
International Collaboration Networks in the Emerging (Big) Data Science
PPTX
농업 빅데이터를 활용한 병해충 발생 예측 모형
PDF
DATA CENTRIC EDUCATION & LEARNING
PDF
Data Centric Art, Science, and Humanities
PDF
데이터사이언스학회 5월 세미나 데이터저널리즘과 트위터네트워크 분석
PDF
국가의 신성장 동력으로서 공간정보의 가치와 활용 2016-0603
PDF
Structures of Twitter Crowds and Conversations Six distinct types of crowds t...
PPTX
2015-4 혁신기술로서의 빅데이터 국내 기술수용 초기 특성연구- 김정선
PPTX
텍스톰을 이용한 SNA 분석 -전채남
PDF
Data-driven biomedical science: implications for human disease and public health
PDF
소셜 텍스트 빅 테이터를 통해 분석한 화장품 유통구조 시사점
PDF
R의 이해와 활용_데이터사이언스학회
PPTX
데이터시장의 트렌드와 예측 - 이영환
PDF
소셜미디어 분석방법론과 사례
PDF
데이터 시각화의 글로벌 동향 20140819 - 고영혁
PPTX
스마트 시티의 빅데이터 분석론 - 최준영
PDF
Bayesian Network 을 활용한 예측 분석
온라인 데이터 분석을 통한 선거예측- 김찬우, 조인호
International Collaboration Networks in the Emerging (Big) Data Science
농업 빅데이터를 활용한 병해충 발생 예측 모형
DATA CENTRIC EDUCATION & LEARNING
Data Centric Art, Science, and Humanities
데이터사이언스학회 5월 세미나 데이터저널리즘과 트위터네트워크 분석
국가의 신성장 동력으로서 공간정보의 가치와 활용 2016-0603
Structures of Twitter Crowds and Conversations Six distinct types of crowds t...
2015-4 혁신기술로서의 빅데이터 국내 기술수용 초기 특성연구- 김정선
텍스톰을 이용한 SNA 분석 -전채남
Data-driven biomedical science: implications for human disease and public health
소셜 텍스트 빅 테이터를 통해 분석한 화장품 유통구조 시사점
R의 이해와 활용_데이터사이언스학회
데이터시장의 트렌드와 예측 - 이영환
소셜미디어 분석방법론과 사례
데이터 시각화의 글로벌 동향 20140819 - 고영혁
스마트 시티의 빅데이터 분석론 - 최준영
Bayesian Network 을 활용한 예측 분석
Ad

Similar to A Unified Music Recommender System Using Listening Habits and Semantics of Tags (20)

PDF
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
PDF
IRJET- A Personalized Music Recommendation System
PDF
ASONAM2023_presection_slide_track-recommendation.pdf
PDF
Aiml ppt pdf.pdf on music recommendation system
PPTX
Music recommendations model using natural language processing
PDF
MULHER@AVI2012
PDF
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
PDF
Scala Data Pipelines for Music Recommendations
PDF
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
PDF
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
PDF
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
PPTX
Internship_project_ppt - Rahul-1.pptx engineering course
PPTX
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
PDF
Deep Learning Based Music Recommendation System
PDF
[221]똑똑한 인공지능 dj 비서 clova music
PPTX
Improving Semantic Search Using Query Log Analysis
PDF
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
PPTX
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
PDF
Trends in Music Recommendations 2018
PPT
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
IRJET- A Personalized Music Recommendation System
ASONAM2023_presection_slide_track-recommendation.pdf
Aiml ppt pdf.pdf on music recommendation system
Music recommendations model using natural language processing
MULHER@AVI2012
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
Scala Data Pipelines for Music Recommendations
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
Internship_project_ppt - Rahul-1.pptx engineering course
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
Deep Learning Based Music Recommendation System
[221]똑똑한 인공지능 dj 비서 clova music
Improving Semantic Search Using Query Log Analysis
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Trends in Music Recommendations 2018
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
Ad

Recently uploaded (20)

PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
August Patch Tuesday
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Tartificialntelligence_presentation.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
MIND Revenue Release Quarter 2 2025 Press Release
August Patch Tuesday
cloud_computing_Infrastucture_as_cloud_p
SOPHOS-XG Firewall Administrator PPT.pptx
OMC Textile Division Presentation 2021.pptx
A novel scalable deep ensemble learning framework for big data classification...
A comparative analysis of optical character recognition models for extracting...
Building Integrated photovoltaic BIPV_UPV.pdf
WOOl fibre morphology and structure.pdf for textiles
NewMind AI Weekly Chronicles - August'25-Week II
Heart disease approach using modified random forest and particle swarm optimi...
Chapter 5: Probability Theory and Statistics
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Zenith AI: Advanced Artificial Intelligence
Assigned Numbers - 2025 - Bluetooth® Document
Tartificialntelligence_presentation.pptx

A Unified Music Recommender System Using Listening Habits and Semantics of Tags

  • 1. A Unified Music Recommender System Using Users’ Listening Habits and Semantics of Tags Hyon Hee Kim Department of Statistics and Information Science, Dongduk Women’s University
  • 2. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 3. Motivation (1/3) • In a Social Music Site – Music recommendation is essential. – Music recommendation is different from other product recommendation • Explicit information : Rating system • Implicit information : the number of plays • Listening habits-based User Profiling – Cold Start Problem • A new users with little information • A new items with only a few ratings – Data Sparsity Problem • Data is very small compared to needed music items
  • 4. Classic rock british pop rock • Collaborative Tagging – A tool for users to represent their preferences about web resources – Users add keywords which are freely chosen by themselves to web resources – Using tag data for user profiling in personalized recommender systems • Tag-based User Profiling – More Easily added tags without listening to music – Semantically meaningful tags Motivation (2/3)
  • 5. Motivation (3/3) • In the case of last.fm • Factual Tags – 85% of tags – genre, region, instrumentation • Emotional Tags – 10% of tags – opinion, sentiment, mood • Personal Tags – 5% of tags – to organize, to browse, etc.
  • 6. Objectives • A Novel Approach to Music Recommendation – Combining listening habits and semantics of tags • Using a Tag Ontology and an Emotion Ontology – UniTag: Resolving semantic ambiguity of tags – UniEmotion: Assigning weighted values to the emotional tags → Semantically Enhanced Music Recommendation
  • 7. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 9. Outline • Motivation & Objectives • Overview of the System • Tag-based User Profiling – Preprocessing of tags – Algorithms for generating user profiles – Preliminary experimental results • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 10. Preprocessing of Tags (1/3) • A tag does not have any pre-defined term or hierarchies of a term • Problems of tag data – Synonymy • Different words represents the same meaning • E.g., hiphop, hip-hop, hip hop/ R & B, Rhythm and Blues, Blues – Polysemy • A single word contains multiple meanings • E.g., French => French rock, French pop, French artist – Spelling variants • misspelling • Foreign language
  • 11. Preprocessing of Tags (2/3) • Tag Ontology – Tags, users, items • UniTag Ontology – uniTag:Users • uniTag:userID, uniTag:hasAdded, uniTag:hasAddedTo – uniTag:Items • uniTag:itemID – uniTag:Tags • uniTag:tagID, uniTag:tagName, uniTag:RTag, uniTag:subTag, • uniTag:Rtags {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} • uniTag:classifiedAs, uniTag:isKindOf, uniTag:istheSameAs, uniTag:tagVariation
  • 12. Preprocessing of Tags (3/3) • Rules for reasoning prefix – French rock, progressive rock, post rock=> rock (Tag (?t) ^ tagPrefix (?t, ?p) ^ Prefix(?p) ^ subTag(?t, ?s) ^ Rtags (?s) -> classifiedAs (?t, ?s) • Rules for reasoning expert knowledge – Soul => rhythm and blues, rhythm and blues => blues then Soul => blues (Tag (?t) ^ isKindof (?t, ?A) ^ isKindof (?A, ?B) -> isKindof (?t, ?B) • Rules for reasoning synonym – Hip-hop, hiphop => hip hop (Tag(?t) ^tagVariation (?t, ?R) ^ istheSameAs (?t, ?s) -> tagVariation (?s, ?R)
  • 13. Algorithm for Generating User Profiles (1/2) Algorithm 1. Generation of A Tag-based Profile Input: set of Representative tags Tr, set of a user’s tag Tu Output: set of frequencey for each representative tag of the user FTr var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} var tagFrequency[] = { }, tempFrequency [] = { } var RTag = null while ∃next tag t in Tu do RTag = FindRTag (t) If Rtag == RTags [i] then { tempFrequency[i] = tempFrequency[i] + 1 tagFrequency [i] = tempFrequency [i] } else tagFrequency [i] = tempFrequency [i] endwhile rock hiphop electronic metal jazz rap funk folk blues reggae user1 6 2 2 3 2 4 3 1 1 1 user2 5 0 0 0 0 0 0 0 1 0 user3 2 2 1 1 1 1 2 0 0 1 user4 10 1 0 1 2 0 2 3 3 1 user5 1 4 0 0 0 4 1 0 0 0 Table 1. An example of tag-based profiles
  • 14. Algorithm for generating User Profiles (2/2) Algorithm 2. Generation of A Track-based Profile Input: set of tracks of a usr TRu, set of Representative tags Tr Output: set of number of a user’s tracks for each representative musical genre Tn var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} var numTrack[ ] = { }, tempnumTrack [ ] = { } var RTrack = null while ∃next tag t in Tu do RTrack = FindGenre (t) If Rtrack == RTags [i] then { tempnumTrack [i] = tempnumTrack[i] + 1 numTrack[i] = tempnumTrack [i] } else numTrack [i] = tempnumTrack [i] endwhile rock hiphop electronic metal jazz rap funk folk blues reggae User1 65 176 5 4 0 168 0 3 0 0 User2 411 8 11 109 3 5 8 1 0 0 User3 157 7 11 10 6 2 1 39 4 2 User4 257 20 9 18 2 5 0 9 0 0 User5 110 277 15 8 6 85 10 3 2 7 Table 2. An example of track-based profiles
  • 15. Preliminary Experimental Results (1/3) • 1,000 user data set from Last.fm – Users, tags, music items • Standardization – To remove extensive preference • K-Means clustering algorithm – Canopy Clustering – 6 centroid points and 6 clusters
  • 16. Preliminary Experimental Results (2/3) X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Cluster1 0.241 1.472 0.626 0.130 1.267 1.621 2.168 0.274 1.078 0.381 Cluster2 2.171 0.032 0.517 3.052 0.011 -0.030 0.328 1.533 1.245 0.162 Cluster3 -0.206 -0.273 -0.517 -0.178 -0.180 -0.294 -0.233 -0.171 -0.204 -0.136 Cluster4 -0.341 0.660 -0.459 -0.284 -0.208 1.178 -0.179 -0.321 -0.166 0.273 Cluster5 -0.074 -0.155 1.320 -0.230 -0.115 -0.261 -0.209 -0.070 -0.172 -0.071 Cluster6 2.815 7.640 5.168 -0.136 9.254 6.135 7.000 4.286 4.421 5.254 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Cluster1 -0.411 0.495 0.406 -0.338 1.565 0.131 1.632 -0.135 0.147 0.812 Cluster2 0.200 -0.444 0.007 -0.341 0.907 -0.468 -0.288 2.617 1.097 0.020 Cluster3 -0.897 1.651 -0.539 -0.442 -0.213 1.836 0.059 -0.507 -0.415 0.034 Cluster4 1.925 -0.590 -0.404 0.852 -0.264 -0.491 0.655 -0.002 2.850 -0.108 Cluster5 0.914 -0.557 -0.216 0.794 -0.296 -0.511 -0.297 0.014 -0.157 -0.147 Cluster6 -0.472 -0.327 0.380 -0.373 -0.184 -0.371 -0.241 -0.205 -0.300 -0.093 Table 3. Values of Centers of Tag-based Profiles Table 4. Values of Centers of Track-based Profiles • Clustering Validity – Inter-cluster distances – Distances between all pairs of centroids using cosine distance measure
  • 17. Preliminary Experimental Results (3/3) – T-test • Mean of inter-cluster distances of tag-based profiles • Mean of inter-cluster distances of track-based profiles N Mean Std Dev t p-value Tag-based profiles 15 0.8325 0.6834 2.55 0.0165 Track-based profiles 15 0.3785 0.0885 Table 5. T-test result for the means of inter-cluster distances
  • 18. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation – UniEmotion Ontology – Generation of User Profiles – Music Recommendation Algorithm • Performance Evaluation • Related Work • Conclusions and Future Work
  • 20. UniEmotion Ontology (2/5) P: 0.625, O: 0.25, N: 0.125 P: 0.375, O: 0.625, N: 0 P: 1.0, O: 0, N: 0 • Definition of the intensity of emotional tags • SentiWordNet, http://sentiwordnet.isti.cnr.it/
  • 21. UniEmotion Ontology (3/5) • Intensity of emotional tags – Strong • Positive value >= 0.75 or Negative value>= 0.75 – Middle • 0.25 <= Positive value <= 0.75 or • 0.25 <= Negative value <= 0.75 – Weak • Positive value < 0.25 and Negative value < 0.25
  • 22. UniEmotion Ontology (4/5) • Assigning the weights to the tags – Factual tags: 1 – Positive tags • Strong: 2.5 • Middle: 2 • Weak: 1.5 – Negative tags • Strong: -2.5 • Middle: -2 • Weak: -1.5 • Final score of an item => sum of the weights
  • 23. UniEmotion Ontology (5/5) • Two classes – UniEmotion:Positive • Emotional tags belonging to the positive emotional categories • trust, surprise, anticipation, and happiness – UniEmotion:Negative • Emotional tags belonging to the negative emotional categories • disgust, anger, fear, and sadness • Two properties – UniEmotion:Intensity • Specifying the intensity of tags – UniEmotion:Weight • Specifying the weight of tags
  • 24. Generation of User Profiles (1/2) 1. Listening habits-based User Profiles – U1 = {u1, u2, …, um}, I1 = {i1, i2, …, in}, – <u, I, n> • N: number of plays 2. Tag score-based User Profiles – U2 = {u1, u2, …, um}, I2 = {i1, i2, …, in}, – <u, I, s> • S: scores of tags assigned by UniEmotion ontology 3. Hybrid User Profiles – U3 = {u1, u2, …, um}, I3 = I1 ∩ I2, – <u, I, m> • M = α * n +(1- α) * s; α = 0.5
  • 25. Generation of User Profiles (2/2) 1. Listening habits-based User profiles 2. Tag score-based User profiles 3. Hybrid User profiles
  • 26. Music Recommendation Algorithm (1/2) • Finding Similar Users – Pearson Correlation Similarity • Calculating scores of items – Considering the similar users’ rates • Recommending top n items
  • 27. Music Recommendation Algorithm (2/2) Input: a set of user profiles UP Output: a set of recommended items RI 1. For all yi ∈ U Compute a similarity s between X and yi. 2. Sort by similarity 3. Select top n neighbors 4. 5. For all Compute a similarity t between x and For all preference +=t * pref 6. Rank by preference 7. Select top n items
  • 28. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 29. Performance Evaluation • Implementation Environment: Apache Web Server – User database : MySQL 5.0 – Listening habits collector, tag score generator: PHP – Recommendation Engine: Apache Mahout – UniTag and UniEmotion Ontology: JDK6.0 • Experimental Data – 1, 000 user information from last.fm [http://mir.dcs.gla.ac.uk/] – Containing 18,700 artist and 12,600 tags – 70% training data, 30% test data
  • 30. Performance Evaluation • Evaluation Model – Recommended items • Items which users are interested in (True Positive, TP) • Items which users are not (False Positive, FP) – Items which are not recommended • Items which users are interested in (False Negative, FN) • Items which users are not interested in (True Negative, TN) – Precision P = TP/ TP+ FP • # of correct recommendation/# of all recommended items – Recall R = TP / TP+FN • # of correct recommendation/# of preferred items – F-measure F = 2* P* R / P+R • Harmonic average between precision and recall
  • 31. Experimental Results (1/3) • Precisions [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 32. Experimental Results (2/3) • Recalls [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 33. Experimental Results (3/3) • F-measure [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 34. Statistical Validation • One-way ANOVA about three groups – Method1: listening habits-based approach – Method2: tag-based approach – Method3: hybrid approach • Tukey Multiple Comparison Test – Asymmetric distributions • Log transformation – Different characters in case two groups have significant difference
  • 35. Method 1 2 3 F Mean of log(prec) -3.962B -4.036B -2.879A 34.27*** Mean Precision(SD) 0.020 (0.006) 0.020 (0.009) 0.068 (0.040) N 24 24 24 Method 1 2 3 F Mean of log(recall) -3.285B -4.099c -2.635A 26.80*** Mean Recall (SD) 0.044 (0.023) 0.019 (0.010) 0.093 (0.056) N 24 24 24 <Table1. test for precision> ***: p<0.001 <Table2. test for recall> ***:p<0.001 Method 1 2 3 F Mean of log(F-measure) -3.748B -4.117c -2.894A 41.31*** Mean F-measure (SD) 0.024 (0.006) 0.018 (0.008) 0.06 (0.034) N 24 24 24 <Table2. test for F-measure> ***: p<0.001
  • 36. Related Work • MusicBox – A personalized music recommender system based on social tags – 3-order tensors model – The method improves the recommendation quality • Foafing the music – Collecting music information in a semantic web environment – User information, music information, concert information – Recommendation of similar music items • OntoEmotions – An ontology of emotional categories covering the basic emotions – Armeteo art portal – New relations can be inferred by reasoning on the ontology of emotions
  • 37. Conclusions • Solution to Cold Start Problem – It takes time to collect users’ listening habits. – Adding tags is easily done – Tags look like word-of-mouth • Performance Enhancement – Precision, Recall, F-measure – Hybrid approach > listening habits-based approach, tag-based approach
  • 38. Future Work • Elaborating UniEmotion Ontology – Emerging Internet Slangs • Item Selection – Product Network Analysis Considering Tags – Analyzing short description