SlideShare a Scribd company logo
MODELING ADOPTIONS AND THE 
STAGES OF THE DIFFUSION OF 
INNOVATIONS 
Nicola 
Barbieri 
Francesco 
Bonchi 
Yahoo 
Labs 
Barcelona, 
Spain 
{barbieri,bonchi}@yahoo-­‐inc.com 
Yasir 
Mehmood 
Pompeu 
Fabra 
University 
yasir@yahoo-­‐inc.com
Background 
• The spread of new ideas in a society is a complex process 
that starting from a small fraction of the population, propagates 
over time through a diverse set of communication channels, 
potentially reaching a critical mass. 
• Rogers’ seminal work provides a unified tool for modeling 
diffusion processes. 
• His theory identifies five categories of people by considering 
the adoption time of each person with respect to the rest of the 
population.
Our contribution 
• Understanding the dynamics of such complex process is has 
potential implications in sociology, economics and marketing. 
• We study the data mining problem of modeling adoptions and 
the stages of the diffusion of an innovation: 
① Real-world items exhibit consistent differences in the way 
they diffuse; 
② The diffusion of different items may interest different 
segments of the market and the diffusion of a item can 
achieve different level of success; 
③ Different items may exhibit different temporal patterns of 
diffusion.
MASD: a stochastic framework for modeling 
adoptions and the stages of diffusions 
• The process of diffusion is decomposed in a finite and 
ordered sequence of stages of adoptions; 
• Early stages correspond to the introduction in the market of a item, 
while latter ones correspond to the maturity phase of its life cycle. 
§ In continuity with Rogers’ theory, users have different 
likelihood of being involved in each stage. 
§ Each stage is characterized by a rate which describes the 
relative speed of adoptions.
MASD: modeling procedure (1) 
• The density function for observing a specific adoption given 
the previous ones and the current stage is: 
• The density function for the n-th adoption depends on the 
given the current stage and it is given by 
• the probability of observing the activation of the adopter ui,n 
• this adoption to occur at time ti,n 
• We assume that each stage sj, the temporal gaps between 
consecutive adoptions are explained by an exponential 
density function with rate λj : 
where
MASD: modeling procedure (2) 
• Key idea: we consider a 1-to-1 association between the 
stages of adoptions, and the states of a Markov model. 
• To enforce a clear sequentiality in the evolution of the 
stages of adoption, we introduce the following constraints: 
• These structural constraints can be accommodated in a 
left-to-right hidden Markov model. 
A 4-states left-right HMM and its transition matrix
MASD: Generative model & learning 
To better explain diffusion 
mechanisms (e.g. localized trends), 
we devise a learning framework that 
alternates two phases: 
① Cluster the diffusion traces in different 
groups; 
② For each group fit the parameter of the 
MASD model by applying the 
Expectation Maximization algorithm. 
• The number of states for each cluster is automatically 
detected by relying on the Bayesian Information Criterion.
MASD: Learning 
• We employ a simple instance of iterative prototype-based 
clustering. 
• Given a cluster Ch (set of traces) and a set of candidate 
models with different degree of complexity, we select Θ∗ 
h as:
Evaluation 
• Assess the accuracy, convergence and stability of the learning 
framework on synthetic data. 
• We generate synthetic data with planted clustering structure in two 
steps: 
① Construct a set MASD models to generate clusters of data; 
② Sample lengths of adoption traces, pick a MASD model at random, and 
generate adoptions using the selected model. 
• Detecting and characterizing different patterns of adoption on 
real data (Movielens & Yahoo Meme). 
• Predictive tasks. Given a considered time window: 
• Which users are more likely to adopt the item? 
• How many users will adopt the item?
Evaluation on synthetic data. 
(top) Accuracy in the “clustering reconstruction” task on synthetic data with planted clusters, measured in terms of Rand index 
(bottom) Convergence rate of the clustering/learning process, measured as the percentage of swaps observed at each iteration
Evaluation on real-world data (1) 
Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom) 
MASD model on the Yahoo MEME dataset. The thickness of the arc indicates the 
strength of transition probability between stages. Numbers inside each state 
represent: (i) the index of the adoption stage, (ii) the avg percentage of adoptions 
observed in the considered stage; (iii) the percentage of adoption traces that 
involve the stage.
Evaluation on real-world data (2) 
Each user is generally tied to few stages. 
Stages exhibit different patterns 
considering adoption rates. 
Different diffusion rates (log-scale) in the five clusters in MovieLens (top) and Yahoo Meme (bottom)
Experiments: real-world datasets (3) 
Top-5 traces that minimize the perplexity for the considered model: 
Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom)
Predictive tasks: Prediction protocol 
• Learn parameters over the diffusion traces in the training-set. 
• For each trace in the test-set we evaluate prediction accuracy 
by varying the length of the observed data used for fitting. 
• Each partial-observed trace in the test-set is associated to 
the model M that maximizes it log likelihood. 
Fitting Evaluation 
• We find the the current diffusion state by applying Viterbi. 
• From the current state, we generate multiple samples 
according to the generative process. 
• The variable of interest (adoption of a user/overall size) is 
estimated on the simulated traces (1k).
Predictive tasks: Results 
Area under the curve (AUC) for predicting single user activations: 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
30 days 0.70 0.54 0.55 0.55 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 
21 days 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 0.68 0.54 0.54 0.54 
14 days 0.69 0.54 0.54 0.54 0.69 0.54 0.54 0.55 0.69 0.53 0.54 0.54 
(a) Movielens 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
60 min. 0.83 0.73 0.74 0.75 0.82 0.72 0.73 0.74 0.82 0.72 0.74 0.74 
30 min. 0.82 0.72 0.73 0.74 0.81 0.71 0.72 0.72 0.81 0.71 0.72 0.73 
15 min. 0.81 0.68 0.70 0.70 0.80 0.69 0.70 0.71 0.81 0.66 0.68 0.69 
(b) Yahoo Meme 
TABLE VI: Area under the curve (AUC) for predicting single user activations in different time windows. The baseline procedure is evaluated 
for three selections of k and three different splits of propagations. 
Mean absolute error (MAE) for predicting final size of the diffusion trace: 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
30 days 3.42 3.90 3.90 3.92 3.71 3.90 3.91 3.92 4.61 5.14 5.17 5.17 
21 days 2.61 3.02 3.02 3.03 2.88 3.02 3.03 3.03 3.61 3.96 3.97 3.98 
14 days 1.93 2.22 2.23 2.23 2.16 2.23 2.23 2.23 2.69 2.90 2.91 2.91 
(a) Movielens 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
60 min. 3.57 5.56 5.61 5.63 5.32 7.20 7.24 7.25 7.46 9.01 9.08 9.09 
30 min. 3.01 4.65 4.69 4.71 4.66 6.13 6.17 6.15 6.69 7.82 7.89 7.93 
15 min. 2.49 3.23 3.25 3.26 4.53 5.89 5.92 5.93 5.50 5.63 5.66 5.68 
(b) Yahoo Meme
Conclusion and future works 
• We introduce MASD, a stochastic framework for modeling 
users’ adoptions and the different stages of diffusion of of 
innovations. 
• Our model focuses on the two main dimensions, users and rate 
of adoption. Learning is accomplished by fitting a left-to-right 
hidden Markov model. 
• The experimental evaluation over real-world data confirms the 
accuracy in detecting interesting patterns of adoption and in 
prediction scenarios. 
• Future works: Account for social influence dynamics and stages 
of virality.
Thanks!

More Related Content

PDF
20110501 csseminar rybalkin_substructure_search
PDF
Jayant lrs
PDF
CSMR11b.ppt
PPTX
LEXBFS on Chordal Graphs
PDF
Effective community search_dami2015
PPT
Graph mining seminar_2009
PPT
Diffusion of Innovations Chapters 1-2
PPTX
User Acceptance of Information Technology
20110501 csseminar rybalkin_substructure_search
Jayant lrs
CSMR11b.ppt
LEXBFS on Chordal Graphs
Effective community search_dami2015
Graph mining seminar_2009
Diffusion of Innovations Chapters 1-2
User Acceptance of Information Technology

Similar to Modeling adoptions and the stages of the diffusion of innovations (20)

PDF
Modeling & Simulation Lecture Notes
PDF
Crowd Density Estimation Using Base Line Filtering
PPT
Simulation
PDF
PPT
Panel data random effect fixed effect.ppt
PPTX
Md simulation and stochastic simulation
PDF
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
PDF
Matrioska tracking keypoints in real-time
PDF
International Journal of Engineering Research and Development
PDF
F1083644
PDF
C013141723
PDF
Rabiner
PDF
IRJET- Criminal Recognization in CCTV Surveillance Video
PDF
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
PDF
Using AI Planning to Automate the Performance Analysis of Simulators
PDF
FUTURE TRENDS OF SEISMIC ANALYSIS
PPT
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
PDF
|QAB> : Quantum Computing, AI and Blockchain
PDF
International Journal of Engineering Research and Development
PDF
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
Modeling & Simulation Lecture Notes
Crowd Density Estimation Using Base Line Filtering
Simulation
Panel data random effect fixed effect.ppt
Md simulation and stochastic simulation
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Matrioska tracking keypoints in real-time
International Journal of Engineering Research and Development
F1083644
C013141723
Rabiner
IRJET- Criminal Recognization in CCTV Surveillance Video
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
Using AI Planning to Automate the Performance Analysis of Simulators
FUTURE TRENDS OF SEISMIC ANALYSIS
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
|QAB> : Quantum Computing, AI and Blockchain
International Journal of Engineering Research and Development
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
Ad

Recently uploaded (20)

PDF
annual-report-2024-2025 original latest.
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PDF
Introduction to Data Science and Data Analysis
PDF
Microsoft Core Cloud Services powerpoint
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
New ISO 27001_2022 standard and the changes
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Transcultural that can help you someday.
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Microsoft 365 products and services descrption
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
annual-report-2024-2025 original latest.
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Introduction to Data Science and Data Analysis
Microsoft Core Cloud Services powerpoint
[EN] Industrial Machine Downtime Prediction
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
New ISO 27001_2022 standard and the changes
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Transcultural that can help you someday.
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Microsoft 365 products and services descrption
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Optimise Shopper Experiences with a Strong Data Estate.pdf
retention in jsjsksksksnbsndjddjdnFPD.pptx
Ad

Modeling adoptions and the stages of the diffusion of innovations

  • 1. MODELING ADOPTIONS AND THE STAGES OF THE DIFFUSION OF INNOVATIONS Nicola Barbieri Francesco Bonchi Yahoo Labs Barcelona, Spain {barbieri,bonchi}@yahoo-­‐inc.com Yasir Mehmood Pompeu Fabra University yasir@yahoo-­‐inc.com
  • 2. Background • The spread of new ideas in a society is a complex process that starting from a small fraction of the population, propagates over time through a diverse set of communication channels, potentially reaching a critical mass. • Rogers’ seminal work provides a unified tool for modeling diffusion processes. • His theory identifies five categories of people by considering the adoption time of each person with respect to the rest of the population.
  • 3. Our contribution • Understanding the dynamics of such complex process is has potential implications in sociology, economics and marketing. • We study the data mining problem of modeling adoptions and the stages of the diffusion of an innovation: ① Real-world items exhibit consistent differences in the way they diffuse; ② The diffusion of different items may interest different segments of the market and the diffusion of a item can achieve different level of success; ③ Different items may exhibit different temporal patterns of diffusion.
  • 4. MASD: a stochastic framework for modeling adoptions and the stages of diffusions • The process of diffusion is decomposed in a finite and ordered sequence of stages of adoptions; • Early stages correspond to the introduction in the market of a item, while latter ones correspond to the maturity phase of its life cycle. § In continuity with Rogers’ theory, users have different likelihood of being involved in each stage. § Each stage is characterized by a rate which describes the relative speed of adoptions.
  • 5. MASD: modeling procedure (1) • The density function for observing a specific adoption given the previous ones and the current stage is: • The density function for the n-th adoption depends on the given the current stage and it is given by • the probability of observing the activation of the adopter ui,n • this adoption to occur at time ti,n • We assume that each stage sj, the temporal gaps between consecutive adoptions are explained by an exponential density function with rate λj : where
  • 6. MASD: modeling procedure (2) • Key idea: we consider a 1-to-1 association between the stages of adoptions, and the states of a Markov model. • To enforce a clear sequentiality in the evolution of the stages of adoption, we introduce the following constraints: • These structural constraints can be accommodated in a left-to-right hidden Markov model. A 4-states left-right HMM and its transition matrix
  • 7. MASD: Generative model & learning To better explain diffusion mechanisms (e.g. localized trends), we devise a learning framework that alternates two phases: ① Cluster the diffusion traces in different groups; ② For each group fit the parameter of the MASD model by applying the Expectation Maximization algorithm. • The number of states for each cluster is automatically detected by relying on the Bayesian Information Criterion.
  • 8. MASD: Learning • We employ a simple instance of iterative prototype-based clustering. • Given a cluster Ch (set of traces) and a set of candidate models with different degree of complexity, we select Θ∗ h as:
  • 9. Evaluation • Assess the accuracy, convergence and stability of the learning framework on synthetic data. • We generate synthetic data with planted clustering structure in two steps: ① Construct a set MASD models to generate clusters of data; ② Sample lengths of adoption traces, pick a MASD model at random, and generate adoptions using the selected model. • Detecting and characterizing different patterns of adoption on real data (Movielens & Yahoo Meme). • Predictive tasks. Given a considered time window: • Which users are more likely to adopt the item? • How many users will adopt the item?
  • 10. Evaluation on synthetic data. (top) Accuracy in the “clustering reconstruction” task on synthetic data with planted clusters, measured in terms of Rand index (bottom) Convergence rate of the clustering/learning process, measured as the percentage of swaps observed at each iteration
  • 11. Evaluation on real-world data (1) Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom) MASD model on the Yahoo MEME dataset. The thickness of the arc indicates the strength of transition probability between stages. Numbers inside each state represent: (i) the index of the adoption stage, (ii) the avg percentage of adoptions observed in the considered stage; (iii) the percentage of adoption traces that involve the stage.
  • 12. Evaluation on real-world data (2) Each user is generally tied to few stages. Stages exhibit different patterns considering adoption rates. Different diffusion rates (log-scale) in the five clusters in MovieLens (top) and Yahoo Meme (bottom)
  • 13. Experiments: real-world datasets (3) Top-5 traces that minimize the perplexity for the considered model: Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom)
  • 14. Predictive tasks: Prediction protocol • Learn parameters over the diffusion traces in the training-set. • For each trace in the test-set we evaluate prediction accuracy by varying the length of the observed data used for fitting. • Each partial-observed trace in the test-set is associated to the model M that maximizes it log likelihood. Fitting Evaluation • We find the the current diffusion state by applying Viterbi. • From the current state, we generate multiple samples according to the generative process. • The variable of interest (adoption of a user/overall size) is estimated on the simulated traces (1k).
  • 15. Predictive tasks: Results Area under the curve (AUC) for predicting single user activations: 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 30 days 0.70 0.54 0.55 0.55 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 21 days 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 0.68 0.54 0.54 0.54 14 days 0.69 0.54 0.54 0.54 0.69 0.54 0.54 0.55 0.69 0.53 0.54 0.54 (a) Movielens 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 60 min. 0.83 0.73 0.74 0.75 0.82 0.72 0.73 0.74 0.82 0.72 0.74 0.74 30 min. 0.82 0.72 0.73 0.74 0.81 0.71 0.72 0.72 0.81 0.71 0.72 0.73 15 min. 0.81 0.68 0.70 0.70 0.80 0.69 0.70 0.71 0.81 0.66 0.68 0.69 (b) Yahoo Meme TABLE VI: Area under the curve (AUC) for predicting single user activations in different time windows. The baseline procedure is evaluated for three selections of k and three different splits of propagations. Mean absolute error (MAE) for predicting final size of the diffusion trace: 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 30 days 3.42 3.90 3.90 3.92 3.71 3.90 3.91 3.92 4.61 5.14 5.17 5.17 21 days 2.61 3.02 3.02 3.03 2.88 3.02 3.03 3.03 3.61 3.96 3.97 3.98 14 days 1.93 2.22 2.23 2.23 2.16 2.23 2.23 2.23 2.69 2.90 2.91 2.91 (a) Movielens 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 60 min. 3.57 5.56 5.61 5.63 5.32 7.20 7.24 7.25 7.46 9.01 9.08 9.09 30 min. 3.01 4.65 4.69 4.71 4.66 6.13 6.17 6.15 6.69 7.82 7.89 7.93 15 min. 2.49 3.23 3.25 3.26 4.53 5.89 5.92 5.93 5.50 5.63 5.66 5.68 (b) Yahoo Meme
  • 16. Conclusion and future works • We introduce MASD, a stochastic framework for modeling users’ adoptions and the different stages of diffusion of of innovations. • Our model focuses on the two main dimensions, users and rate of adoption. Learning is accomplished by fitting a left-to-right hidden Markov model. • The experimental evaluation over real-world data confirms the accuracy in detecting interesting patterns of adoption and in prediction scenarios. • Future works: Account for social influence dynamics and stages of virality.