SlideShare a Scribd company logo
Challenges in Software
Ecosystems Research
Alexander
Serebrenik
Eindhoven University
of Technology
The Netherlands @aserebrenik
Tom Mens UMons Belgium @tom_mens
Software
ecosystems in
scientific literature
0
125
250
375
500
1996199719981999200020012002200320042005200620072008200920102011201220132014
Scholar full text DBLP titles
Future challenges?
Definition of an ecosystem
Example of an ecosystem
Trends and challenges
164
authors of an article or a book
chapter on SECO, paper in
IWSECO, WEA or Big Systems 2014
141 authors with a valid email address
26* answered the survey
* response rate 18,4%, comparable with other surveys
Definition of an ecosystem
Respondent: “Defining everything as an
ecosystem. <…> The word is trend-ish and
it causes misunderstandings in the field.”
“The complex system of plant, animal, fungal, and
microorganism communities and their associated
non-living environment interacting as an ecological
unit. Ecosystems have no fixed boundaries”
[Lungu 2008]
[Jansen et al.
2009]
[Manikas,
Hansen 2013]
<biological>
communities
software
projects
actors actors
environment environment
shared markt for
software and
services, shared
platform
common
technological
platform
interaction
developed and
evolve together
exchange of
information,
resources &
artefacts
symbiotic
relationships
Definition of an ecosystem
social
technical economical
[Lungu 2008]
[Bosch, Bosch-
Sijtsema 2009]
[*Moore 1993][Jansen et al. 2009]
[Mitleton-Kelly 2003]
[Manikas,
Hansen
2013]
Definition of an ecosystem
companies
app stores
OS foundations
programming
languages
operation
systems
forges & social
ecosystems
Example of an ecosystem
Based on the
literature
companies
app stores
OS foundations
programming
languages
operation
systems
forges & social
ecosystems
Example of an ecosystem
Based on the
literature
companies
app stores
OS foundations
programming
languages
operation
systems
forges & social
ecosystems
Example of an ecosystem
Based on the
literature
companies
app stores
OS foundations
programming
languages
operation
systems
forges & social
ecosystems
Example of an ecosystem
Based on the
literature
companies
app stores
OS foundations
programming
languages
operation
systems
forges & social
ecosystems
Example of an ecosystem
Based on the
literature
Definition of an ecosystem
Example of an ecosystem
Respondent: “Defining
everything as an
ecosystem. <…> The
word is trend-ish and it
causes misunderstandings
in the field.”
social
economicaltechnical
Different perspectives on the same artefacts or different
artefacts all together?
Trends and challenges
26 survey answers Literature study
{29 challenges
8 categories
One challenge is to be able to characterize
the wealth of the community wrt the wealth
of the software components. What is the
impact of different collaboration and
development practices on the quality of the
ecosystem?”
Trends and challenges
One challenge is to be able to characterize
the wealth of the community wrt the wealth
of the software components. What is the
impact of different collaboration and
development practices on the quality of the
ecosystem?”
Trends and challenges
ecosystem quality
socio-technical
One challenge is to be able to characterize
the wealth of the community wrt the wealth
of the software components. What is the
impact of different collaboration and
development practices on the quality of the
ecosystem?”
Trends and challenges
ecosystem quality
socio-technical
SECOs may consist of many systems.
Analysing all these systems as a whole
may raise some technical problems, due to
the quantity of data to take into account.
data analytics
amount (volume)
large databases with comparable information about the details
of a large collection of ecosystems, so that any research could
be conducted in a repeatable and comparable way.
database of
comparable inforeproducible
research
Challenges in Software Ecosystems Research
Challenges in Software Ecosystems Research
Software Ecosystems
are/lead to Big Data
~
male
likes games
NYC
Privacy: digital
trace data
Privacy: surveys
Minority respondents
are easy to identify
Reproducibility vs privacy
Non-sensitive Sensitive
Zip Age Nationality Condition
1 13053 28 Russian Heart Disease
2 13068 29 American Heart Disease
3 13068 21 Japanese Viral Infection
4 13053 23 American Viral Infection
5 14853 50 Indian Cancer
6 14853 55 Russian Heart Disease
7 14850 47 American Viral Infection
8 14850 49 American Viral Infection
9 13053 31 American Cancer
10 13053 37 Indian Cancer
11 13068 36 Japanese Cancer
12 13068 35 American Cancer
Non-sensitive Sensitive
Zip Age Nationality Condition
1 130** <30 * Heart Disease
2 130** <30 * Heart Disease
3 130** <30 * Viral Infection
4 130** <30 * Viral Infection
5 1485* >40 * Cancer
6 1485* >40 * Heart Disease
7 1485* >40 * Viral Infection
8 1485* >40 * Viral Infection
9 130** 30-40 * Cancer
10 130** 30-40 * Cancer
11 130** 30-40 * Cancer
12 130** 30-40 * Cancer
Non-sensitive Sensitive
Zip Age Nationality Condition
1 130** <30 * Heart Disease
2 130** <30 * Heart Disease
3 130** <30 * Viral Infection
4 130** <30 * Viral Infection
5 1485* >40 * Cancer
6 1485* >40 * Heart Disease
7 1485* >40 * Viral Infection
8 1485* >40 * Viral Infection
9 130** 30-40 * Cancer
10 130** 30-40 * Cancer
11 130** 30-40 * Cancer
12 130** 30-40 * Cancer
Are some challenges more
important than others?
Second survey
• Group A: respondents of the previous survey that
have provided their email addresses
• 26 answers - 20 with mail, invited - 14 responses - 70%
• Group B: extended list of ecosystem experts
(outside Group A):
• 148 invited - 142 valid addresses - 38* responses ~ 27%
• Better response rate: 32.1% vs 18.4% (first survey)
* One of the respondents that provided an email has not been invited
Challenges in Software Ecosystems Research
No difference between
Group A and Group B
Adonis, Unknown,
restored by Duquesnoy
(1597–1643), Louvre
• Analysis of Similarities
(ANOSIM)
• R: -0.07564
• more dissimilar closer to 1
• Permutational Multivariate
Analysis of Variance Using
Distance Matrices (ADONIS)
• p-value: 0.192
Ordering challenges
1. Consider both groups as one set of answers
2. Per question: #very important - #moderately
important - #slightly important
3. Lexicographic order on the triples
(#very important - #moderately important - #slightly
important)
Top Three
1. Reproducible and Comparable Research [Providing
databases with information about the details of a
large collection of ecosystems]
2. Reproducible and Comparable Research [Making
research results about ecosystems available in a
reproducible way]
3. Offer more advanced ecosystems analysis (e.g., case
studies, qualitative and quantitative analysis) [Use
more advanced statistical techniques (e.g., survival
analysis, econometric aggregation, contrasts)]
Reproducible Research: SE
problem?
Raw$data!
Processed$
data$set!
Tools$&$
scripts!
#MSR$papers$
200482009!
Y" Y" Y" 2"
Y" Y" N" 2"
Y" P" Y" 1"
Y" P" P" 2"
Y" P" N" 2"
Y" N" Y" 16"
Y" N" P" 19"
Y" N" N" 64"
P" N" Y" 1"
P" N" N" 2"
N" Y" N" 2"
N" P" N" 1"
N" N" Y" 7"
N" N" P" 2"
N" N" N" 31"
N/A" N/A" N/A" 17"
Robles 2010
Ghezzi, Gall 2013:
• Replicated 25
• Partially 27
• Not replicated 36
Reproducible and Comparable Research
[Providing databases with information about the
details of a large collection of ecosystems]
Enough?
Too big to share?
Up-to-date?
Still relevant?
1TB
Culture
http://www.nickcobbcopywriter.com/wp-content/uploads/2013/03/whats-in-it-for-me.jpg
Advanced statistics
3. Offer more advanced ecosystems analysis (e.g., case
studies, qualitative and quantitative analysis) [Use more
advanced statistical techniques (e.g., survival
analysis, econometric aggregation, contrasts)]
Advanced statistics
PAGE 2711/08/15
Two distributions:
!  t-test
!  Mann-Whitney
Multiple distributions:
1.  ANOVA / KW
2.  pairwise t-test / MW
Tests can be
inconsistent with
each other
We need a
one-phase test!
Advanced statistics
PAGE 3211/08/15
Idea:
​" 
Pair Low High
B-A -0.56 -0.44
C-A -0.50 -0.31
D-A -0.32 -0.03
C-B -0.01 0.24
D-B 0.24 0.47
D-C 0.09 0.40
A→B
A→C
A→D
D→B
D→C
Konietschke, F., Hothorn, LA, and Brunner, E.
Rank-based multiple test procedures and
simultaneous confidence intervals.
Electron. J. Stat. 6 (2012), 738–759.
~
T and Software Ecosystems
• Stack Overflow and GitHub - Vasilescu et al. Social
Com 2013
• Simulink models - Dajsuren et al. QoSA 2013
• GNOME - Vasilescu et al. ESE 2014
• Stack Exchange sites - Wang et al. ICSME 2014
• jEdit, ArgoUML, KOffice - Sun et al. Inf & Software
Technology 2015
~
Advanced statistics
Mean,
median,
sum
Gini, Theil,
Kolm…
Choice of an aggregation
technique provides different
insights but can also affect
validity of the results!
C. Gini, “Measurement of inequality of
incomes,” The Economic Journal, 1921.
H. Theil, Economics and Information Theory.
North-Holland, 1967
A.B. Atkinson, “On the measurement of
inequality,” Journal of Economic Theory,
1970.
…
Gini, Theil & Software Ecos
• Qualitas - Spasojević et al. ICSME 2014
• GNOME - Mens, Goeminne IWSECO 2011,
Vasilescu et al. ESE 2014
• Debian - Serebrenik, vd Brand ICSM 2010
• Market shares - Yu, First Monday 2012
Advanced statistics
% of entities still used
after time t?
Kaplan, E. L.; Meier, P. (1958).
"Nonparametric estimation from incomplete
observations". J. Amer. Statist. Assn. 53
(282): 457–481
Survival & Software Ecos
• FLOSSMetrics DB - Samoladas et al. Information &
Software Technology 2010
• Debian packages - Claes et al. MSR 2015
• Databases in Java projects - Goeminne, Mens
ICSME 2015
4. Understanding and improving the design, architecture, quality and
health of software ecosystems [Socio-technical perspective, e.g.,
comparing the health of the community against the health of the
ecosystem components]
5. Ecosystem Governance [Design perspective, e.g., actively
supporting the stakeholders' decisions]
6. Understanding and improving an ecosystem's dynamics and
evolution [Generalisation perspective, e.g., transferring insights from
evolution of individual software systems to evolution of ecosystems]
7. Understanding and improving the design, architecture, quality and
health of software ecosystems [Social perspective, e.g., creating an
active community around the ecosystem]
8. Interdisciplinary research [Applying ecosystem research techniques
to non-classical software ecosystems, e.g., spreadsheets or Matlab
Simulink models]
9. Understanding and improving an ecosystem's dynamics and
evolution [Design perspective, e.g., providing upgrade strategies
when one of the ecosystem elements changes]
10.Ecosystem Governance [Generalisation perspective, e.g., going
beyond anecdotal evidence]
Threats to validity
• Representativeness of the respondents wrt the
research community
Challenges in Software Ecosystems Research
National Oceanic and Atmospheric Administration, USA

More Related Content

PPTX
Stats powerpoint presentation
PDF
Sensors, Signals and Sense-making in Human-Energy Relationships
PDF
Christoph Barrett - Policy Informatics at Societal Scale
PDF
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
PDF
fall-2014-big-data-challenges
PPTX
02 Network Canvas
PPTX
00 Social Influence Effects on Men's HIV Testing
Stats powerpoint presentation
Sensors, Signals and Sense-making in Human-Energy Relationships
Christoph Barrett - Policy Informatics at Societal Scale
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
fall-2014-big-data-challenges
02 Network Canvas
00 Social Influence Effects on Men's HIV Testing

What's hot (14)

PDF
An Introduction to Machine Learning and Genomics
PPT
00 Differentiating Between Network Structure and Network Function
PPTX
Reginald Desroches - Building Disaster Reslience
PPTX
11 Respondent Driven Sampling
PDF
Development of a Decision Support System for Environmental Indicators Using V...
PDF
Give me the place to stand: Leverage analysis in systemic design
PPTX
22 An Introduction to Stochastic Actor-Oriented Models (SAOM or Siena)
PDF
Data has a gravity and is attracting decisions
PDF
Finding the emic in systemic design: Towards systemic ethnography
PPTX
The Early Stage Analysis of a Systemic Innovation Lab
PDF
From Bugs to Decision Support - Selected Research Highlights
PPTX
Automated Software Enging, Fall 2015, NCSU
PDF
Computational Models in Systemic Design
PPTX
What's up at Kno.e.sis?
An Introduction to Machine Learning and Genomics
00 Differentiating Between Network Structure and Network Function
Reginald Desroches - Building Disaster Reslience
11 Respondent Driven Sampling
Development of a Decision Support System for Environmental Indicators Using V...
Give me the place to stand: Leverage analysis in systemic design
22 An Introduction to Stochastic Actor-Oriented Models (SAOM or Siena)
Data has a gravity and is attracting decisions
Finding the emic in systemic design: Towards systemic ethnography
The Early Stage Analysis of a Systemic Innovation Lab
From Bugs to Decision Support - Selected Research Highlights
Automated Software Enging, Fall 2015, NCSU
Computational Models in Systemic Design
What's up at Kno.e.sis?
Ad

Viewers also liked (20)

PPTX
865 social capital
PDF
Hh kehittamistyo esitys_atte_jarvela
PPT
PresentacióN Pcpi M
PPT
De Andrea Nicole James
PPTX
EnTagRec: An Enhanced Tag Recommendation System for Software Information Sites
PDF
Arts & Crafts Expo
PPTX
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
PPT
Car Study &amp; Statistics
PPT
Mock Advertisement for Amphibious Mousetrap Car
PPT
ไตร่ตรองงานวิจัยของฉัน
PPT
System7 Five Point
PPTX
PPT
ดวงจันทร์ Ppt
PPS
Saxony Germany
PDF
Riverbend Market Cooperative
PDF
Metropolia - Projektityön esitys
PPTX
Flowgen: Flowchart-Based Documentation Framework for C++
PPT
TTT
PDF
Sneak peak at door prizes and silent auction items for Grand Opening reception!
865 social capital
Hh kehittamistyo esitys_atte_jarvela
PresentacióN Pcpi M
De Andrea Nicole James
EnTagRec: An Enhanced Tag Recommendation System for Software Information Sites
Arts & Crafts Expo
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
Car Study &amp; Statistics
Mock Advertisement for Amphibious Mousetrap Car
ไตร่ตรองงานวิจัยของฉัน
System7 Five Point
ดวงจันทร์ Ppt
Saxony Germany
Riverbend Market Cooperative
Metropolia - Projektityön esitys
Flowgen: Flowchart-Based Documentation Framework for C++
TTT
Sneak peak at door prizes and silent auction items for Grand Opening reception!
Ad

Similar to Challenges in Software Ecosystems Research (20)

PDF
Challenges in Software Ecosystem Research
PPTX
The biodiversity informatics landscape: a systematics perspective
PDF
Software Ecosystems = Big Data
PPTX
What is SECOHealth about?
PPTX
Data accessibility and the role of informatics in predicting the biosphere
PPTX
Big Data Field Museum
PDF
The Biodiversity Informatics Landscape
PDF
Overview of Ecosystem Extent and Integrity Slides
PPTX
IARU Global Challenges 2014 Cornell Tracking our decline
PPT
Services For Science April 2009
PPTX
10th e concertation-brussels-06march2013-v2
PDF
Studying Evolving Software Ecosystems Inspired by Ecological Models
PPTX
Software Ecosystem Evolution. It's complex!
PDF
Biodiversity Virtual e-Laboratory (BioVeL)
PPTX
Big data nebraska
PPTX
Natusfera Lifewatch Competence Center EGI amsterdam 2016 small
PDF
Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf
PDF
BioCatalogue DILS & Enfin 2009 by Jits
PPTX
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
PDF
RE 2015 ecosystems tutorial
Challenges in Software Ecosystem Research
The biodiversity informatics landscape: a systematics perspective
Software Ecosystems = Big Data
What is SECOHealth about?
Data accessibility and the role of informatics in predicting the biosphere
Big Data Field Museum
The Biodiversity Informatics Landscape
Overview of Ecosystem Extent and Integrity Slides
IARU Global Challenges 2014 Cornell Tracking our decline
Services For Science April 2009
10th e concertation-brussels-06march2013-v2
Studying Evolving Software Ecosystems Inspired by Ecological Models
Software Ecosystem Evolution. It's complex!
Biodiversity Virtual e-Laboratory (BioVeL)
Big data nebraska
Natusfera Lifewatch Competence Center EGI amsterdam 2016 small
Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf
BioCatalogue DILS & Enfin 2009 by Jits
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
RE 2015 ecosystems tutorial

More from Alexander Serebrenik (20)

PDF
Peer Reviews MSR 2025: tutorial for the Junior PC
PDF
Studying Humans in Software Engineering [Keynote talk at BPM 2024]
PDF
Software development is a human activity: understanding software requires und...
PPTX
Towards Continuous Performance Assessment of Java Applications With PerfBot
PPTX
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
PPTX
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
PPTX
Emotion Analysis in Software Ecosystems
PPTX
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
PDF
Gender and Age in Software Engineering
PDF
Alexander - intro
PDF
Diversity and inclusion in a CS classroom
PDF
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
PDF
Classification and Ranking of Delta Static Analysis Alarms
PDF
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
PDF
Gender and Community Smells
PPTX
Bias in MSR Research
PDF
From team organisation to software quality
PDF
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
PDF
My research story (presentation at ICSE 2021 New Faculty Symposium)
PDF
Opinion Mining for Software Engineering
Peer Reviews MSR 2025: tutorial for the Junior PC
Studying Humans in Software Engineering [Keynote talk at BPM 2024]
Software development is a human activity: understanding software requires und...
Towards Continuous Performance Assessment of Java Applications With PerfBot
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
Emotion Analysis in Software Ecosystems
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Gender and Age in Software Engineering
Alexander - intro
Diversity and inclusion in a CS classroom
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
Classification and Ranking of Delta Static Analysis Alarms
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
Gender and Community Smells
Bias in MSR Research
From team organisation to software quality
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
My research story (presentation at ICSE 2021 New Faculty Symposium)
Opinion Mining for Software Engineering

Recently uploaded (20)

PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
OOP with Java - Java Introduction (Basics)
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Internet of Things (IOT) - A guide to understanding
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Construction Project Organization Group 2.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Well-logging-methods_new................
DOCX
573137875-Attendance-Management-System-original
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Safety Seminar civil to be ensured for safe working.
PPT
Mechanical Engineering MATERIALS Selection
PPTX
additive manufacturing of ss316l using mig welding
PPTX
web development for engineering and engineering
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPT
Project quality management in manufacturing
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
R24 SURVEYING LAB MANUAL for civil enggi
OOP with Java - Java Introduction (Basics)
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Internet of Things (IOT) - A guide to understanding
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Construction Project Organization Group 2.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Well-logging-methods_new................
573137875-Attendance-Management-System-original
CH1 Production IntroductoryConcepts.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Safety Seminar civil to be ensured for safe working.
Mechanical Engineering MATERIALS Selection
additive manufacturing of ss316l using mig welding
web development for engineering and engineering
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Project quality management in manufacturing
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
bas. eng. economics group 4 presentation 1.pptx

Challenges in Software Ecosystems Research

  • 1. Challenges in Software Ecosystems Research Alexander Serebrenik Eindhoven University of Technology The Netherlands @aserebrenik Tom Mens UMons Belgium @tom_mens
  • 4. Definition of an ecosystem Example of an ecosystem Trends and challenges
  • 5. 164 authors of an article or a book chapter on SECO, paper in IWSECO, WEA or Big Systems 2014 141 authors with a valid email address 26* answered the survey * response rate 18,4%, comparable with other surveys
  • 6. Definition of an ecosystem Respondent: “Defining everything as an ecosystem. <…> The word is trend-ish and it causes misunderstandings in the field.”
  • 7. “The complex system of plant, animal, fungal, and microorganism communities and their associated non-living environment interacting as an ecological unit. Ecosystems have no fixed boundaries”
  • 8. [Lungu 2008] [Jansen et al. 2009] [Manikas, Hansen 2013] <biological> communities software projects actors actors environment environment shared markt for software and services, shared platform common technological platform interaction developed and evolve together exchange of information, resources & artefacts symbiotic relationships Definition of an ecosystem
  • 9. social technical economical [Lungu 2008] [Bosch, Bosch- Sijtsema 2009] [*Moore 1993][Jansen et al. 2009] [Mitleton-Kelly 2003] [Manikas, Hansen 2013] Definition of an ecosystem
  • 10. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  • 11. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  • 12. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  • 13. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  • 14. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  • 15. Definition of an ecosystem Example of an ecosystem Respondent: “Defining everything as an ecosystem. <…> The word is trend-ish and it causes misunderstandings in the field.” social economicaltechnical Different perspectives on the same artefacts or different artefacts all together?
  • 16. Trends and challenges 26 survey answers Literature study {29 challenges 8 categories
  • 17. One challenge is to be able to characterize the wealth of the community wrt the wealth of the software components. What is the impact of different collaboration and development practices on the quality of the ecosystem?” Trends and challenges
  • 18. One challenge is to be able to characterize the wealth of the community wrt the wealth of the software components. What is the impact of different collaboration and development practices on the quality of the ecosystem?” Trends and challenges ecosystem quality socio-technical
  • 19. One challenge is to be able to characterize the wealth of the community wrt the wealth of the software components. What is the impact of different collaboration and development practices on the quality of the ecosystem?” Trends and challenges ecosystem quality socio-technical SECOs may consist of many systems. Analysing all these systems as a whole may raise some technical problems, due to the quantity of data to take into account. data analytics amount (volume) large databases with comparable information about the details of a large collection of ecosystems, so that any research could be conducted in a repeatable and comparable way. database of comparable inforeproducible research
  • 24. Privacy: surveys Minority respondents are easy to identify Reproducibility vs privacy
  • 25. Non-sensitive Sensitive Zip Age Nationality Condition 1 13053 28 Russian Heart Disease 2 13068 29 American Heart Disease 3 13068 21 Japanese Viral Infection 4 13053 23 American Viral Infection 5 14853 50 Indian Cancer 6 14853 55 Russian Heart Disease 7 14850 47 American Viral Infection 8 14850 49 American Viral Infection 9 13053 31 American Cancer 10 13053 37 Indian Cancer 11 13068 36 Japanese Cancer 12 13068 35 American Cancer
  • 26. Non-sensitive Sensitive Zip Age Nationality Condition 1 130** <30 * Heart Disease 2 130** <30 * Heart Disease 3 130** <30 * Viral Infection 4 130** <30 * Viral Infection 5 1485* >40 * Cancer 6 1485* >40 * Heart Disease 7 1485* >40 * Viral Infection 8 1485* >40 * Viral Infection 9 130** 30-40 * Cancer 10 130** 30-40 * Cancer 11 130** 30-40 * Cancer 12 130** 30-40 * Cancer
  • 27. Non-sensitive Sensitive Zip Age Nationality Condition 1 130** <30 * Heart Disease 2 130** <30 * Heart Disease 3 130** <30 * Viral Infection 4 130** <30 * Viral Infection 5 1485* >40 * Cancer 6 1485* >40 * Heart Disease 7 1485* >40 * Viral Infection 8 1485* >40 * Viral Infection 9 130** 30-40 * Cancer 10 130** 30-40 * Cancer 11 130** 30-40 * Cancer 12 130** 30-40 * Cancer
  • 28. Are some challenges more important than others?
  • 29. Second survey • Group A: respondents of the previous survey that have provided their email addresses • 26 answers - 20 with mail, invited - 14 responses - 70% • Group B: extended list of ecosystem experts (outside Group A): • 148 invited - 142 valid addresses - 38* responses ~ 27% • Better response rate: 32.1% vs 18.4% (first survey) * One of the respondents that provided an email has not been invited
  • 31. No difference between Group A and Group B Adonis, Unknown, restored by Duquesnoy (1597–1643), Louvre • Analysis of Similarities (ANOSIM) • R: -0.07564 • more dissimilar closer to 1 • Permutational Multivariate Analysis of Variance Using Distance Matrices (ADONIS) • p-value: 0.192
  • 32. Ordering challenges 1. Consider both groups as one set of answers 2. Per question: #very important - #moderately important - #slightly important 3. Lexicographic order on the triples (#very important - #moderately important - #slightly important)
  • 33. Top Three 1. Reproducible and Comparable Research [Providing databases with information about the details of a large collection of ecosystems] 2. Reproducible and Comparable Research [Making research results about ecosystems available in a reproducible way] 3. Offer more advanced ecosystems analysis (e.g., case studies, qualitative and quantitative analysis) [Use more advanced statistical techniques (e.g., survival analysis, econometric aggregation, contrasts)]
  • 34. Reproducible Research: SE problem? Raw$data! Processed$ data$set! Tools$&$ scripts! #MSR$papers$ 200482009! Y" Y" Y" 2" Y" Y" N" 2" Y" P" Y" 1" Y" P" P" 2" Y" P" N" 2" Y" N" Y" 16" Y" N" P" 19" Y" N" N" 64" P" N" Y" 1" P" N" N" 2" N" Y" N" 2" N" P" N" 1" N" N" Y" 7" N" N" P" 2" N" N" N" 31" N/A" N/A" N/A" 17" Robles 2010 Ghezzi, Gall 2013: • Replicated 25 • Partially 27 • Not replicated 36
  • 35. Reproducible and Comparable Research [Providing databases with information about the details of a large collection of ecosystems] Enough? Too big to share? Up-to-date? Still relevant? 1TB
  • 37. Advanced statistics 3. Offer more advanced ecosystems analysis (e.g., case studies, qualitative and quantitative analysis) [Use more advanced statistical techniques (e.g., survival analysis, econometric aggregation, contrasts)]
  • 38. Advanced statistics PAGE 2711/08/15 Two distributions: !  t-test !  Mann-Whitney Multiple distributions: 1.  ANOVA / KW 2.  pairwise t-test / MW Tests can be inconsistent with each other We need a one-phase test!
  • 39. Advanced statistics PAGE 3211/08/15 Idea: ​"  Pair Low High B-A -0.56 -0.44 C-A -0.50 -0.31 D-A -0.32 -0.03 C-B -0.01 0.24 D-B 0.24 0.47 D-C 0.09 0.40 A→B A→C A→D D→B D→C Konietschke, F., Hothorn, LA, and Brunner, E. Rank-based multiple test procedures and simultaneous confidence intervals. Electron. J. Stat. 6 (2012), 738–759. ~
  • 40. T and Software Ecosystems • Stack Overflow and GitHub - Vasilescu et al. Social Com 2013 • Simulink models - Dajsuren et al. QoSA 2013 • GNOME - Vasilescu et al. ESE 2014 • Stack Exchange sites - Wang et al. ICSME 2014 • jEdit, ArgoUML, KOffice - Sun et al. Inf & Software Technology 2015 ~
  • 41. Advanced statistics Mean, median, sum Gini, Theil, Kolm… Choice of an aggregation technique provides different insights but can also affect validity of the results! C. Gini, “Measurement of inequality of incomes,” The Economic Journal, 1921. H. Theil, Economics and Information Theory. North-Holland, 1967 A.B. Atkinson, “On the measurement of inequality,” Journal of Economic Theory, 1970. …
  • 42. Gini, Theil & Software Ecos • Qualitas - Spasojević et al. ICSME 2014 • GNOME - Mens, Goeminne IWSECO 2011, Vasilescu et al. ESE 2014 • Debian - Serebrenik, vd Brand ICSM 2010 • Market shares - Yu, First Monday 2012
  • 43. Advanced statistics % of entities still used after time t? Kaplan, E. L.; Meier, P. (1958). "Nonparametric estimation from incomplete observations". J. Amer. Statist. Assn. 53 (282): 457–481
  • 44. Survival & Software Ecos • FLOSSMetrics DB - Samoladas et al. Information & Software Technology 2010 • Debian packages - Claes et al. MSR 2015 • Databases in Java projects - Goeminne, Mens ICSME 2015
  • 45. 4. Understanding and improving the design, architecture, quality and health of software ecosystems [Socio-technical perspective, e.g., comparing the health of the community against the health of the ecosystem components] 5. Ecosystem Governance [Design perspective, e.g., actively supporting the stakeholders' decisions] 6. Understanding and improving an ecosystem's dynamics and evolution [Generalisation perspective, e.g., transferring insights from evolution of individual software systems to evolution of ecosystems] 7. Understanding and improving the design, architecture, quality and health of software ecosystems [Social perspective, e.g., creating an active community around the ecosystem] 8. Interdisciplinary research [Applying ecosystem research techniques to non-classical software ecosystems, e.g., spreadsheets or Matlab Simulink models] 9. Understanding and improving an ecosystem's dynamics and evolution [Design perspective, e.g., providing upgrade strategies when one of the ecosystem elements changes] 10.Ecosystem Governance [Generalisation perspective, e.g., going beyond anecdotal evidence]
  • 46. Threats to validity • Representativeness of the respondents wrt the research community
  • 48. National Oceanic and Atmospheric Administration, USA