Research and Academic
Software Projects
at the Institute for
Quantitative Social Science
Mercè Crosas, Ph.D.
Chief Data Science andTechnology Officer
IQSS, Harvard University
twitter: @mercecrosas web: mercecrosas.com
The Big Picture
Identify a
problem or
need in
research or
academia
The Big Picture
Identify a
problem or
need in
research or
academia
Build a
technology
solution,
easy- to-use,
gives control
to researcher
The Big Picture
Identify a
problem or
need in
research or
academia
Build a
technology
solution,
easy- to-use,
gives control
to researcher
Generalizable,
Open-source
The Big Picture
Identify a
problem or
need in
research or
academia
Build a
technology
solution,
easy- to-use,
gives control
to researcher
Build a
community
that makes
the
technology
better
Generalizable,
Open-source
Example: Dataverse
Example: Dataverse
๏ How do we increase data sharing to
improve research transparency and
replication with incentives to
researchers?
Example: Dataverse
๏ How do we increase data sharing to
improve research transparency and
replication with incentives to
researchers?
๏ Provide a repository solution, where
researchers have control of branding
and access of their data, and get credit
through data citation.
Example: OpenScholar
Example: OpenScholar
๏ How do we enable scholars to build
their academic web sites in a cost
effective way?
Example: OpenScholar
๏ How do we enable scholars to build
their academic web sites in a cost
effective way?
๏ Provide a web site builder with pre-set
features for academics, where a single
hosting serves thousands of sites.
Example: Zelig
Example: Zelig
๏ How do we simplify using thousands of
R statistical methods built by different
authors?
Example: Zelig
๏ How do we simplify using thousands of
R statistical methods built by different
authors?
๏ Provide a statistical package that uses
the same three commands for all
methods, with consistent
documentation.
Example: Consilience
Example: Consilience
๏ How do we make sense of thousands
(or millions!) of texts?
Example: Consilience
๏ How do we make sense of thousands
(or millions!) of texts?
๏ Provide an application that helps
researchers explore many possible
ways of categorizing documents.
The Process
Research,
standards &
best practices
Development,
testing &
releases
Input
from users,
community,
stakeholders
Dataverse
Case Study
metadata standards,
harvesting protocols,
data transfer, data
citation, provenance,
connecting to journals,
integrating with cloud
computing, ….
The Process
Research,
standards &
best practices
Development,
testing &
releases
Input
from users,
community,
stakeholders
Dataverse
Case Study
metadata standards,
harvesting protocols,
data transfer, data
citation, provenance,
connecting to journals,
integrating with cloud
computing, ….
The Process
Research,
standards &
best practices
Development,
testing &
releases
Input
from users,
community,
stakeholders
Dataverse
Case Study
usability testing,
community calls,
annual community
meeting, pull
requests
The Process DetailsDataverse
Case Study
The Process DetailsDataverse
Case Study
An agile process, integrating Waffle + GitHub + Jenkins, including these steps:
The Process DetailsDataverse
Case Study
An agile process, integrating Waffle + GitHub + Jenkins, including these steps:
Backlog > Ready > Dev > Code Review > QA > Usability Test > Polishing > Done
The Process DetailsDataverse
Case Study
An agile process, integrating Waffle + GitHub + Jenkins, including these steps:
Backlog > Ready > Dev > Code Review > QA > Usability Test > Polishing > Done
Pull Requests
Not only Best Practices in
Process, but also in Coding
Not only Best Practices in
Process, but also in Coding
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
3. Make incremental changes.
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
3. Make incremental changes.
4. Don't repeat yourself (or others).
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
3. Make incremental changes.
4. Don't repeat yourself (or others).
5. Plan for mistakes.
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
3. Make incremental changes.
4. Don't repeat yourself (or others).
5. Plan for mistakes.
6. Optimize software only after it works correctly.
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
3. Make incremental changes.
4. Don't repeat yourself (or others).
5. Plan for mistakes.
6. Optimize software only after it works correctly.
7. Document design and purpose, not mechanics.
Not only Best Practices in
Process, but also in Coding
1. Write programs for people, not computers.
2. Let the computer do the work.
3. Make incremental changes.
4. Don't repeat yourself (or others).
5. Plan for mistakes.
6. Optimize software only after it works correctly.
7. Document design and purpose, not mechanics.
8. Collaborate.
Impact at Harvard
6,833 OpenScholar sites created
13,904 Registered users
75,378 Publications posted
24 Academic departments
Impact at Harvard
243 Dataverses from Harvard affiliates
1,226 Datasets by Harvard affiliates as authors
1,427 Registered Harvard users
Broader Impact
Dataverse world-wide impact
Dataverses by Category
Datasets by Subject
53 Stats Models, easy-to-use
Thank you!
Presented by @mercecrosas mercecrosas.com
dataverse.org
dataverse.harvard.edu
openscholar.org
projects.iq.harvard.edu
scholar.harvard.edu
zeligproject.org
Coming soon!
iq.harvard.edu

More Related Content

PDF
Andrew Moore past-present-potential
PPT
Chapter1 introduction
PPTX
2014 aus-agta
PPTX
2013 ucar best practices
PPTX
Reproducible Research in the Humanities
PPTX
Designing a synergistic relationship between undergraduate Data Science educa...
PDF
What We Learned from Three Years of Sciencing the Crap Out of DevOps
PPT
Gridforum David De Roure Newe Science 20080402
Andrew Moore past-present-potential
Chapter1 introduction
2014 aus-agta
2013 ucar best practices
Reproducible Research in the Humanities
Designing a synergistic relationship between undergraduate Data Science educa...
What We Learned from Three Years of Sciencing the Crap Out of DevOps
Gridforum David De Roure Newe Science 20080402

What's hot (7)

ODP
Two Solitudes
PDF
UMich CI Days: Scaling a code in the human dimension
PDF
2014-10-10-SBC361-Reproducible research
PPTX
Genome sharing projects around the world nijmegen oct 29 - 2015
PPTX
Impact-Driven Research on Software Engineering Tooling
PPTX
2013 bio it world
PPTX
Big Data: the weakest link
Two Solitudes
UMich CI Days: Scaling a code in the human dimension
2014-10-10-SBC361-Reproducible research
Genome sharing projects around the world nijmegen oct 29 - 2015
Impact-Driven Research on Software Engineering Tooling
2013 bio it world
Big Data: the weakest link
Ad

Viewers also liked (20)

PPTX
PPTX
Social welfare 282 : Counducting a Literature Review
PPTX
Research
PPTX
PPT
Presentation Slide Show2010 Ec
PDF
The DataTags System: Sharing Sensitive Data with Confidence
PPTX
Cam12 OER and Quantitative Social Science
PDF
A model for handling overloading of literature review process for social science
PPT
PPT
Family4 Wages
PPT
Test pp
PPT
Design- Based Research: New Research Paradigm
PDF
SQ Lecture Four : Distributing Services & Setting Prices and Implementing Re...
PDF
Stephenson - Data Curation for Quantitative Social Science Research
PPT
Short Presentation (2008)
PDF
Labour week 2:3 i
PPTX
Writing Effective Literature Review in Social Science.
PPT
05 qualitative research
PPT
Chapter 9
PDF
Complimentary Roles of Quantitative & Qualitative Research Methods 2015.2.25
Social welfare 282 : Counducting a Literature Review
Research
Presentation Slide Show2010 Ec
The DataTags System: Sharing Sensitive Data with Confidence
Cam12 OER and Quantitative Social Science
A model for handling overloading of literature review process for social science
Family4 Wages
Test pp
Design- Based Research: New Research Paradigm
SQ Lecture Four : Distributing Services & Setting Prices and Implementing Re...
Stephenson - Data Curation for Quantitative Social Science Research
Short Presentation (2008)
Labour week 2:3 i
Writing Effective Literature Review in Social Science.
05 qualitative research
Chapter 9
Complimentary Roles of Quantitative & Qualitative Research Methods 2015.2.25
Ad

Similar to Abcd iqs ssoftware-projects-mercecrosas (20)

PDF
Research software susainability
PDF
Planning and Executing Practice-Impactful Research
PPTX
20171003 lancaster data conversations Chue-Hong
PDF
The challenge of putting software sustainability research into practice
PDF
An expanding and expansive view of computing research
PPTX
Software Professionals (RSEs) at NCSA
PPTX
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
PPTX
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
PDF
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
PDF
Bringing AI to Production - An Introduction
PPTX
Future se oct15
PDF
Vision and reflection on Mining Software Repositories research in 2024
PDF
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
PDF
discussion_3_project.pdf
PPTX
Better Software, Better Practices, Better Research
PPTX
Towards Mining Software Repositories Research that Matters
PDF
EclipseCon France 2015 - Science Track
PDF
What’s New with Databricks Machine Learning
PDF
Data Science meets Software Development
PPTX
SOFTWARE_ENGINEERING_UNIT_I_ROGER S PRESSMAN_A PRACTIONAR'S APPROACH.pptx
Research software susainability
Planning and Executing Practice-Impactful Research
20171003 lancaster data conversations Chue-Hong
The challenge of putting software sustainability research into practice
An expanding and expansive view of computing research
Software Professionals (RSEs) at NCSA
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
Bringing AI to Production - An Introduction
Future se oct15
Vision and reflection on Mining Software Repositories research in 2024
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
discussion_3_project.pdf
Better Software, Better Practices, Better Research
Towards Mining Software Repositories Research that Matters
EclipseCon France 2015 - Science Track
What’s New with Databricks Machine Learning
Data Science meets Software Development
SOFTWARE_ENGINEERING_UNIT_I_ROGER S PRESSMAN_A PRACTIONAR'S APPROACH.pptx

More from Merce Crosas (20)

PDF
Practical Implementation of research data policies: Solutions with Dataverse
PDF
Research Data Management @Harvard
PPTX
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
PDF
Can data access combat fake news?
PDF
Data Repositories Impact
PDF
Dataverse, Cloud Dataverse, and DataTags
PDF
FAIR Data Management and FAIR Data Sharing
PDF
The Data Lifecycle (Harvard DataFest)
PDF
Cloud Dataverse
PDF
Making Data Accessible
PDF
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
PDF
Connecting Dataverse with the Research Life Cycle
PDF
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
PPTX
A very Brief History of Communicating Science
PDF
Data Citation Implementation at Dataverse
PDF
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
PPTX
Dataverse on the MOC
PPTX
The Dataverse Commons
PPTX
Data Publishing at Harvard's Research Data Access Symposium
PDF
Dataverse hpdm symposium
Practical Implementation of research data policies: Solutions with Dataverse
Research Data Management @Harvard
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Can data access combat fake news?
Data Repositories Impact
Dataverse, Cloud Dataverse, and DataTags
FAIR Data Management and FAIR Data Sharing
The Data Lifecycle (Harvard DataFest)
Cloud Dataverse
Making Data Accessible
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Connecting Dataverse with the Research Life Cycle
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
A very Brief History of Communicating Science
Data Citation Implementation at Dataverse
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Dataverse on the MOC
The Dataverse Commons
Data Publishing at Harvard's Research Data Access Symposium
Dataverse hpdm symposium

Recently uploaded (20)

PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Five Habits of High-Impact Board Members
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
CloudStack 4.21: First Look Webinar slides
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Chapter 5: Probability Theory and Statistics
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
DOCX
search engine optimization ppt fir known well about this
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Architecture types and enterprise applications.pdf
PPTX
Modernising the Digital Integration Hub
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
The influence of sentiment analysis in enhancing early warning system model f...
Five Habits of High-Impact Board Members
A proposed approach for plagiarism detection in Myanmar Unicode text
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
sustainability-14-14877-v2.pddhzftheheeeee
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
sbt 2.0: go big (Scala Days 2025 edition)
A review of recent deep learning applications in wood surface defect identifi...
Final SEM Unit 1 for mit wpu at pune .pptx
Developing a website for English-speaking practice to English as a foreign la...
Module 1.ppt Iot fundamentals and Architecture
CloudStack 4.21: First Look Webinar slides
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Chapter 5: Probability Theory and Statistics
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
search engine optimization ppt fir known well about this
2018-HIPAA-Renewal-Training for executives
Architecture types and enterprise applications.pdf
Modernising the Digital Integration Hub
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...

Abcd iqs ssoftware-projects-mercecrosas

  • 1. Research and Academic Software Projects at the Institute for Quantitative Social Science Mercè Crosas, Ph.D. Chief Data Science andTechnology Officer IQSS, Harvard University twitter: @mercecrosas web: mercecrosas.com
  • 2. The Big Picture Identify a problem or need in research or academia
  • 3. The Big Picture Identify a problem or need in research or academia Build a technology solution, easy- to-use, gives control to researcher
  • 4. The Big Picture Identify a problem or need in research or academia Build a technology solution, easy- to-use, gives control to researcher Generalizable, Open-source
  • 5. The Big Picture Identify a problem or need in research or academia Build a technology solution, easy- to-use, gives control to researcher Build a community that makes the technology better Generalizable, Open-source
  • 7. Example: Dataverse ๏ How do we increase data sharing to improve research transparency and replication with incentives to researchers?
  • 8. Example: Dataverse ๏ How do we increase data sharing to improve research transparency and replication with incentives to researchers? ๏ Provide a repository solution, where researchers have control of branding and access of their data, and get credit through data citation.
  • 10. Example: OpenScholar ๏ How do we enable scholars to build their academic web sites in a cost effective way?
  • 11. Example: OpenScholar ๏ How do we enable scholars to build their academic web sites in a cost effective way? ๏ Provide a web site builder with pre-set features for academics, where a single hosting serves thousands of sites.
  • 13. Example: Zelig ๏ How do we simplify using thousands of R statistical methods built by different authors?
  • 14. Example: Zelig ๏ How do we simplify using thousands of R statistical methods built by different authors? ๏ Provide a statistical package that uses the same three commands for all methods, with consistent documentation.
  • 16. Example: Consilience ๏ How do we make sense of thousands (or millions!) of texts?
  • 17. Example: Consilience ๏ How do we make sense of thousands (or millions!) of texts? ๏ Provide an application that helps researchers explore many possible ways of categorizing documents.
  • 18. The Process Research, standards & best practices Development, testing & releases Input from users, community, stakeholders Dataverse Case Study
  • 19. metadata standards, harvesting protocols, data transfer, data citation, provenance, connecting to journals, integrating with cloud computing, …. The Process Research, standards & best practices Development, testing & releases Input from users, community, stakeholders Dataverse Case Study
  • 20. metadata standards, harvesting protocols, data transfer, data citation, provenance, connecting to journals, integrating with cloud computing, …. The Process Research, standards & best practices Development, testing & releases Input from users, community, stakeholders Dataverse Case Study usability testing, community calls, annual community meeting, pull requests
  • 22. The Process DetailsDataverse Case Study An agile process, integrating Waffle + GitHub + Jenkins, including these steps:
  • 23. The Process DetailsDataverse Case Study An agile process, integrating Waffle + GitHub + Jenkins, including these steps: Backlog > Ready > Dev > Code Review > QA > Usability Test > Polishing > Done
  • 24. The Process DetailsDataverse Case Study An agile process, integrating Waffle + GitHub + Jenkins, including these steps: Backlog > Ready > Dev > Code Review > QA > Usability Test > Polishing > Done Pull Requests
  • 25. Not only Best Practices in Process, but also in Coding
  • 26. Not only Best Practices in Process, but also in Coding
  • 27. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers.
  • 28. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work.
  • 29. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work. 3. Make incremental changes.
  • 30. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work. 3. Make incremental changes. 4. Don't repeat yourself (or others).
  • 31. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work. 3. Make incremental changes. 4. Don't repeat yourself (or others). 5. Plan for mistakes.
  • 32. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work. 3. Make incremental changes. 4. Don't repeat yourself (or others). 5. Plan for mistakes. 6. Optimize software only after it works correctly.
  • 33. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work. 3. Make incremental changes. 4. Don't repeat yourself (or others). 5. Plan for mistakes. 6. Optimize software only after it works correctly. 7. Document design and purpose, not mechanics.
  • 34. Not only Best Practices in Process, but also in Coding 1. Write programs for people, not computers. 2. Let the computer do the work. 3. Make incremental changes. 4. Don't repeat yourself (or others). 5. Plan for mistakes. 6. Optimize software only after it works correctly. 7. Document design and purpose, not mechanics. 8. Collaborate.
  • 35. Impact at Harvard 6,833 OpenScholar sites created 13,904 Registered users 75,378 Publications posted 24 Academic departments
  • 36. Impact at Harvard 243 Dataverses from Harvard affiliates 1,226 Datasets by Harvard affiliates as authors 1,427 Registered Harvard users
  • 41. 53 Stats Models, easy-to-use
  • 42. Thank you! Presented by @mercecrosas mercecrosas.com dataverse.org dataverse.harvard.edu openscholar.org projects.iq.harvard.edu scholar.harvard.edu zeligproject.org Coming soon! iq.harvard.edu