SlideShare a Scribd company logo
Mark van der Loo | Statistics Netherlands
Open source statistical software at
the statistical office
Mark van der Loo
Statistics Netherlands
m.vanderloo@cbs.nl
Mark van der Loo | Statistics Netherlands
What is open source software?
Free and Open Source Software
Use
Study
Change
Redistribute
Mark van der Loo | Statistics Netherlands
FOSS is driving modern stats and data science
And much, much more. . .
Mark van der Loo | Statistics Netherlands
Communities (1) Social Coding with github
Mark van der Loo | Statistics Netherlands
Communities (2) Q&A with stackoverflow
Mark van der Loo | Statistics Netherlands
Communities (3) news & discussions on Twitter
Mark van der Loo | Statistics Netherlands
Role of commercial parties, foundations
And many,many more, . . .
Mark van der Loo | Statistics Netherlands
Motivations
Wikipedians
Wikipedians enjoy a sense of accomplishment, collectivism, and
benevolence while working with exceptional freedom and ease. The
values of reputation, community, reciprocity, altruism and autonomy
are fostered by both the people and the technology[. . .].
(Kutzetsnov, 2006)
Commecial parties
[Companies] expect to benefit from their expertise in some segment
whose demand is boosted by the success of a complementary open
source program. (Learner, 2000)
Mark van der Loo | Statistics Netherlands
Motivations for Official Statistics
Use
Economic (its free!)
New hires
Supporting community
Contribute
Solving shared problems
Many eyes make all bugs shallow
Influence, reputation
Built with tax payer’s money
Mark van der Loo | Statistics Netherlands
FOSS for official statistics
Awesome list
Community effort
Curated
50+ Software packages
Covering 14 GSBPM areas
Growing
− 75 commits
− 5 PR’s
− 6 contributors
www.awesomeofficialstatistics.org
Mark van der Loo | Statistics Netherlands
When is it awesome?
You may be awesome when. . .
Free, open source, available for download
Used in at least one statistical institute for production or, offers
access to official statistics
Relatively easy to install and use (for non-dev’s)
Actively maintained
At least one stable release
Mark van der Loo | Statistics Netherlands
What’s on the Awesome List?
Mark van der Loo | Statistics Netherlands
In the works. . .
Mark van der Loo | Statistics Netherlands
FOSS policy at Statistics Netherlands (in short)
Usage
Selection and introduction follows the same procedure as for COTS
(commercial off-the-shelve).
R, Python, node.js
Contributing
When relevant to Statistics Netherlands, with positive business case.
R, node.js
Mark van der Loo | Statistics Netherlands
Deployment of R and Python at Statistics
Netherlands
Central read-only folder for executables.
All users have access to the same version with curated list of
libraries installed.
Scripts can be prepared and integrated for non-developers with
ease.
Repositories (CRAN, Anaconda) on internal website, updated
frequently.
Old versions stay available for some time so existing
applications stay working.
Mark van der Loo | Statistics Netherlands
Example using R in data editing: tools and roles
Database
Rules, metadata, control
parameters...
Graphical user interface
IT dpt
Users
IT dpt
Methodology
Mark van der Loo | Statistics Netherlands
Example contributions
R packages
− data editing: validate, simputation, errorlocate,
deductive, dcmodify
− data logging: lumberjack
− small area estimation: hbsae
− datavis: tabplot, tabplotd3, tmap, . . .
node.js packages
− scraping: RobotTool, S4Robo
− dashboard: StatMine
. . .
Mark van der Loo | Statistics Netherlands
So you want to contribute?
Here are some options
1. Use it (& send a thumbs-up!)
2. Advocate
− Tell your friends and colleagues
− Write blog posts / articles / presentations
− Social media (twitter. . .)
3. File bug reports, suggestions
4. Add code to an existing project
5. Start your own project
Mark van der Loo | Statistics Netherlands
FREE TIP!
Don’t work alone
Join your local community
− meetups, news letters, hackatons
Set up a community in your institute
− Local wiki, user meetings, hackatons, ask-the-expert

More Related Content

PPTX
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
PDF
Open data across Europe
PDF
Data & Analytics Framework: how public sector can profit from its immense ass...
PDF
How AI can improve the PA: A case study. Fabio Fumarola, Team per la Trasform...
PPTX
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
PDF
FIWARE Global Summit - Implementing the European Data Economy with FIWARE Tec...
PDF
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
PDF
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Open data across Europe
Data & Analytics Framework: how public sector can profit from its immense ass...
How AI can improve the PA: A case study. Fabio Fumarola, Team per la Trasform...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
FIWARE Global Summit - Implementing the European Data Economy with FIWARE Tec...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...

What's hot (11)

PDF
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
PDF
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
PDF
Linked GeoRef
PDF
Rajendra Akerkar - LeMO Project
PPTX
Mapping presentation THAG big data from space
PPTX
Wikidata as a toolbox for public service media companies
PPTX
A Linked Data Dataset for Madrid Transport Authority's Datasets
PDF
Digitales Graffiti und vernetzte Daten für digitale Städte
PPT
Joost Tholhuijsen - Public authorities The Netherlands IPv6 Awareness
PDF
Data collection for cultural project
PDF
DataGraft: Data-as-a-Service for Open Data
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Linked GeoRef
Rajendra Akerkar - LeMO Project
Mapping presentation THAG big data from space
Wikidata as a toolbox for public service media companies
A Linked Data Dataset for Madrid Transport Authority's Datasets
Digitales Graffiti und vernetzte Daten für digitale Städte
Joost Tholhuijsen - Public authorities The Netherlands IPv6 Awareness
Data collection for cultural project
DataGraft: Data-as-a-Service for Open Data
Ad

Similar to Open Source Statistical Software for the Statistical Office (20)

PDF
R - the language
PPTX
R language tutorial
KEY
R for Pirates. ESCCONF October 27, 2011
PPTX
Getting started with R when analysing GitHub commits
PDF
Open source analytics
PDF
Introduction to open data
PDF
R tutorial
PDF
Open source for customer analytics
PDF
Machine Learning in R
PDF
R basics
PDF
effectivegraphsmro1
PPT
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
PDF
Data Visualization in the Newsroom
PPTX
The Business Economics and Opportunity of Open Source Data Science
DOCX
Final ProjectsData SetsData Sets in R Packages.docx
PPTX
Linked Statistical Data 101
PPTX
R_Scripting_Basics_2022-03aaaaaaaaa.pptx
PDF
Open source vs. open data
PDF
Open Data, Big Data and Machine Learning
R - the language
R language tutorial
R for Pirates. ESCCONF October 27, 2011
Getting started with R when analysing GitHub commits
Open source analytics
Introduction to open data
R tutorial
Open source for customer analytics
Machine Learning in R
R basics
effectivegraphsmro1
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
Data Visualization in the Newsroom
The Business Economics and Opportunity of Open Source Data Science
Final ProjectsData SetsData Sets in R Packages.docx
Linked Statistical Data 101
R_Scripting_Basics_2022-03aaaaaaaaa.pptx
Open source vs. open data
Open Data, Big Data and Machine Learning
Ad

Recently uploaded (20)

PDF
eVerify Overview and Detailed Instructions to Set up an account
PDF
Item # 2 - 934 Patterson Specific Use Permit (SUP)
PPTX
dawasoncitcommunityroolingadsAug 11_25.pptx
PDF
PPT Item # 2 -- Announcements Powerpoint
PDF
PPT Item # 5 - 5307 Broadway St (Final Review).pdf
PDF
PPT - Primary Rules of Interpretation (1).pdf
PDF
Item # 3 - 934 Patterson Final Review.pdf
PDF
CXPA Finland Webinar: Rated 5 Stars - Delivering Service That Customers Truly...
PPT
The Central Civil Services (Leave Travel Concession) Rules, 1988, govern the ...
PDF
About Karen Miner-Romanoff - Academic & nonprofit consultant
PPTX
Workshop-Session-1-LGU-WFP-Formulation.pptx
PPTX
Developing_An_Advocacy_Agenda_by_Kevin_Karuga.pptx
PDF
Item # 4 -- 328 Albany St. compt. review
PPTX
The DFARS - Part 251 - Use of Government Sources By Contractors
PPTX
SOMANJAN PRAMANIK_3500032 2042.pptx
PDF
Population Estimates 2025 Regional Snapshot 08.11.25
PPTX
Portland FPDR Oregon Legislature 2025.pptx
PDF
The Detrimental Impacts of Hydraulic Fracturing for Oil and Gas_ A Researched...
PPTX
School Education Programs for Social Impact Learn with Parramatta Mission
PPTX
DFARS Part 253 - Forms - Defense Contracting Regulations
eVerify Overview and Detailed Instructions to Set up an account
Item # 2 - 934 Patterson Specific Use Permit (SUP)
dawasoncitcommunityroolingadsAug 11_25.pptx
PPT Item # 2 -- Announcements Powerpoint
PPT Item # 5 - 5307 Broadway St (Final Review).pdf
PPT - Primary Rules of Interpretation (1).pdf
Item # 3 - 934 Patterson Final Review.pdf
CXPA Finland Webinar: Rated 5 Stars - Delivering Service That Customers Truly...
The Central Civil Services (Leave Travel Concession) Rules, 1988, govern the ...
About Karen Miner-Romanoff - Academic & nonprofit consultant
Workshop-Session-1-LGU-WFP-Formulation.pptx
Developing_An_Advocacy_Agenda_by_Kevin_Karuga.pptx
Item # 4 -- 328 Albany St. compt. review
The DFARS - Part 251 - Use of Government Sources By Contractors
SOMANJAN PRAMANIK_3500032 2042.pptx
Population Estimates 2025 Regional Snapshot 08.11.25
Portland FPDR Oregon Legislature 2025.pptx
The Detrimental Impacts of Hydraulic Fracturing for Oil and Gas_ A Researched...
School Education Programs for Social Impact Learn with Parramatta Mission
DFARS Part 253 - Forms - Defense Contracting Regulations

Open Source Statistical Software for the Statistical Office

  • 1. Mark van der Loo | Statistics Netherlands Open source statistical software at the statistical office Mark van der Loo Statistics Netherlands m.vanderloo@cbs.nl
  • 2. Mark van der Loo | Statistics Netherlands What is open source software? Free and Open Source Software Use Study Change Redistribute
  • 3. Mark van der Loo | Statistics Netherlands FOSS is driving modern stats and data science And much, much more. . .
  • 4. Mark van der Loo | Statistics Netherlands Communities (1) Social Coding with github
  • 5. Mark van der Loo | Statistics Netherlands Communities (2) Q&A with stackoverflow
  • 6. Mark van der Loo | Statistics Netherlands Communities (3) news & discussions on Twitter
  • 7. Mark van der Loo | Statistics Netherlands Role of commercial parties, foundations And many,many more, . . .
  • 8. Mark van der Loo | Statistics Netherlands Motivations Wikipedians Wikipedians enjoy a sense of accomplishment, collectivism, and benevolence while working with exceptional freedom and ease. The values of reputation, community, reciprocity, altruism and autonomy are fostered by both the people and the technology[. . .]. (Kutzetsnov, 2006) Commecial parties [Companies] expect to benefit from their expertise in some segment whose demand is boosted by the success of a complementary open source program. (Learner, 2000)
  • 9. Mark van der Loo | Statistics Netherlands Motivations for Official Statistics Use Economic (its free!) New hires Supporting community Contribute Solving shared problems Many eyes make all bugs shallow Influence, reputation Built with tax payer’s money
  • 10. Mark van der Loo | Statistics Netherlands FOSS for official statistics Awesome list Community effort Curated 50+ Software packages Covering 14 GSBPM areas Growing − 75 commits − 5 PR’s − 6 contributors www.awesomeofficialstatistics.org
  • 11. Mark van der Loo | Statistics Netherlands When is it awesome? You may be awesome when. . . Free, open source, available for download Used in at least one statistical institute for production or, offers access to official statistics Relatively easy to install and use (for non-dev’s) Actively maintained At least one stable release
  • 12. Mark van der Loo | Statistics Netherlands What’s on the Awesome List?
  • 13. Mark van der Loo | Statistics Netherlands In the works. . .
  • 14. Mark van der Loo | Statistics Netherlands FOSS policy at Statistics Netherlands (in short) Usage Selection and introduction follows the same procedure as for COTS (commercial off-the-shelve). R, Python, node.js Contributing When relevant to Statistics Netherlands, with positive business case. R, node.js
  • 15. Mark van der Loo | Statistics Netherlands Deployment of R and Python at Statistics Netherlands Central read-only folder for executables. All users have access to the same version with curated list of libraries installed. Scripts can be prepared and integrated for non-developers with ease. Repositories (CRAN, Anaconda) on internal website, updated frequently. Old versions stay available for some time so existing applications stay working.
  • 16. Mark van der Loo | Statistics Netherlands Example using R in data editing: tools and roles Database Rules, metadata, control parameters... Graphical user interface IT dpt Users IT dpt Methodology
  • 17. Mark van der Loo | Statistics Netherlands Example contributions R packages − data editing: validate, simputation, errorlocate, deductive, dcmodify − data logging: lumberjack − small area estimation: hbsae − datavis: tabplot, tabplotd3, tmap, . . . node.js packages − scraping: RobotTool, S4Robo − dashboard: StatMine . . .
  • 18. Mark van der Loo | Statistics Netherlands So you want to contribute? Here are some options 1. Use it (& send a thumbs-up!) 2. Advocate − Tell your friends and colleagues − Write blog posts / articles / presentations − Social media (twitter. . .) 3. File bug reports, suggestions 4. Add code to an existing project 5. Start your own project
  • 19. Mark van der Loo | Statistics Netherlands FREE TIP! Don’t work alone Join your local community − meetups, news letters, hackatons Set up a community in your institute − Local wiki, user meetings, hackatons, ask-the-expert