SlideShare a Scribd company logo
2013 open analytics-meetup-mortar
Mission:
Democratize Data Development
There are thousands
of publicly available datasets
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
Common Crawl
O
S
E
M
N
Obtain
Scrub
Explore
Model
Nterpret
Obtain
Scrub
Explore
Model
Nterpret
2013 open analytics-meetup-mortar
DEMO
Recap:
Illustrate
Local run
Cluster run
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
REQUEST:
1,300 NYC public datasets made usable
     https://data.cityofnewyork.us
FREE OFFER:
  Log data analysis
Recommender system
2013 open analytics-meetup-mortar
K Young, @kky
mortardata.com
APPENDIX
Mortar quick-start

gem install mortar
git clone git@github.com:mortardata/mortar-examples.git
cd mortar-examples
mortar register mortar-examples
mortar illustrate coffee_tweets ordered_output
mortar run coffee_tweets --clustersize=5
Some starting points for public data:

bitly.com/bundles/bigmlcom/4
www.reddit.com/r/datasets
www.kdnuggets.com/datasets
bitly.com/bundles/hmason/1
https://data.cityofnewyork.us

More Related Content

PPTX
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
PDF
Mining a Large Web Corpus
PDF
Cenitpede: Analyzing Webcrawl
PPTX
London HUG
PPTX
Search Joins with the Web - ICDT2014 Invited Lecture
PDF
Adoption of the Linked Data Best Practices in Different Topical Domains
PPT
Graph Structure in the Web - Revisited. WWW2014 Web Science Track
PDF
DBpedia - An Interlinking Hub in the Web of Data
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
Mining a Large Web Corpus
Cenitpede: Analyzing Webcrawl
London HUG
Search Joins with the Web - ICDT2014 Invited Lecture
Adoption of the Linked Data Best Practices in Different Topical Domains
Graph Structure in the Web - Revisited. WWW2014 Web Science Track
DBpedia - An Interlinking Hub in the Web of Data

What's hot (16)

PPTX
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
ODP
Linking Open Data
PPTX
The Graph Structure of the Web - Aggregated by Pay-Level Domain
PPT
Analytics and Access to the UK web archive
PPTX
Linked data life cycles
PDF
Extending Tables with Data from over a Million Websites
PPTX
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
PDF
Health Sciences Research Informatics, Powered by Globus
PDF
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
ODP
Mining the Web of Linked Data with RapidMiner
PDF
balloon: LOD forecasting - cloudy with a chance of services
PDF
Linked Data (1st Linked Data Meetup Malmö)
PPTX
Scalable Web Data Management using RDF
PPT
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
PDF
Linked Data
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
Linking Open Data
The Graph Structure of the Web - Aggregated by Pay-Level Domain
Analytics and Access to the UK web archive
Linked data life cycles
Extending Tables with Data from over a Million Websites
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
Health Sciences Research Informatics, Powered by Globus
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
Mining the Web of Linked Data with RapidMiner
balloon: LOD forecasting - cloudy with a chance of services
Linked Data (1st Linked Data Meetup Malmö)
Scalable Web Data Management using RDF
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
Linked Data
Ad

Similar to 2013 open analytics-meetup-mortar (20)

PDF
NYC Open Data Andrew Nicklin 20120510
PPTX
Eric Roche - Open Data KC - GCS16
PPTX
Ontology Engineering at Scale for Open City Data Sharing
PPTX
cse6339-spring15-02.pptx
PPTX
OakX:The Data-Driven City Kristina Redgrave - SF Mayors Office of Civic Inn...
PPT
Open Data in Gdansk
PPT
Open Gdansk - Analitics Conf - Gdansk
PPT
Tomasz Nadolny: Open Data in Gdańsk
PDF
A Farmers Market of Open Data
PPTX
Open Government Open Innovation and the Cloud
PPTX
Data Days: Citadel pilots results
PDF
Syracuse open data presentation
PPTX
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
PPTX
Open Data Initiatives
PPTX
Innovating with Open Data - Avi Bender
PPTX
Know Your Community: Data Power to the People
PDF
Open Data: Movement or a Joke?
PDF
Open Data Portals: 9 Solutions and How they Compare
PDF
US EPA Resource Conservation and Recovery Act published as Linked Open Data
PDF
Empowering City Developers with Federal Data
NYC Open Data Andrew Nicklin 20120510
Eric Roche - Open Data KC - GCS16
Ontology Engineering at Scale for Open City Data Sharing
cse6339-spring15-02.pptx
OakX:The Data-Driven City Kristina Redgrave - SF Mayors Office of Civic Inn...
Open Data in Gdansk
Open Gdansk - Analitics Conf - Gdansk
Tomasz Nadolny: Open Data in Gdańsk
A Farmers Market of Open Data
Open Government Open Innovation and the Cloud
Data Days: Citadel pilots results
Syracuse open data presentation
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
Open Data Initiatives
Innovating with Open Data - Avi Bender
Know Your Community: Data Power to the People
Open Data: Movement or a Joke?
Open Data Portals: 9 Solutions and How they Compare
US EPA Resource Conservation and Recovery Act published as Linked Open Data
Empowering City Developers with Federal Data
Ad

More from Open Analytics (20)

PDF
Cyber after Snowden (OA Cyber Summit)
PPTX
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
PPT
CDM….Where do you start? (OA Cyber Summit)
PPTX
An Immigrant’s view of Cyberspace (OA Cyber Summit)
PPTX
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
PPTX
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
PPTX
Using Real-Time Data to Drive Optimization & Personalization
PPTX
M&A Trends in Telco Analytics
PPTX
Competing in the Digital Economy
PPTX
Piwik: An Analytics Alternative (Chicago Summit)
PDF
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
PDF
Crossing the Chasm (Ikanow - Chicago Summit)
PPTX
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
PDF
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
PDF
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
PDF
From Insight to Impact (Chicago Summit - Keynote)
PPT
Easybib Open Analytics NYC
PPTX
MarkLogic - Open Analytics Meetup
PPTX
The caprate presentation_july2013_open analytics dc meetup
PPTX
Verifeed open analytics_3min deck_071713_final
Cyber after Snowden (OA Cyber Summit)
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
CDM….Where do you start? (OA Cyber Summit)
An Immigrant’s view of Cyberspace (OA Cyber Summit)
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
Using Real-Time Data to Drive Optimization & Personalization
M&A Trends in Telco Analytics
Competing in the Digital Economy
Piwik: An Analytics Alternative (Chicago Summit)
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Crossing the Chasm (Ikanow - Chicago Summit)
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
From Insight to Impact (Chicago Summit - Keynote)
Easybib Open Analytics NYC
MarkLogic - Open Analytics Meetup
The caprate presentation_july2013_open analytics dc meetup
Verifeed open analytics_3min deck_071713_final

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
MYSQL Presentation for SQL database connectivity
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Spectral efficient network and resource selection model in 5G networks
MYSQL Presentation for SQL database connectivity
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
Advanced methodologies resolving dimensionality complications for autism neur...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced Soft Computing BINUS July 2025.pdf
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25 Week I
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Machine learning based COVID-19 study performance prediction

2013 open analytics-meetup-mortar