SlideShare a Scribd company logo
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
•
•
•
•
•
• ANY OTHER
PROVIDER PROVISIONING TOOLS
• HTTPS://WWW.VAGRANTUP.COM/DOWNLOADS.HTML
• VIRTUALIZATION
•
•
GUEST OPERATING SYSTEMS
• HTTPS://WWW.VIRTUALBOX.ORG/WIKI/DOWNLOADS
•
• HTTPS://GITHUB.COM/FELIXCHEUNG/VAGRANT-PROJECTS
• SPARK-CASSANDRA-ZEPPELIN
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
•
•
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
•
HTTPS://ZEPPELIN.INCUBATOR.APACHE.ORG/
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
• HTTPS://GITHUB.COM/FELIXCHEUNG/SPARK-NOTEBOOK-
EXAMPLES/TREE/MASTER/ZEPPELIN_NOTEBOOK/APACHECON2016
•
•
• HTTP://SPARK.APACHE.ORG/DOCS/LATE
ST/CONFIGURATION.HTML
•
•
• HTTPS://GITHUB.COM/FELIXCHEUNG/SPARK-NOTEBOOK-
EXAMPLES/TREE/MASTER/ZEPPELIN_NOTEBOOK/APACHECON2016
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
• PARTITION
CLUSTER MEAN PROTOTYPE
• HTTP://THEORY.STANFORD.EDU/~SERGEI/PAPERS/VLDB12-KMPAR.PDF
K-MEANS++
• STREAMING K-MEANS
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
GRAPHFRAMES
•
•
•
•
•
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
• BIGTABLE: A DISTRIBUTED STORAGE SYSTEM FOR STRUCTURED DATA
•
•
•
•
•
• HTTPS://HBASE.APACHE.ORG/BOOK.HTML#QUICKSTART
•
•
•
•
•
•
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
• BORN AT FACEBOOK AMAZON’S DYNAMO AND GOOGLE’S BIGTABLE
•
•
•
•
•
• HTTP://WIKI.APACHE.ORG/CASSANDRA/GETTINGSTARTED
•
•
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
•
•
•
• HERE
•
•
•
• HTTPS://WWW.DIGITALOCEAN.COM/COMMUNITY/TUTORIALS/HOW-TO-INSTALL-CASSANDRA-AND-RUN-
A-SINGLE-NODE-CLUSTER-ON-A-UBUNTU-VPS
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
•
•
•
•
•
•
•
•
•
•
•
•
• M3.XLARGE
•
•
•
•
•
•
•
•
•
http://www.natalinobusa.com/2015/11/why-is-smack-stack-all-rage-lately.html
• HTTPS://DOCS.MESOSPHERE.COM/ADMINISTRATION/INSTALLING/CLOUD/AWS/
• HTTPS://DCOS.IO/DOCS/1.7/ADMINISTRATION/INSTALLING/CLOUD/AWS/
•
•
•
•
•
• https://dcos.io/docs/1.7/usage/tutorials/spark/
• HTTPS://GITHUB.COM/FELIXCHEUNG

More Related Content

PPTX
Cloud company
PDF
Data science lifecycle with Apache Zeppelin
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
PPTX
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
PDF
Big Data visualization with Apache Spark and Zeppelin
PPTX
Data Science with Spark & Zeppelin
PDF
Sparkly Notebook: Interactive Analysis and Visualization with Spark
PDF
Exploratory data analysis using apache lens and apache zeppelin
Cloud company
Data science lifecycle with Apache Zeppelin
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Big Data visualization with Apache Spark and Zeppelin
Data Science with Spark & Zeppelin
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Exploratory data analysis using apache lens and apache zeppelin

Viewers also liked (20)

PPTX
Apache Camel: The Swiss Army Knife of Open Source Integration
PDF
Cloudera Impala
PDF
Apache Spark: The Next Gen toolset for Big Data Processing
PPTX
SparkR + Zeppelin
PPTX
Streaming Python on Hadoop
PDF
Installing Hadoop / Spark from scratch
PDF
Recommendation and graph algorithms in Hadoop and SQL
PPTX
PyData Ljubljana meetup #1
PDF
Apache Zeppelin, Helium and Beyond
PDF
Event Driven Architecture with Apache Camel
PDF
ACM DEBS 2015: Realtime Streaming Analytics Patterns
PPTX
Praxis and politics of urban data: Building the Dublin Dashboard
PPTX
Dublin dashboard launch
PDF
The ethics of urban big data and smart cities
PPTX
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
PDF
Apache Hadoop Crash Course
PPTX
Ethics and Politics of Big Data
PPTX
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
PDF
Spark Under the Hood - Meetup @ Data Science London
PPTX
Intro to Spark with Zeppelin
Apache Camel: The Swiss Army Knife of Open Source Integration
Cloudera Impala
Apache Spark: The Next Gen toolset for Big Data Processing
SparkR + Zeppelin
Streaming Python on Hadoop
Installing Hadoop / Spark from scratch
Recommendation and graph algorithms in Hadoop and SQL
PyData Ljubljana meetup #1
Apache Zeppelin, Helium and Beyond
Event Driven Architecture with Apache Camel
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Praxis and politics of urban data: Building the Dublin Dashboard
Dublin dashboard launch
The ethics of urban big data and smart cities
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Apache Hadoop Crash Course
Ethics and Politics of Big Data
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Spark Under the Hood - Meetup @ Data Science London
Intro to Spark with Zeppelin
Ad

Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark

Editor's Notes

  • #35: https://docs.mesosphere.com/1-7/usage/services/zeppelin/