SlideShare a Scribd company logo
Databricks Community Cloud
By: Robert Sanders
2Page:
Databricks Community Cloud
• Free/Paid Standalone Spark Cluster
• Online Notebook
• Python
• R
• Scala
• SQL
• Tutorials and Guides
• Shareable Notebooks
3Page:
Why is it useful?
• Learning about Spark
• Testing different versions of Spark
• Rapid Prototyping
• Data Analysis
• Saved Code
• Others…
4Page:
Forums
https://forums.databricks.com/
5Page:
Login/Sign Up
https://community.cloud.databricks.com/login.html
6Page:
Home Page
7Page:
Active Clusters
8Page:
Create a Cluster - Steps
1. From the Active Clusters page, click the “+
Create Cluster” button
2. Fill in the cluster name
3. Select the version of Apache Spark
4. Click “Create Cluster”
5. Wait for the Cluster to start up and be in a
“Running” state
9Page:
Create a Cluster
10Page:
Active Clusters
11Page:
Active Clusters – Spark Cluster UI - Master
12Page:
Workspaces
13Page:
Create a Notebook - Steps
1. Right click within a Workspace and click Create ->
Notebook
2. Fill in the Name
3. Select the programming language
4. Select the running cluster you’ve created that you
want to attach to the Notebook
5. Click the “Create” button
14Page:
Create a Notebook
15Page:
Notebook
16Page:
Using the Notebook
17Page:
Using the Notebook – Code Snippets
> sc
> sc.parallelize(1 to 5).collect()
18Page:
Using the Notebook - Shortcuts
Short Cut Action
Shift + Enter Run Selected Cell and Move to next
Cell
Ctrl + Enter Run Selected Cell
Option + Enter Run Selected Cell and Insert Cell
Bellow
Ctrl + Alt + P Create Cell Above Current Cell
Ctrl + Alt + N Create Cell Bellow Selected Cell
19Page:
Tables
20Page:
Create a Table - Steps
1. From the Tables section, click “+ Create Table”
2. Select the Data Source (bellow steps assume you’re using
File as the Data Source)
3. Upload a file from your local file system
1. Supported file types: CSV, JSON, Avro, Parquet
4. Click Preview Table
5. Fill in the Table Name
6. Select the File Type and other Options depending on the File
Type
7. Change Column Names and Types as desired
8. Click “Create Table”
21Page:
Create a Table – Upload File
22Page:
Create a Table – Configure Table
23Page:
Create a Table – Review Table
24Page:
Notebook – Access Table
25Page:
Notebook – Access Table – Code Snippets
> sqlContext
> sqlContext.sql("show tables").collect()
> val got = sqlContext.sql("select * from
got")
> got.limit(10).collect()
26Page:
Notebook – Display
27Page:
Notebook – Data Cleaning for Charting
28Page:
Notebook – Plot Options
29Page:
Notebook – Charting
30Page:
Notebook – Display and Charting – Code Snippets
> filter(got)
> val got = sqlContext.sql("select * from got")
> got.limit(10).collect()
> import org.apache.spark.sql.functions._
> val allegiancesCleanupUDF = udf[String, String]
(_.toLowerCase().replace("house ", ""))
> val isDeathUDF = udf{ deathYear: Integer => if(deathYear != null) 1 else 0}
> val gotCleaned = got.filter("Allegiances !=
"None"").withColumn("Allegiances",
allegiancesCleanupUDF($"Allegiances")).withColumn("isDeath",
isDeathUDF($"Death Year"))
> display(gotCleaned)
31Page:
Publish Notebook - Steps
1. While in a Notebook, click “Publish” on the top
right
2. Click “Publish” on the pop up
3. Copy the link and send it out
32Page:
Publish Notebook

More Related Content

PDF
Mongo db basics
PPTX
Mongo db
PDF
Experiment no 1
PPTX
DBeaver installation guide
PPTX
Tutorial about Using Zotero on Shared Computers
PPTX
Zotero: Step by step guide
PDF
phptut4
Mongo db basics
Mongo db
Experiment no 1
DBeaver installation guide
Tutorial about Using Zotero on Shared Computers
Zotero: Step by step guide
phptut4

What's hot (9)

PPT
ملخص تقنية تصميم صفحات الويب - الوحدة السادسة
PDF
Zotero Citation Management Software
PPTX
Mongo db nosql (1)
PDF
MySQL Space Management
PPS
Getting Started with Zotero
PPTX
File handling
PPTX
Zotero step by-step
PPTX
Sekilas PHP + mongoDB
PPTX
Dev Jumpstart: Building Your First App
ملخص تقنية تصميم صفحات الويب - الوحدة السادسة
Zotero Citation Management Software
Mongo db nosql (1)
MySQL Space Management
Getting Started with Zotero
File handling
Zotero step by-step
Sekilas PHP + mongoDB
Dev Jumpstart: Building Your First App
Ad

Viewers also liked (7)

PPTX
Pulse survey july 2012
PDF
i-HQ Cloud - Yaba; Connected Community
PDF
Community cloud
PPTX
fluid statics
PPT
Pressure Measurements | Comprehensive search
PPT
Pressure measurement
PPTX
Pressure measuring devices
Pulse survey july 2012
i-HQ Cloud - Yaba; Connected Community
Community cloud
fluid statics
Pressure Measurements | Comprehensive search
Pressure measurement
Pressure measuring devices
Ad

Similar to Databricks Community Cloud (20)

DOCX
Databricks Online Training | Databricks Online Course
PPTX
Python Automation With Gauge + Selenium + API + Jenkins
PDF
Apache Calcite (a tutorial given at BOSS '21)
PPTX
CCI2019 - Monitorare SQL Server Senza Andare in Bancarotta
PPTX
Day 1 - Technical Bootcamp azure synapse analytics
PPTX
Math-Bridge Installation
PDF
C++ Windows Forms L01 - Intro
PPT
Uklug 2014 connections dev faq
PPTX
SAS basics Step by step learning
PDF
Denodo Partner Connect: Technical Webinar - Ask Me Anything
PPTX
Learn Electron for Web Developers
PDF
HBase The Definitive Guide 2 (Early Release) Edition Lars George
PPT
Linux introduction
PDF
Twelve ways to make your apps suck less
PDF
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
PPTX
Getting Started with Splunk Break out Session
PDF
Azure Data Factory presentation with links
PPTX
Share point 2010_overview-day4-code
PPTX
Share point 2010_overview-day4-code
PDF
OASIS - Data Analysis Platform for Multi-tenant Hadoop Cluster
Databricks Online Training | Databricks Online Course
Python Automation With Gauge + Selenium + API + Jenkins
Apache Calcite (a tutorial given at BOSS '21)
CCI2019 - Monitorare SQL Server Senza Andare in Bancarotta
Day 1 - Technical Bootcamp azure synapse analytics
Math-Bridge Installation
C++ Windows Forms L01 - Intro
Uklug 2014 connections dev faq
SAS basics Step by step learning
Denodo Partner Connect: Technical Webinar - Ask Me Anything
Learn Electron for Web Developers
HBase The Definitive Guide 2 (Early Release) Edition Lars George
Linux introduction
Twelve ways to make your apps suck less
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Getting Started with Splunk Break out Session
Azure Data Factory presentation with links
Share point 2010_overview-day4-code
Share point 2010_overview-day4-code
OASIS - Data Analysis Platform for Multi-tenant Hadoop Cluster

More from clairvoyantllc (12)

PPTX
Getting started with SparkSQL - Desert Code Camp 2016
PPTX
MongoDB Replication fundamentals - Desert Code Camp - October 2014
PPTX
Architecture - December 2013 - Avinash Ramineni, Shekhar Veumuri
PPTX
Big data in the cloud - Shekhar Vemuri
PPTX
Webservices Workshop - september 2014
PPTX
Bigdata workshop february 2015
PPTX
Intro to Apache Spark
PPTX
Running Airflow Workflows as ETL Processes on Hadoop
PPTX
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
PPTX
Event Driven Architectures - Phoenix Java Users Group 2013
PDF
Strata+Hadoop World NY 2016 - Avinash Ramineni
PDF
HBase from the Trenches - Phoenix Data Conference 2015
Getting started with SparkSQL - Desert Code Camp 2016
MongoDB Replication fundamentals - Desert Code Camp - October 2014
Architecture - December 2013 - Avinash Ramineni, Shekhar Veumuri
Big data in the cloud - Shekhar Vemuri
Webservices Workshop - september 2014
Bigdata workshop february 2015
Intro to Apache Spark
Running Airflow Workflows as ETL Processes on Hadoop
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Event Driven Architectures - Phoenix Java Users Group 2013
Strata+Hadoop World NY 2016 - Avinash Ramineni
HBase from the Trenches - Phoenix Data Conference 2015

Recently uploaded (20)

PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Getting Started with Data Integration: FME Form 101
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
A Presentation on Touch Screen Technology
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
August Patch Tuesday
PDF
Approach and Philosophy of On baking technology
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Mushroom cultivation and it's methods.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PPTX
1. Introduction to Computer Programming.pptx
Zenith AI: Advanced Artificial Intelligence
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Getting Started with Data Integration: FME Form 101
A comparative study of natural language inference in Swahili using monolingua...
A Presentation on Touch Screen Technology
NewMind AI Weekly Chronicles - August'25-Week II
August Patch Tuesday
Approach and Philosophy of On baking technology
OMC Textile Division Presentation 2021.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Hindi spoken digit analysis for native and non-native speakers
cloud_computing_Infrastucture_as_cloud_p
Mushroom cultivation and it's methods.pdf
DP Operators-handbook-extract for the Mautical Institute
Enhancing emotion recognition model for a student engagement use case through...
1 - Historical Antecedents, Social Consideration.pdf
Heart disease approach using modified random forest and particle swarm optimi...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
1. Introduction to Computer Programming.pptx

Databricks Community Cloud