From the course: Azure Spark Databricks Essential Training
Unlock the full course today
Join today to access over 24,700 courses taught by industry experts.
Business scenarios for Spark
From the course: Azure Spark Databricks Essential Training
Business scenarios for Spark
- [Instructor] So as a working cloud architect, what types of business scenarios have I found that are a best fit for Apache Spark technologies? In a nutshell, those are around distributed compute, and really what's driving it is the volume of data. For example, I've been doing quite a lot of work recently in genomic sequencing and analysis of genomic information. The kinds of tasks that I've used Spark for in these types of workflows included data cleansing, or Extract, Transform, and Load; fast data serving pipelines; scalable complex processing; and distributed machine learning. You can think of Azure Databricks as a set of three components. You have the Databricks tools, services, and optimizations that surround the core open source Apache Spark distribution, and Apache Spark itself provides the distributed computation needed for these intensive workloads, and this sits on top of some sort of file system. Now natively in…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
(Locked)
Meet Databricks Apache Spark clusters2m 22s
-
(Locked)
Business scenarios for Spark1m 45s
-
(Locked)
Understand Spark key components2m 43s
-
(Locked)
Azure Databricks concepts5m 25s
-
Quick start: Use a notebook7m 7s
-
(Locked)
Set up Databricks AI Playground1m 56s
-
(Locked)
Use Databricks AI Playground3m 25s
-
(Locked)
-
-
-
-
-