SlideShare a Scribd company logo
1
TOPIC
.net developer for Jupyter
Notebooks and Apache
Spark and viceversa
Marco Parenzan
(c)2020FrancescaParenzan
Who I am
https://twitter.com/marco_parenzan
https://github.com/marcoparenzan
https://www.linkedin.com/in/marcoparenzan/
Marco Parenzan
Prossimo evento 1nn0va: SQL Saturday Pordenone, 30 maggio 2020 https://www.sqlsaturday.com/921/
Road to Data Science
Big Data
Volume Velocity and Variety of Data
Thousands of IoT sensors in a fatory, producing petabytes of
data
What is Apache Spark?
• General-purpose distributed
processing engine for analytics over
large datasets
• For
• SparkSQL
• Streaming
• Machine Learning
• Visualization
• OSS anlutics engine
• Access data from variety of sources
and types
• ADLS, HDFS, S3, Kafka
• Text files, CSVs, JSONs, Parquet
• In-memory computation improves
efficiency and speed
• 100x faster that disk-based Hadoop
MapReduce
• 10x faster event when still on-disk
• Faster speeds  more interactive
analysis
• Developers can harness the power
of big data in a language of their
chiuce with the various high-
performance Spark APIs
DataFrame: the core of Spark programming
CSV Data
JSON Data
RDBMS Data
Parquet Data
Binary Data
MySQL Data
DataFrame
User programs
against the
DataFrame
abstraction
A Spark Recipe
Create Session
Create Dataframe
Define a user defined function
Manipulate and view Data
Elements of Spark
Spark
Session
User’s
Program
Task1 Task 2
Task 3 Task 4
Driver Executor
Cluster Manager
Batch vs. Notebooks
Work on slow data stored
into a Datalake
Submit a complete app in
one single deploy
Receive the entire output
Use «spart-submit»
Run cell by cell (but also all
at once)
What about .NET?
In a recent survey, more than 70% of .NET devs expressed
interest in Apache Spark
Millions of lines of big data-usable business logic are written
in .NET
But .NET devs are locked out from big data processing –
lack of .NET support in OSS big data solutions
We want a first-class .net experience in Spark
.NET for Apache Spark
• .NET bindings (C# e F#) to
Spark
• Written on the Spark interop
layer, designed to provide
high performance bindings to
multiple languages
• Re-use knowledge, skills,
code you have as a .NET
developer
• Compliant with .NET Standard
• You can use .NET for Apache
Spark anywhere you write
.NET code
• Original project Moebius
• https://github.com/microsoft/
Mobius
.NET Spark support
Spark DataFramews
with SparkSQL
• Spark 2.3.x, 2.4.x
• ~300 SparkSQL
function
• DeltaLake
.NET Standard 2.0
• C#/F#
• .NET Framework
4.6.1+
• .NET Core 2.1+
Batch&Streaming
• Structured
Streaming
Data Science
• ML.NET
• Notebooks
DEMO
The .NET Notebook experience
Evolution of REPL
• At the beginning there
was mono
• Then Dynamic/DLR (C#
4)
• C#/F# interactive
• .NET Try
In a world of:
• Python
• Mathematica
Jupyter
• Evolution and generalization of the seminal role of
Mathematica (notebook)
• +Python adoption (ipynb)
• +Web (HTTP+Markdown)
• +Kernel
DEMO
Conclusions
• .NET for Spark
• 1.0 GA in May/June
• https://github.com/dotnet/spark/
• .NET Interactive
• https://github.com/dotnet/interactive/
Thanks
Questions?
https://twitter.com/marco_parenzan
https://github.com/marcoparenzan
https://www.linkedin.com/in/marcoparenzan/

More Related Content

PPTX
Azure Logic Apps
PPTX
Azure Logic Apps
PPTX
Workflow All the Things with Azure Logic Apps
PPTX
Serverless Architecture - Azure Logic apps
PPTX
Logic Apps and Azure Functions
PPTX
Microsoft Azure Logic apps
PPTX
Introduction to Microsoft Azure App Service (Logic and API Apps)
PPTX
2015-12-02 - WebCamp - Microsoft Azure Logic Apps
Azure Logic Apps
Azure Logic Apps
Workflow All the Things with Azure Logic Apps
Serverless Architecture - Azure Logic apps
Logic Apps and Azure Functions
Microsoft Azure Logic apps
Introduction to Microsoft Azure App Service (Logic and API Apps)
2015-12-02 - WebCamp - Microsoft Azure Logic Apps

What's hot (20)

PPTX
Workflow Automation with Logic Apps
PPTX
Integration Monday - Logic App Patterns
PDF
Building serverless integration solutions with Microsoft Azure
PPTX
Building microservices with azure functions
PDF
Creating a workflow with Azure Logic and API Apps
PDF
Logic Apps y las posibilidades de Integración Cloud (por Felipe Senso, de Mic...
PPTX
Building Azure Logic Apps
PPTX
Azure Logic Apps and Microsoft Flows
PPTX
Azure logic app
PPTX
Integration Tales: Logic & API apps to the rescue!
PPTX
PPTX
Azure Logic Apps
PPTX
The Hitchhiker’s Guide to Hybrid Connectivity
PDF
BizTalk Server 2016: What's new (por Mariano Robles)
PPTX
Integrating SaaS application using Microsoft’s Azure App Service Platform
PDF
Agile Integration using Microsoft Azure
PPTX
Connect the Impossible with Azure Logic App
PPTX
Azure Functions VS AWS Lambda: overview and comparison
PPTX
Convert your sketches to code with microsoft ai
PPTX
Power Friday - Introduction to Azure Integration services
Workflow Automation with Logic Apps
Integration Monday - Logic App Patterns
Building serverless integration solutions with Microsoft Azure
Building microservices with azure functions
Creating a workflow with Azure Logic and API Apps
Logic Apps y las posibilidades de Integración Cloud (por Felipe Senso, de Mic...
Building Azure Logic Apps
Azure Logic Apps and Microsoft Flows
Azure logic app
Integration Tales: Logic & API apps to the rescue!
Azure Logic Apps
The Hitchhiker’s Guide to Hybrid Connectivity
BizTalk Server 2016: What's new (por Mariano Robles)
Integrating SaaS application using Microsoft’s Azure App Service Platform
Agile Integration using Microsoft Azure
Connect the Impossible with Azure Logic App
Azure Functions VS AWS Lambda: overview and comparison
Convert your sketches to code with microsoft ai
Power Friday - Introduction to Azure Integration services
Ad

Similar to .net developer for Jupyter Notebook and Apache Spark and viceversa (20)

PPTX
.NET for Azure Synapse (and viceversa)
PPTX
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
PPTX
.NET per la Data Science e oltre
PPTX
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
PDF
ApacheCon 2021 Apache Deep Learning 302
PDF
Apache Spark for Everyone - Women Who Code Workshop
PDF
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
PDF
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
PDF
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
PPTX
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
PDF
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
PDF
PPT5: Neuron Introduction
PDF
An Insider’s Guide to Maximizing Spark SQL Performance
PPTX
Developing apache spark jobs in .net using mobius
PPTX
Data streaming
PPTX
Data Science at Scale: Using Apache Spark for Data Science at Bitly
PDF
BDTC2015 databricks-辛湜-state of spark
PDF
Started with-apache-spark
PDF
USQL Trivadis Azure Data Lake Event
PDF
Apache Spark 101 - Demi Ben-Ari - Panorays
.NET for Azure Synapse (and viceversa)
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
.NET per la Data Science e oltre
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
ApacheCon 2021 Apache Deep Learning 302
Apache Spark for Everyone - Women Who Code Workshop
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
PPT5: Neuron Introduction
An Insider’s Guide to Maximizing Spark SQL Performance
Developing apache spark jobs in .net using mobius
Data streaming
Data Science at Scale: Using Apache Spark for Data Science at Bitly
BDTC2015 databricks-辛湜-state of spark
Started with-apache-spark
USQL Trivadis Azure Data Lake Event
Apache Spark 101 - Demi Ben-Ari - Panorays
Ad

More from Marco Parenzan (20)

PPTX
Azure IoT Central per lo SCADA engineer
PPTX
Azure Hybrid @ Home
PPTX
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
PPTX
Azure Synapse Analytics for your IoT Solutions
PPTX
Power BI Streaming Data Flow e Azure IoT Central
PPTX
Power BI Streaming Data Flow e Azure IoT Central
PPTX
Power BI Streaming Data Flow e Azure IoT Central
PPTX
Developing Actors in Azure with .net
PPTX
Math with .NET for you and Azure
PPTX
Power BI data flow and Azure IoT Central
PPTX
.net for fun: write a Christmas videogame
PPTX
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
PPTX
Anomaly Detection with Azure and .NET
PPTX
Deploy Microsoft Azure Data Solutions
PPTX
Deep Dive Time Series Anomaly Detection in Azure with dotnet
PPTX
Azure IoT Central
PPTX
Anomaly Detection with Azure and .net
PPTX
Code Generation for Azure with .net
PPTX
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
PPTX
Time Series Anomaly Detection with Azure and .NETT
Azure IoT Central per lo SCADA engineer
Azure Hybrid @ Home
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Azure Synapse Analytics for your IoT Solutions
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Developing Actors in Azure with .net
Math with .NET for you and Azure
Power BI data flow and Azure IoT Central
.net for fun: write a Christmas videogame
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Anomaly Detection with Azure and .NET
Deploy Microsoft Azure Data Solutions
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Azure IoT Central
Anomaly Detection with Azure and .net
Code Generation for Azure with .net
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Time Series Anomaly Detection with Azure and .NETT

Recently uploaded (20)

PPT
Introduction Database Management System for Course Database
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Transform Your Business with a Software ERP System
PDF
System and Network Administraation Chapter 3
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
L1 - Introduction to python Backend.pptx
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PPTX
assetexplorer- product-overview - presentation
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
Introduction Database Management System for Course Database
2025 Textile ERP Trends: SAP, Odoo & Oracle
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Odoo Companies in India – Driving Business Transformation.pdf
Designing Intelligence for the Shop Floor.pdf
PTS Company Brochure 2025 (1).pdf.......
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Transform Your Business with a Software ERP System
System and Network Administraation Chapter 3
Computer Software and OS of computer science of grade 11.pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
L1 - Introduction to python Backend.pptx
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
assetexplorer- product-overview - presentation
wealthsignaloriginal-com-DS-text-... (1).pdf
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
VVF-Customer-Presentation2025-Ver1.9.pptx

.net developer for Jupyter Notebook and Apache Spark and viceversa