SWAN and its analysis
ecosystem
D. Castro, J. Moscicki, M. Lamanna, E. Bocchi,
E. Tejedor, D. Piparo, P. Mato, P. Kothuri
Jan 29th, 2019
CS3 2019 - Cloud Storage Synchronization and Sharing Services
https://cern.ch/swan
Introduction
2
SWAN in a Nutshell
﹥Analysis only with a web browser
 No local installation needed
 Based on Jupyter Notebooks
 Calculations, input data and results “in the Cloud”
﹥Support for multiple analysis ecosystems
 ROOT, Python, R, Octave…
﹥Easy sharing of scientific results: plots, data,
code
﹥Integration with CERN resources
 Sofware, storage, mass processing power
3
Integrating services
Software Storage
Infrastructure
4
Storage
﹥Uses EOS disk storage system
 All experiment data potentially available
﹥CERNBox is SWAN's home directory
 Storage for your notebooks and data
﹥Sync&Share
 Files synced across devices and the
Cloud
 Collaborative analysis
5
Software
﹥Software distributed through CVMFS
 ”LCG Releases” - pack a series of compatible packages
 Reduced Docker Images size
 Lazy fetching of software
﹥Possibility to install libraries in user cloud storage
 Good way to use custom/not mainstream packages
 Configurable environment
6
LCG Release
CERN
Software
User
Software
Jupyter
modules
Previously on last CS3 conference…
7
New User Interface
8
New User Interface
9
Sharing made easy
﹥Sharing from inside
SWAN interface
 Integration with CERNBox
﹥Users can share
“Projects”
 Special kind of folder that
contains notebooks and
other files, like input data
 Self contained
10
The Share tab
﹥Users can list which projects...
 they have shared
 others have shared with them
﹥Projects can be cloned to the
receiver's CERNBox
 The receiver will work on his own copy
﹥Concurrent editing not supported by
Jupyter
 Safer to clone
11
Spark Cluster
Integration with Spark
﹥Connection to CERN
Spark Clusters
﹥Same environment
across platforms
 User data - EOS
 Software - CVMFS
﹥Graphical Jupyter
extensions developed
 Spark Connector
 Spark Monitor
Spark Master
Spark Worker
Python task Python task Python task
User Notebook
12
Spark Connector/Monitor
13
The result
14
Stats
﹥~200 user sessions a day on
average
 Users doubled last year with new SWAN
interface
﹥~1300 unique users in 6 months
﹥Spark cluster connection: 15 – 20 %
of users
 SWAN as entry point for accessing
computational resources
 Used for monitoring LHC accelerator
hardware devices (NXCals)
15
Courses
New developments
16
Inspecting a Project
﹥Users can inspect shared
project contents
 Browsing of the files
 Static rendering of
notebooks
﹥Useful to decide whether
to accept or not the
shared project
17
Spark improvements
18
1919Worldwide LHC Computing Grid (WLCG)
Connecting More Resources
﹥Ongoing effort: submit
batch jobs from the
notebook
 Monitoring display
 Jobs tab
20
Outreach, Education
21
Science Box: SWAN on Premises
﹥Packaged deployment of SWAN
 Includes all SWAN components: CERNBox/EOS, CVMFS, JupyterHub
﹥Deployable through Kubernetes or docker-compose
﹥Some successful community installations
 AARNet
 PSNC
 Open Telekom Cloud (Helix Nebula)
22
Science Box: SWAN on Premises
﹥UP2University European Project
 Bridge the gap between secondary schools, higher education and the
research domain
 Partner universities (OU, UROMA, NTUA, …), pilot schools
 http://up2university.eu
﹥SWAN used by students to learn physics and other sciences
 Let them use the very same tools & services used by scientists at CERN
 Pilot with University of Geneva (Physiscope)
﹥Establishing collaboration with Callysto project
23
Looking ahead
24
Future work/challenges
﹥Move to Jupyterlab
 Porting the current extensions
 Concurrent editing
﹥New architecture
 Based on Kubernetes
﹥Exploitation of GPUs
 HEP is looking to ML
 Speed up computation of GPU-ready libraries (e.g. TensorFlow)
25
Where to find us
26
Where to find us
﹥Contacts
 swan-talk@cern.ch
 http://cern.ch/swan
﹥Repository
 https://github.com/swan-cern/
﹥Science Box
 https://cern.ch/sciencebox
27
Conclusion
28
Conclusion
﹥Changes introduced since last year improved user experience
 Which translated on more users using the service
﹥SWAN became a fundamental Interface for Mass Processing Resources (Spark)
 Not only for Physics analysis but also for monitoring the LHC hardware
﹥The new Jupyterlab interface will bring new possibilities for collaborative analysis
 With the introduction of concurrent editing of notebooks
 Which can help reach more users
﹥Successfully deployed outside CERN premises
 Including on education related projects
29
SWAN and its analysis ecosystem
Thank you
Diogo Castro
diogo.castro@cern.ch
30

More Related Content

PPT
DIET_BLAST
PPT
Vitus Masters Defense
PDF
CloudLightning and the OPM-based Use Case
PDF
PIC Tier-1 (LHCP Conference / Barcelona)
PPTX
Sky Arrays - ArrayDB in action for Sky View Factor Computation
PDF
From data centers to fog computing: the evaporating cloud
PPT
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
PDF
Container orchestration in geo-distributed cloud computing platforms
DIET_BLAST
Vitus Masters Defense
CloudLightning and the OPM-based Use Case
PIC Tier-1 (LHCP Conference / Barcelona)
Sky Arrays - ArrayDB in action for Sky View Factor Computation
From data centers to fog computing: the evaporating cloud
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Container orchestration in geo-distributed cloud computing platforms

What's hot (20)

PDF
Virtual Clusters for (RDF) Stream Processing
PPTX
Project Matsu: Elastic Clouds for Disaster Relief
PPTX
My Other Computer is a Data Center: The Sector Perspective on Big Data
PPTX
Bionimbus - An Overview (2010-v6)
PDF
An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
PDF
Stream Processing
PPTX
Applications of PARALLEL PROCESSING
PDF
Fog Computing for Dummies
PPTX
Panel at Internet2 Spring Meeting, April 2010
PDF
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
PPTX
Data Stream Algorithms in Storm and R
PPTX
Novel Techniques & Connections Between High-Pressure Mineral Physics, Microto...
PDF
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
PDF
Nasa HPC in the Cloud
PDF
Atomate: a high-level interface to generate, execute, and analyze computation...
PDF
Overview of DuraMat software tool development
PPTX
Taming Big Data!
PDF
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
PDF
io-Chem-BD, una solució per gestionar el Big Data en Química Computacional
PDF
Materials Project computation and database infrastructure
Virtual Clusters for (RDF) Stream Processing
Project Matsu: Elastic Clouds for Disaster Relief
My Other Computer is a Data Center: The Sector Perspective on Big Data
Bionimbus - An Overview (2010-v6)
An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
Stream Processing
Applications of PARALLEL PROCESSING
Fog Computing for Dummies
Panel at Internet2 Spring Meeting, April 2010
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
Data Stream Algorithms in Storm and R
Novel Techniques & Connections Between High-Pressure Mineral Physics, Microto...
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
Nasa HPC in the Cloud
Atomate: a high-level interface to generate, execute, and analyze computation...
Overview of DuraMat software tool development
Taming Big Data!
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
io-Chem-BD, una solució per gestionar el Big Data en Química Computacional
Materials Project computation and database infrastructure
Ad

Similar to 2019 swan-cs3 (20)

PDF
Accelerator Programming Using Directives 8th International Workshop Waccpd 20...
PDF
Thesies_Cheng_Guo_2015_fina_signed
PPTX
Data-intensive applications on cloud computing resources: Applications in lif...
PDF
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
PPTX
Scientific
PPTX
OpenACC Monthly Highlights Summer 2019
PDF
R.E.M.O.T.E. SACNAS Poster
PDF
Open Science and GEOSS: the Cloud Sandbox enablers
PPTX
Larry Smarr - NRP Application Drivers
PDF
Cloud Services for Education - HNSciCloud applied to the UP2U project
PPTX
National Research Platform: Application Drivers
PPT
GeoChronos
PPTX
Data-intensive bioinformatics on HPC and Cloud
PPTX
Virtual research environments for implementing long tail open science
PDF
OpenACC Monthly Highlights: January 2024
PDF
2016 nov-ieee-sdn-wiki
PPT
grid computing
PDF
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
PPTX
DAWN and Scientific Workflows
PPTX
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Accelerator Programming Using Directives 8th International Workshop Waccpd 20...
Thesies_Cheng_Guo_2015_fina_signed
Data-intensive applications on cloud computing resources: Applications in lif...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
Scientific
OpenACC Monthly Highlights Summer 2019
R.E.M.O.T.E. SACNAS Poster
Open Science and GEOSS: the Cloud Sandbox enablers
Larry Smarr - NRP Application Drivers
Cloud Services for Education - HNSciCloud applied to the UP2U project
National Research Platform: Application Drivers
GeoChronos
Data-intensive bioinformatics on HPC and Cloud
Virtual research environments for implementing long tail open science
OpenACC Monthly Highlights: January 2024
2016 nov-ieee-sdn-wiki
grid computing
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
DAWN and Scientific Workflows
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Ad

More from Up2Universe (20)

PDF
Up2U Pedagogical evaluation
PPTX
Continuous professional development for secondary education teachers to adopt...
PDF
Up2U brand manual
PDF
openUp2U booklet
PDF
Why choose Up2U?
PDF
Up2U step by step guides for NRENs
PDF
Up2U for schools booklet
PPTX
Open Educational Resources for Bridging High School – University Gaps in Acad...
PDF
Greek IT security flyer
PPTX
Edulearn2019_Up2U_Presentation_G.Cibulskis_A.Urbaityte
PPTX
Pilots results- lessons learned Up2University project
PDF
Praktyczny przewodnik po bezpieczeństwie teleinformatycznym Up2U
PDF
IT biztonsági kisokos
PDF
Guida pratica alla sicurezza ICT per il progetto Up2U
PDF
Una guía práctica para la seguridad TIC-Up2U
PDF
A practical guide to IT security-Up to University project
PDF
Facilitating curation of open educational resources through the use of an app...
PDF
Up2U Learning Community interactions
PDF
Up to University
PDF
Up2U webinar for NRENs
Up2U Pedagogical evaluation
Continuous professional development for secondary education teachers to adopt...
Up2U brand manual
openUp2U booklet
Why choose Up2U?
Up2U step by step guides for NRENs
Up2U for schools booklet
Open Educational Resources for Bridging High School – University Gaps in Acad...
Greek IT security flyer
Edulearn2019_Up2U_Presentation_G.Cibulskis_A.Urbaityte
Pilots results- lessons learned Up2University project
Praktyczny przewodnik po bezpieczeństwie teleinformatycznym Up2U
IT biztonsági kisokos
Guida pratica alla sicurezza ICT per il progetto Up2U
Una guía práctica para la seguridad TIC-Up2U
A practical guide to IT security-Up to University project
Facilitating curation of open educational resources through the use of an app...
Up2U Learning Community interactions
Up to University
Up2U webinar for NRENs

Recently uploaded (20)

PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
CloudStack 4.21: First Look Webinar slides
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Modernising the Digital Integration Hub
PDF
Hybrid model detection and classification of lung cancer
PDF
Five Habits of High-Impact Board Members
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPT
What is a Computer? Input Devices /output devices
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Architecture types and enterprise applications.pdf
PPTX
O2C Customer Invoices to Receipt V15A.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
CloudStack 4.21: First Look Webinar slides
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Getting started with AI Agents and Multi-Agent Systems
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Hindi spoken digit analysis for native and non-native speakers
Enhancing emotion recognition model for a student engagement use case through...
Modernising the Digital Integration Hub
Hybrid model detection and classification of lung cancer
Five Habits of High-Impact Board Members
A review of recent deep learning applications in wood surface defect identifi...
What is a Computer? Input Devices /output devices
Module 1.ppt Iot fundamentals and Architecture
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
A comparative study of natural language inference in Swahili using monolingua...
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
WOOl fibre morphology and structure.pdf for textiles
Taming the Chaos: How to Turn Unstructured Data into Decisions
Architecture types and enterprise applications.pdf
O2C Customer Invoices to Receipt V15A.pptx

2019 swan-cs3

  • 1. SWAN and its analysis ecosystem D. Castro, J. Moscicki, M. Lamanna, E. Bocchi, E. Tejedor, D. Piparo, P. Mato, P. Kothuri Jan 29th, 2019 CS3 2019 - Cloud Storage Synchronization and Sharing Services https://cern.ch/swan
  • 3. SWAN in a Nutshell ﹥Analysis only with a web browser  No local installation needed  Based on Jupyter Notebooks  Calculations, input data and results “in the Cloud” ﹥Support for multiple analysis ecosystems  ROOT, Python, R, Octave… ﹥Easy sharing of scientific results: plots, data, code ﹥Integration with CERN resources  Sofware, storage, mass processing power 3
  • 5. Storage ﹥Uses EOS disk storage system  All experiment data potentially available ﹥CERNBox is SWAN's home directory  Storage for your notebooks and data ﹥Sync&Share  Files synced across devices and the Cloud  Collaborative analysis 5
  • 6. Software ﹥Software distributed through CVMFS  ”LCG Releases” - pack a series of compatible packages  Reduced Docker Images size  Lazy fetching of software ﹥Possibility to install libraries in user cloud storage  Good way to use custom/not mainstream packages  Configurable environment 6 LCG Release CERN Software User Software Jupyter modules
  • 7. Previously on last CS3 conference… 7
  • 10. Sharing made easy ﹥Sharing from inside SWAN interface  Integration with CERNBox ﹥Users can share “Projects”  Special kind of folder that contains notebooks and other files, like input data  Self contained 10
  • 11. The Share tab ﹥Users can list which projects...  they have shared  others have shared with them ﹥Projects can be cloned to the receiver's CERNBox  The receiver will work on his own copy ﹥Concurrent editing not supported by Jupyter  Safer to clone 11
  • 12. Spark Cluster Integration with Spark ﹥Connection to CERN Spark Clusters ﹥Same environment across platforms  User data - EOS  Software - CVMFS ﹥Graphical Jupyter extensions developed  Spark Connector  Spark Monitor Spark Master Spark Worker Python task Python task Python task User Notebook 12
  • 15. Stats ﹥~200 user sessions a day on average  Users doubled last year with new SWAN interface ﹥~1300 unique users in 6 months ﹥Spark cluster connection: 15 – 20 % of users  SWAN as entry point for accessing computational resources  Used for monitoring LHC accelerator hardware devices (NXCals) 15 Courses
  • 17. Inspecting a Project ﹥Users can inspect shared project contents  Browsing of the files  Static rendering of notebooks ﹥Useful to decide whether to accept or not the shared project 17
  • 20. Connecting More Resources ﹥Ongoing effort: submit batch jobs from the notebook  Monitoring display  Jobs tab 20
  • 22. Science Box: SWAN on Premises ﹥Packaged deployment of SWAN  Includes all SWAN components: CERNBox/EOS, CVMFS, JupyterHub ﹥Deployable through Kubernetes or docker-compose ﹥Some successful community installations  AARNet  PSNC  Open Telekom Cloud (Helix Nebula) 22
  • 23. Science Box: SWAN on Premises ﹥UP2University European Project  Bridge the gap between secondary schools, higher education and the research domain  Partner universities (OU, UROMA, NTUA, …), pilot schools  http://up2university.eu ﹥SWAN used by students to learn physics and other sciences  Let them use the very same tools & services used by scientists at CERN  Pilot with University of Geneva (Physiscope) ﹥Establishing collaboration with Callysto project 23
  • 25. Future work/challenges ﹥Move to Jupyterlab  Porting the current extensions  Concurrent editing ﹥New architecture  Based on Kubernetes ﹥Exploitation of GPUs  HEP is looking to ML  Speed up computation of GPU-ready libraries (e.g. TensorFlow) 25
  • 26. Where to find us 26
  • 27. Where to find us ﹥Contacts  swan-talk@cern.ch  http://cern.ch/swan ﹥Repository  https://github.com/swan-cern/ ﹥Science Box  https://cern.ch/sciencebox 27
  • 29. Conclusion ﹥Changes introduced since last year improved user experience  Which translated on more users using the service ﹥SWAN became a fundamental Interface for Mass Processing Resources (Spark)  Not only for Physics analysis but also for monitoring the LHC hardware ﹥The new Jupyterlab interface will bring new possibilities for collaborative analysis  With the introduction of concurrent editing of notebooks  Which can help reach more users ﹥Successfully deployed outside CERN premises  Including on education related projects 29
  • 30. SWAN and its analysis ecosystem Thank you Diogo Castro diogo.castro@cern.ch 30