SlideShare a Scribd company logo
Using Conda in Oracle Data Science
By Nicholas Toscano
What is Conda Environments?
• Conda is like a virtual
environment
• Let you run Python processes in
different environments with
different versions of the same
library
• Manages different versions of
Python that aren’t installed
system-wide Lets you upgrade
libraries
• Supports the installation of
packages for R, Python,
Node.js, Java, etc.
There are now over 42 pre-built conda environments to choose from, including ones
dedicated to Oracle PyPGX, PySpark, NVIDIA RAPIDS, and more.
Benefits of Conda Environments
• Install Python libraries from the different Conda channels:
• conda-forge
• pypi service
• Third-party version control provider, such as github.com
• Environments portable through the conda-pack tool
• Archive them in an Object Storage bucket
• Or shipped across platforms and operating systems
• Access different Conda Environments as different notebook kernels in JupyterLab
• Simultaneously execute different notebooks in different kernels with potentially conflicting sets
of dependencies
Install Curated Conda Environments
• From the odsc conda CLI or the
Explorer extension, you can install one
or more of the Data Science Conda
Environments
• Env are built and curated by the OCI
Data Science service team
• More Data Science Conda
Environments are added over time:
Create Your Own Environment
• Create your own Conda Environment
using odsc conda create command
• List what libraries you want to install in
a Conda environment.yaml file
• Conda supports the installation of
libraries from Conda channels and pip
• Publish your env to object storage
bucket:
• Use the odsc conda publish
command
• Share Conda Environments with
colleagues
• Install a published Conda in a
different notebook session
Publish an environment and share it with colleagues across notebook sessions
Example Environments
PySpark
Provides a local development environment for a PySpark job. Ideal
environment to test your Oracle Cloud Infrastructure Data Flow jobs
before submitting them with ADS (also included in this environment).
General machine learning for CPUs
Includes the new versions of ADS, AutoML, and MLX, along with the
usual machine learning suspects, including sklearn, xgboost,
lightGBM, and others
General machine learning for GPUs
Includes the new versions of ADS, AutoML, and MLX. This environment
also includes TensorFlow 2.3.1 optimized for GPUs.
* See Oracle documentation for up-to-date information.
Step 1: Open or launch a notebook session
Step 2: Write a conda-compatible environment.yaml
File
• This file contains the channels and the dependencies that you want to install in your conda
environment
• You can also select packages from pypi
Adding pip packages to the list of dependencies
• You can install packages directly from pypi
Step 3: Create the conda environment with odsc
conda create Command
Open a terminal window in your notebook session and run:
• This command will create a brand new kernel in your notebook session called my-conda-
env
• A version v1.0 will be assigned to the conda environment by default and appended to the
name of conda slugname
• You can change that by assigning a value to the create command optional parameter -v
Step 4: Validate the new conda environment
Step 4: Validate the new conda environment
In your notebook, import numpy and pandas and confirm that these libraries are available in your
environment. Do the same thing for scikit-learn if you installed it from pypi:
Step 5: Publish the new environment
• Publishing a conda environment consists of creating a pack and uploading it to an
Object Storage bucket that you specify.
• We recommend that you publish conda environments to ensure that a model training
environment can be reproduced or re-used for model deployment
• You can use the odsc CLI to publish an environment.
• First, you need to specify the target object storage bucket where the published environment will be
stored. This can be done through the odsc conda init command:
Step 5: Publish the new environment
• Use the odsc conda publish command. Specify the slug name of the conda environment you
just created
• The slug name is the name of the conda environment and its version. It corresponds to the
notebook kernel name minus the "conda-env:" part
• Go to your object storage bucket in the OCI console and confirm that the new conda pack is
stored in the bucket.
END

More Related Content

PDF
launch_a_data_science_environment_in_oracle_cloud.pdf
PDF
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
PDF
Conda environment system how to use it on CSUC machines
PDF
Conda environment system how to use it on CSUC machines
PDF
Conda environment system how to use it on CSUC machines
PDF
Effectively using Open Source with conda
PDF
Conda environment system & how to use it on CSUC machines
PPTX
Conda environment system & how to use it on CSUC machines
launch_a_data_science_environment_in_oracle_cloud.pdf
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
Conda environment system how to use it on CSUC machines
Conda environment system how to use it on CSUC machines
Conda environment system how to use it on CSUC machines
Effectively using Open Source with conda
Conda environment system & how to use it on CSUC machines
Conda environment system & how to use it on CSUC machines

Similar to Using Conda in Oracle Data Science.pdf (20)

PDF
Setting-Up Python Environment (Jupyter Notebook)
PPTX
Making Conda-based Reproducible Projects
PPTX
25532813.pptx
PDF
The Conda environment system and how to use it on CSUC machines
PPTX
anaconda.pptx
PPTX
Top 10 Anaconda Interview Questions and Answers.pptx
PDF
Python as the Zen of Data Science
PDF
PyData Barcelona Keynote
PDF
Oracle 1Z0-1110-25 Dumps Pass Without Stress
PPTX
2022.03.23 Conda and Conda environments.pptx
PPTX
ppt_template for EDA.pptx
PDF
Introduction to Data Science & Python.pdf
PDF
London level39
PDF
Conda cheatsheet
PDF
Introduction to TensorFlow and OpenCV libraries
PDF
Installing Anaconda Distribution of Python
PDF
PyData Boston 2013
PDF
Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)
PDF
Apache Spark for Everyone - Women Who Code Workshop
PDF
Python for Data Science: A Comprehensive Guide
Setting-Up Python Environment (Jupyter Notebook)
Making Conda-based Reproducible Projects
25532813.pptx
The Conda environment system and how to use it on CSUC machines
anaconda.pptx
Top 10 Anaconda Interview Questions and Answers.pptx
Python as the Zen of Data Science
PyData Barcelona Keynote
Oracle 1Z0-1110-25 Dumps Pass Without Stress
2022.03.23 Conda and Conda environments.pptx
ppt_template for EDA.pptx
Introduction to Data Science & Python.pdf
London level39
Conda cheatsheet
Introduction to TensorFlow and OpenCV libraries
Installing Anaconda Distribution of Python
PyData Boston 2013
Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)
Apache Spark for Everyone - Women Who Code Workshop
Python for Data Science: A Comprehensive Guide
Ad

Recently uploaded (20)

PPTX
Computer network topology notes for revision
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Introduction to Data Science and Data Analysis
PPTX
1_Introduction to advance data techniques.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Business Analytics and business intelligence.pdf
PDF
Lecture1 pattern recognition............
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
Computer network topology notes for revision
Introduction to Knowledge Engineering Part 1
Database Infoormation System (DBIS).pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Data Science and Data Analysis
1_Introduction to advance data techniques.pptx
IB Computer Science - Internal Assessment.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Clinical guidelines as a resource for EBP(1).pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
climate analysis of Dhaka ,Banglades.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Business Analytics and business intelligence.pdf
Lecture1 pattern recognition............
STUDY DESIGN details- Lt Col Maksud (21).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
STERILIZATION AND DISINFECTION-1.ppthhhbx
Ad

Using Conda in Oracle Data Science.pdf

  • 1. Using Conda in Oracle Data Science By Nicholas Toscano
  • 2. What is Conda Environments? • Conda is like a virtual environment • Let you run Python processes in different environments with different versions of the same library • Manages different versions of Python that aren’t installed system-wide Lets you upgrade libraries • Supports the installation of packages for R, Python, Node.js, Java, etc. There are now over 42 pre-built conda environments to choose from, including ones dedicated to Oracle PyPGX, PySpark, NVIDIA RAPIDS, and more.
  • 3. Benefits of Conda Environments • Install Python libraries from the different Conda channels: • conda-forge • pypi service • Third-party version control provider, such as github.com • Environments portable through the conda-pack tool • Archive them in an Object Storage bucket • Or shipped across platforms and operating systems • Access different Conda Environments as different notebook kernels in JupyterLab • Simultaneously execute different notebooks in different kernels with potentially conflicting sets of dependencies
  • 4. Install Curated Conda Environments • From the odsc conda CLI or the Explorer extension, you can install one or more of the Data Science Conda Environments • Env are built and curated by the OCI Data Science service team • More Data Science Conda Environments are added over time:
  • 5. Create Your Own Environment • Create your own Conda Environment using odsc conda create command • List what libraries you want to install in a Conda environment.yaml file • Conda supports the installation of libraries from Conda channels and pip • Publish your env to object storage bucket: • Use the odsc conda publish command • Share Conda Environments with colleagues • Install a published Conda in a different notebook session Publish an environment and share it with colleagues across notebook sessions
  • 6. Example Environments PySpark Provides a local development environment for a PySpark job. Ideal environment to test your Oracle Cloud Infrastructure Data Flow jobs before submitting them with ADS (also included in this environment). General machine learning for CPUs Includes the new versions of ADS, AutoML, and MLX, along with the usual machine learning suspects, including sklearn, xgboost, lightGBM, and others General machine learning for GPUs Includes the new versions of ADS, AutoML, and MLX. This environment also includes TensorFlow 2.3.1 optimized for GPUs. * See Oracle documentation for up-to-date information.
  • 7. Step 1: Open or launch a notebook session
  • 8. Step 2: Write a conda-compatible environment.yaml File • This file contains the channels and the dependencies that you want to install in your conda environment • You can also select packages from pypi
  • 9. Adding pip packages to the list of dependencies • You can install packages directly from pypi
  • 10. Step 3: Create the conda environment with odsc conda create Command Open a terminal window in your notebook session and run: • This command will create a brand new kernel in your notebook session called my-conda- env • A version v1.0 will be assigned to the conda environment by default and appended to the name of conda slugname • You can change that by assigning a value to the create command optional parameter -v
  • 11. Step 4: Validate the new conda environment
  • 12. Step 4: Validate the new conda environment In your notebook, import numpy and pandas and confirm that these libraries are available in your environment. Do the same thing for scikit-learn if you installed it from pypi:
  • 13. Step 5: Publish the new environment • Publishing a conda environment consists of creating a pack and uploading it to an Object Storage bucket that you specify. • We recommend that you publish conda environments to ensure that a model training environment can be reproduced or re-used for model deployment • You can use the odsc CLI to publish an environment. • First, you need to specify the target object storage bucket where the published environment will be stored. This can be done through the odsc conda init command:
  • 14. Step 5: Publish the new environment • Use the odsc conda publish command. Specify the slug name of the conda environment you just created • The slug name is the name of the conda environment and its version. It corresponds to the notebook kernel name minus the "conda-env:" part • Go to your object storage bucket in the OCI console and confirm that the new conda pack is stored in the bucket.
  • 15. END