SlideShare a Scribd company logo
Getting started with
OpenAI and Data
science
SUSAN IBACH | HOCKEYGEEKGIRL
SUSAN.IBACH@LIVE.COM
You can't go
anywhere these
days without
hearing about
Generative AI
AI won't replace you, but someone with your skills + AI might
Coders are more productive when they
use AI to help them code
 Over 80% of coders say they are more productive when they use a code helper
such as GitHub Copilot
 74% say it enables them to focus on more satisfying work
 96% say they are faster completing repetitive tasks
 When studying two control groups, the group using a built in AI to help with
coding completed their tasks 50% faster
Okay I get it Susan this AI thing
looks useful, how do I get
started using it for data
science?
You could just
open up ChatGPT
ask it to write code
for you then copy
& paste
But the real win is doing it inside your IDE!
This Photo by Unknown Author is licensed under CC BY
Step 1
Find a Large
Language Model
(LLM) you can install
inside your IDE
This takes a bit
of research
OpenAI – Owned by Microsoft
Codeium – VS Code, Vim, Jupyter Notebook, Eclipse
GitHub Copilot – comes as an extension for VS Code, Visual
Studio, JetBrains
Obsidian Integration, heroml, Superpower extension,
llmops.space, cursor.so, ChatGPT, CometLLM, Cohere
I use Jupyter
notebooks so
I'm going with
Jupyter AI
Jupyter AI is
vendor neutral
and can
connect to
different LLMs
 AI21
 Anthropic
 AWS
 Cohere
 HuggingFace Hub
 OpenAI
I chose OpenAI
because I had
played with it a
bit already
Step 2
Install the
extension or
library in your IDE
If you want to use Jupyter AI with OpenAI
in a Jupyter Notebook
Software versions required
 Requires Jupyter Lab 4
 Python 3.8 – 3.11 (I installed Python 3.11.6 64 bit)
Accounts required (you can start with the free version)
 OpenAI
If you want to use Jupyter AI with OpenAI
in a Jupyter Notebook
Install the openai library
 pip install openai
Create an environment variable and set it to the API key for your OpenAI account
 OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxx
 Each LLM supported has a specific environment variable name
Install the jupyter_ai extension
 load_ext jupyter_ai
Not all OpenAI models are created equal
Version GPT-3.5 Turbo GPT-4.0
Speed Faster Slower
Database size 10X size of ChatGPT 3.5
and can handle images
Quality of output 40% more likely to
produce factual responses
than 3.5, better at dialects
$ Input / 1000 tokens $0.00005 $0.03
$ Output / 1000 tokens $0.0015 $0.06
You can find more information on pricing at openai.com
So what is a token anyway?
You can think of tokens as pieces of words
Wayne Gretzky’s quote "You miss 100% of the shots you don't take" contains 11 tokens
1 token is about 4 characters in English
1 English word is typically 1.3 tokens
1 French word is typically 2 tokens
Punctuation marks are counted as one token
Special characters are one to three tokens
Emojis are between two to three tokens
Step 3
Try a hello world
type command
Ask the AI to create "Hello World"
%%ai chatgpt --format code
display a message that says hello world
Possible successful outputs include
print("Hello World")
System.out.println("Hello World");
console.log("Hello World");
echo "Hello World";
Step 4
Evaluate the
suggested code
AI does not replace programmers.
Programmers with AI replace programmers
 There is more than one way to write code to complete a task
 LLMs make an educated guess based on code it has seen in the past
 The coder provides the knowledge to evaluate the suggestion from the AI and make
modifications to the prompt as needed (referred to as prompt engineering)
Curious about
pricing?
How much did that cost?
How many tokens and calls was it?
Step 5
Now we can play!
Maybe I need a
dataframe with
some sample
data
Maybe I forgot the
syntax for returning
entries that start
with a particular
letter
Let's read a .csv file
and then do some
linear regression
Let's read a .csv file
and then do some
linear regression
ValueError: Input y
contains NaN
AI does not replace
programmers.
Programmers with AI
replace
programmers
What would a
coder do? We'd
get rid of the rows
with Nulls and try
again!
Victory!
I have successfully
produced a plot but if
you don't know how
to read it this isn't
going to help you 
AI does not replace
data scientists. Data
scientists with AI
replace data
scientists
Until today, I have never done a
live code demo
- with this much code
- in a session this short
- without having to look up
method names and parameters
- without spending time in the
session having the audience
help me find my typing mistakes
AI doesn't replace
presenters.
Presenters with AI
replace presenters
References
ChatGPT
Open AI
Project Jupyter | Installing Jupyter
Generative AI in Jupyter. Jupyter AI, a new open source project… | by Jason Weill | Jupyter Blog
GitHub - jupyterlab/jupyter-ai: A generative AI extension for JupyterLab
What are tokens and how to count them
OpenAI Pricing
Questions?
SUSAN IBACH | HOCKEYGEEKGIRL
SUSAN.IBACH@LIVE.COM
Thank you!

More Related Content

PDF
Technology Introduction Series: Edge Computing tutorial.pdf
PDF
Hibernate 3
PPT
PPTX
Jetpack Compose - Android’s modern toolkit for building native UI
PPS
Java Hibernate Programming with Architecture Diagram and Example
PPTX
Tailwind CSS.11.pptx
PDF
Android intents
PPTX
Tizen operating system seminar ppt
Technology Introduction Series: Edge Computing tutorial.pdf
Hibernate 3
Jetpack Compose - Android’s modern toolkit for building native UI
Java Hibernate Programming with Architecture Diagram and Example
Tailwind CSS.11.pptx
Android intents
Tizen operating system seminar ppt

What's hot (20)

PDF
Android Programming Basics
PPTX
Kotlin Basics & Introduction to Jetpack Compose.pptx
PDF
Fundamentals of Web Development For Non-Developers
PPT
Android - Android Intent Types
PPT
Ppt of web development
PPTX
What is SCADA system? SCADA Solutions for IoT
PPTX
6-IoT protocol.pptx
PPT
Ping-and-Traceroute.ppt
PPTX
Reactive programming intro
PPTX
Jsp Introduction Tutorial
PPT
Semantic web
PPTX
Angularjs PPT
PDF
Procedure to install turbo c++
PDF
Web Development 2 (HTML & CSS)
PPTX
Spring data jpa
PPT
Chain of responsibility
PPT
Introduction to XML
PDF
IoT security and privacy: main challenges and how ISOC-OTA address them
PPTX
Creating the first app with android studio
Android Programming Basics
Kotlin Basics & Introduction to Jetpack Compose.pptx
Fundamentals of Web Development For Non-Developers
Android - Android Intent Types
Ppt of web development
What is SCADA system? SCADA Solutions for IoT
6-IoT protocol.pptx
Ping-and-Traceroute.ppt
Reactive programming intro
Jsp Introduction Tutorial
Semantic web
Angularjs PPT
Procedure to install turbo c++
Web Development 2 (HTML & CSS)
Spring data jpa
Chain of responsibility
Introduction to XML
IoT security and privacy: main challenges and how ISOC-OTA address them
Creating the first app with android studio
Ad

Similar to Confoo 2024 Gettings started with OpenAI and data science (20)

PPT
What does OOP stand for?
PDF
wang-Leveraging-the-Power-of-ChatGPT-and-Vector-Databases-in-the-FreeBSD-Expe...
PDF
Unlocking Generative AI in your Web Apps
PDF
Spring into AI presented by Dan Vega 5/14
PDF
A gentle introduction to algorithm complexity analysis
PDF
SFSCON24 - Moritz Mock, Barbara Russo & Jorge Melegati - Can Test Driven Deve...
PPTX
Generative AI in CSharp with Semantic Kernel.pptx
PDF
Python_Interview_Questions.pdf
PDF
10 Months with Supermaven in Neovim - KM
PPTX
Object Oriented Apologetics
PDF
AI Software Creation: Build Apps Without Coding Using Replit's AI
PPTX
Flavius olaru logicless ui prototyping with node js
PPTX
Best practices in coding for beginners
PDF
interviewbit.pdf
PDF
ChatGPT and AI for web developers - Maximiliano Firtman
PPTX
Generative A IBootcamp-Presentation echnologies and how they connect Using t...
PDF
Walter api
PDF
The Aipowered Developer Meap V01 Chapters 1 To 4 Of 8 Nathan B Crocker
PDF
Open Source Security and ChatGPT-Published.pdf
PDF
Pythonanditsapplications 161121160425
What does OOP stand for?
wang-Leveraging-the-Power-of-ChatGPT-and-Vector-Databases-in-the-FreeBSD-Expe...
Unlocking Generative AI in your Web Apps
Spring into AI presented by Dan Vega 5/14
A gentle introduction to algorithm complexity analysis
SFSCON24 - Moritz Mock, Barbara Russo & Jorge Melegati - Can Test Driven Deve...
Generative AI in CSharp with Semantic Kernel.pptx
Python_Interview_Questions.pdf
10 Months with Supermaven in Neovim - KM
Object Oriented Apologetics
AI Software Creation: Build Apps Without Coding Using Replit's AI
Flavius olaru logicless ui prototyping with node js
Best practices in coding for beginners
interviewbit.pdf
ChatGPT and AI for web developers - Maximiliano Firtman
Generative A IBootcamp-Presentation echnologies and how they connect Using t...
Walter api
The Aipowered Developer Meap V01 Chapters 1 To 4 Of 8 Nathan B Crocker
Open Source Security and ChatGPT-Published.pdf
Pythonanditsapplications 161121160425
Ad

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPT
Teaching material agriculture food technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
cuic standard and advanced reporting.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Spectral efficient network and resource selection model in 5G networks
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Digital-Transformation-Roadmap-for-Companies.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
sap open course for s4hana steps from ECC to s4
Network Security Unit 5.pdf for BCA BBA.
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Assigned Numbers - 2025 - Bluetooth® Document
Teaching material agriculture food technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine Learning_overview_presentation.pptx
cuic standard and advanced reporting.pdf
The AUB Centre for AI in Media Proposal.docx
Spectral efficient network and resource selection model in 5G networks

Confoo 2024 Gettings started with OpenAI and data science

  • 1. Getting started with OpenAI and Data science SUSAN IBACH | HOCKEYGEEKGIRL SUSAN.IBACH@LIVE.COM
  • 2. You can't go anywhere these days without hearing about Generative AI
  • 3. AI won't replace you, but someone with your skills + AI might
  • 4. Coders are more productive when they use AI to help them code  Over 80% of coders say they are more productive when they use a code helper such as GitHub Copilot  74% say it enables them to focus on more satisfying work  96% say they are faster completing repetitive tasks  When studying two control groups, the group using a built in AI to help with coding completed their tasks 50% faster
  • 5. Okay I get it Susan this AI thing looks useful, how do I get started using it for data science?
  • 6. You could just open up ChatGPT ask it to write code for you then copy & paste
  • 7. But the real win is doing it inside your IDE! This Photo by Unknown Author is licensed under CC BY
  • 8. Step 1 Find a Large Language Model (LLM) you can install inside your IDE
  • 9. This takes a bit of research OpenAI – Owned by Microsoft Codeium – VS Code, Vim, Jupyter Notebook, Eclipse GitHub Copilot – comes as an extension for VS Code, Visual Studio, JetBrains Obsidian Integration, heroml, Superpower extension, llmops.space, cursor.so, ChatGPT, CometLLM, Cohere
  • 10. I use Jupyter notebooks so I'm going with Jupyter AI
  • 11. Jupyter AI is vendor neutral and can connect to different LLMs  AI21  Anthropic  AWS  Cohere  HuggingFace Hub  OpenAI
  • 12. I chose OpenAI because I had played with it a bit already
  • 13. Step 2 Install the extension or library in your IDE
  • 14. If you want to use Jupyter AI with OpenAI in a Jupyter Notebook Software versions required  Requires Jupyter Lab 4  Python 3.8 – 3.11 (I installed Python 3.11.6 64 bit) Accounts required (you can start with the free version)  OpenAI
  • 15. If you want to use Jupyter AI with OpenAI in a Jupyter Notebook Install the openai library  pip install openai Create an environment variable and set it to the API key for your OpenAI account  OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxx  Each LLM supported has a specific environment variable name Install the jupyter_ai extension  load_ext jupyter_ai
  • 16. Not all OpenAI models are created equal Version GPT-3.5 Turbo GPT-4.0 Speed Faster Slower Database size 10X size of ChatGPT 3.5 and can handle images Quality of output 40% more likely to produce factual responses than 3.5, better at dialects $ Input / 1000 tokens $0.00005 $0.03 $ Output / 1000 tokens $0.0015 $0.06 You can find more information on pricing at openai.com
  • 17. So what is a token anyway? You can think of tokens as pieces of words Wayne Gretzky’s quote "You miss 100% of the shots you don't take" contains 11 tokens 1 token is about 4 characters in English 1 English word is typically 1.3 tokens 1 French word is typically 2 tokens Punctuation marks are counted as one token Special characters are one to three tokens Emojis are between two to three tokens
  • 18. Step 3 Try a hello world type command
  • 19. Ask the AI to create "Hello World" %%ai chatgpt --format code display a message that says hello world Possible successful outputs include print("Hello World") System.out.println("Hello World"); console.log("Hello World"); echo "Hello World";
  • 21. AI does not replace programmers. Programmers with AI replace programmers  There is more than one way to write code to complete a task  LLMs make an educated guess based on code it has seen in the past  The coder provides the knowledge to evaluate the suggestion from the AI and make modifications to the prompt as needed (referred to as prompt engineering)
  • 23. How much did that cost?
  • 24. How many tokens and calls was it?
  • 25. Step 5 Now we can play!
  • 26. Maybe I need a dataframe with some sample data
  • 27. Maybe I forgot the syntax for returning entries that start with a particular letter
  • 28. Let's read a .csv file and then do some linear regression
  • 29. Let's read a .csv file and then do some linear regression
  • 31. AI does not replace programmers. Programmers with AI replace programmers
  • 32. What would a coder do? We'd get rid of the rows with Nulls and try again!
  • 33. Victory! I have successfully produced a plot but if you don't know how to read it this isn't going to help you 
  • 34. AI does not replace data scientists. Data scientists with AI replace data scientists
  • 35. Until today, I have never done a live code demo - with this much code - in a session this short - without having to look up method names and parameters - without spending time in the session having the audience help me find my typing mistakes
  • 36. AI doesn't replace presenters. Presenters with AI replace presenters
  • 37. References ChatGPT Open AI Project Jupyter | Installing Jupyter Generative AI in Jupyter. Jupyter AI, a new open source project… | by Jason Weill | Jupyter Blog GitHub - jupyterlab/jupyter-ai: A generative AI extension for JupyterLab What are tokens and how to count them OpenAI Pricing
  • 38. Questions? SUSAN IBACH | HOCKEYGEEKGIRL SUSAN.IBACH@LIVE.COM