SlideShare a Scribd company logo
Alexy Khrabrov, PhD
Open-Source Science Director
IBM Research, Accelerated Discovery
Chair, Generative AI Commons
LF AI & Data, Linux Foundation
@chiefscientist (X/LinkedIn/Telegram)
alexy@ chiefscientist.org
Open-Source AI: Community is the Way
Why do we need community around LLMs?
• Claims of trust, safety, performance, transparency and openness cannot be unilateral
announcements by one or even a few companies
• Need an established community vehicle like LF, ML Commons, NumFOCUS
2
Generative AI needs Community
3
PyData
Hamburg, Germany
LLM Avalanche
San Francisco, CA
ChiPy & PyData
Chicago, IL
PyData
Accra, Ghana
UCSC OSPO
Santa Cruz, CA
PyData
Berlin, Germany
OSPOs for Good @ United
Nations
ACS Off-Site, Almaden
SciPy 2023
Austin, TX
LLM Avalanche
San Francisco, CA
ACS
San Francisco, CA
NumFOCUS
Donation
to Data Science
Education
• Data
• Models
• Applications
• Community Validation
4
What is the OSS Generative AI?
• Data: training, lakehouses, retraining
• Models: OSS models, serving, inference
• Applications:
• frameworks, prompt engineering, DSP, Open Interpreter
• Enterprise Integration
• Community Validation: benchmarks, openness metrics,
measurable broad societal consensus
5
Generative AI Commons at LF AI & Data
The LF AI & Data Generative AI Commons is dedicated to fostering the
democratization, advancement and adoption of efficient, secure, reliable, and
ethical Generative AI open source innovations through neutral governance, open
and transparent collaboration and education.
Alexy Khrabrov and Peter Staar
CZI HQ, Redwood City 10/26/23
DeepSearch used to identify software mentions in Arxiv at
the CZI hackathon
Mapping the Impact of Research Software in Science
(Chan Zuckerburg Initiative Hackathon, Oct 24–27, 2023)
• Which sciences grow faster
with OSS
• Which software is most used,
by discipline
• Which organizations support
OSS
• How to extract software
mentions from papers
• Grants, Authors,
Organizations
• Software citation intent
• Digital Transformation 1.0 was low-level automation, a gas-powered horse
• Clerks replaced by PDF flows
• Middle management still in place to operate PDF-enabled clerk teams
• SSAs will replace clerks and middle management workflow (data+instruction)
• Human in the Loop creators will translate strategy to SSAs
• Actual organizational restructuring
• SAP, Oracle, legal integrations
• Industrial infrastructure, machinery,
networks, grids subject to DT2
7
Digital Transformation 2.0: DT2
Material knowledge about specialized processes
• BIMs and Power Tools
• Factory Automation
• Communications and Utilities
• Specialized Machinery (Long Tail)
• Hardware and chip-based infrastructure
• AI vs A/V
• Undocumented tribal knowledge
IBM Confidential | © 2020 IBM 8
Industrial AI
• Ownership
With open-source, organizations can secure AI sovereignty and protect their IP encapsulated in the models. This
empowers them to freely create, modify, and deploy their agents within their own industrial environments, without
vendor lock-in. Factory setup also required high-bandwidth local networks.
– Small is Beautiful (Unix => and Efficient!)
The OSS model can be specialized and compressed, fitting in the environments where it should be deployed. It can
be reasoned about and proven correct for the specific domain, preserving ownership and expertise.
– Do One Thing, Do It Well! (GM)
Specialist models can be fused with
company knowledge.
9
Why OSS AI is needed for Industrial AI?
10
Why OSS AI is needed for Enterprise AI?
• Generative AI Commons at Linux Foundation
• ML Commons – MLperf benchmark, AI safety group
• Foundation models in Climate, Chemistry, Biology, IBM+NASA+…
• Partnership for AI
• Frontier Model Forum
• OECD/WEF working groups
11
Multiple AI Bodies need to Collaborate
12
Thoughtful Software Engineering + LLMs
scale.bythebay.io — Oakland, November 13-15 — K1st 30% off passes

More Related Content

PDF
Deep-Dive-AI-final-report.pdf
PDF
OpenChain Webinar 57 - The Open Source Initiative - 2023-11-27
PPTX
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
PPTX
AI Open-Source Models- Benefits vs. Risks.
PPTX
Norway 20190312 v3
PDF
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
PDF
3rd International Congress on Recent Trends in Computer Science (ICRCS 2024)
PDF
Siemens Keynote Presentation, OW2con'19, June 12-13, Paris
 
Deep-Dive-AI-final-report.pdf
OpenChain Webinar 57 - The Open Source Initiative - 2023-11-27
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
AI Open-Source Models- Benefits vs. Risks.
Norway 20190312 v3
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
3rd International Congress on Recent Trends in Computer Science (ICRCS 2024)
Siemens Keynote Presentation, OW2con'19, June 12-13, Paris
 

Similar to Open-Source AI: Community is the Way (20)

PDF
Leveraging Generative AI: Exploring New Technology for Data Integration
PDF
Trusted, Transparent and Fair AI using Open Source
PPTX
Lfai governance board 20191031 v3
PPTX
Hicss52 20190108 v2
PPTX
Hicss52 20190108 v3
PDF
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
PDF
London Futurists - The Future of AI & Sustainability
PPTX
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
PPTX
Types of AI and Their Usefulness.pptx for healthcare workers
PPTX
Denmark 20190418 v5
PDF
How to build a generative AI solution.pdf
PPTX
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
PDF
Deep Learning Image Processing Applications in the Enterprise
PPTX
Spohrer GAMP 20230628 v17.pptx
PDF
What’s New with Databricks Machine Learning
PPTX
Spohrer SIRs 20230511 v16.pptx
PDF
Open Source and Standards Communities Coming Together to Solve Real World Pro...
PDF
An overview of Artifical Intelligence for Creators...
PPTX
S0-Stephen.pptx
Leveraging Generative AI: Exploring New Technology for Data Integration
Trusted, Transparent and Fair AI using Open Source
Lfai governance board 20191031 v3
Hicss52 20190108 v2
Hicss52 20190108 v3
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
London Futurists - The Future of AI & Sustainability
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
Types of AI and Their Usefulness.pptx for healthcare workers
Denmark 20190418 v5
How to build a generative AI solution.pdf
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
Deep Learning Image Processing Applications in the Enterprise
Spohrer GAMP 20230628 v17.pptx
What’s New with Databricks Machine Learning
Spohrer SIRs 20230511 v16.pptx
Open Source and Standards Communities Coming Together to Solve Real World Pro...
An overview of Artifical Intelligence for Creators...
S0-Stephen.pptx
Ad

More from Sri Ambati (20)

PDF
H2O Label Genie Starter Track - Support Presentation
PDF
H2O.ai Agents : From Theory to Practice - Support Presentation
PDF
H2O Generative AI Starter Track - Support Presentation Slides.pdf
PDF
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
PDF
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
PDF
Intro to Enterprise h2oGPTe Presentation Slides
PDF
Enterprise h2o GPTe Learning Path Slide Deck
PDF
H2O Wave Course Starter - Presentation Slides
PDF
Large Language Models (LLMs) - Level 3 Slides
PDF
Data Science and Machine Learning Platforms (2024) Slides
PDF
Data Prep for H2O Driverless AI - Slides
PDF
H2O Cloud AI Developer Services - Slides (2024)
PDF
LLM Learning Path Level 2 - Presentation Slides
PDF
LLM Learning Path Level 1 - Presentation Slides
PDF
Hydrogen Torch - Starter Course - Presentation Slides
PDF
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
PDF
H2O Driverless AI Starter Course - Slides and Assignments
PPTX
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
PDF
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
PPTX
Generative AI Masterclass - Model Risk Management.pptx
H2O Label Genie Starter Track - Support Presentation
H2O.ai Agents : From Theory to Practice - Support Presentation
H2O Generative AI Starter Track - Support Presentation Slides.pdf
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
Intro to Enterprise h2oGPTe Presentation Slides
Enterprise h2o GPTe Learning Path Slide Deck
H2O Wave Course Starter - Presentation Slides
Large Language Models (LLMs) - Level 3 Slides
Data Science and Machine Learning Platforms (2024) Slides
Data Prep for H2O Driverless AI - Slides
H2O Cloud AI Developer Services - Slides (2024)
LLM Learning Path Level 2 - Presentation Slides
LLM Learning Path Level 1 - Presentation Slides
Hydrogen Torch - Starter Course - Presentation Slides
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
H2O Driverless AI Starter Course - Slides and Assignments
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Generative AI Masterclass - Model Risk Management.pptx
Ad

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Empathic Computing: Creating Shared Understanding
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Digital-Transformation-Roadmap-for-Companies.pptx
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
Machine learning based COVID-19 study performance prediction
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Understanding_Digital_Forensics_Presentation.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
NewMind AI Monthly Chronicles - July 2025
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
The Rise and Fall of 3GPP – Time for a Sabbatical?

Open-Source AI: Community is the Way

  • 1. Alexy Khrabrov, PhD Open-Source Science Director IBM Research, Accelerated Discovery Chair, Generative AI Commons LF AI & Data, Linux Foundation @chiefscientist (X/LinkedIn/Telegram) alexy@ chiefscientist.org Open-Source AI: Community is the Way
  • 2. Why do we need community around LLMs? • Claims of trust, safety, performance, transparency and openness cannot be unilateral announcements by one or even a few companies • Need an established community vehicle like LF, ML Commons, NumFOCUS 2 Generative AI needs Community
  • 3. 3 PyData Hamburg, Germany LLM Avalanche San Francisco, CA ChiPy & PyData Chicago, IL PyData Accra, Ghana UCSC OSPO Santa Cruz, CA PyData Berlin, Germany OSPOs for Good @ United Nations ACS Off-Site, Almaden SciPy 2023 Austin, TX LLM Avalanche San Francisco, CA ACS San Francisco, CA NumFOCUS Donation to Data Science Education
  • 4. • Data • Models • Applications • Community Validation 4 What is the OSS Generative AI? • Data: training, lakehouses, retraining • Models: OSS models, serving, inference • Applications: • frameworks, prompt engineering, DSP, Open Interpreter • Enterprise Integration • Community Validation: benchmarks, openness metrics, measurable broad societal consensus
  • 5. 5 Generative AI Commons at LF AI & Data The LF AI & Data Generative AI Commons is dedicated to fostering the democratization, advancement and adoption of efficient, secure, reliable, and ethical Generative AI open source innovations through neutral governance, open and transparent collaboration and education.
  • 6. Alexy Khrabrov and Peter Staar CZI HQ, Redwood City 10/26/23 DeepSearch used to identify software mentions in Arxiv at the CZI hackathon Mapping the Impact of Research Software in Science (Chan Zuckerburg Initiative Hackathon, Oct 24–27, 2023) • Which sciences grow faster with OSS • Which software is most used, by discipline • Which organizations support OSS • How to extract software mentions from papers • Grants, Authors, Organizations • Software citation intent
  • 7. • Digital Transformation 1.0 was low-level automation, a gas-powered horse • Clerks replaced by PDF flows • Middle management still in place to operate PDF-enabled clerk teams • SSAs will replace clerks and middle management workflow (data+instruction) • Human in the Loop creators will translate strategy to SSAs • Actual organizational restructuring • SAP, Oracle, legal integrations • Industrial infrastructure, machinery, networks, grids subject to DT2 7 Digital Transformation 2.0: DT2
  • 8. Material knowledge about specialized processes • BIMs and Power Tools • Factory Automation • Communications and Utilities • Specialized Machinery (Long Tail) • Hardware and chip-based infrastructure • AI vs A/V • Undocumented tribal knowledge IBM Confidential | © 2020 IBM 8 Industrial AI
  • 9. • Ownership With open-source, organizations can secure AI sovereignty and protect their IP encapsulated in the models. This empowers them to freely create, modify, and deploy their agents within their own industrial environments, without vendor lock-in. Factory setup also required high-bandwidth local networks. – Small is Beautiful (Unix => and Efficient!) The OSS model can be specialized and compressed, fitting in the environments where it should be deployed. It can be reasoned about and proven correct for the specific domain, preserving ownership and expertise. – Do One Thing, Do It Well! (GM) Specialist models can be fused with company knowledge. 9 Why OSS AI is needed for Industrial AI?
  • 10. 10 Why OSS AI is needed for Enterprise AI?
  • 11. • Generative AI Commons at Linux Foundation • ML Commons – MLperf benchmark, AI safety group • Foundation models in Climate, Chemistry, Biology, IBM+NASA+… • Partnership for AI • Frontier Model Forum • OECD/WEF working groups 11 Multiple AI Bodies need to Collaborate
  • 12. 12 Thoughtful Software Engineering + LLMs scale.bythebay.io — Oakland, November 13-15 — K1st 30% off passes