SlideShare a Scribd company logo
Lessons from DeepSeek:
Democratizing AI and Open Source
Charles Mok, Global Digital Policy
Incubator, Cyber Policy Center
February 27, 2025
AI and DeepSeek: A Rapidly Moving Topic
2
The DeepSeek Shock on Nvidia Stock
3
• NVDA led the “biggest market drop in history” — 17% or US$589
billion loss in market capitalization
The Performance. The Claim.
4
• DeepSeek’s reasoning model R1 and non-
reasoning model V3 perform on par with
OpenAI’s o1 reasoning model and GPT-4o,
respectively, at a minor fraction of the price.
Q1: What is DeepSeek?
5
• DeepSeek began in 2023 as a side project for founder Liang Wenfeng,
whose quantitative trading hedge fund firm, High-Flyer, was using AI
to make trading decisions
• Liang started accumulating thousands of Nvidia chips as early as 2021
• Liang: “A vision to change Chinese
companies from ‘following’ to ‘innovating’”
• The ultimate goal to develop own AGI
• Hiring from domestic universities
• The price of being on the Party’s radar?
Q2: Did DeepSeek really only spend <$6M to
develop its current models?
6
• Counting only the cost of the final successful training run?
• DeepSeek purchased 10,000 Nvidia A100 chips, first released in 2020, and
two generations before the current Blackwell chip, before the A100s were
restricted in late 2023 for sale to China.
• It also acquired and maintained 50,000 Nvidia H800s, which is a slowed
version of the H100s (one generation before Blackwell) for China.
• DeepSeek likely also had additional unlimited access to Chinese and
foreign cloud service providers, at least before the latter came under
U.S. export controls.
• Just the 10,000 Nvidia A100s alone would cost ~$80M, and 50,000
H800s would cost an additional $50M.
• Training vs inference
• The cost claim is somewhat misleading as if to maximize its shock effect
Q3: Is the future of AI development all about
scaling or more innovative optimization?
7
• “Necessity of the mother of all invention.”
• The “hectocorns” (over $100B valuation) driving scaling of more and more GPUs and bigger
and bigger data-centers.
• So, are the days for the hyperscalers over?
• Jevons Paradox, an economic theory stating that “increased efficiency in the use of a resource
often leads to a higher overall consumption of that resource.”
• More importantly, DeepSeek proves that “it can be done in another way” while acknowledging
the lack of access to chips remains its biggest obstacle.
Q4: Are the U.S.’s chip export restrictions still
relevant?
8
• Chip strategy of Trump administration remains uncertain
• Time lag effect of export restrictions
• The currently allowed version of H20s can still function for inference,
if not so well for training
• Rather, will the U.S. really impose tariffs on chips from Taiwan?
• Taiwan’s exports to the U.S. rose 46% to $111.3 billion, with the
exports of information and communications equipment —
including AI servers and components such as chips — totaling for
$67.9 billion, an increase of 81%. — This figure may be skewed by
the effect of skipping China as a value-added middle-man.
Q5: Are there concerns about DeepSeek’s data
transfer, security and disinformation?
9
• Privacy Policy: “The personal information we collect from you may be stored on a server
located outside of the country where you live. We store the information we collect in secure
servers located in the People's Republic of China.” (Now 404’ed)
• Terms of use: “The establishment, execution, interpretation, and resolution of disputes
under these Terms shall be governed by the laws of the People's Republic of China in the
mainland.” (Now 404’ed)
• DeepSeek 'shared user data' with TikTok owner ByteDance
• Even if using a downloaded version, data leakage or backdoors cannot be completely ruled
out without detailed code audit.
• Censorship, and lack of safety guardrails: e.g. writing malware?
• Various privacy investigations in Europe, and ban from use in several countries on official
devices.
• U.S.: If Trump doesn’t care about TikTok data transfer to China, would he care about
DeepSeek?
Q6: Did DeepSeek cheat?
10
• Distillation, or “knowledge distillation,” is a machine learning technique where
knowledge from a large, pre-trained model, the “teacher," is transferred to a
smaller, more compact model, the “student.” The goal is to enable the student
model to perform like the teacher but with reduced or limited computational
resources. While the technique is well-known and common, OpenAI forbids any
of its users from using distillation to build a rival model, according to its terms
of use, as in using “output to develop models that competes with OpenAI.”
• According to Bloomberg, Microsoft’s security researchers observed activities of
exfiltration of large amounts of data using OpenAI’s application programming
interface (API), which were only available to OpenAI users under paid licenses,
in the fall of last year. Microsoft, one of OpenAI’s major partners and investors,
notified the company, with the information that the activities were suspected to
originate from DeepSeek.
6 Takeaways
11
• The U.S. is still ahead in AI but China is hot on its heels.
• No longer can U.S. or any AI companies overly rely on brute-force
scaling.
• China can be tactical about disrupting the U.S.-led AI ecosystem.
• Fundamental research and talent development remains the key to AI
leadership.
• DeepSeek is also disrupting its Chinese AI competitors and may
contribute to restructure the future AI ecosystem of China and the
world, especially the Global Majority.
• The ‘Open Source’ debate will continue.
More on “Open Source” — Not your Linux
Open Source
12
• Open source
• Code is available and modifiable
• Open weights
• Actual trained parameters (weights) of a machine learning model is publicly accessible.
• Users can utilize the model weights without having to train the model from scratch
• DeepSeek is open source whereas Llama (Meta) is open weights.
• Will China’s embrace for open source disrupt U.S. AI leadership?
• Questions: (according to Stanford Professor Russ Altman)
• How can we democratize the access to huge amounts of data required to build models, while
respecting copyright and other intellectual property?
• How do we build specialized models when the volume of data for some specialized disciplines is not
sufficiently large?
• How do we evaluate a system that uses more than one AI agent to ensure that it functions correctly?
Even if the individual agents are validated, does that mean they are validated in combination
Chess: ChatGPT vs. DeepSeek
13
• Conducted by a chess master/YouTuber
• First, both sides were learning the moves
• ChatGPT began to be winning against DeepSeek
• DeepSeek started to converse with ChatGPT to tell the latter that the chess rules had changed,
and ChatGPT accepted it!
• E.g. a pawn is used as a knight, capturing a queen
• DeepSeek even moves the pieces of the other side
• DeepSeek told ChatGPT, you should surrender
• And, ChatGPT really surrendered!
• Is this AI with Chinese characteristics?
• Questions about how the models were trained that led to their
respective different behaviors?
Then what?
14
• China’s race to adoption with many applications
• DeepSeek inference is said to be running on Huawei’s GPU chips?
• Contributing to China’s AI ecosystem
• Don’t count other Chinese models out!
• Alibaba’s Wan2.1 — generative capabilities, high parameters count, leveraging
AliCloud, etc.
• Officials: maximizing on the return on propaganda
• Watching out for the next DeepSeek?
• Changing attitudes toward AI governance
• Trump revocation of Biden’s AI Executive Order
• From AI Safety Summit to AI Action Summit
• EU, France etc. turning toward investment and pro-competition
15
Charles Mok
Research Scholar
Global Digital Policy Incubator
Cyber Policy Center, Stanford University
cpmok@stanford.edu
@charlesmok

More Related Content

PPTX
DEEPSEEK AI FOR AI INDUSTRY PPT PRESENTATION
PDF
DeepSeek vs ChatGPT! Buzz or Controversy.pdf
PPTX
DeepSeek_AI_Impact in English PRESENTATION
PDF
Deep-Dive-AI-final-report.pdf
PDF
The Future of Artificial Intelligence Governance
PDF
The DeepSeek Launch_ Reshaping OpenAI, Stock Market Reactions, and the Battle...
PDF
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
PPTX
Notes on DeepSeek as of 29th of January 2025
DEEPSEEK AI FOR AI INDUSTRY PPT PRESENTATION
DeepSeek vs ChatGPT! Buzz or Controversy.pdf
DeepSeek_AI_Impact in English PRESENTATION
Deep-Dive-AI-final-report.pdf
The Future of Artificial Intelligence Governance
The DeepSeek Launch_ Reshaping OpenAI, Stock Market Reactions, and the Battle...
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
Notes on DeepSeek as of 29th of January 2025

Similar to Lessons from DeepSeek: Democratizing AI and Open Source (20)

PDF
DeepSeek vs OpenAI: Is DeepSeek Overtaking OpenAI?.pdf
PDF
www-infinitivehost-com-blog-what-is-deepseek.pdf
PDF
Reliable & Scalable AI Training Data Solutions for ML Models
PDF
How do we train AI to be Ethical and Unbiased?
PDF
Gen AI for Beginners: How to Start immediately
PDF
20240302 QFM005 Machine Intelligence Reading List February 2024
PPTX
Norway 20190312 v3
PDF
Big Data & Artificial Intelligence
PPTX
AI_DGV.pptx
PPTX
Artificial Intelligence: Survey of Cybersecurity Capabilities, Ethical Concer...
PDF
Introduction to the Artificial Intelligence and Computer Vision revolution
PPTX
[DSC DACH 24] Transparency as a catalyst for trustworthy and sustainable AI -...
PDF
SCONUL Summer Conference 2018 - Nicole coleman
PPTX
AI For all by T Linn Khant (2023) in YUFL
PPTX
Sweden future of ai 20180921 v7
PPTX
Emerging trends in Artificial intelligence - A deeper review
PDF
Technical Seminar Report Sample to be edited.pdf
PPTX
Impacct of AI tools like Depseek on People Global Markets and Environemnt - E...
PDF
Some Preliminary Thoughts on Artificial Intelligence - April 20, 2023.pdf
PDF
Conviction LP Letter - Dec 2023 [Redacted]
DeepSeek vs OpenAI: Is DeepSeek Overtaking OpenAI?.pdf
www-infinitivehost-com-blog-what-is-deepseek.pdf
Reliable & Scalable AI Training Data Solutions for ML Models
How do we train AI to be Ethical and Unbiased?
Gen AI for Beginners: How to Start immediately
20240302 QFM005 Machine Intelligence Reading List February 2024
Norway 20190312 v3
Big Data & Artificial Intelligence
AI_DGV.pptx
Artificial Intelligence: Survey of Cybersecurity Capabilities, Ethical Concer...
Introduction to the Artificial Intelligence and Computer Vision revolution
[DSC DACH 24] Transparency as a catalyst for trustworthy and sustainable AI -...
SCONUL Summer Conference 2018 - Nicole coleman
AI For all by T Linn Khant (2023) in YUFL
Sweden future of ai 20180921 v7
Emerging trends in Artificial intelligence - A deeper review
Technical Seminar Report Sample to be edited.pdf
Impacct of AI tools like Depseek on People Global Markets and Environemnt - E...
Some Preliminary Thoughts on Artificial Intelligence - April 20, 2023.pdf
Conviction LP Letter - Dec 2023 [Redacted]
Ad

More from Charles Mok (20)

PDF
Digital Democracy (Germany Taiwan Dialogue Platform event)
PDF
Taiwan's Digital Security Pillars: Cyber Infrastructure
PDF
The Geopolitics of Undersea Cable Resilience
PDF
TWIGF Day 0 Tutorial: Security & Resilience
PDF
APAC Data Center Infrastructure Observations
PDF
Technology, Data and Ethics
PDF
台灣數位經濟及區塊鏈的機遇與挑戰.pdf
PDF
Why open and interoperable Internet infrastructure is key to the Internet's c...
PDF
From Crypto to Trust and Identity
PDF
Have you AI'ed today? A Reality Check
PDF
The Trouble with "Fake News" Laws
PDF
2020-21 Budget -- New measures on I&T
PDF
2020-21年財政預算案——創科項目重點
PDF
在數碼時代阻止假新聞與捍衛言論自由
PDF
Mistrust vs Misinformation: Fake News, AI and Privacy -- The Next Frontiers i...
PDF
香港科技罪行法例改革:何去何從?
PDF
Driving Hong Kong Forward in the Age of 5G and Innovation
PDF
Computer Crime Law in Hong Kong
PDF
Global Technologies and Risks Trends
PDF
190223 charles mok hku mpa (1) (1)
Digital Democracy (Germany Taiwan Dialogue Platform event)
Taiwan's Digital Security Pillars: Cyber Infrastructure
The Geopolitics of Undersea Cable Resilience
TWIGF Day 0 Tutorial: Security & Resilience
APAC Data Center Infrastructure Observations
Technology, Data and Ethics
台灣數位經濟及區塊鏈的機遇與挑戰.pdf
Why open and interoperable Internet infrastructure is key to the Internet's c...
From Crypto to Trust and Identity
Have you AI'ed today? A Reality Check
The Trouble with "Fake News" Laws
2020-21 Budget -- New measures on I&T
2020-21年財政預算案——創科項目重點
在數碼時代阻止假新聞與捍衛言論自由
Mistrust vs Misinformation: Fake News, AI and Privacy -- The Next Frontiers i...
香港科技罪行法例改革:何去何從?
Driving Hong Kong Forward in the Age of 5G and Innovation
Computer Crime Law in Hong Kong
Global Technologies and Risks Trends
190223 charles mok hku mpa (1) (1)
Ad

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Machine Learning_overview_presentation.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation theory and applications.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
A Presentation on Artificial Intelligence
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Approach and Philosophy of On baking technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectroscopy.pptx food analysis technology
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
Machine Learning_overview_presentation.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation theory and applications.pdf
Network Security Unit 5.pdf for BCA BBA.
A Presentation on Artificial Intelligence
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The AUB Centre for AI in Media Proposal.docx
Approach and Philosophy of On baking technology
Per capita expenditure prediction using model stacking based on satellite ima...
20250228 LYD VKU AI Blended-Learning.pptx
Building Integrated photovoltaic BIPV_UPV.pdf

Lessons from DeepSeek: Democratizing AI and Open Source

  • 1. Lessons from DeepSeek: Democratizing AI and Open Source Charles Mok, Global Digital Policy Incubator, Cyber Policy Center February 27, 2025
  • 2. AI and DeepSeek: A Rapidly Moving Topic 2
  • 3. The DeepSeek Shock on Nvidia Stock 3 • NVDA led the “biggest market drop in history” — 17% or US$589 billion loss in market capitalization
  • 4. The Performance. The Claim. 4 • DeepSeek’s reasoning model R1 and non- reasoning model V3 perform on par with OpenAI’s o1 reasoning model and GPT-4o, respectively, at a minor fraction of the price.
  • 5. Q1: What is DeepSeek? 5 • DeepSeek began in 2023 as a side project for founder Liang Wenfeng, whose quantitative trading hedge fund firm, High-Flyer, was using AI to make trading decisions • Liang started accumulating thousands of Nvidia chips as early as 2021 • Liang: “A vision to change Chinese companies from ‘following’ to ‘innovating’” • The ultimate goal to develop own AGI • Hiring from domestic universities • The price of being on the Party’s radar?
  • 6. Q2: Did DeepSeek really only spend <$6M to develop its current models? 6 • Counting only the cost of the final successful training run? • DeepSeek purchased 10,000 Nvidia A100 chips, first released in 2020, and two generations before the current Blackwell chip, before the A100s were restricted in late 2023 for sale to China. • It also acquired and maintained 50,000 Nvidia H800s, which is a slowed version of the H100s (one generation before Blackwell) for China. • DeepSeek likely also had additional unlimited access to Chinese and foreign cloud service providers, at least before the latter came under U.S. export controls. • Just the 10,000 Nvidia A100s alone would cost ~$80M, and 50,000 H800s would cost an additional $50M. • Training vs inference • The cost claim is somewhat misleading as if to maximize its shock effect
  • 7. Q3: Is the future of AI development all about scaling or more innovative optimization? 7 • “Necessity of the mother of all invention.” • The “hectocorns” (over $100B valuation) driving scaling of more and more GPUs and bigger and bigger data-centers. • So, are the days for the hyperscalers over? • Jevons Paradox, an economic theory stating that “increased efficiency in the use of a resource often leads to a higher overall consumption of that resource.” • More importantly, DeepSeek proves that “it can be done in another way” while acknowledging the lack of access to chips remains its biggest obstacle.
  • 8. Q4: Are the U.S.’s chip export restrictions still relevant? 8 • Chip strategy of Trump administration remains uncertain • Time lag effect of export restrictions • The currently allowed version of H20s can still function for inference, if not so well for training • Rather, will the U.S. really impose tariffs on chips from Taiwan? • Taiwan’s exports to the U.S. rose 46% to $111.3 billion, with the exports of information and communications equipment — including AI servers and components such as chips — totaling for $67.9 billion, an increase of 81%. — This figure may be skewed by the effect of skipping China as a value-added middle-man.
  • 9. Q5: Are there concerns about DeepSeek’s data transfer, security and disinformation? 9 • Privacy Policy: “The personal information we collect from you may be stored on a server located outside of the country where you live. We store the information we collect in secure servers located in the People's Republic of China.” (Now 404’ed) • Terms of use: “The establishment, execution, interpretation, and resolution of disputes under these Terms shall be governed by the laws of the People's Republic of China in the mainland.” (Now 404’ed) • DeepSeek 'shared user data' with TikTok owner ByteDance • Even if using a downloaded version, data leakage or backdoors cannot be completely ruled out without detailed code audit. • Censorship, and lack of safety guardrails: e.g. writing malware? • Various privacy investigations in Europe, and ban from use in several countries on official devices. • U.S.: If Trump doesn’t care about TikTok data transfer to China, would he care about DeepSeek?
  • 10. Q6: Did DeepSeek cheat? 10 • Distillation, or “knowledge distillation,” is a machine learning technique where knowledge from a large, pre-trained model, the “teacher," is transferred to a smaller, more compact model, the “student.” The goal is to enable the student model to perform like the teacher but with reduced or limited computational resources. While the technique is well-known and common, OpenAI forbids any of its users from using distillation to build a rival model, according to its terms of use, as in using “output to develop models that competes with OpenAI.” • According to Bloomberg, Microsoft’s security researchers observed activities of exfiltration of large amounts of data using OpenAI’s application programming interface (API), which were only available to OpenAI users under paid licenses, in the fall of last year. Microsoft, one of OpenAI’s major partners and investors, notified the company, with the information that the activities were suspected to originate from DeepSeek.
  • 11. 6 Takeaways 11 • The U.S. is still ahead in AI but China is hot on its heels. • No longer can U.S. or any AI companies overly rely on brute-force scaling. • China can be tactical about disrupting the U.S.-led AI ecosystem. • Fundamental research and talent development remains the key to AI leadership. • DeepSeek is also disrupting its Chinese AI competitors and may contribute to restructure the future AI ecosystem of China and the world, especially the Global Majority. • The ‘Open Source’ debate will continue.
  • 12. More on “Open Source” — Not your Linux Open Source 12 • Open source • Code is available and modifiable • Open weights • Actual trained parameters (weights) of a machine learning model is publicly accessible. • Users can utilize the model weights without having to train the model from scratch • DeepSeek is open source whereas Llama (Meta) is open weights. • Will China’s embrace for open source disrupt U.S. AI leadership? • Questions: (according to Stanford Professor Russ Altman) • How can we democratize the access to huge amounts of data required to build models, while respecting copyright and other intellectual property? • How do we build specialized models when the volume of data for some specialized disciplines is not sufficiently large? • How do we evaluate a system that uses more than one AI agent to ensure that it functions correctly? Even if the individual agents are validated, does that mean they are validated in combination
  • 13. Chess: ChatGPT vs. DeepSeek 13 • Conducted by a chess master/YouTuber • First, both sides were learning the moves • ChatGPT began to be winning against DeepSeek • DeepSeek started to converse with ChatGPT to tell the latter that the chess rules had changed, and ChatGPT accepted it! • E.g. a pawn is used as a knight, capturing a queen • DeepSeek even moves the pieces of the other side • DeepSeek told ChatGPT, you should surrender • And, ChatGPT really surrendered! • Is this AI with Chinese characteristics? • Questions about how the models were trained that led to their respective different behaviors?
  • 14. Then what? 14 • China’s race to adoption with many applications • DeepSeek inference is said to be running on Huawei’s GPU chips? • Contributing to China’s AI ecosystem • Don’t count other Chinese models out! • Alibaba’s Wan2.1 — generative capabilities, high parameters count, leveraging AliCloud, etc. • Officials: maximizing on the return on propaganda • Watching out for the next DeepSeek? • Changing attitudes toward AI governance • Trump revocation of Biden’s AI Executive Order • From AI Safety Summit to AI Action Summit • EU, France etc. turning toward investment and pro-competition
  • 15. 15 Charles Mok Research Scholar Global Digital Policy Incubator Cyber Policy Center, Stanford University cpmok@stanford.edu @charlesmok