SlideShare a Scribd company logo
NSF’s Computational Ecosystem for
21st Century Science and Engineering
Amy Walton, Deputy Director
Office of Advanced Cyberinfrastructure
National Science Foundation
1
Fourth National Research Platform (4NRP) Workshop
September 9, 2023
Topics
• Looking Back – The Pacific Research Platform
• A Productive Experiment
• Moving Targets
• Acknowledgements
• Looking Forward – A National Research Ecosystem
• Challenges
• Opportunities
• Resources
2
NSF 15-534: Data, Networking, and Innovation
3
An initial – and productive– collaboration between two OAC programs:
• Campus Cyberinfrastructure (CC*)
• Data Infrastructure Building Blocks (DIBBs)
Area 1: Multi-Campus/Multi-Institution Model Implementations
Emphasis on integration of data and network infrastructure activities
• Awards served as models for potential future national scale network-aware data-
focused cyberinfrastructure.
• Expected to be science-driven, demonstrating a strong and credible connection
to the multi-campus, multi-institutional, and/or regional scientific communities
they serve.
• Emphasized the value of sharing data beyond a specific institution to the wider
science, engineering, and education communities.
Pacific Research Platform: Then and Now
4
• Goal: Expand the campus Science DMZ network systems
model into a regional model for data-intensive science.
• The PRP data-sharing architecture allowed region-wide
virtual co-location of data with computing.
• Endpoints of PRP sites -- devices called Flash I/O Network
Appliances (FIONAs) -- were incorporated into
a Kubernetes cluster of FIONAs called Nautilus.
• Data can traverse multiple, heterogeneous networks with
minimal performance degradation.
Now uses 11 major regional/national networks:
• 737 namespaces (projects)
• >2,100 users
• Researchers at 94 US campuses in 39 states
Not Mentioned in the Original Proposal:
5
• Kubernetes
• Containers
• Automation
• Jupyter
• Ceph
These technologies emerged
and were integrated into what
became Nautilus during the
period of the PRP grant
• Machine Learning
• Artificial Intelligence
• Neutrino Observatory
• COVID
• Wildfires
While all applications listed in the
original proposal were addressed,
these applications became some
of the largest PRP CPU/GPU
application consumers
6
Acknowledgements: Many Contributors
CHASE-CI [CISE/CNS]
2100237 and 2120019
Additional GPU nodes,
expand community
Expanse
1928224 (ACSS-I)
NVIDIA GPUs,
cloud integration,
composable systems
Voyager
2005369 (ACSS-II)
AI-focused
hardware,
Intel/Habana tools
Prototype NRP
2112167 (ACSS-II)
Distributed across SDSC,
U Nebraska – Lincoln,
and MGHPCC
PRP cyberinfrastructure has increased compute
capacity through several sources:
• Individual data-intensive research
faculty at multiple campuses used their
grant resources
• This added ~1/4 of the total GPUs on
Nautilus
Today, Nautilus has nearly 20,000 CPU-cores and
nearly 1500 GPUs
T-NRP
1826967 (CC*)
Connect Quilt Regional
Networks using CENIC
and Internet2
CHASE-CI [CISE/CNS]
1713149 – Cloud of
GPUs for faculty to
train AI algorithms
Astronomy Physics
Computational Bio Material Science
Evolutionary Bio
Climatology
7
Looking Forward: Cyberinfrastructure that
Enables Research Across Science Disciplines
Challenges:
• Large instruments producing
• Big data requiring
• Big compute for
• Highly collaborative scientists
in
• Different specializations
across
• Widely Distributed
infrastructure that must be
• Available, ensure
• Workflow Integrity, and be
• Easy to use while adhering to
• Regulatory or policy
requirements
Data Cyberinfrastructure
• Federal guidance on Open Science and Public Access presents new
opportunities for an agile, scalable and equitable national data
cyberinfrastructure to support data sharing.
• Recent OAC CC* awards provided federated campus storage.
• Required: Follow NSF data practices; sustainability plan; integrate into networks
• Future Directions: How to capitalize on existing investments and achieve a
national scale data CI to support equitable access to and use of data using
FAIR principles?
• Our proposed solution: A loose federated approach of existing and new repositories
and infrastructure which adhere to basic agreed principles.
• Repositories and other data projects that join the network gain benefit from shared
resources and services.
CI Professionals
• A significant barrier to use of national resources is access to CI
professionals who can provide expertise and support that are
responsive to local needs.
• The new ACCESS Computational Science Support Network (CSSN)
provides a framework for engaging, training/mentoring, and
coordinating a network of CI professionals
• The new SCIPE Solicitation (NSF 23-574) supports CI professionals
at the campus or regional level.
• Enables engagement of CI professionals into ACCESS Computational
Science Support Network
• Requires: A plan for mentoring, professional development, and
sustainability; and 20% of supported individual’s time be dedicated to
national activities.
Leadership-class
Capacity Systems
Distributed Services
Cloud resources
Innovative Prototypes/Testbeds
NSF-supported Advanced CI Resources
Anvil Purdue University
Bridges 2 Carnegie-Mellon University
Delta U of Illinois, Urbana-Champaign
Expanse U of California, San Diego
Jetstream 2 University of Indiana + Partners
Stampede 2 U of Texas, Austin
Frontera U of Texas, Austin
Neocortex Carnegie-Mellon University
Voyager U of California, San Diego
Ookami Stonybrook University
NRP U of California, San Diego
ACES Texas A&M University
Learn how to access resources at access-ci.org
Cloudbank U of California, San Diego
CloudLab University of Utah
Chameleon University of Chicago
PATh/OSG U of Wisconsin, Madison
ACCESS Several Partners 10
Democratizing Science through Cyberinfrastructure
Broad, fair, and equitable access to
advanced computing is essential to
democratizing science in the 21st
century
• Significant barriers
• Knowledge: Awareness, discovery, expertise,
support
• Technical: Allocation, access, on-ramps
• Social: Awareness of the importance of
access to CI, rewards structures
• Complex tradeoffs / optimizations
• Capacity vs. capability
• Stability vs. innovation
• Performance vs. ease of use
• Expert vs. novice
M. Parashar, "Democratizing Science Through
Advanced Cyberinfrastructure"
in Computer, vol. 55, no. 09, pp. 79-84, 2022.
doi:10.1109/MC.2022.3174928
Advanced Computing Ecosystem as a Strategic
National Asset
12
National Strategic Computing
Reserve (NSCR)
• A coalition of experts
and resource providers
that could be mobilized
quickly to provide
critical computational
resources in times of
urgent need
• Build on experiences
from the COVID-19
HPC Consortium,
responses to RFI
• Aligns with the FACE
Strategic plan
NSF’s Advanced Cyberinfrastructure
Ecosystem: Highly Accessible
Computing
• Network of advanced
systems and services
• Leadership and
capacity systems,
testbeds
• Federation (PATh) and
coordination services
(ACCESS)
• Scalable user support
networks
https://www.whitehouse.gov/wp-
content/uploads/2021/10/National-Strategic-
Computing-Reserve-Blueprint-Oct2021.pdf
Democratized access to an
advanced CI Ecosystem
Realizing an Advanced CI Ecosystem for All
• Integrated and user-friendly portals and
gateways for discovering and accessing
resources;
• Access to local CI resources as part of a shared
fabric of national CI resources reachable through
high-speed frictionless data networking;
• Diverse and flexible allocation and access
modes that support a diversity of users and
applications;
• Agile, easily accessible, and scalable networks of
experts providing embedded expertise and
support that is responsive to local needs; and
• Broadly accessible training targeting the
spectrum of CI users and skills.
The Missing Millions: Democratizing Computation and Data
to Bridge Digital Divides and Increase Access to Science for
Underrepresented Communities (A. Blatecky, EAGER)
https://www.rti.org/publication/missing-
millions/fulltext.pdf
14

More Related Content

PDF
Democratizing Science through Cyberinfrastructure - Manish Parashar
PPTX
Toward a National Research Platform to Enable Data-Intensive Computing
PPTX
Montana State, Research Networking and the Outcomes from the First National R...
PDF
Frank Würthwein - NRP and the Path forward
PPTX
The National Research Platform Enables a Growing Diversity of Users and Appl...
PPTX
Creating a Science-Driven Big Data Superhighway
PPTX
National Federated Compute Platforms: The Pacific Research Platform
PPTX
The Pacific Research Platform:a Science-Driven Big-Data Freeway System
Democratizing Science through Cyberinfrastructure - Manish Parashar
Toward a National Research Platform to Enable Data-Intensive Computing
Montana State, Research Networking and the Outcomes from the First National R...
Frank Würthwein - NRP and the Path forward
The National Research Platform Enables a Growing Diversity of Users and Appl...
Creating a Science-Driven Big Data Superhighway
National Federated Compute Platforms: The Pacific Research Platform
The Pacific Research Platform:a Science-Driven Big-Data Freeway System

Similar to Amy Walton - NSF’s Computational Ecosystem for 21st Century Science & Engineering (20)

PPT
From NCSA to the National Research Platform
PDF
Software and Education at NSF/ACI
PPTX
Integrated Optical Fiber/Wireless Systems for Environmental Monitoring
PDF
NSF SI2 program discussion at 2013 SI2 PI meeting
PPTX
The Pacific Research Platform- a High-Bandwidth Distributed Supercomputer
PDF
NSF Software @ ApacheConNA
PPTX
Working towards Sustainable Software for Science (an NSF and community view)
PPTX
NRP for the next 10 years - Frank Würthwein
PPTX
The Pacific Research Platform
PPTX
SC21: Larry Smarr on The Rise of Supernetwork Data Intensive Computing
PPTX
The Rise of Supernetwork Data Intensive Computing
PPTX
From the Pacific Research Platform to a National Research Platform
PPTX
Towards a High-Performance National Research Platform Enabling Digital Research
PPT
The Pacific Research Platform
PPTX
Panel Presentation - Tom DeFanti with Larry Smarr and Frank Wuerthwein - Naut...
PPT
Cyberinfrastructure and Applications Overview: Howard University June22
PDF
NSF SI2 program discussion at 2014 SI2 PI meeting
PDF
NETWORKING AND INFORMATION TECHNOLOGY AND DEVELOPMENT
PPTX
Toward a Global Research Platform for Big Data Analysis
PPTX
CENIC: Pacific Wave and PRP Update Big News for Big Data
From NCSA to the National Research Platform
Software and Education at NSF/ACI
Integrated Optical Fiber/Wireless Systems for Environmental Monitoring
NSF SI2 program discussion at 2013 SI2 PI meeting
The Pacific Research Platform- a High-Bandwidth Distributed Supercomputer
NSF Software @ ApacheConNA
Working towards Sustainable Software for Science (an NSF and community view)
NRP for the next 10 years - Frank Würthwein
The Pacific Research Platform
SC21: Larry Smarr on The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
From the Pacific Research Platform to a National Research Platform
Towards a High-Performance National Research Platform Enabling Digital Research
The Pacific Research Platform
Panel Presentation - Tom DeFanti with Larry Smarr and Frank Wuerthwein - Naut...
Cyberinfrastructure and Applications Overview: Howard University June22
NSF SI2 program discussion at 2014 SI2 PI meeting
NETWORKING AND INFORMATION TECHNOLOGY AND DEVELOPMENT
Toward a Global Research Platform for Big Data Analysis
CENIC: Pacific Wave and PRP Update Big News for Big Data
Ad

More from Larry Smarr (20)

PPTX
Smart Patients, Big Data, NextGen Primary Care
PPTX
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
PPTX
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
PPTX
National Research Platform: Application Drivers
PPT
From Supercomputing to the Grid - Larry Smarr
PPTX
The CENIC-AI Resource - Los Angeles Community College District (LACCD)
PPT
Redefining Collaboration through Groupware - From Groupware to Societyware
PPT
The Coming of the Grid - September 8-10,1997
PPT
Supercomputers: Directions in Technology, Architecture, and Applications
PPT
High Performance Geographic Information Systems
PPT
Data Intensive Applications at UCSD: Driving a Campus Research Cyberinfrastru...
PPT
Enhanced Telepresence and Green IT — The Next Evolution in the Internet
PPTX
The CENIC AI Resource CENIC AIR - CENIC Retreat 2024
PPTX
The CENIC-AI Resource: The Right Connection
PPTX
The Pacific Research Platform: The First Six Years
PPTX
The NSF Grants Leading Up to CHASE-CI ENS
PPTX
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
PPTX
Digital Twins of Physical Reality - Future in Review
PPTX
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy
PPTX
The PRP and Its Applications - Nautilus and the National Research Platform
Smart Patients, Big Data, NextGen Primary Care
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
National Research Platform: Application Drivers
From Supercomputing to the Grid - Larry Smarr
The CENIC-AI Resource - Los Angeles Community College District (LACCD)
Redefining Collaboration through Groupware - From Groupware to Societyware
The Coming of the Grid - September 8-10,1997
Supercomputers: Directions in Technology, Architecture, and Applications
High Performance Geographic Information Systems
Data Intensive Applications at UCSD: Driving a Campus Research Cyberinfrastru...
Enhanced Telepresence and Green IT — The Next Evolution in the Internet
The CENIC AI Resource CENIC AIR - CENIC Retreat 2024
The CENIC-AI Resource: The Right Connection
The Pacific Research Platform: The First Six Years
The NSF Grants Leading Up to CHASE-CI ENS
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
Digital Twins of Physical Reality - Future in Review
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy
The PRP and Its Applications - Nautilus and the National Research Platform
Ad

Recently uploaded (20)

PPT
protein biochemistry.ppt for university classes
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
2. Earth - The Living Planet earth and life
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
An interstellar mission to test astrophysical black holes
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
BIOMOLECULES PPT........................
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PDF
Sciences of Europe No 170 (2025)
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
protein biochemistry.ppt for university classes
The KM-GBF monitoring framework – status & key messages.pptx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
2. Earth - The Living Planet earth and life
Phytochemical Investigation of Miliusa longipes.pdf
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
An interstellar mission to test astrophysical black holes
Viruses (History, structure and composition, classification, Bacteriophage Re...
BIOMOLECULES PPT........................
7. General Toxicologyfor clinical phrmacy.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Derivatives of integument scales, beaks, horns,.pptx
Sciences of Europe No 170 (2025)
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
Placing the Near-Earth Object Impact Probability in Context
bbec55_b34400a7914c42429908233dbd381773.pdf

Amy Walton - NSF’s Computational Ecosystem for 21st Century Science & Engineering

  • 1. NSF’s Computational Ecosystem for 21st Century Science and Engineering Amy Walton, Deputy Director Office of Advanced Cyberinfrastructure National Science Foundation 1 Fourth National Research Platform (4NRP) Workshop September 9, 2023
  • 2. Topics • Looking Back – The Pacific Research Platform • A Productive Experiment • Moving Targets • Acknowledgements • Looking Forward – A National Research Ecosystem • Challenges • Opportunities • Resources 2
  • 3. NSF 15-534: Data, Networking, and Innovation 3 An initial – and productive– collaboration between two OAC programs: • Campus Cyberinfrastructure (CC*) • Data Infrastructure Building Blocks (DIBBs) Area 1: Multi-Campus/Multi-Institution Model Implementations Emphasis on integration of data and network infrastructure activities • Awards served as models for potential future national scale network-aware data- focused cyberinfrastructure. • Expected to be science-driven, demonstrating a strong and credible connection to the multi-campus, multi-institutional, and/or regional scientific communities they serve. • Emphasized the value of sharing data beyond a specific institution to the wider science, engineering, and education communities.
  • 4. Pacific Research Platform: Then and Now 4 • Goal: Expand the campus Science DMZ network systems model into a regional model for data-intensive science. • The PRP data-sharing architecture allowed region-wide virtual co-location of data with computing. • Endpoints of PRP sites -- devices called Flash I/O Network Appliances (FIONAs) -- were incorporated into a Kubernetes cluster of FIONAs called Nautilus. • Data can traverse multiple, heterogeneous networks with minimal performance degradation. Now uses 11 major regional/national networks: • 737 namespaces (projects) • >2,100 users • Researchers at 94 US campuses in 39 states
  • 5. Not Mentioned in the Original Proposal: 5 • Kubernetes • Containers • Automation • Jupyter • Ceph These technologies emerged and were integrated into what became Nautilus during the period of the PRP grant • Machine Learning • Artificial Intelligence • Neutrino Observatory • COVID • Wildfires While all applications listed in the original proposal were addressed, these applications became some of the largest PRP CPU/GPU application consumers
  • 6. 6 Acknowledgements: Many Contributors CHASE-CI [CISE/CNS] 2100237 and 2120019 Additional GPU nodes, expand community Expanse 1928224 (ACSS-I) NVIDIA GPUs, cloud integration, composable systems Voyager 2005369 (ACSS-II) AI-focused hardware, Intel/Habana tools Prototype NRP 2112167 (ACSS-II) Distributed across SDSC, U Nebraska – Lincoln, and MGHPCC PRP cyberinfrastructure has increased compute capacity through several sources: • Individual data-intensive research faculty at multiple campuses used their grant resources • This added ~1/4 of the total GPUs on Nautilus Today, Nautilus has nearly 20,000 CPU-cores and nearly 1500 GPUs T-NRP 1826967 (CC*) Connect Quilt Regional Networks using CENIC and Internet2 CHASE-CI [CISE/CNS] 1713149 – Cloud of GPUs for faculty to train AI algorithms
  • 7. Astronomy Physics Computational Bio Material Science Evolutionary Bio Climatology 7 Looking Forward: Cyberinfrastructure that Enables Research Across Science Disciplines Challenges: • Large instruments producing • Big data requiring • Big compute for • Highly collaborative scientists in • Different specializations across • Widely Distributed infrastructure that must be • Available, ensure • Workflow Integrity, and be • Easy to use while adhering to • Regulatory or policy requirements
  • 8. Data Cyberinfrastructure • Federal guidance on Open Science and Public Access presents new opportunities for an agile, scalable and equitable national data cyberinfrastructure to support data sharing. • Recent OAC CC* awards provided federated campus storage. • Required: Follow NSF data practices; sustainability plan; integrate into networks • Future Directions: How to capitalize on existing investments and achieve a national scale data CI to support equitable access to and use of data using FAIR principles? • Our proposed solution: A loose federated approach of existing and new repositories and infrastructure which adhere to basic agreed principles. • Repositories and other data projects that join the network gain benefit from shared resources and services.
  • 9. CI Professionals • A significant barrier to use of national resources is access to CI professionals who can provide expertise and support that are responsive to local needs. • The new ACCESS Computational Science Support Network (CSSN) provides a framework for engaging, training/mentoring, and coordinating a network of CI professionals • The new SCIPE Solicitation (NSF 23-574) supports CI professionals at the campus or regional level. • Enables engagement of CI professionals into ACCESS Computational Science Support Network • Requires: A plan for mentoring, professional development, and sustainability; and 20% of supported individual’s time be dedicated to national activities.
  • 10. Leadership-class Capacity Systems Distributed Services Cloud resources Innovative Prototypes/Testbeds NSF-supported Advanced CI Resources Anvil Purdue University Bridges 2 Carnegie-Mellon University Delta U of Illinois, Urbana-Champaign Expanse U of California, San Diego Jetstream 2 University of Indiana + Partners Stampede 2 U of Texas, Austin Frontera U of Texas, Austin Neocortex Carnegie-Mellon University Voyager U of California, San Diego Ookami Stonybrook University NRP U of California, San Diego ACES Texas A&M University Learn how to access resources at access-ci.org Cloudbank U of California, San Diego CloudLab University of Utah Chameleon University of Chicago PATh/OSG U of Wisconsin, Madison ACCESS Several Partners 10
  • 11. Democratizing Science through Cyberinfrastructure Broad, fair, and equitable access to advanced computing is essential to democratizing science in the 21st century • Significant barriers • Knowledge: Awareness, discovery, expertise, support • Technical: Allocation, access, on-ramps • Social: Awareness of the importance of access to CI, rewards structures • Complex tradeoffs / optimizations • Capacity vs. capability • Stability vs. innovation • Performance vs. ease of use • Expert vs. novice M. Parashar, "Democratizing Science Through Advanced Cyberinfrastructure" in Computer, vol. 55, no. 09, pp. 79-84, 2022. doi:10.1109/MC.2022.3174928
  • 12. Advanced Computing Ecosystem as a Strategic National Asset 12 National Strategic Computing Reserve (NSCR) • A coalition of experts and resource providers that could be mobilized quickly to provide critical computational resources in times of urgent need • Build on experiences from the COVID-19 HPC Consortium, responses to RFI • Aligns with the FACE Strategic plan NSF’s Advanced Cyberinfrastructure Ecosystem: Highly Accessible Computing • Network of advanced systems and services • Leadership and capacity systems, testbeds • Federation (PATh) and coordination services (ACCESS) • Scalable user support networks https://www.whitehouse.gov/wp- content/uploads/2021/10/National-Strategic- Computing-Reserve-Blueprint-Oct2021.pdf Democratized access to an advanced CI Ecosystem
  • 13. Realizing an Advanced CI Ecosystem for All • Integrated and user-friendly portals and gateways for discovering and accessing resources; • Access to local CI resources as part of a shared fabric of national CI resources reachable through high-speed frictionless data networking; • Diverse and flexible allocation and access modes that support a diversity of users and applications; • Agile, easily accessible, and scalable networks of experts providing embedded expertise and support that is responsive to local needs; and • Broadly accessible training targeting the spectrum of CI users and skills. The Missing Millions: Democratizing Computation and Data to Bridge Digital Divides and Increase Access to Science for Underrepresented Communities (A. Blatecky, EAGER) https://www.rti.org/publication/missing- millions/fulltext.pdf
  • 14. 14