SlideShare a Scribd company logo
A vision for applying machine learning to
parameterization of unresolved subgrid variability in
climate models using higher-resolution simulations
Christopher S. Bretherton
Departments of Atmospheric Science and Applied Mathematics,
University of Washington
What your atmospheric scientists were doing Monday
Cloud
processes are
multiscale
A single grid column of a global climate model (100 x 100 km)
Representing such unresolved variability is challenging
Park et al. 2014 Park and Bretherton 2009
…so ‘parameterizations’ of moist physics are difficult to
develop, subjective and work imperfectly.
Feeds into weather/climate model biases & uncertainties
CMIP5
(models)
GPCP
(observations)
Example: Double-ITCZ rainfall bias in climate models
Subgrid parameterization improvements have come more
through ‘targeted tweaking’ than from fundamental advances.
Cloud processes are better simulated on fine grids
Giga-LES (Khairoutdinov et al. 2009)
100x100 km with 100 m grid
1 km global simulations are (briefly) possible
NICAM
3.5-14 km (mo-yr)
Tomita et al. 2005
0.9 km (for 24 hr):
Miyamoto et al. 2013
Multiyear simulations with ‘ultraparameterization’
Ultrafine-grid 2D cloud-resolving model in each GCM
grid column (Δx = 250 m, Δz = 20 m for z=0.5-2 km)
Parishani et al. 2017
0 8 km
Can hi-res simulations help subgrid parameterizations?
• High-resolution models have progressed faster than moist
physics parameterizations in GCMs
• They are conceptually simpler than GCMs, because cloud
properties and air velocity don’t vary much within a grid cell, so
a complicated model for their subgrid covariability is not
needed.
• They must still parameterize smaller-scale processes (e. g.
cloud droplets a few microns wide, complex ice crystals,
turbulence, aerosols). Like other models, they are works in
progress needing constant testing vs. observations.
• Still, high-resolution simulations provide realistic reference
datasets for parameterizing subgrid cloud process variability
The human bottleneck
• Since 1992, the GCSS program has used this approach
• Improved parameterizations of cumulus convection,
turbulence, and cloud microphysics have been
implemented in leading weather and climate models
• But progress has been slow, and the uptake of new
insights from hi-res modelling and new observations is
difficult because humans concoct parameterizations.
So, how about machine learning (ML)?
• Increasingly large and comprehensive training datasets
from high-resolution simulations, if we trust them.
• A coarse-graining problem: With variables computed by
the coarse grid model (e. g. temperature, moisture and
wind profiles), use the fine-grid model to return needed
quantities to the coarse-grid model (e. g. rainfall, vertical
profiles of fractional cloud cover, turbulence, atmospheric
heating and drying).
• Ideally, the needed quantities should be stochastic -
sampled from the hi-res derived pdf of internal fine-grid
variability consistent with the coarse-grid variables.
• Could machine learning techniques help?
I wish I knew…
Some machine learning challenges
• Supervised (structured) or unsupervised learning?
• How to use training data to make a scheme stochastic?
• Inputs may be highly multidimensional (e. g. whole
multilevel profiles of temperature and humidity)
• The training dataset only covers real atmospheric states,
but a GCM will stray outside such states.
• What to do in the likely event that a ‘black box’ ML-
based scheme is numerically unstable n a GCM?
• Can we make the scheme ‘modular’ so as to separate
subgrid variability issues from changes in microphysics?
• How can we tune a ML-based scheme to minimize
errors of the GCM vs. observations?
• What if we change the GCM grid resolution?
What we are not trying to do with machine learning
• Pure dimension reduction, i. e. using ML to find a
computationally simper approximation to a deterministic
but compute-intensive parameterization, e. g.
atmospheric radiative transfer.
• Learn governing equations. Our high resolution model
uses equations for cloud microphysics and fluid motions
that we assume are correct for this purpose. Ideally, we
want to learn the subgrid variability needed to apply
these equations on the coarse grid scale by integrating
over distributions (e. g. joint pdfs of vertical air velocity,
cloud water and rain water) that might not depend on the
microphysics parameterizations.
So this is where you help me
• We have just started a small (3 person) group in UW
Atm Sci, with connections to our eScience institute
• I will describe selected past work and our baby steps.
• I would appreciate the feedback of those of you who
have more thoughts about how to do this.
Past work I: Subramanian and Palmer 2017 JAMES
• ‘Superparameterized’ ECMWF IFS (Small 2D cloud
resolving model represents moist physics in each grid
column of the global model)
• Ensemble of SP-IFS forecasts made from particular days,
differing only in random initial temperature noise used to
kick off small-scale motions in each CRM.
• Is spread of the ensemble forecasts a realistic guide to the
overall forecast uncertainty? That is, is this a useful
strategy for stochastic parameterization of moist processes?
125 km
Ensemble of
10 day SP-IFS
rainfall
forecasts
initialized
21 Oct. 2011
Subramanian and Palmer 2017 JAMES
Error and spread of rainfall forecasts vs. satellite obs
Error is within
the forecast
spread, as
desired for
stochastic
parameterization
…a credible
stochastic
parameterization
strategy based
on hi-res CRMs
Subramanian and Palmer 2017 JAMES
Past work II: Krasnopolsky et al. 2013 Adv. Artif. Neur. Syst.
• Training/testing dataset: 120-day CRM simulating a 256x256
km region of the W Pacific Ocean driven by observed
boundary and surface conditions for Nov 1992-Feb 1993.
Lots of deep cumulonimbus cloud systems and rainfall.
First 80% for training, last 20% for testing.
• Neural net (NN) used to learn how atmospheric heating &
moistening profiles due to all CRM-simulated cloud, radiation
and turbulent processes depend upon the time-varying
atmospheric temperature & moisture profiles (unsupervised
learning of parameterization of all moist physics).
NN training
r = 256
In training data, underlying relation {T(z),q(z)} → {Q1(z),Q2(z),cloud} is
not deterministic; it also depends upon the detailed time evolution of the
convection in the CRM domain. For each use, a NN should be
stochastically drawn from an ensemble of NNs that fit the training data
‘well enough’ to be plausible. They use a 10-member ensemble.
5 layers, 594 parameters
Krasnopolsky et al. 2013
Validation using CRM test data set
NN ensemble predicts the CRM heating and cloud profiles encouragingly well
Krasnopolsky et al. 2013
Diagnostic test of NN param over tropical Pacific
NN moist physics
parameterization tested
diagnostically using
inputs from CAM climate
model over a large
region of the tropical
Pacific.
NN maintains a
reasonable cloud cover
over the region, but
other aspects of the
simulation were not
comprehensively
validated.
Diagnostic test doesn’t
demonstrate NN would
perform well if used to
‘prognostically’ drive
CAM.
Cloud cover
Krasnopolsky et al. 2013
Improving on this approach
• Can a version of this approach work prognostically?
• Best to train with more comprehensive datasets, e. g.
super/ultraparameterization or global cloud resolving model.
Narenpitak et al. 2017 JAMES
CHOMP: Advancing CLUBB using machine learning
A supervised learning strategy for improving the CLUBB moist turbulence
parameterization used in CAM6 (with Jeremy McGibbon)
CLUBB = Cloud Layers Unified By Binormals (Golaz et al. 2002)
CHOMP = Closing Higher Orders with Machine-learning Parameterization
Based on large-eddy simulations of cloudy atmospheric boundary layers
sampled on 12 cruises of a container ship from LA to Hawaii during DOE’s
MAGIC campaign in Oct. 2011-Sept. 2012 (McGibbon and Bretherton 2017).
The LES dataset
• Hourly 3D samples
from 12 4-day
cruises, 400+
vertical levels
• 11 cruises used for
training, 1 for testing
CLUBB and CHOMP
• Higher-order turbulence closure
(HOC): Gridcell-mean tendencies
depend on second-order moments
• 2nd-order depends on 3rd-order, etc.
• CLUBB: 11 moments of u, v, w, qt and
θl are prognosed; others are
diagnosed (‘closed’) in terms of these
assuming joint double-Gaussian
PDFs  errors!
• CHOMP: Use random forest
regression trained on LES output for
cloud-topped boundary layer cases to
relate unknown moments (blue) to
prognostic variables.
• Each of the 400+ model levels at
each LES output time gives a sample.
Prognostic variables
One typical CLUBB moment equation
(closure terms revised in CHOMP)
Profile of input moments at one time in test data
Profiles of desired output moments from same time
• Random forest (CHOMP) matches LES better than CLUBB.
Challenges for CHOMP
• Random forest outputs are piecewise constant functions
of inputs, but we need to take their vertical derivatives.
Even after smoothing this leads to numerical instability in
the prognostic variables.
• A more efficient neural net is harder to train and less
robust on this data set.
• What to do when inputs go well outside the range of the
training data?
• The LES outputs are partly stochastic functions of their
inputs. This would be nice to preserve in CHOMP, but
the random forest regression mostly averages this out.
Outlook
• Machine learning using comprehensive training datasets from
realistic high-resolution models of clouds, storms,turbulence
→ could break the human parameterization bottleneck?
• But has not yet been successfully used to develop a moist
physics parameterization implemented in a global model.
• Best approach is unclear, including how to implement
stochasticity and how to tune using observational data.
• Lots of potential and technical challenges - your help needed!

More Related Content

PDF
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PDF
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PDF
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PDF
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PDF
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PPTX
Improving Physical Parametrizations in Climate Models using Machine Learning
PDF
Estimation of global solar radiation by using machine learning methods
PDF
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Improving Physical Parametrizations in Climate Models using Machine Learning
Estimation of global solar radiation by using machine learning methods
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...

What's hot (20)

PDF
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
PDF
Understanding climate model evaluation and validation
PDF
CLIM: Transition Workshop - Activities and Progress in the “Parameter Optimiz...
PPT
temporal_analysis_forest_cover_using_hmm.ppt
PPTX
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
PDF
TuckerHEDPSummer2016Presentation
PDF
Improved Kalman Filtered Neuro-Fuzzy Wind Speed Predictor For Real Data Set ...
PPTX
Choosing the right physics for WRF (5/2009)
PDF
Agu chen a31_g-2917_retrieving temperature and relative humidity profiles fro...
PDF
CLEARMiner: Mining of Multitemporal Remote Sensing Images
PDF
On the Effect of Geometries Simplification on Geo-spatial Link Discovery
PDF
Prediction of the daily global solar irradiance received on a horizontal surf...
PPTX
Parameter estimation of distributed hydrological model using polynomial chaos...
PDF
HACC: Fitting the Universe Inside a Supercomputer
PDF
Thesis Guillermo Kardolus
PPT
TH3.TO4.3.ppt
PDF
An Enhanced Support Vector Regression Model for Weather Forecasting
PDF
Comparison of Solar and Wind Energy Potential at University of Oldenburg, Ger...
PDF
Morales, Randulph: Spatio-temporal kriging in estimating local methane source...
PDF
Cambridge 2014 Complexity, tails and trends
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
Understanding climate model evaluation and validation
CLIM: Transition Workshop - Activities and Progress in the “Parameter Optimiz...
temporal_analysis_forest_cover_using_hmm.ppt
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
TuckerHEDPSummer2016Presentation
Improved Kalman Filtered Neuro-Fuzzy Wind Speed Predictor For Real Data Set ...
Choosing the right physics for WRF (5/2009)
Agu chen a31_g-2917_retrieving temperature and relative humidity profiles fro...
CLEARMiner: Mining of Multitemporal Remote Sensing Images
On the Effect of Geometries Simplification on Geo-spatial Link Discovery
Prediction of the daily global solar irradiance received on a horizontal surf...
Parameter estimation of distributed hydrological model using polynomial chaos...
HACC: Fitting the Universe Inside a Supercomputer
Thesis Guillermo Kardolus
TH3.TO4.3.ppt
An Enhanced Support Vector Regression Model for Weather Forecasting
Comparison of Solar and Wind Energy Potential at University of Oldenburg, Ger...
Morales, Randulph: Spatio-temporal kriging in estimating local methane source...
Cambridge 2014 Complexity, tails and trends
Ad

Similar to Program on Mathematical and Statistical Methods for Climate and the Earth System Opening Workshop, Developing Stochastic Parameterizations of Subgrid Variability of Clouds and Turbulence using High-Resolution Simulations - Chris Bretherton, Aug 24, 2017 (20)

PPTX
Climate Model Prediction IPCC Climate Change
PDF
Simulating Weather: Numerical Weather Prediction as Computational Simulation
PDF
Machine Learning for Weather Forecasts
PDF
Deep Learning based Multiple Regression to Predict Total Column Water Vapor (...
PDF
Optimal combinaison of CFD modeling and statistical learning for short-term w...
PDF
The Role of Semantics in Harmonizing YOPP Observation and Model Data
PPTX
Short Presentation: Mohamed abuella's Research Highlights
PPTX
Modellistica Lagrangiana in ISAC Torino - risultati e nuovi sviluppi
PPTX
Julian R - Spatial downscaling of future climate predictions for agriculture ...
PDF
A Semi-Lagrangian NWP Model For Real-Time And Research Applications Evaluati...
PDF
Future guidelines the meteorological view - Isabel Martínez (AEMet)
DOCX
Machine-Learned Cloud Classes From Satellite Data for Process-Oriented Climat...
DOCX
Ocean Modelling
PDF
Climate downscaling
PDF
PDF
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PPT
PowerPoint Powerpoint
PPT
PowerPoint Powerpoint
PPTX
Carlos N - CIAT Experience In Climate Modeling; Scenarios of future climate c...
PPTX
Effects of Uncertainty in Cloud Microphysics on Passive Microwave Rainfall Me...
Climate Model Prediction IPCC Climate Change
Simulating Weather: Numerical Weather Prediction as Computational Simulation
Machine Learning for Weather Forecasts
Deep Learning based Multiple Regression to Predict Total Column Water Vapor (...
Optimal combinaison of CFD modeling and statistical learning for short-term w...
The Role of Semantics in Harmonizing YOPP Observation and Model Data
Short Presentation: Mohamed abuella's Research Highlights
Modellistica Lagrangiana in ISAC Torino - risultati e nuovi sviluppi
Julian R - Spatial downscaling of future climate predictions for agriculture ...
A Semi-Lagrangian NWP Model For Real-Time And Research Applications Evaluati...
Future guidelines the meteorological view - Isabel Martínez (AEMet)
Machine-Learned Cloud Classes From Satellite Data for Process-Oriented Climat...
Ocean Modelling
Climate downscaling
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
PowerPoint Powerpoint
PowerPoint Powerpoint
Carlos N - CIAT Experience In Climate Modeling; Scenarios of future climate c...
Effects of Uncertainty in Cloud Microphysics on Passive Microwave Rainfall Me...
Ad

More from The Statistical and Applied Mathematical Sciences Institute (20)

PDF
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
PDF
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
PDF
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
PDF
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
PDF
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
PDF
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
PPTX
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
PDF
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
PDF
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
PPTX
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
PDF
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
PDF
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
PDF
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
PDF
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
PDF
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
PDF
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
PPTX
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
PPTX
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
PDF
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
PDF
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharma ospi slides which help in ospi learning
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
01-Introduction-to-Information-Management.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Pre independence Education in Inndia.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
VCE English Exam - Section C Student Revision Booklet
TR - Agricultural Crops Production NC III.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharma ospi slides which help in ospi learning
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
01-Introduction-to-Information-Management.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Microbial disease of the cardiovascular and lymphatic systems
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Renaissance Architecture: A Journey from Faith to Humanism
PPH.pptx obstetrics and gynecology in nursing
Pre independence Education in Inndia.pdf

Program on Mathematical and Statistical Methods for Climate and the Earth System Opening Workshop, Developing Stochastic Parameterizations of Subgrid Variability of Clouds and Turbulence using High-Resolution Simulations - Chris Bretherton, Aug 24, 2017

  • 1. A vision for applying machine learning to parameterization of unresolved subgrid variability in climate models using higher-resolution simulations Christopher S. Bretherton Departments of Atmospheric Science and Applied Mathematics, University of Washington
  • 2. What your atmospheric scientists were doing Monday
  • 4. A single grid column of a global climate model (100 x 100 km)
  • 5. Representing such unresolved variability is challenging Park et al. 2014 Park and Bretherton 2009 …so ‘parameterizations’ of moist physics are difficult to develop, subjective and work imperfectly.
  • 6. Feeds into weather/climate model biases & uncertainties CMIP5 (models) GPCP (observations) Example: Double-ITCZ rainfall bias in climate models Subgrid parameterization improvements have come more through ‘targeted tweaking’ than from fundamental advances.
  • 7. Cloud processes are better simulated on fine grids Giga-LES (Khairoutdinov et al. 2009) 100x100 km with 100 m grid
  • 8. 1 km global simulations are (briefly) possible NICAM 3.5-14 km (mo-yr) Tomita et al. 2005 0.9 km (for 24 hr): Miyamoto et al. 2013
  • 9. Multiyear simulations with ‘ultraparameterization’ Ultrafine-grid 2D cloud-resolving model in each GCM grid column (Δx = 250 m, Δz = 20 m for z=0.5-2 km) Parishani et al. 2017 0 8 km
  • 10. Can hi-res simulations help subgrid parameterizations? • High-resolution models have progressed faster than moist physics parameterizations in GCMs • They are conceptually simpler than GCMs, because cloud properties and air velocity don’t vary much within a grid cell, so a complicated model for their subgrid covariability is not needed. • They must still parameterize smaller-scale processes (e. g. cloud droplets a few microns wide, complex ice crystals, turbulence, aerosols). Like other models, they are works in progress needing constant testing vs. observations. • Still, high-resolution simulations provide realistic reference datasets for parameterizing subgrid cloud process variability
  • 11. The human bottleneck • Since 1992, the GCSS program has used this approach • Improved parameterizations of cumulus convection, turbulence, and cloud microphysics have been implemented in leading weather and climate models • But progress has been slow, and the uptake of new insights from hi-res modelling and new observations is difficult because humans concoct parameterizations.
  • 12. So, how about machine learning (ML)? • Increasingly large and comprehensive training datasets from high-resolution simulations, if we trust them. • A coarse-graining problem: With variables computed by the coarse grid model (e. g. temperature, moisture and wind profiles), use the fine-grid model to return needed quantities to the coarse-grid model (e. g. rainfall, vertical profiles of fractional cloud cover, turbulence, atmospheric heating and drying). • Ideally, the needed quantities should be stochastic - sampled from the hi-res derived pdf of internal fine-grid variability consistent with the coarse-grid variables. • Could machine learning techniques help?
  • 13. I wish I knew… Some machine learning challenges • Supervised (structured) or unsupervised learning? • How to use training data to make a scheme stochastic? • Inputs may be highly multidimensional (e. g. whole multilevel profiles of temperature and humidity) • The training dataset only covers real atmospheric states, but a GCM will stray outside such states. • What to do in the likely event that a ‘black box’ ML- based scheme is numerically unstable n a GCM? • Can we make the scheme ‘modular’ so as to separate subgrid variability issues from changes in microphysics? • How can we tune a ML-based scheme to minimize errors of the GCM vs. observations? • What if we change the GCM grid resolution?
  • 14. What we are not trying to do with machine learning • Pure dimension reduction, i. e. using ML to find a computationally simper approximation to a deterministic but compute-intensive parameterization, e. g. atmospheric radiative transfer. • Learn governing equations. Our high resolution model uses equations for cloud microphysics and fluid motions that we assume are correct for this purpose. Ideally, we want to learn the subgrid variability needed to apply these equations on the coarse grid scale by integrating over distributions (e. g. joint pdfs of vertical air velocity, cloud water and rain water) that might not depend on the microphysics parameterizations.
  • 15. So this is where you help me • We have just started a small (3 person) group in UW Atm Sci, with connections to our eScience institute • I will describe selected past work and our baby steps. • I would appreciate the feedback of those of you who have more thoughts about how to do this.
  • 16. Past work I: Subramanian and Palmer 2017 JAMES • ‘Superparameterized’ ECMWF IFS (Small 2D cloud resolving model represents moist physics in each grid column of the global model) • Ensemble of SP-IFS forecasts made from particular days, differing only in random initial temperature noise used to kick off small-scale motions in each CRM. • Is spread of the ensemble forecasts a realistic guide to the overall forecast uncertainty? That is, is this a useful strategy for stochastic parameterization of moist processes? 125 km
  • 17. Ensemble of 10 day SP-IFS rainfall forecasts initialized 21 Oct. 2011 Subramanian and Palmer 2017 JAMES
  • 18. Error and spread of rainfall forecasts vs. satellite obs Error is within the forecast spread, as desired for stochastic parameterization …a credible stochastic parameterization strategy based on hi-res CRMs Subramanian and Palmer 2017 JAMES
  • 19. Past work II: Krasnopolsky et al. 2013 Adv. Artif. Neur. Syst. • Training/testing dataset: 120-day CRM simulating a 256x256 km region of the W Pacific Ocean driven by observed boundary and surface conditions for Nov 1992-Feb 1993. Lots of deep cumulonimbus cloud systems and rainfall. First 80% for training, last 20% for testing. • Neural net (NN) used to learn how atmospheric heating & moistening profiles due to all CRM-simulated cloud, radiation and turbulent processes depend upon the time-varying atmospheric temperature & moisture profiles (unsupervised learning of parameterization of all moist physics).
  • 20. NN training r = 256 In training data, underlying relation {T(z),q(z)} → {Q1(z),Q2(z),cloud} is not deterministic; it also depends upon the detailed time evolution of the convection in the CRM domain. For each use, a NN should be stochastically drawn from an ensemble of NNs that fit the training data ‘well enough’ to be plausible. They use a 10-member ensemble. 5 layers, 594 parameters Krasnopolsky et al. 2013
  • 21. Validation using CRM test data set NN ensemble predicts the CRM heating and cloud profiles encouragingly well Krasnopolsky et al. 2013
  • 22. Diagnostic test of NN param over tropical Pacific NN moist physics parameterization tested diagnostically using inputs from CAM climate model over a large region of the tropical Pacific. NN maintains a reasonable cloud cover over the region, but other aspects of the simulation were not comprehensively validated. Diagnostic test doesn’t demonstrate NN would perform well if used to ‘prognostically’ drive CAM. Cloud cover Krasnopolsky et al. 2013
  • 23. Improving on this approach • Can a version of this approach work prognostically? • Best to train with more comprehensive datasets, e. g. super/ultraparameterization or global cloud resolving model. Narenpitak et al. 2017 JAMES
  • 24. CHOMP: Advancing CLUBB using machine learning A supervised learning strategy for improving the CLUBB moist turbulence parameterization used in CAM6 (with Jeremy McGibbon) CLUBB = Cloud Layers Unified By Binormals (Golaz et al. 2002) CHOMP = Closing Higher Orders with Machine-learning Parameterization Based on large-eddy simulations of cloudy atmospheric boundary layers sampled on 12 cruises of a container ship from LA to Hawaii during DOE’s MAGIC campaign in Oct. 2011-Sept. 2012 (McGibbon and Bretherton 2017).
  • 25. The LES dataset • Hourly 3D samples from 12 4-day cruises, 400+ vertical levels • 11 cruises used for training, 1 for testing
  • 26. CLUBB and CHOMP • Higher-order turbulence closure (HOC): Gridcell-mean tendencies depend on second-order moments • 2nd-order depends on 3rd-order, etc. • CLUBB: 11 moments of u, v, w, qt and θl are prognosed; others are diagnosed (‘closed’) in terms of these assuming joint double-Gaussian PDFs  errors! • CHOMP: Use random forest regression trained on LES output for cloud-topped boundary layer cases to relate unknown moments (blue) to prognostic variables. • Each of the 400+ model levels at each LES output time gives a sample. Prognostic variables One typical CLUBB moment equation (closure terms revised in CHOMP)
  • 27. Profile of input moments at one time in test data
  • 28. Profiles of desired output moments from same time • Random forest (CHOMP) matches LES better than CLUBB.
  • 29. Challenges for CHOMP • Random forest outputs are piecewise constant functions of inputs, but we need to take their vertical derivatives. Even after smoothing this leads to numerical instability in the prognostic variables. • A more efficient neural net is harder to train and less robust on this data set. • What to do when inputs go well outside the range of the training data? • The LES outputs are partly stochastic functions of their inputs. This would be nice to preserve in CHOMP, but the random forest regression mostly averages this out.
  • 30. Outlook • Machine learning using comprehensive training datasets from realistic high-resolution models of clouds, storms,turbulence → could break the human parameterization bottleneck? • But has not yet been successfully used to develop a moist physics parameterization implemented in a global model. • Best approach is unclear, including how to implement stochasticity and how to tune using observational data. • Lots of potential and technical challenges - your help needed!