SlideShare a Scribd company logo
Numerically Optimized Empirical Modeling
 of Highly Dynamic, Spatially Expansive,
     and Behaviorally Heterogeneous
       Hydrologic Systems – Part 2
    Jana Stewart, U.S. Geological Survey, Middleton, WI
       Matthew Mitro, Wisconsin DNR, Madison, WI
     Ed Roehl, Advanced Data Mining, LLC, Greer, SC
     John Risley, U.S. Geological Survey, Portland, OR
Part 1
International Environmental
  Modelling and Software
Society 2006, Burlington VT
16-year hydrographs               Upper Floridan Aquifer,
                                  Suwannee River Valley, Florida
                       •   Research – MLP ANNs to spatially interpolate
                       •   Highly spatially discontinuous
                            – MLP ANNs – continuous functions
                            – Optimally segment well behaviors?
                       •   High temporal variability




      Well Locations
     (100x100 miles)
Western Oregon Stream Temperature Modeling
                                                        ST sites
                                                        Climatic sites   • Thermal TMDL
                                                                         • Modeled Output - ST
                                    Portland
                                                                           hourly time series
Pacific Ocean




                            Willamette                                     Jun-Oct 1999 at 146
                                              egion                        “pristine” sites
                            Valley
                            Eco-
                            region                                       • Potential Inputs
                                                   or
                                             des Ec


                            Corvallis
                egion




                                                                           – STATIC - 34 variables,
                                                                             including stream
                   cor




                                                                             shading and basin
                                        Casca
            ange E




                                                                             forestation
                                   Eugene                                  – CLIMATE TIME
                                                                             SERIES - 65 hourly air
              R




                                                                             temperature, dew-
        Coast




                                                                             point, solar radiation,
                                                                             barometric pressure,
                  Klamath                                                    snowpack, and
                  Mountains                                                  precipitation from 25
                  Ecoregion                                                  locations.
                         Ashland
Objectives
• Model Highly Dynamic, Spatially Expansive,
  and Behaviorally Heterogeneous Hydrologic
  Systems
• Divide and conquer – big problem
  transformed into multiple small problems
• Use a sequence of numerically optimized
  algorithms
  – minimize subjectivity
Steps (divide and conquer)
1. SEGMENT DATA - into behavioral classes
  •   Cluster time series - k-means, SOM
      –   Intermediate cross correlation matrix
  •   Bonus – identifies redundant/unique sites for network
      optimization
1. MODEL EACH BEHAVIORAL CLASS separately
  •   Process signals to separate low and high frequency
      components
  •   “Stacked” data set for training
  •   Decorrelate input variables as needed
  •   ANNs – multivariate, non-linear curve fitting
  •   Sub-models of low and high frequency components,
      combine predictions = “super model”
  •   Sensitivity analysis determines which static and time
      series variables are predictive
Steps – cont.
3.       BUILD CLASSIFIER – to link static site
         characteristics to dynamic behaviors (classes)
     •      static inputs ⇒ mapping function ⇒ class id
     •      krigging in Floridan Aquifer (x,y,class id)
     •      classification model
            –   Nearest neighbor classifier (linear)
            –   ANN-classifier (non-linear)
4.       RUN MODEL
     i.     Input new site vector of static inputs
     ii.    Run classifier to select behavioral model
     iii.   Run behavioral model
     iv.    Write output
Clustering Results – Floridan Aquifer
                         indicates well redundancy




12 classes – probably more
than necessary
Normalized Water Level                                       Accuracy by Cluster

                          C1              Actual                         C3
                                          Prediction
   above Sea Level




                         History from Apr 1982 to Oct 1998

                                                                         C10




                           C6
Super Model Prediction


                         Max elevation above
                         sea level ~ 180 feet




⇑ run time
application                    Su
                                 w
                                     an
display                                 n   ee




                                                 River
              Gulf of Mexico
Western Oregon – 1 of 6 validation
                 sites not used for training
21

20

19

18

17

16

15

14

13

12

11

      25 30 5 10 15 20 25 31   5 10 15 20 25 31 5 10 15
     JUNE        JULY              AUGUST      SEPTEMBER
Western Oregon – another validation site
14
13
12
11
10
 9
 8
 7
 6
 5
      25 30 5 10 15 20 25 31   5 10 15 20 25 31 5 10 15
     JUNE        JULY              AUGUST      SEPTEMBER


• Good dynamics
• Static inputs primary source of error
Part 2
 HIC06
Wisconsin Temperature Modeling
      • Fisheries management
      • Modeled Output
        – 254 ST daily time series
          measured Jun-Aug, 1990-2002
           • temporally discontinuous –
             different sites measured
             different years
      • Potential Inputs
        – STATIC - 42 variables including
          land cover, drainage area, and
          streambed characteristics
        – CLIMATE TIME SERIES- 353
          daily air temperature, dew-point,
          solar radiation, barometric
          pressure and precipitation from
          25 locations.
Asynchronous Site Monitoring
•   Modified time series clustering method
•   Steps
    a) Compile populations having overlapping signals
       •   1998 to 2002 made up 241 of the 254 sites
    a) Estimate # classes per population, then choose same k
       for all populations. k=3 for Wisconsin model
    b) Apply the standard time series clustering algorithm to
       each population using k
    c) Perform sensitivity analyses with prototype ANN
       classification models - determine best static variables
    d) Determine overall best static variables
    e) Cluster all sites using best static variables
    f) ANN dynamic models of each behavioral class as before.
    g) ANN classification models as before for “new sites”
Best Static Variables
                                                         Top variables
                     Variable description            6       10          14
Land cover–agriculture (W)                           *        *           *
Area–drainage area (W)                               *        *           *
Land cover–forest (W)                                *        *           *
Bedrock depth–depth to bedrock (0? 50 feet) (W)      *        *           *
Surficial deposit texture–medium (W)                 *        *           *
Stream network–downstream link (S)                   *        *           *
Stream network–gradient (S)                                   *           *
Land cover–wetland (W)                                        *           *
Darcy value–darcy (W)                                         *           *
Bedrock depth–depth to bedrock (51? 100 feet) (W)             *           *
Land cover–urban (W)                                                      *
Surficial deposit texture–fine (W)                                        *
Bedrock type–sandstone (W)                                                *
Bedrock depth–depth to bedrock (101? 200 feet) (W)                        *
Measured & Predicted Class 1 Stream Temps
                            measured   predicted




• 14 “test” sites not used to train ANNs
   – concatenated
   – June – August
• R2=0.66
• Dynamically good
• Offsets (high or low) from static variables
Conclusions
•   Numerical methods
    1. Signal processing, e.g., spectral filtering
    2. Clustering, e.g., k-means
    3. ANN non-linear, dynamic sub-models of behavioral
       components assembled into super-model
    4. Classification, e.g., ANN non-linear classifier
•   Approach uses all available static and time series
    data
•   Divide and conquer makes big problems tractable
•   Near optimal results – limited by data quality
•   Compact finished model
Florida Everglades Water Levels
            • In progress
            • Water management
            • Modeled Output - 260
              real-time WL gages
            • Potential Inputs
              – STATIC - 6 variables –
                x,y + 4 vegetation
              – WL TIME SERIES - 260
                real-time WL gages
                 • autoregressive

More Related Content

PPT
IGARSS11_TH2.T04.3_Rott_Larsen.ppt
PPTX
Tracy - Urban Subwatershed Stormwater Retrofit Analysis
PDF
Monitoring playa water resources using gis and remote sensing
PDF
The Bundarra project case study
PPT
Prof Graeme Dandy at the Landscape Science Cluster Seminar, May 2009
PPTX
UTM_SWMM_KAMAL
PPTX
Unit 3 INFILTRATION
PPTX
RAINFALL RUNOFF MODELLING USING HEC-HMS
IGARSS11_TH2.T04.3_Rott_Larsen.ppt
Tracy - Urban Subwatershed Stormwater Retrofit Analysis
Monitoring playa water resources using gis and remote sensing
The Bundarra project case study
Prof Graeme Dandy at the Landscape Science Cluster Seminar, May 2009
UTM_SWMM_KAMAL
Unit 3 INFILTRATION
RAINFALL RUNOFF MODELLING USING HEC-HMS

Similar to Hic06 spatial interpolation (20)

PPT
Hic06 spatial interpolation
PDF
Ewri2009 big data_jbc
PDF
Scwrc2014 savannah basinresourceoptimization-20141021
PDF
Response-based Metocean Criteria for OptimisingDesign and Operation of FPSOs
PPT
BEACH PROFILE – MEASUREMENT TECHNIQUES.ppt
DOC
Zwaan Eage 2004 V3
PPTX
Initial Core Descriptions
PPT
Majid MSc Presentation
PPT
Majid M.Sc Presentation
PPT
Yulini piceance apr3
PDF
Canadian R&D activities on wind energy production in cold climate and in comp...
PDF
IGARSS11_VC_ppt.pdf
PPTX
The Remarkable Benefits and Grave Dangers of using Artificial Intelligence in...
PDF
Predicting the effects of multiple stressors on salmon (EPASTAR)
PDF
Julie.webster
PPTX
Modelling the Distribution of Karst Topography, Nova Scotia, Canada
PDF
TH1.T04.2_MULTI-FREQUENCY MICROWAVE EMISSION OF THE EAST ANTARCTIC PLATEAU_IG...
PDF
TH1.T04.2_MULTI-FREQUENCY MICROWAVE EMISSION OF THE EAST ANTARCTIC PLATEAU_IG...
PDF
TH1.T04.2_MULTI-FREQUENCY MICROWAVE EMISSION OF THE EAST ANTARCTIC PLATEAU_IG...
PPTX
Electromagnetic prospecting
Hic06 spatial interpolation
Ewri2009 big data_jbc
Scwrc2014 savannah basinresourceoptimization-20141021
Response-based Metocean Criteria for OptimisingDesign and Operation of FPSOs
BEACH PROFILE – MEASUREMENT TECHNIQUES.ppt
Zwaan Eage 2004 V3
Initial Core Descriptions
Majid MSc Presentation
Majid M.Sc Presentation
Yulini piceance apr3
Canadian R&D activities on wind energy production in cold climate and in comp...
IGARSS11_VC_ppt.pdf
The Remarkable Benefits and Grave Dangers of using Artificial Intelligence in...
Predicting the effects of multiple stressors on salmon (EPASTAR)
Julie.webster
Modelling the Distribution of Karst Topography, Nova Scotia, Canada
TH1.T04.2_MULTI-FREQUENCY MICROWAVE EMISSION OF THE EAST ANTARCTIC PLATEAU_IG...
TH1.T04.2_MULTI-FREQUENCY MICROWAVE EMISSION OF THE EAST ANTARCTIC PLATEAU_IG...
TH1.T04.2_MULTI-FREQUENCY MICROWAVE EMISSION OF THE EAST ANTARCTIC PLATEAU_IG...
Electromagnetic prospecting
Ad

More from John B. Cook, PE, CEO (16)

PDF
Orange Co. Water District's Solution to Water Crisis
PDF
Asset management-cda
PPTX
Adm graphics-2003
PDF
Neiwpcc nps 2010
PPTX
Integrated river basin management
PPTX
Daamen r 2010scwr-cpaper
PPSX
Wqtc2013 dist syswq-modeling-20131107
PDF
Caw toronto presentation-20121031
PDF
Wqtc2013 invest ofperformanceprobswitheds-20130910
PDF
Modeling full scale-data(2)
PPT
Ad mi floridan-aquiferwls-for-pps
PPTX
Wrf4285 climate changepresentation-20121008
PDF
Wqtc2011 causes offalsealarms-20111115-final
PPT
Neiwpcc2010.ppt
PPT
Ad mi floridan-aquiferwls-for-pps
PDF
Modeling full scale-data(2)
Orange Co. Water District's Solution to Water Crisis
Asset management-cda
Adm graphics-2003
Neiwpcc nps 2010
Integrated river basin management
Daamen r 2010scwr-cpaper
Wqtc2013 dist syswq-modeling-20131107
Caw toronto presentation-20121031
Wqtc2013 invest ofperformanceprobswitheds-20130910
Modeling full scale-data(2)
Ad mi floridan-aquiferwls-for-pps
Wrf4285 climate changepresentation-20121008
Wqtc2011 causes offalsealarms-20111115-final
Neiwpcc2010.ppt
Ad mi floridan-aquiferwls-for-pps
Modeling full scale-data(2)
Ad

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Cloud computing and distributed systems.
PDF
Machine learning based COVID-19 study performance prediction
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Electronic commerce courselecture one. Pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
MIND Revenue Release Quarter 2 2025 Press Release
The Rise and Fall of 3GPP – Time for a Sabbatical?
Spectral efficient network and resource selection model in 5G networks
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Digital-Transformation-Roadmap-for-Companies.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Cloud computing and distributed systems.
Machine learning based COVID-19 study performance prediction
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Unlocking AI with Model Context Protocol (MCP)
Per capita expenditure prediction using model stacking based on satellite ima...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Electronic commerce courselecture one. Pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
MIND Revenue Release Quarter 2 2025 Press Release

Hic06 spatial interpolation

  • 1. Numerically Optimized Empirical Modeling of Highly Dynamic, Spatially Expansive, and Behaviorally Heterogeneous Hydrologic Systems – Part 2 Jana Stewart, U.S. Geological Survey, Middleton, WI Matthew Mitro, Wisconsin DNR, Madison, WI Ed Roehl, Advanced Data Mining, LLC, Greer, SC John Risley, U.S. Geological Survey, Portland, OR
  • 2. Part 1 International Environmental Modelling and Software Society 2006, Burlington VT
  • 3. 16-year hydrographs Upper Floridan Aquifer, Suwannee River Valley, Florida • Research – MLP ANNs to spatially interpolate • Highly spatially discontinuous – MLP ANNs – continuous functions – Optimally segment well behaviors? • High temporal variability Well Locations (100x100 miles)
  • 4. Western Oregon Stream Temperature Modeling ST sites Climatic sites • Thermal TMDL • Modeled Output - ST Portland hourly time series Pacific Ocean Willamette Jun-Oct 1999 at 146 egion “pristine” sites Valley Eco- region • Potential Inputs or des Ec Corvallis egion – STATIC - 34 variables, including stream cor shading and basin Casca ange E forestation Eugene – CLIMATE TIME SERIES - 65 hourly air R temperature, dew- Coast point, solar radiation, barometric pressure, Klamath snowpack, and Mountains precipitation from 25 Ecoregion locations. Ashland
  • 5. Objectives • Model Highly Dynamic, Spatially Expansive, and Behaviorally Heterogeneous Hydrologic Systems • Divide and conquer – big problem transformed into multiple small problems • Use a sequence of numerically optimized algorithms – minimize subjectivity
  • 6. Steps (divide and conquer) 1. SEGMENT DATA - into behavioral classes • Cluster time series - k-means, SOM – Intermediate cross correlation matrix • Bonus – identifies redundant/unique sites for network optimization 1. MODEL EACH BEHAVIORAL CLASS separately • Process signals to separate low and high frequency components • “Stacked” data set for training • Decorrelate input variables as needed • ANNs – multivariate, non-linear curve fitting • Sub-models of low and high frequency components, combine predictions = “super model” • Sensitivity analysis determines which static and time series variables are predictive
  • 7. Steps – cont. 3. BUILD CLASSIFIER – to link static site characteristics to dynamic behaviors (classes) • static inputs ⇒ mapping function ⇒ class id • krigging in Floridan Aquifer (x,y,class id) • classification model – Nearest neighbor classifier (linear) – ANN-classifier (non-linear) 4. RUN MODEL i. Input new site vector of static inputs ii. Run classifier to select behavioral model iii. Run behavioral model iv. Write output
  • 8. Clustering Results – Floridan Aquifer indicates well redundancy 12 classes – probably more than necessary
  • 9. Normalized Water Level Accuracy by Cluster C1 Actual C3 Prediction above Sea Level History from Apr 1982 to Oct 1998 C10 C6
  • 10. Super Model Prediction Max elevation above sea level ~ 180 feet ⇑ run time application Su w an display n ee River Gulf of Mexico
  • 11. Western Oregon – 1 of 6 validation sites not used for training 21 20 19 18 17 16 15 14 13 12 11 25 30 5 10 15 20 25 31 5 10 15 20 25 31 5 10 15 JUNE JULY AUGUST SEPTEMBER
  • 12. Western Oregon – another validation site 14 13 12 11 10 9 8 7 6 5 25 30 5 10 15 20 25 31 5 10 15 20 25 31 5 10 15 JUNE JULY AUGUST SEPTEMBER • Good dynamics • Static inputs primary source of error
  • 14. Wisconsin Temperature Modeling • Fisheries management • Modeled Output – 254 ST daily time series measured Jun-Aug, 1990-2002 • temporally discontinuous – different sites measured different years • Potential Inputs – STATIC - 42 variables including land cover, drainage area, and streambed characteristics – CLIMATE TIME SERIES- 353 daily air temperature, dew-point, solar radiation, barometric pressure and precipitation from 25 locations.
  • 15. Asynchronous Site Monitoring • Modified time series clustering method • Steps a) Compile populations having overlapping signals • 1998 to 2002 made up 241 of the 254 sites a) Estimate # classes per population, then choose same k for all populations. k=3 for Wisconsin model b) Apply the standard time series clustering algorithm to each population using k c) Perform sensitivity analyses with prototype ANN classification models - determine best static variables d) Determine overall best static variables e) Cluster all sites using best static variables f) ANN dynamic models of each behavioral class as before. g) ANN classification models as before for “new sites”
  • 16. Best Static Variables Top variables Variable description 6 10 14 Land cover–agriculture (W) * * * Area–drainage area (W) * * * Land cover–forest (W) * * * Bedrock depth–depth to bedrock (0? 50 feet) (W) * * * Surficial deposit texture–medium (W) * * * Stream network–downstream link (S) * * * Stream network–gradient (S) * * Land cover–wetland (W) * * Darcy value–darcy (W) * * Bedrock depth–depth to bedrock (51? 100 feet) (W) * * Land cover–urban (W) * Surficial deposit texture–fine (W) * Bedrock type–sandstone (W) * Bedrock depth–depth to bedrock (101? 200 feet) (W) *
  • 17. Measured & Predicted Class 1 Stream Temps measured predicted • 14 “test” sites not used to train ANNs – concatenated – June – August • R2=0.66 • Dynamically good • Offsets (high or low) from static variables
  • 18. Conclusions • Numerical methods 1. Signal processing, e.g., spectral filtering 2. Clustering, e.g., k-means 3. ANN non-linear, dynamic sub-models of behavioral components assembled into super-model 4. Classification, e.g., ANN non-linear classifier • Approach uses all available static and time series data • Divide and conquer makes big problems tractable • Near optimal results – limited by data quality • Compact finished model
  • 19. Florida Everglades Water Levels • In progress • Water management • Modeled Output - 260 real-time WL gages • Potential Inputs – STATIC - 6 variables – x,y + 4 vegetation – WL TIME SERIES - 260 real-time WL gages • autoregressive