Python the Lingua Franca of SeqFEWS
2019 AUS FEWS Users Conference
Lindsay Millard, Hydrologist
This presentation started in 2016:
Coupling external modules into Delft-FEWS
• 2016 FEWS User Day – Antonio Morales (ADASA)
• Outlined Java, .NET, R methods to integrate data in FEWS
– Used Java to Link MIKE hydrological model / Irrigation into
FEWS.
– Used Java and R to link MIKE 11 model to FEWS
• Conclusion:
– Take the best of Delft-FEWS: visualisation, modifiers, general
adapters but don’t be limited by it. Add what you need.
– R can do complex statistical analytics
Brisbane River Catchment Flood Study - 2015
3
7Tb of data to harvest
What is a Dutch word to describe FEWS?
“Datafabriek”
“Data-centric Workflow Manager”
Why not use it for non-forecasting purposes?
It would:
– assist with task efficiency and maximising skillsets
– avoid buying and learning new software
– enable efficient data management of AR&R 2016
– centralise and allow auditing of tailored design engineering workflows
4
Refresher from 2018
Itinerary for “terra Python”
• Presentation will walk through how to build an adapter using Python
• Simple steps, a quick-start guide to linking models
• This allows FEWs to become the wrapper for all the code and data.
• Proof of concept before getting official support
• You can access an array of powerful Monte Carlo libraries to drive your
model.
Orientation
• Roll your own adapter/plugin from boilerplate code
• What do you need to do
• Linking models to FEWs, can do a lot already but
can do exponentially more with python or R or
JavaScript etc
• Allow use of Monte Carlo frameworks for solving
probabilistically/ Bayesian Inference
• Rasmus Baath, Youtube / sumsar.net
“terra Python” Lonely Planet Guidebook
“Datafabriek” POPULATION: YOU
Includes ››
What to Pack …. 1
Activities ……… 2
Sights ………… 3
Phrase book ….. 4
Best Places to Visit
›› Github
›› Delft FEWS Wiki
›› HydPy
›› HKVfewspy
›› Slideshare
Best Places to Stay
›› XML Spy
›› Visual Studio Code
›› Anaconda Distribution
›› R Studio
›› Python 3.6 / Pandas
›› Jupyter Notebook
Why Go?
Delft FEWS is one of the world’s most flexible data management
frameworks. Its architecture allows for the seamless integration of numerous
modelling dialects. FEWS’ epic flexibility is awe-inspiring and there is still
much more substance here than in other dynastic capitals. You just need to
a bit of patient exploration to tap into its full potential.
The city’s denizens chat in extensible markup language – the gold standard
of Markup – and marvel at their good fortune for occupying the centre of the
flexible languages. Delft-FEWS dispenses with the peristent pace of
procurement and locals instead find time to develop and build a city that truly
reflects their needs.
When do you go?
As soon as your model is packed and ready to be implemented. Model
environment. Requires a pre-adapter to translate XML to local language,
Model runtime execution, and then post-adapter to translate from local
language to XML.
Top Places to Visit
• Ingredients:
– Download Anaconda using Jupyter Notebook
– Python Dependencies PANDAS
– FEWS Standalone – Configuration Course
– A Model that can be driven from .BAT files. (most can be)
• RORB, URBS, TUFLOW, GoldSIM, Probabilistic Modelling
– Setup a Bin and Model folder that unzips from:
• $Region Home$configModuleDataSetFiles_yourModel_.zip
• $Region Home$Modules_yourModelbin and model
– Use the Bin to store .bat to drive the model and .py file to
Pre/Post translate from FEWS.
bin
model
SeqFEWS
General
Adapter
.XML
Runtime
parameters
.XML .NC
.CSV
.bat / .py
Model Input
.bat / .py
Model
Results/Log
Module Data
set
GeometryRun files
Input TS Output TS
Grid
Model .exe
ModuleConfig
RegionConfig
WorkflowFiles
Modulesmodel
ModuleDataset model.zip
0d, 1d,
2d T.S.
transformation
export import
Plots
Modulebin Modulebin
Getting Around
10
General Adapter XML
bin
model
GoldSim Import
GoldSim Export
Essential Phrases and Orientation
ConfigModuleDataSetFiles_yourModel_.ZIP
A zipped folder with a bin that contains .bat and .py scripts to drive the
model. A model that contains all of the model components
If python installed you can call it from the General Adapter via a batch file:
››› c:python36python.exe ‘yourScript.py’ >log.txt
Python will need other modules, most will be available, others will need to
be installed using:
››› c:python36scriptspip.exe install _module_
https://notebooks.azure.com/lamilla79/projects/
dfuda-2019-adapter
Worked_Example_FEWS-GoldSim-
Adapter.ipynb
Folder
Guided Walking Tour
Missing a Package?
15
import subprocess
import sys
def install(package):
subprocess.call([sys.executable, "-m", "pip", "install", package])
# If they are missing then
››› c:Python36ScriptsPip.exe install pandas
››› c:ProgramDataAnaconda3ScriptsConda.exe install pandas
Web Links to Aid your visit
HYDPY
• https://github.com/hydpy-dev/hydpy
HKV-FEWS-PY
FEWS Wiki: Adding External Module
Key Takeaways
17
• Variety of different types of models available in FEWS
– All stitched together using the “adapter” concept
– Python is a toolbox to overcome non-standard issues
– Models can be mixed in a single workflow for auditing
• Increasing use of distributed & complex models in workflows
– Issues: speed, database sizes, complexity, …
Keep the design workflow organised and repeatable
FEWS building blocks:
• An application that manages model runs efficiently
• Management of model queue to:
– assist event and scenario runs
– maximise license/hardware utilisation
• Import/Export Timeseries:
– Point, Grid and export self-contained NC.
• Transformations of Timeseries:
– Grid-grid interpolation / 2D Lookup Tables
Key Takeaways
19
• Variety of different types of models available in FEWS
– All stitched together using the “adapter” concept
– Python is a toolbox to overcome non-standard issues
– Models can be mixed in a single workflow for auditing
• Increasing use of distributed & complex models in workflows
– Issues: speed, database sizes, complexity, …
Keep the design workflow organised and repeatable
Tenets of modelling
All Models are wrong
Models are never finished, only abandoned
If you can’t make it perfect, make it adjustable

More Related Content

PDF
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
PDF
A Collaborative Data Science Development Workflow
PDF
Build your own discovery index of scholary e-resources
PPTX
Spark Summit - Mobius C# Binding for Apache Spark
PPTX
.NET per la Data Science e oltre
PPTX
March 2011 HUG: Scaling Hadoop
PDF
Apache Arrow: Leveling Up the Analytics Stack
PDF
Spark Summit EU talk by Oscar Castaneda
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
A Collaborative Data Science Development Workflow
Build your own discovery index of scholary e-resources
Spark Summit - Mobius C# Binding for Apache Spark
.NET per la Data Science e oltre
March 2011 HUG: Scaling Hadoop
Apache Arrow: Leveling Up the Analytics Stack
Spark Summit EU talk by Oscar Castaneda

What's hot (20)

PPTX
Seattle Spark Meetup Mobius CSharp API
PDF
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
PDF
Whirlpools in the Stream with Jayesh Lalwani
PPTX
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
PDF
PDF
Yet another intro to Apache Spark
PDF
Apache spark
PDF
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
PPTX
Containers, Habitat and Orchestration - Infracoders Meetup Graz
PDF
Data science lifecycle with Apache Zeppelin
PDF
Drupal 8 introduction
PDF
Intro to Apache Spark
PDF
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
PPTX
seminar presentation on apache-spark
PDF
Apache spark on Hadoop Yarn Resource Manager
PDF
What we talk about when we talk about DevOps
PDF
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
PDF
Spark Summit EU talk by Yiannis Gkoufas
PPTX
Koshy june27 140pm_room210_c_v4
PDF
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Seattle Spark Meetup Mobius CSharp API
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
Whirlpools in the Stream with Jayesh Lalwani
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Yet another intro to Apache Spark
Apache spark
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Containers, Habitat and Orchestration - Infracoders Meetup Graz
Data science lifecycle with Apache Zeppelin
Drupal 8 introduction
Intro to Apache Spark
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
seminar presentation on apache-spark
Apache spark on Hadoop Yarn Resource Manager
What we talk about when we talk about DevOps
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
Spark Summit EU talk by Yiannis Gkoufas
Koshy june27 140pm_room210_c_v4
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Ad

Similar to Python the lingua franca of FEWS (20)

PDF
James-Bucher-Resume
PDF
Headless approach for offloading heavy tasks in Magento
PDF
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
PDF
Introduction to Apache Mesos and DC/OS
PDF
Effective admin and development in iib
PPTX
h2o3_open_source_enablement_and_introduction
DOC
resume
PDF
Building Applications using Apache Hadoop
PPT
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
PDF
Apache Spark Presentation good for big data
PDF
DevOps-Roadmap
PDF
ML-Ops: Philosophy, Best-Practices and Tools
PPTX
General Learning.pptx
PPTX
Containerdays Intro to Habitat
PDF
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
PDF
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
PDF
High Performance Machine Learning in R with H2O
PPTX
Oscon 2017: Build your own container-based system with the Moby project
PDF
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
PPTX
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
James-Bucher-Resume
Headless approach for offloading heavy tasks in Magento
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Introduction to Apache Mesos and DC/OS
Effective admin and development in iib
h2o3_open_source_enablement_and_introduction
resume
Building Applications using Apache Hadoop
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Apache Spark Presentation good for big data
DevOps-Roadmap
ML-Ops: Philosophy, Best-Practices and Tools
General Learning.pptx
Containerdays Intro to Habitat
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
High Performance Machine Learning in R with H2O
Oscon 2017: Build your own container-based system with the Moby project
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Ad

Recently uploaded (20)

PPTX
Platelet disorders - thrombocytopenia.pptx
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PPTX
limit test definition and all limit tests
PPTX
bone as a tissue presentation micky.pptx
PDF
Chapter 3 - Human Development Poweroint presentation
PDF
Sustainable Biology- Scopes, Principles of sustainiability, Sustainable Resou...
PDF
Metabolic Acidosis. pa,oakw,llwla,wwwwqw
PPTX
congenital heart diseases of burao university.pptx
PPTX
PMR- PPT.pptx for students and doctors tt
PPTX
Introduction to Immunology (Unit-1).pptx
PPT
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
PPTX
GREEN FIELDS SCHOOL PPT ON HOLIDAY HOMEWORK
PPTX
A powerpoint on colorectal cancer with brief background
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
HAEMATOLOGICAL DISEASES lack of red blood cells, which carry oxygen throughou...
PDF
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PDF
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
PPTX
Understanding the Circulatory System……..
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
Platelet disorders - thrombocytopenia.pptx
TORCH INFECTIONS in pregnancy with toxoplasma
limit test definition and all limit tests
bone as a tissue presentation micky.pptx
Chapter 3 - Human Development Poweroint presentation
Sustainable Biology- Scopes, Principles of sustainiability, Sustainable Resou...
Metabolic Acidosis. pa,oakw,llwla,wwwwqw
congenital heart diseases of burao university.pptx
PMR- PPT.pptx for students and doctors tt
Introduction to Immunology (Unit-1).pptx
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
GREEN FIELDS SCHOOL PPT ON HOLIDAY HOMEWORK
A powerpoint on colorectal cancer with brief background
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
HAEMATOLOGICAL DISEASES lack of red blood cells, which carry oxygen throughou...
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
Enhancing Laboratory Quality Through ISO 15189 Compliance
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
Understanding the Circulatory System……..
Presentation1 INTRODUCTION TO ENZYMES.pptx

Python the lingua franca of FEWS

  • 1. Python the Lingua Franca of SeqFEWS 2019 AUS FEWS Users Conference Lindsay Millard, Hydrologist
  • 2. This presentation started in 2016: Coupling external modules into Delft-FEWS • 2016 FEWS User Day – Antonio Morales (ADASA) • Outlined Java, .NET, R methods to integrate data in FEWS – Used Java to Link MIKE hydrological model / Irrigation into FEWS. – Used Java and R to link MIKE 11 model to FEWS • Conclusion: – Take the best of Delft-FEWS: visualisation, modifiers, general adapters but don’t be limited by it. Add what you need. – R can do complex statistical analytics
  • 3. Brisbane River Catchment Flood Study - 2015 3 7Tb of data to harvest
  • 4. What is a Dutch word to describe FEWS? “Datafabriek” “Data-centric Workflow Manager” Why not use it for non-forecasting purposes? It would: – assist with task efficiency and maximising skillsets – avoid buying and learning new software – enable efficient data management of AR&R 2016 – centralise and allow auditing of tailored design engineering workflows 4 Refresher from 2018
  • 5. Itinerary for “terra Python” • Presentation will walk through how to build an adapter using Python • Simple steps, a quick-start guide to linking models • This allows FEWs to become the wrapper for all the code and data. • Proof of concept before getting official support • You can access an array of powerful Monte Carlo libraries to drive your model.
  • 6. Orientation • Roll your own adapter/plugin from boilerplate code • What do you need to do • Linking models to FEWs, can do a lot already but can do exponentially more with python or R or JavaScript etc • Allow use of Monte Carlo frameworks for solving probabilistically/ Bayesian Inference • Rasmus Baath, Youtube / sumsar.net
  • 7. “terra Python” Lonely Planet Guidebook “Datafabriek” POPULATION: YOU Includes ›› What to Pack …. 1 Activities ……… 2 Sights ………… 3 Phrase book ….. 4 Best Places to Visit ›› Github ›› Delft FEWS Wiki ›› HydPy ›› HKVfewspy ›› Slideshare Best Places to Stay ›› XML Spy ›› Visual Studio Code ›› Anaconda Distribution ›› R Studio ›› Python 3.6 / Pandas ›› Jupyter Notebook Why Go? Delft FEWS is one of the world’s most flexible data management frameworks. Its architecture allows for the seamless integration of numerous modelling dialects. FEWS’ epic flexibility is awe-inspiring and there is still much more substance here than in other dynastic capitals. You just need to a bit of patient exploration to tap into its full potential. The city’s denizens chat in extensible markup language – the gold standard of Markup – and marvel at their good fortune for occupying the centre of the flexible languages. Delft-FEWS dispenses with the peristent pace of procurement and locals instead find time to develop and build a city that truly reflects their needs. When do you go? As soon as your model is packed and ready to be implemented. Model environment. Requires a pre-adapter to translate XML to local language, Model runtime execution, and then post-adapter to translate from local language to XML.
  • 8. Top Places to Visit • Ingredients: – Download Anaconda using Jupyter Notebook – Python Dependencies PANDAS – FEWS Standalone – Configuration Course – A Model that can be driven from .BAT files. (most can be) • RORB, URBS, TUFLOW, GoldSIM, Probabilistic Modelling – Setup a Bin and Model folder that unzips from: • $Region Home$configModuleDataSetFiles_yourModel_.zip • $Region Home$Modules_yourModelbin and model – Use the Bin to store .bat to drive the model and .py file to Pre/Post translate from FEWS.
  • 9. bin model SeqFEWS General Adapter .XML Runtime parameters .XML .NC .CSV .bat / .py Model Input .bat / .py Model Results/Log Module Data set GeometryRun files Input TS Output TS Grid Model .exe ModuleConfig RegionConfig WorkflowFiles Modulesmodel ModuleDataset model.zip 0d, 1d, 2d T.S. transformation export import Plots Modulebin Modulebin
  • 13. Essential Phrases and Orientation ConfigModuleDataSetFiles_yourModel_.ZIP A zipped folder with a bin that contains .bat and .py scripts to drive the model. A model that contains all of the model components If python installed you can call it from the General Adapter via a batch file: ››› c:python36python.exe ‘yourScript.py’ >log.txt Python will need other modules, most will be available, others will need to be installed using: ››› c:python36scriptspip.exe install _module_
  • 15. Missing a Package? 15 import subprocess import sys def install(package): subprocess.call([sys.executable, "-m", "pip", "install", package]) # If they are missing then ››› c:Python36ScriptsPip.exe install pandas ››› c:ProgramDataAnaconda3ScriptsConda.exe install pandas
  • 16. Web Links to Aid your visit HYDPY • https://github.com/hydpy-dev/hydpy HKV-FEWS-PY FEWS Wiki: Adding External Module
  • 17. Key Takeaways 17 • Variety of different types of models available in FEWS – All stitched together using the “adapter” concept – Python is a toolbox to overcome non-standard issues – Models can be mixed in a single workflow for auditing • Increasing use of distributed & complex models in workflows – Issues: speed, database sizes, complexity, … Keep the design workflow organised and repeatable
  • 18. FEWS building blocks: • An application that manages model runs efficiently • Management of model queue to: – assist event and scenario runs – maximise license/hardware utilisation • Import/Export Timeseries: – Point, Grid and export self-contained NC. • Transformations of Timeseries: – Grid-grid interpolation / 2D Lookup Tables
  • 19. Key Takeaways 19 • Variety of different types of models available in FEWS – All stitched together using the “adapter” concept – Python is a toolbox to overcome non-standard issues – Models can be mixed in a single workflow for auditing • Increasing use of distributed & complex models in workflows – Issues: speed, database sizes, complexity, … Keep the design workflow organised and repeatable
  • 20. Tenets of modelling All Models are wrong Models are never finished, only abandoned If you can’t make it perfect, make it adjustable

Editor's Notes

  • #2: What is this presentation about? In essence it is the presentation that I had seen several years ago. Many presentations are about the what, but not many are about the ‘how’. I have had a go at solving a few problems that have cropped up. I am no expert configuration guru nor am I formally trained in programming. I have just hammered away problems and found solutions that I would like to share with the rest of you. Everything I am showing is available on GitHub and I am more than happy to talk it through with anyone wanting more detail. If I can inspire one person in this audience to have a go at coding either in Python or Delft FEWS then it was success. A few years ago I began to realise during FEWS training that the days of sloppy spreadsheets and manual translation of model input/outputs was not the right way to attack all of this ensemble /monte carlo modelling. Last year I presented I on Data Fabriek.
  • #3: The inspiration from this presentation came from reflecting on why did I start doing so much coding. I sat in Audience in 2016 and listened to Antonio discuss coupling of FEWS to various models and then chaining them together in a workflow. About the same time I was involved in the Brisbane River Flood Study where URBS monte carlo models had been done. I had no idea how all of that was happening, it was an itch I had to scratch. First it started with a simple python script that linked URBS csv straight into Tuflow boundaries. Then it became For loop that scrapped all of the ensemble results, then it went to plotting. It was a rabbit hole that I disappeared down. I found it very satisfying and rewarding, with a little persistence I started figuring out how to do more and more.
  • #5: My colleagues, Steve and Dave presented on the work they have done using FEWS with Design Hydrology. This work is also inspired by cracking the same problem – Australian Rainfall and Runoff. I am confident that Steve convinced many of you that my using FEWS to aggregate data is possible. I’d like to think that his presentation bookends my presentation from last year of a possible vision for FEWS outside of just Forecasting.
  • #6: So the Presentation is the HOW you can do this. I want you to leave here today knowing that you have all the tools to get started to stitch a model into FEWS. Have greater flexibility, stay on the bleeding edge, learn something. You might have simple models and spreadsheets and wonder why change, alternatively you may be interested in expanding the scope of your system – this will allow you to scale it up. The future is probabilistic models. Handling all of that data is going to need new tools Try a new feature, calibrate/compare models
  • #7: Known output probabilistic determination of possible inputs Hamiltonian PY mc3 or Stan sumsar.net Rasmuss Baath youtube PyMC3 strives to make Bayesian modeling as simple and painless as possible, allowing users to focus on their scientific problem, rather than on the methods used to solve it. Here is a partial list of its features: Modern methods for fitting Bayesian models, including MCMC and VI. Includes a large suite of well-documented statistical distributions. Uses Theano as the computational backend, allowing for fast expression evaluation, automatic gradient calculation, and GPU computing. Built-in support for Gaussian process modeling. Model summarization and plotting. Model checking and convergence detection. Extensible: easily incorporates custom step methods and unusual probability distributions. Bayesian models can be embedded in larger programs, and results can be analyzed with the full power of Python.
  • #8: Highlights Top Five Weblinks Things to Take Cuisine Do’s and Don’ts Resources Essential Information Exploring/Orientation Price Guide Phrase Book Itineraries Don’t Miss Best Places to Visit Guide Map Walking Tour Trivia
  • #11: I am going to gloss over the detail of configuration course, but essentially the one file you need to know about is the General Adapter. It lives in the Module Config Files In this file, you can layout your executables – this case batch file that calls Python pre/post adapter or your Model. Other housekeeping can be accomplished in this XML file. The model and its binary are zipped up and live in the Module Data Set files. The zip contains all of the batch and your running model files.
  • #14: Simple tasks can be accomplished using batch command syntax. More complicated steps you can use Powershell, R, Python Anaconda takes care of a lot of the necessary steps Python from
  • #16: One of the biggest frustrations with Python is that you have some code, but it won’t run because you have an error message about a missing library. Arrgh - what to do. It is quite simple, ICT willing, it should just be matter of pulling the latest library from the repository