SlideShare a Scribd company logo
MANAGING MODELS IN THE
AGE OF OPEN DATA
Key topics to cover
 The use of spatial databases
 Underlying principles
 Model data – infrastructure networks
+network options
 The 4S model
 Conclusions
Why does this matter?
 We are no longer the custodians of data –
we are more like curators (collate and
contextualise)
 Need to be able to incorporate updates
 Hierarchy of models – share across levels
 Community of modellers – share between
modellers and platforms
 Importance of auditing, licensing and change
management
 Less tedious more fun!
GENERAL DATA MANAGEMENT
How Data is Stored – Traditional
 Developed without reference to
recent developments in data
management
 Stored in proprietary formats
 File based data/consolidated data
bank
 Data stored outside of the model
(e.g. GIS) – often deeply nested
folders of files on a network share
How Data is Stored - Improved
 Well established approach – Relational
Database Management System
(RDBMS)
 Commercial and open source packages
 Large data sets, spatial
 Standardised access and analysis – SQL
 Integrate with other systems (GIS, stats,
custom)
 Shared access, security and access
control
Guiding principles
 Robustness principle – Fuzzy not brittle
 Be conservative in what you do, be liberal in what
you accept from others
 Separate data and processing
 BAD: Excel, GOOD: Database Queries
 Data normalisation – never repeat data
 [Every] non-key [attribute] must provide a fact
about the key, the whole key, and nothing but the
key
 Unified data – Everything goes in the database
 Metadata – Data about data
 Source, context, limitations
NETWORKS
Data Sources
 Govt street centreline data
 Freely available but limited
 Commercial products
 Full routing information
 License issues with derivative works
 Crowd sourced (OpenStreetMap)
 Road networks, points of interest,
commercial centres, schools, airports,
parking and many other elements
 Good quality – but some
missing/inconsistent
 Can fix errors/omissions
Network Geometry and Connectivity
Traditional approach:
 Series of links and nodes
 Anode, Bnode and fixed number of attributes
 Semi-automatic/semi-manual process that
creates a new stand-alone artifact
Weaknesses:
 Cannot distinguish defaults from overrides
 Breaks links to original data sources
 Hard to bring in update to external sources
 Difficult to unify changes (node number
conflicts)
The Goal
 No manual processing in network
creation
 Repeatable, automatic process
 Share process not all data
 Fast enough to run every time
 “Fuzzy" enough that it can still work
even if there are changes to the
underlying spatial data
 No node numbers!
Creating a network from GIS layers
 Two ways of viewing a network
 Geographically (polylines in GIS)
 Topologically (links + nodes in transport
model)
 Conversion between these views
 Network connectivity from spatial join
 Cannot use exact coincident points
 sensitive to minor changes
 Not too fuzzy or else incorrect topology
Network Connection Points
Adding more detailed information
 Need to add detail to source data
(lanes, capacities)
 Common approaches – both break
connection
 Edit source data
 Make model network and then edit
Our Approach – “Link Transitions”
A point with a
bearing (unit vector)
Specify start or end
of an attribute
change
Directional Points - Link Transitions
 Bearing allows direction
 Better identification when position data is
ambiguous - location + bearing eliminates
most ambiguities
 Remaining problems can be identified and
solved through more careful coding
 Works with named roads – consistent with
the way that we think about roads
 Robust when network changes –
coordinates, added or removed links
Link Transitions
Option Coding
o Option Links
o Option
Connection
Points
o Option Link
Transitions
o Option Nodes
o All have
OptionCode
o Scenarios have
hierarchical sets
of OptionCodes
THE 4S MODEL
4S
Structure
Stochastic:
● Monte Carlo methods to draw
values from probability
distributions
● Random variable parameters
● Number of slices can be
varied
SIMULTANEOUS
Segmented:
● Comprehensive
breakdown of travel
markets (20 private + 40
CV segments)
● Behavioural parameters
vary by market segment
EXPLICIT RANDOM UTILITY
Slice:
● Takes slices of the travel
market
○ across model area
○ through probability
distributions
● Very efficient – detailed
networks, large models
Simulation:
● Uses state-machine with
very flexible transition rules
● Simulates all aspects of
travel choice
● Complex public transport
● Multimodal freight
● Easily extended
Key features of 4S model
 No matrices, no skims, no zones, no centroid
connectors
 All travel is from node to node
 Models constructed with MUCH less manual effort
 Include all roads, all paths, timetabled transit
 Population and employment from multiple sources
 Multimodal with all modes assigned
 Continuous time and simultaneous choice
 Easily include any demand based effects and
capacity constraints (not just roads and transit)
 Much more detailed outputs (volumes by purpose)
Australia wide model
All roads except local streets
Some timetabled PT
Walk/cycle
Commercial vehicles
Runs in under 2 hrs (500k links, 400k nodes)
Detailed Australia model
All roads
Some timetabled PT
Walk/cycle
Commercial vehicles
Runs in under 8 hrs (2m links (2way), 1.5m nodes)
NSW
Central Sydney
ACT
Hobart
Orange, NSW
Great Britain
Excluding
residential
streets
864k Links
293,000 km
3:19 hrs
California
All Roads and
paths
1.9m Links
509,000 km
316,000 mi
8:44 hrs

More Related Content

PPTX
A new adaptive, multi-scale traffic simulation
PPTX
Beyond Level of Service – Towards a relative measurement of congestion in pla...
PPTX
Benjamin Pool
PPTX
Delivering our transport future now
PPTX
Alan Robinson
PDF
Pedestrian modelling
PDF
Transport Modelling For Transport And Land Use Sustainability Lessons And C...
PDF
Floating Car Data and Traffic Management
A new adaptive, multi-scale traffic simulation
Beyond Level of Service – Towards a relative measurement of congestion in pla...
Benjamin Pool
Delivering our transport future now
Alan Robinson
Pedestrian modelling
Transport Modelling For Transport And Land Use Sustainability Lessons And C...
Floating Car Data and Traffic Management

Viewers also liked (17)

PPTX
Mobility information from mobile phone data
PPTX
Transport problems in urban india
PPT
Transport Problems And Solutions!
PPTX
A modeller’s dilemma: overfitting or underperforming
PPTX
Building trip matrices from mobile phone data
PPTX
How can modelling help resolve transport challenges?
PPTX
Modelling in an imperfect world
PPTX
Transport Modelling for managers 2014 willumsen
PPT
Modelling World 2011
PDF
Collaborating for Trust:Pharma, Patient Organizations, & Social Media
PPTX
PPTX
IS DATA COLLECTION A HASSLE ON SITE & IN THE FIELD?
PDF
CynergyUAS
DOC
cv ayman 2
DOC
liceo de pakil Alumni reg. form 2011
PDF
Personal public private social security saving accounts
PPT
Objects by Jeroen Molenaar
Mobility information from mobile phone data
Transport problems in urban india
Transport Problems And Solutions!
A modeller’s dilemma: overfitting or underperforming
Building trip matrices from mobile phone data
How can modelling help resolve transport challenges?
Modelling in an imperfect world
Transport Modelling for managers 2014 willumsen
Modelling World 2011
Collaborating for Trust:Pharma, Patient Organizations, & Social Media
IS DATA COLLECTION A HASSLE ON SITE & IN THE FIELD?
CynergyUAS
cv ayman 2
liceo de pakil Alumni reg. form 2011
Personal public private social security saving accounts
Objects by Jeroen Molenaar
Ad

Similar to Managing models in the age of Open Data (20)

PPTX
Managing models in the age of Open Data
PDF
Christian jensen advanced routing in spatial networks using big data
PDF
Introduction to Smart Data Models
PDF
Network analysis in gis , part 1 introduction
PDF
2014 ABP Dialogue talk: "Examples of Collaborative Data, and Free and Open So...
PPTX
Debbie Wilson: Deliver More Efficient, Joined-Up Services through Improved Ma...
PDF
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
PDF
Domain-Driven Data at the O'Reilly Software Architecture Conference
PDF
OpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
PDF
Understanding Public Transport Networks using Free and Open Source Software
PDF
How Shutl Delivers Even Faster Using Neo4J
PDF
The Case for Graphs in Supply Chains
PDF
Informs 2019 - Flexible Network Design Utilizing Non Strict Modeling Approaches
PDF
Spatial Data Model 2
PPTX
Intro to Big Data in Urban GIS Research
PDF
Download- Enscape Crack + Activvation key
PDF
TunesKit Video Cutter 3.0.0.54 for MacOS Free
PDF
Adobe Acrobat Reader: Edit PDF 25.9.0.87410 APK
PDF
Neo4j workshop at GraphSummit London 14 Nov 2023.pdf
KEY
Geodata creation:past, present and future
Managing models in the age of Open Data
Christian jensen advanced routing in spatial networks using big data
Introduction to Smart Data Models
Network analysis in gis , part 1 introduction
2014 ABP Dialogue talk: "Examples of Collaborative Data, and Free and Open So...
Debbie Wilson: Deliver More Efficient, Joined-Up Services through Improved Ma...
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Domain-Driven Data at the O'Reilly Software Architecture Conference
OpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
Understanding Public Transport Networks using Free and Open Source Software
How Shutl Delivers Even Faster Using Neo4J
The Case for Graphs in Supply Chains
Informs 2019 - Flexible Network Design Utilizing Non Strict Modeling Approaches
Spatial Data Model 2
Intro to Big Data in Urban GIS Research
Download- Enscape Crack + Activvation key
TunesKit Video Cutter 3.0.0.54 for MacOS Free
Adobe Acrobat Reader: Edit PDF 25.9.0.87410 APK
Neo4j workshop at GraphSummit London 14 Nov 2023.pdf
Geodata creation:past, present and future
Ad

More from JumpingJaq (20)

PPTX
Richard Tang - Mitcham Princes Road Crossing
PPTX
Darren Blasdale - Seaford roundabout
PPTX
Zak Valiff - Causeway Road and Semaphore Road Shared Use Paths
PPTX
Lydia Kairl - King William pedestrian crossings
PPTX
Shaun Smith - Resident street parties
PPTX
Shaun Smith - Narrow road parking
PPTX
Edward Chan - Local Area Traffic Management Novar Gardens and Camden Park
PPTX
Li Meng - Shared mobility
PPT
Gabby O'Neil - Safe System Approach
PPTX
Paul Froggatt - KWR presentation
PPT
Ingrid Hunt - Traffic control device approval
PPTX
David Hayes - Robust decision making
PPTX
Paul Steely White Plenary
PPTX
Aecom - Streets for people workshop
PPTX
AITPM Conference Presentation - Bob Davis
PPTX
AITPM Conference Presentation - Casper Baum
PPTX
AITPM Conference Presentation - Laurie Piggott
PPTX
AITPM Conference Presentation - David Sanders
PPTX
AITPM Conference Presentation - Willem Deddam
PPTX
AITPM Conference Presentation - Nicole Lockwood
Richard Tang - Mitcham Princes Road Crossing
Darren Blasdale - Seaford roundabout
Zak Valiff - Causeway Road and Semaphore Road Shared Use Paths
Lydia Kairl - King William pedestrian crossings
Shaun Smith - Resident street parties
Shaun Smith - Narrow road parking
Edward Chan - Local Area Traffic Management Novar Gardens and Camden Park
Li Meng - Shared mobility
Gabby O'Neil - Safe System Approach
Paul Froggatt - KWR presentation
Ingrid Hunt - Traffic control device approval
David Hayes - Robust decision making
Paul Steely White Plenary
Aecom - Streets for people workshop
AITPM Conference Presentation - Bob Davis
AITPM Conference Presentation - Casper Baum
AITPM Conference Presentation - Laurie Piggott
AITPM Conference Presentation - David Sanders
AITPM Conference Presentation - Willem Deddam
AITPM Conference Presentation - Nicole Lockwood

Recently uploaded (20)

PDF
Corporate Finance Fundamentals - Course Presentation.pdf
PPTX
Session 14-16. Capital Structure Theories.pptx
PDF
Predicting Customer Bankruptcy Using Machine Learning Algorithm research pape...
PDF
Bitcoin Layer August 2025: Power Laws of Bitcoin: The Core and Bubbles
PDF
Mathematical Economics 23lec03slides.pdf
PDF
ECONOMICS AND ENTREPRENEURS LESSONSS AND
PDF
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
PPTX
The discussion on the Economic in transportation .pptx
PDF
way to join Real illuminati agent 0782561496,0756664682
PDF
ECONOMICS AND ENTREPRENEURS LESSONSS AND
PDF
Dialnet-DynamicHedgingOfPricesOfNaturalGasInMexico-8788871.pdf
PPTX
How best to drive Metrics, Ratios, and Key Performance Indicators
PDF
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
PDF
1a In Search of the Numbers ssrn 1488130 Oct 2009.pdf
PPTX
EABDM Slides for Indifference curve.pptx
PDF
Understanding University Research Expenditures (1)_compressed.pdf
PDF
ABriefOverviewComparisonUCP600_ISP8_URDG_758.pdf
PPTX
Introduction to Essence of Indian traditional knowledge.pptx
PPTX
Session 3. Time Value of Money.pptx_finance
PDF
Bladex Earnings Call Presentation 2Q2025
Corporate Finance Fundamentals - Course Presentation.pdf
Session 14-16. Capital Structure Theories.pptx
Predicting Customer Bankruptcy Using Machine Learning Algorithm research pape...
Bitcoin Layer August 2025: Power Laws of Bitcoin: The Core and Bubbles
Mathematical Economics 23lec03slides.pdf
ECONOMICS AND ENTREPRENEURS LESSONSS AND
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
The discussion on the Economic in transportation .pptx
way to join Real illuminati agent 0782561496,0756664682
ECONOMICS AND ENTREPRENEURS LESSONSS AND
Dialnet-DynamicHedgingOfPricesOfNaturalGasInMexico-8788871.pdf
How best to drive Metrics, Ratios, and Key Performance Indicators
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
1a In Search of the Numbers ssrn 1488130 Oct 2009.pdf
EABDM Slides for Indifference curve.pptx
Understanding University Research Expenditures (1)_compressed.pdf
ABriefOverviewComparisonUCP600_ISP8_URDG_758.pdf
Introduction to Essence of Indian traditional knowledge.pptx
Session 3. Time Value of Money.pptx_finance
Bladex Earnings Call Presentation 2Q2025

Managing models in the age of Open Data

  • 1. MANAGING MODELS IN THE AGE OF OPEN DATA
  • 2. Key topics to cover  The use of spatial databases  Underlying principles  Model data – infrastructure networks +network options  The 4S model  Conclusions
  • 3. Why does this matter?  We are no longer the custodians of data – we are more like curators (collate and contextualise)  Need to be able to incorporate updates  Hierarchy of models – share across levels  Community of modellers – share between modellers and platforms  Importance of auditing, licensing and change management  Less tedious more fun!
  • 5. How Data is Stored – Traditional  Developed without reference to recent developments in data management  Stored in proprietary formats  File based data/consolidated data bank  Data stored outside of the model (e.g. GIS) – often deeply nested folders of files on a network share
  • 6. How Data is Stored - Improved  Well established approach – Relational Database Management System (RDBMS)  Commercial and open source packages  Large data sets, spatial  Standardised access and analysis – SQL  Integrate with other systems (GIS, stats, custom)  Shared access, security and access control
  • 7. Guiding principles  Robustness principle – Fuzzy not brittle  Be conservative in what you do, be liberal in what you accept from others  Separate data and processing  BAD: Excel, GOOD: Database Queries  Data normalisation – never repeat data  [Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key  Unified data – Everything goes in the database  Metadata – Data about data  Source, context, limitations
  • 9. Data Sources  Govt street centreline data  Freely available but limited  Commercial products  Full routing information  License issues with derivative works  Crowd sourced (OpenStreetMap)  Road networks, points of interest, commercial centres, schools, airports, parking and many other elements  Good quality – but some missing/inconsistent  Can fix errors/omissions
  • 10. Network Geometry and Connectivity Traditional approach:  Series of links and nodes  Anode, Bnode and fixed number of attributes  Semi-automatic/semi-manual process that creates a new stand-alone artifact Weaknesses:  Cannot distinguish defaults from overrides  Breaks links to original data sources  Hard to bring in update to external sources  Difficult to unify changes (node number conflicts)
  • 11. The Goal  No manual processing in network creation  Repeatable, automatic process  Share process not all data  Fast enough to run every time  “Fuzzy" enough that it can still work even if there are changes to the underlying spatial data  No node numbers!
  • 12. Creating a network from GIS layers  Two ways of viewing a network  Geographically (polylines in GIS)  Topologically (links + nodes in transport model)  Conversion between these views  Network connectivity from spatial join  Cannot use exact coincident points  sensitive to minor changes  Not too fuzzy or else incorrect topology
  • 14. Adding more detailed information  Need to add detail to source data (lanes, capacities)  Common approaches – both break connection  Edit source data  Make model network and then edit
  • 15. Our Approach – “Link Transitions” A point with a bearing (unit vector) Specify start or end of an attribute change
  • 16. Directional Points - Link Transitions  Bearing allows direction  Better identification when position data is ambiguous - location + bearing eliminates most ambiguities  Remaining problems can be identified and solved through more careful coding  Works with named roads – consistent with the way that we think about roads  Robust when network changes – coordinates, added or removed links
  • 18. Option Coding o Option Links o Option Connection Points o Option Link Transitions o Option Nodes o All have OptionCode o Scenarios have hierarchical sets of OptionCodes
  • 20. 4S Structure Stochastic: ● Monte Carlo methods to draw values from probability distributions ● Random variable parameters ● Number of slices can be varied SIMULTANEOUS Segmented: ● Comprehensive breakdown of travel markets (20 private + 40 CV segments) ● Behavioural parameters vary by market segment EXPLICIT RANDOM UTILITY Slice: ● Takes slices of the travel market ○ across model area ○ through probability distributions ● Very efficient – detailed networks, large models Simulation: ● Uses state-machine with very flexible transition rules ● Simulates all aspects of travel choice ● Complex public transport ● Multimodal freight ● Easily extended
  • 21. Key features of 4S model  No matrices, no skims, no zones, no centroid connectors  All travel is from node to node  Models constructed with MUCH less manual effort  Include all roads, all paths, timetabled transit  Population and employment from multiple sources  Multimodal with all modes assigned  Continuous time and simultaneous choice  Easily include any demand based effects and capacity constraints (not just roads and transit)  Much more detailed outputs (volumes by purpose)
  • 22. Australia wide model All roads except local streets Some timetabled PT Walk/cycle Commercial vehicles Runs in under 2 hrs (500k links, 400k nodes)
  • 23. Detailed Australia model All roads Some timetabled PT Walk/cycle Commercial vehicles Runs in under 8 hrs (2m links (2way), 1.5m nodes)
  • 24. NSW
  • 26. ACT
  • 30. California All Roads and paths 1.9m Links 509,000 km 316,000 mi 8:44 hrs

Editor's Notes

  • #7: Commercial and open source packages - Oracle, Microsoft SQL server, IBM DB2, PostgreSQL and MySql/MariaDB
  • #11: include just the additional information and tools necessary for downstream users to generate their own networks extra information can be independently licensed, and downstream model users can obtain their own licence to the base data
  • #15: Network Connection Points Algorithm For each connection point Find all lines within range (using manual tolerance) For each line in range Find the point of closest approach between the line and the connection point Identify if the point is an endpoint or an intermediate point If the intermediate point is close to the end of the line Use the end point instead Add the point to a set of adjustment points If all adjustment points are end points Extend each of the adjustment lines to the connection point Else if only one adjustment point is an intermediate point Extend all of the end point adjustment lines to the intermediate point Else Extend all end points to the connection point Add a new mid point to all intermediate points to the connection point
  • #18: However one problem remains - since we do not have a clearly gazetted road, how can we identify the start and end of each road. The only real candidate is to use the road name to determine road identity. This is somewhat problematic, since some roads are not named; there is not always consistency on when names change; and the same name could be used on multiple roads. In order to make the process work, we first construct a unique road name identifier. This is prepared by identifying contiguous sections of identically named roads; by requiring them to be contiguous we avoid problems of similarly named roads, and isolated sections of unnamed roads. Some flexibility in the contiguous test is desirable, though, since we have found that there are sometimes small sections of differently named roads and ramps and roundabouts that can sometimes break what should be a single road. To simplify coding, and eliminate extraneous end points, we have adopted a range of different Link Transition types: Merge Start - notes that the attribute changes at the closest intersection back from this point , and applies to all sections of this road until the end (unless an End Transition is found) Split Start - splits the road at the transition point, and applies the attribute change Merge End - notes the end of an attribute change at the next intersection forward from this point Split End - splits the road at the transition point, and ends the previous attribute change Single Link - applied an attribute change only to the specified link (however that has been defined) Single Block - applies the attribute change from the previous intersection through to the next intersection The link transitions are easily created in any GIS system
  • #20: OptionLinks - this contains all the new links that are to be added in any scenario (includes link attributes) OptionLinkConnectionPoints - shows where the new links should connect to the existing network OptionNodes - identifies nodes that are added or changed in any scenario (includes node attributes) OptionLinkTransitionPoints - shows the start and stop of any changes in link attributes (includes link attributes)
  • #21: There are complexities in updating a timetable to incorporate congestion because any change in a stop-to-stop time will alter the whole timetable and is likely to have follow on effects on other services that use the same vehicle. We have explored methods to address this problem but they are beyond the scope of this paper.
  • #27: Can build from OpenStreetMap and GTFS
  • #30: We have focused on SEQ – combines 3 cities, also some rural areas