SlideShare a Scribd company logo
Data-Driven Discovery of
Dynamical Systems and
Governing Equations
SINDy Autoencoders: A Hybrid Approach
Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45), 22445-22451.
References
https://www.eigensteve.com/ https://www.youtube.com/watch?v=NLFboNNKCME
Personal Motivation – 3 levels of ML usage
Decide Interpret Reveal
Personal Motivation – 3 levels of Murder
Mystery Resolution
Who? How? Why?
Agenda
• Ambition
• Key Assumptions
• Example: Lotka-Volterra (Predator-Prey) Equations
• Generalized SINDy
• Transforming Co-ordinates
• SINDy Autoencoder
• Applications & Challenges
• Future Extension
Ambition
Given the current state of the system, we want to learn the
governing equations that can estimate the the system's rate of
change.
Ambition of Sparse Identification of Nonlinear
Dynamical Systems (SINDy)
Given the current state of the system, we want to learn the sparse
governing equations that accurately estimate the system's rate of
change.
𝑑𝑥(𝑡)
𝑑𝑡
= 𝑓(𝑥 𝑡 )
Ambition of SINDy
Given the current state of the system, we want to learn the sparse
governing equations that accurately estimate the system's rate of
change.
𝑑𝑥(𝑡)
𝑑𝑡
= 𝑓(𝑥 𝑡 )
Turn high-dimensional scientific data into parsimonious dynamical
models
Key Assumptions
• Causality: The Current State Encapsulates Future Behavior
oKnowing the current state 𝑥(𝑡) allows us to predict how the system will
evolve in the next instant.
• Markov Property: Future Depends Only on Present
Key Assumptions
SINDy models the dynamics of a system using only the current
state 𝑥(𝑡) 𝑑𝑥(𝑡)
𝑑𝑡
= 𝑓(𝑥 𝑡 )
Simplicity Limitation
Mathematical Tractability Non-Markovian Systems
Applicability to a Wide Range of
Systems
Incomplete State Representation
Computational Efficiency: Extensions Required for Complex
Systems
Example: Lotka-Volterra (Predator-Prey)
Equations
𝑑𝑥
𝑑𝑡
= 𝛼𝑥 − 𝛽𝑥𝑦
𝑑𝑦
𝑑𝑡
= 𝛿𝑥𝑦 − 𝛾𝑦
Where,
𝑥: Prey population
𝑦: Predator population
𝛼, 𝛽, 𝛿, 𝛾: Positive constants representing interaction rates
Generating data
• Initial Conditions
Generating data
• Simulated data
X(t)
Generating data
• Simulated data
X(t) ሶ
𝑋
Learning the governing Equations
X(t) ሶ
𝑋
Learn a governing set of equations so that
𝑓 𝑋 𝑡 = ሶ
𝑋
Learning the governing equations
X(t) ሶ
𝑋
Library Functions Θ(𝑋)
Learning the governing equations
X(t) ሶ
𝑋
Library Functions Θ(𝑋)
𝑑𝑥
𝑑𝑡
= 𝛼𝑥 − 𝛽𝑥𝑦
𝑑𝑦
𝑑𝑡
= 𝛿𝑥𝑦 − 𝛾𝑦
0 0
0 −𝛾
𝛼 0
0 0
𝛽 𝛿
Ξ
Generalized SINDy
ሶ
𝑋 = Θ(𝑋)Ξ
Brunton, S. (2017, March). Discovering governing equations from data by sparse identification of nonlinear dynamics. In APS March Meeting Abstracts (Vol. 2017, pp. X49-004).
Generalized SINDy
ሶ
𝑋 = Θ(𝑋)Ξ
𝑋 =
𝑥𝑇(𝑡1)
𝑥𝑇(𝑡2)
⋮
𝑥𝑇(𝑡𝑚)
=
𝑥1(𝑡1) 𝑥2(𝑡1) … 𝑥𝑛(𝑡1)
𝑥1(𝑡2) 𝑥2(𝑡2) … 𝑥𝑛(𝑡2)
⋮ ⋮ ⋱ ⋮
𝑥1(𝑡𝑚) 𝑥2(𝑡𝑚) … 𝑥𝑛(𝑡𝑚)
Where,
Θ 𝑋 =
| | | | | | | |
1 𝑋 𝑋𝑃2 𝑋𝑃3 … sin(𝑋) cos(𝑋) …
| | | | | | | |
Ξ = 𝜉1 𝜉2 … 𝜉𝑛
Generalized SINDy
ሶ
𝑋 = Θ(𝑋)Ξ
𝑋 =
𝑥𝑇(𝑡1)
𝑥𝑇(𝑡2)
⋮
𝑥𝑇(𝑡𝑚)
=
𝑥1(𝑡1) 𝑥2(𝑡1) … 𝑥𝑛(𝑡1)
𝑥1(𝑡2) 𝑥2(𝑡2) … 𝑥𝑛(𝑡2)
⋮ ⋮ ⋱ ⋮
𝑥1(𝑡𝑚) 𝑥2(𝑡𝑚) … 𝑥𝑛(𝑡𝑚)
Where,
Θ 𝑋 =
| | | | | | | |
1 𝑋 𝑋𝑃2 𝑋𝑃3 … sin(𝑋) cos(𝑋) …
| | | | | | | |
Ξ = 𝜉1 𝜉2 … 𝜉𝑛
Once Ξ has been determined
ሶ
𝑥𝑘 = 𝑓𝑘 𝑥 = Θ(𝑥𝑇)𝜉𝑘
Co ordinates
… Choosing the right coordinates to simplify dynamics has always been important …
Brunton, S. (2017, March). Discovering governing equations from data by sparse identification of nonlinear dynamics. In APS March Meeting Abstracts (Vol. 2017, pp. X49-004).
Simplifying co-ordinate systems
Example Complex Coordinate System Simplified Coordinate System
Celestial Mechanics Geocentric (Earth-centered) Heliocentric (Sun-centered)
Fourier Transform (Heat Equation) Time Domain Frequency Domain (Fourier
space)
Principal Component Analysis
(PCA)
Original high-dimensional space Principal Component Space
Polar Coordinates Cartesian Coordinates (x, y) Polar Coordinates (r, θ)
x
y
Transforming Co-ordinates
SVD (Shallow Linear)
Deep AutoEncoder (Deep Non-
Linear)
Losses
Loss Term I
Training the Sindy model
Ensure that we get a good
representation of ሶ
𝒛
• Capturing Core Dynamics
• Efficient Prediction
• Simplified Understanding
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
𝑧 = 𝜑 𝑥
ሶ
𝑧 = ?
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
𝑧 = 𝜑 𝑥
ሶ
𝑧 = ?
Utilize the Jacobian, where
Δ𝑓 = 𝐽Δ𝑥
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
ሶ
𝑧 = ?
Δ𝑓 = 𝐽Δ𝑥
𝑧 = 𝜑 𝑥
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
ሶ
𝑧 = ?
ሶ
𝑧= 𝐽𝜑 ሶ
𝑥
Δ𝑓 = 𝐽Δ𝑥
𝑧 = 𝜑 𝑥
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
ሶ
𝑧 = ?
ሶ
𝑧= 𝐽𝜑 ሶ
𝑥
ሶ
𝑧= ∇𝑥(𝑧) ሶ
𝑥
Δ𝑓 = 𝐽Δ𝑥
𝑧 = 𝜑 𝑥
Loss Term I
• Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧
= 𝜃 𝑧𝑇
Ξ
= 𝜃 𝜑 𝑥 𝑇
Ξ
ሶ
𝑧 = ?
ሶ
𝑧= 𝐽𝜑 ሶ
𝑥
ሶ
𝑧= ∇𝑥(𝑧) ሶ
𝑥
ሶ
𝑧= ∇𝑥(𝜑 𝑥 ) ሶ
𝑥
Δ𝑓 = 𝐽Δ𝑥
𝑧 = 𝜑 𝑥
Loss Term I
Minimize ( ሶ
𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧)
𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ
𝑧 = 𝜃 𝜑 𝑥 𝑇
Ξ
ሶ
𝑧 = ?
ሶ
𝑧= 𝐽𝜑 ሶ
𝑥
ሶ
𝑧= ∇𝑥(𝑧) ሶ
𝑥
ሶ
𝑧= ∇𝑥(𝜑 𝑥 ) ሶ
𝑥
Δ𝑓 = 𝐽Δ𝑥
𝑧 = 𝜑 𝑥
ሶ
𝑧= ∇𝑥(𝜑 𝑥 ) ሶ
𝑥
ℒ𝑑𝑧/𝑑𝑡 = ∇𝑥 𝜑 𝑥 ሶ
𝑥 − 𝜃 𝜑 𝑥 𝑇
Ξ
2
2
Loss Term II
• Predicting Time Evolution
• Capturing Data and Dynamics
• Improving Model Accuracy with Time Derivatives
Minimize ( ሶ
𝑥 − ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
ሶ
𝑥 = 𝐽𝜓 ( ሶ
𝑧)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
ሶ
𝑥 = 𝐽𝜓 ( ሶ
𝑧)
= ∇𝑧 ො
𝑥 ( ሶ
𝑧)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
ሶ
𝑥 = 𝐽𝜓 ( ሶ
𝑧)
= ∇𝑧 ො
𝑥 ( ሶ
𝑧)
= ∇𝑧 𝜓(𝑧) ( ሶ
𝑧)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
ሶ
𝑥 = 𝐽𝜓 ( ሶ
𝑧)
= ∇𝑧 ො
𝑥 ( ሶ
𝑧)
= ∇𝑧 𝜓(𝑧) ( ሶ
𝑧)
= ∇𝑧 𝜓(𝜑 𝑥 ) ( ሶ
𝑧)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
ሶ
𝑥 = 𝐽𝜓 ( ሶ
𝑧)
= ∇𝑧 ො
𝑥 ( ሶ
𝑧)
= ∇𝑧 𝜓(𝑧) ( ሶ
𝑧)
= ∇𝑧 𝜓(𝜑 𝑥 ) ( ሶ
𝑧)
= ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ)
Loss Term II
Calculating ሶ
𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠
ො
𝑥 = 𝜓(𝑧)
ሶ
𝑥 = 𝐽𝜓 ( ሶ
𝑧)
= ∇𝑧 ො
𝑥 ( ሶ
𝑧)
= ∇𝑧 𝜓(𝑧) ( ሶ
𝑧)
= ∇𝑧 𝜓(𝜑 𝑥 ) ( ሶ
𝑧)
= ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ)
ℒ𝑑𝑥/𝑑𝑡 = ሶ
𝑥 − ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ)
2
2
Loss Term III
• Recreating the input
ℒ𝑟𝑒𝑐𝑜𝑛 = 𝑥 − 𝜓(𝜑 𝑥 2
2
Total Loss
Overall, Loss = ℒ𝑟𝑒𝑐𝑜𝑛 + 𝜆1ℒ𝑑𝑥/𝑑𝑡 + 𝜆2ℒ𝑑𝑧/𝑑𝑡 + 𝜆3ℒ𝑟𝑒𝑔
ℒ𝑟𝑒𝑐𝑜𝑛 = 𝑥 − 𝜓(𝜑 𝑥 2
2
ℒ𝑑𝑥/𝑑𝑡 = ሶ
𝑥 − ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ)
2
2
ℒ𝑑𝑧/𝑑𝑡 = ∇𝑥 𝜑 𝑥 ሶ
𝑥 − 𝜃 𝜑 𝑥 𝑇 Ξ
2
2
ℒ𝑟𝑒𝑔 = Ξ 1
1
Applications: Nonlinear Pendulum
Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45),
22445-22451.
Applications: Lorenz System
Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45),
22445-22451.
Applications: Reaction Diffusion System
Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45),
22445-22451.
Challenges
• Clean, noise free measurement data
• Limited interpretability of deep learning models
• Coordinate transformation limitations
• Need for integration with domain knowledge
Future Extension
Chebyshev Polynomials for System Identification
𝑇0 𝑥 = 1, 𝑇1 𝑥 = 𝑥
𝑇𝑛+1 𝑥 = 2 ∗ 𝑥 ∗ 𝑇𝑛 𝑥 − 𝑇𝑛−1 𝑥
𝑥𝜖[−1,1]
𝑦 𝑥 = 𝛼0𝑇0 𝑥 + 𝛼1𝑇1 𝑥 + 𝛼2𝑇2 𝑥 + ⋯ + 𝛼𝑛𝑇𝑛 𝑥
Why Chebyshev Polynomials?
• Orthogonality
• Numerical Stability
• Efficient Approximation
Application to System Identification:
• Use high-order Chebyshev polynomials to create candidate basis functions.
• Each function becomes a weighted sum of these polynomials:
Extending Chebyshev Polynomials to Multiple
Features
• Generalizing to Multiple Features:
• For features 𝑥1, 𝑥2,… 𝑥𝑚 ∶
𝑦 𝑥1, 𝑥2 = ෍
𝑖=0
𝑛
෍
𝑗
𝑛
𝛼𝑖,𝑗𝑇𝑖 𝑥1 𝑇𝑗 𝑥2
• Including Interactions Between Features
• Tensor Products
• For Two Features 𝑥1 and 𝑥2
𝑦 𝑥1, 𝑥2, … 𝑥𝑚 = ෍
𝑖,𝑗,…,𝑘
𝛼𝑖,𝑗,…,𝑘𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 … 𝑇𝑘 𝑥𝑚
Optimizing and Sparsifying with Cross-
Feature Terms
• Learn the coefficients 𝛼𝑖,𝑗,…,𝑘that best fit the system while
promoting sparsity
• Optimization problem:
𝑚𝑖𝑛𝛼 𝑦 − ෍
𝑖,𝑗,…,𝑘
𝛼𝑖,𝑗,…,𝑘𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 … 𝑇𝑘 𝑥𝑚
2
+ 𝜆 𝛼 1
Sparsity and Interpretability
• Pruning Insignificant Terms: Coefficients corresponding to unimportant
polynomials shrink to zero.
• Simplified Model: Results in a model that is both accurate and interpretable, using
only the most significant terms.
Synergy with Neural Networks
• Neural Network Enhancement
• Adaptive Coefficients
• Dynamic Modeling
• Model Architecture
• Input Layer: Receives features 𝑥1, 𝑥2, … , 𝑥𝑚
• Hidden Layers
• Output Layer: 𝛼𝑖,𝑗,…,𝑘
• Combined Model:
𝑦 𝑥1, 𝑥2, … 𝑥𝑚 = ෍
𝑖,𝑗,…,𝑘
𝛼𝑖,𝑗,…,𝑘 𝑥1, 𝑥2, … , 𝑥𝑚 𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 … 𝑇𝑘 𝑥𝑚
Remarks
Advantages
• Enhanced Expressiveness
• Improved Accuracy
• Interpretability
Practical considerations
• Computational Complexity
• Regularization Techniques
• Training the Neural Network
Example application: Neural Networks for
classification
Normally
Output of a neuron is σ𝑖=0
𝑚
𝑤𝑖𝑥𝑖, m is the number of features
Where 𝑤𝑖 is the learnable parameter
For ChebyShev, each 𝑤𝑖 is as follows
𝑤𝑖 = ෍
𝑑=0
𝑘
𝑐𝑖,𝑑𝑇𝑑 𝑥𝑖
Where 𝑐𝑑 is the learnable parameter, k is the order of Chebyshev polynomial
Thus for Chebyshev, output of a neuron is
෍
𝑖=0
𝑚
𝑤𝑖𝑥𝑖
Which can also be written as
෍
𝑖=0
𝑚
෍
𝑑=0
𝑘
𝑐𝑖,𝑑𝑇𝑑 𝑥𝑖 𝑥𝑖
For k=3(0,1,2), Cheby Shev
Output
෍
𝑖=0
𝑚
෍
𝑑=0
𝑘
𝑐𝑖,𝑑𝑇𝑑 𝑥𝑖 𝑥𝑖
= ෍
𝑖=0
𝑚
𝑐𝑖,0𝑇0 𝑥𝑖 + 𝑐𝑖,1𝑇1 𝑥𝑖 + 𝑐𝑖,2𝑇2 𝑥𝑖 𝑥𝑖
Now, if 𝑐𝑖,1and 𝑐𝑖,2are forced to be 0, and 𝑇0 is 1, output becomes
σ𝑖=0
𝑚
𝑐𝑖,0 𝑥𝑖 which is same as σ𝑖=0
𝑚
𝑤𝑖𝑥𝑖
where each 𝑐𝑖,0 term in the expression represents 𝑤𝑖
Example application: Neural Networks for classification
Results File Size Columns Number Of
Classes
MLP F1 Score F1 Score by
ChebyShevNetwork
Delta
percentage
krkopt.csv 28056 6 18 0.509 0.632 24.165
contraceptive.csv 1473 9 3 0.529 0.582 10.019
led7.csv 3200 7 10 0.703 0.764 8.677
cmc.csv 1473 9 3 0.536 0.58 8.209
car.csv 1728 6 4 0.898 0.967 7.684
splice.csv 3188 60 3 0.851 0.899 5.64
wine_quality_red.csv 1599 11 6 0.559 0.59 5.546
wine_quality_white.csv 4898 11 7 0.517 0.542 4.836
letter.csv 20000 16 26 0.896 0.924 3.125
satimage.csv 6435 36 6 0.86 0.886 3.023
solar_flare_2.csv 1066 12 6 0.72 0.74 2.778
page_blocks.csv 5473 10 5 0.959 0.978 1.981
nursery.csv 12958 8 4 0.984 0.997 1.321
sleep.csv 105908 13 5 0.744 0.753 1.21
allhypo.csv 3770 29 3 0.938 0.948 1.066
allhyper.csv 3771 29 4 0.981 0.988 0.714
allrep.csv 3772 29 4 0.971 0.975 0.412
ann_thyroid.csv 7200 21 3 0.982 0.986 0.407
texture.csv 5500 40 11 0.986 0.99 0.406
segmentation.csv 2310 19 7 0.972 0.974 0.206
dna.csv 3186 180 3 0.938 0.939 0.107
pendigits.csv 10992 16 10 0.994 0.994 0
yeast.csv 1479 8 9 0.588 0.587 -0.17
allbp.csv 3772 29 3 0.977 0.973 -0.409
optdigits.csv 5620 64 10 0.976 0.972 -0.41
car_evaluation.csv 1728 21 4 0.991 0.986 -0.505
led24.csv 3200 24 10 0.742 0.735 -0.943
Data Extrapolation
Initial Hypothesis
Generation
Physical Model
Simulation
Prediction and
Uncertainty
Estimation
Data Integration
and Model Update
Active Learning:
Data Acquisition
Thank You!

More Related Content

PDF
Defense_Talk
PDF
Data-Driven Identification of Networks of Dynamic Systems Michel Verhaegen
PDF
MUMS Opening Workshop -Data-Driven Discovery of Governing Physical Laws and t...
PDF
PREDICTING THE REPLICATOR EQUATION USING DEEP LEARNING IN EVOLUTIONARY DYNAMICS
PDF
PREDICTING THE REPLICATOR EQUATION USING DEEP LEARNING IN EVOLUTIONARY DYNAMICS
PDF
PREDICTING THE REPLICATOR EQUATION USING DEEP LEARNING IN EVOLUTIONARY DYNAMICS
PDF
Data-Driven Identification of Networks of Dynamic Systems Michel Verhaegen
PDF
Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - App...
Defense_Talk
Data-Driven Identification of Networks of Dynamic Systems Michel Verhaegen
MUMS Opening Workshop -Data-Driven Discovery of Governing Physical Laws and t...
PREDICTING THE REPLICATOR EQUATION USING DEEP LEARNING IN EVOLUTIONARY DYNAMICS
PREDICTING THE REPLICATOR EQUATION USING DEEP LEARNING IN EVOLUTIONARY DYNAMICS
PREDICTING THE REPLICATOR EQUATION USING DEEP LEARNING IN EVOLUTIONARY DYNAMICS
Data-Driven Identification of Networks of Dynamic Systems Michel Verhaegen
Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - App...

Similar to SindyAutoEncoder: Interpretable Latent Dynamics via Sparse Identification (20)

PDF
Data-Driven Science and Engineering Steven L. Brunton
PDF
Deep learning concepts
PDF
PhysRevE.89.042911
PPTX
FUNCTION APPROXIMATION
PDF
Deep learning MindMap
PPT
Unit1 pg math model
PDF
Performance of Matching Algorithmsfor Signal Approximation
PDF
Modern Control System (BE)
PDF
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
PDF
Nonlinear Manifolds in Computer Vision
PDF
L010628894
PPTX
Deep learning simplified
PDF
Final Report
PPTX
System Approximation: Concept and Approaches
PDF
2018 Modern Math Workshop - Foundations of Statistical Learning Theory: Quint...
PDF
CLIM Program: Remote Sensing Workshop, Foundations Session: A Discussion - Br...
PDF
My data are incomplete and noisy: Information-reduction statistical methods f...
PDF
MBIP-book.pdf
PDF
State space courses
PPT
State_space_represenation_&_analysis.ppt
Data-Driven Science and Engineering Steven L. Brunton
Deep learning concepts
PhysRevE.89.042911
FUNCTION APPROXIMATION
Deep learning MindMap
Unit1 pg math model
Performance of Matching Algorithmsfor Signal Approximation
Modern Control System (BE)
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonlinear Manifolds in Computer Vision
L010628894
Deep learning simplified
Final Report
System Approximation: Concept and Approaches
2018 Modern Math Workshop - Foundations of Statistical Learning Theory: Quint...
CLIM Program: Remote Sensing Workshop, Foundations Session: A Discussion - Br...
My data are incomplete and noisy: Information-reduction statistical methods f...
MBIP-book.pdf
State space courses
State_space_represenation_&_analysis.ppt
Ad

Recently uploaded (20)

PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
Sciences of Europe No 170 (2025)
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
The scientific heritage No 166 (166) (2025)
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
2. Earth - The Living Planet earth and life
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Sciences of Europe No 170 (2025)
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Introduction to Cardiovascular system_structure and functions-1
bbec55_b34400a7914c42429908233dbd381773.pdf
TOTAL hIP ARTHROPLASTY Presentation.pptx
neck nodes and dissection types and lymph nodes levels
The scientific heritage No 166 (166) (2025)
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Taita Taveta Laboratory Technician Workshop Presentation.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
ECG_Course_Presentation د.محمد صقران ppt
2. Earth - The Living Planet earth and life
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
2Systematics of Living Organisms t-.pptx
microscope-Lecturecjchchchchcuvuvhc.pptx
AlphaEarth Foundations and the Satellite Embedding dataset
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Ad

SindyAutoEncoder: Interpretable Latent Dynamics via Sparse Identification

  • 1. Data-Driven Discovery of Dynamical Systems and Governing Equations SINDy Autoencoders: A Hybrid Approach Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45), 22445-22451.
  • 3. Personal Motivation – 3 levels of ML usage Decide Interpret Reveal
  • 4. Personal Motivation – 3 levels of Murder Mystery Resolution Who? How? Why?
  • 5. Agenda • Ambition • Key Assumptions • Example: Lotka-Volterra (Predator-Prey) Equations • Generalized SINDy • Transforming Co-ordinates • SINDy Autoencoder • Applications & Challenges • Future Extension
  • 6. Ambition Given the current state of the system, we want to learn the governing equations that can estimate the the system's rate of change.
  • 7. Ambition of Sparse Identification of Nonlinear Dynamical Systems (SINDy) Given the current state of the system, we want to learn the sparse governing equations that accurately estimate the system's rate of change. 𝑑𝑥(𝑡) 𝑑𝑡 = 𝑓(𝑥 𝑡 )
  • 8. Ambition of SINDy Given the current state of the system, we want to learn the sparse governing equations that accurately estimate the system's rate of change. 𝑑𝑥(𝑡) 𝑑𝑡 = 𝑓(𝑥 𝑡 ) Turn high-dimensional scientific data into parsimonious dynamical models
  • 9. Key Assumptions • Causality: The Current State Encapsulates Future Behavior oKnowing the current state 𝑥(𝑡) allows us to predict how the system will evolve in the next instant. • Markov Property: Future Depends Only on Present
  • 10. Key Assumptions SINDy models the dynamics of a system using only the current state 𝑥(𝑡) 𝑑𝑥(𝑡) 𝑑𝑡 = 𝑓(𝑥 𝑡 ) Simplicity Limitation Mathematical Tractability Non-Markovian Systems Applicability to a Wide Range of Systems Incomplete State Representation Computational Efficiency: Extensions Required for Complex Systems
  • 11. Example: Lotka-Volterra (Predator-Prey) Equations 𝑑𝑥 𝑑𝑡 = 𝛼𝑥 − 𝛽𝑥𝑦 𝑑𝑦 𝑑𝑡 = 𝛿𝑥𝑦 − 𝛾𝑦 Where, 𝑥: Prey population 𝑦: Predator population 𝛼, 𝛽, 𝛿, 𝛾: Positive constants representing interaction rates
  • 14. Generating data • Simulated data X(t) ሶ 𝑋
  • 15. Learning the governing Equations X(t) ሶ 𝑋 Learn a governing set of equations so that 𝑓 𝑋 𝑡 = ሶ 𝑋
  • 16. Learning the governing equations X(t) ሶ 𝑋 Library Functions Θ(𝑋)
  • 17. Learning the governing equations X(t) ሶ 𝑋 Library Functions Θ(𝑋) 𝑑𝑥 𝑑𝑡 = 𝛼𝑥 − 𝛽𝑥𝑦 𝑑𝑦 𝑑𝑡 = 𝛿𝑥𝑦 − 𝛾𝑦 0 0 0 −𝛾 𝛼 0 0 0 𝛽 𝛿 Ξ
  • 18. Generalized SINDy ሶ 𝑋 = Θ(𝑋)Ξ Brunton, S. (2017, March). Discovering governing equations from data by sparse identification of nonlinear dynamics. In APS March Meeting Abstracts (Vol. 2017, pp. X49-004).
  • 19. Generalized SINDy ሶ 𝑋 = Θ(𝑋)Ξ 𝑋 = 𝑥𝑇(𝑡1) 𝑥𝑇(𝑡2) ⋮ 𝑥𝑇(𝑡𝑚) = 𝑥1(𝑡1) 𝑥2(𝑡1) … 𝑥𝑛(𝑡1) 𝑥1(𝑡2) 𝑥2(𝑡2) … 𝑥𝑛(𝑡2) ⋮ ⋮ ⋱ ⋮ 𝑥1(𝑡𝑚) 𝑥2(𝑡𝑚) … 𝑥𝑛(𝑡𝑚) Where, Θ 𝑋 = | | | | | | | | 1 𝑋 𝑋𝑃2 𝑋𝑃3 … sin(𝑋) cos(𝑋) … | | | | | | | | Ξ = 𝜉1 𝜉2 … 𝜉𝑛
  • 20. Generalized SINDy ሶ 𝑋 = Θ(𝑋)Ξ 𝑋 = 𝑥𝑇(𝑡1) 𝑥𝑇(𝑡2) ⋮ 𝑥𝑇(𝑡𝑚) = 𝑥1(𝑡1) 𝑥2(𝑡1) … 𝑥𝑛(𝑡1) 𝑥1(𝑡2) 𝑥2(𝑡2) … 𝑥𝑛(𝑡2) ⋮ ⋮ ⋱ ⋮ 𝑥1(𝑡𝑚) 𝑥2(𝑡𝑚) … 𝑥𝑛(𝑡𝑚) Where, Θ 𝑋 = | | | | | | | | 1 𝑋 𝑋𝑃2 𝑋𝑃3 … sin(𝑋) cos(𝑋) … | | | | | | | | Ξ = 𝜉1 𝜉2 … 𝜉𝑛 Once Ξ has been determined ሶ 𝑥𝑘 = 𝑓𝑘 𝑥 = Θ(𝑥𝑇)𝜉𝑘
  • 21. Co ordinates … Choosing the right coordinates to simplify dynamics has always been important … Brunton, S. (2017, March). Discovering governing equations from data by sparse identification of nonlinear dynamics. In APS March Meeting Abstracts (Vol. 2017, pp. X49-004).
  • 22. Simplifying co-ordinate systems Example Complex Coordinate System Simplified Coordinate System Celestial Mechanics Geocentric (Earth-centered) Heliocentric (Sun-centered) Fourier Transform (Heat Equation) Time Domain Frequency Domain (Fourier space) Principal Component Analysis (PCA) Original high-dimensional space Principal Component Space Polar Coordinates Cartesian Coordinates (x, y) Polar Coordinates (r, θ) x y
  • 23. Transforming Co-ordinates SVD (Shallow Linear) Deep AutoEncoder (Deep Non- Linear)
  • 25. Loss Term I Training the Sindy model Ensure that we get a good representation of ሶ 𝒛 • Capturing Core Dynamics • Efficient Prediction • Simplified Understanding
  • 26. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ
  • 27. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ 𝑧 = 𝜑 𝑥 ሶ 𝑧 = ?
  • 28. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ 𝑧 = 𝜑 𝑥 ሶ 𝑧 = ? Utilize the Jacobian, where Δ𝑓 = 𝐽Δ𝑥
  • 29. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ ሶ 𝑧 = ? Δ𝑓 = 𝐽Δ𝑥 𝑧 = 𝜑 𝑥
  • 30. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ ሶ 𝑧 = ? ሶ 𝑧= 𝐽𝜑 ሶ 𝑥 Δ𝑓 = 𝐽Δ𝑥 𝑧 = 𝜑 𝑥
  • 31. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ ሶ 𝑧 = ? ሶ 𝑧= 𝐽𝜑 ሶ 𝑥 ሶ 𝑧= ∇𝑥(𝑧) ሶ 𝑥 Δ𝑓 = 𝐽Δ𝑥 𝑧 = 𝜑 𝑥
  • 32. Loss Term I • Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝑧𝑇 Ξ = 𝜃 𝜑 𝑥 𝑇 Ξ ሶ 𝑧 = ? ሶ 𝑧= 𝐽𝜑 ሶ 𝑥 ሶ 𝑧= ∇𝑥(𝑧) ሶ 𝑥 ሶ 𝑧= ∇𝑥(𝜑 𝑥 ) ሶ 𝑥 Δ𝑓 = 𝐽Δ𝑥 𝑧 = 𝜑 𝑥
  • 33. Loss Term I Minimize ( ሶ 𝑧 − 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧) 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 ሶ 𝑧 = 𝜃 𝜑 𝑥 𝑇 Ξ ሶ 𝑧 = ? ሶ 𝑧= 𝐽𝜑 ሶ 𝑥 ሶ 𝑧= ∇𝑥(𝑧) ሶ 𝑥 ሶ 𝑧= ∇𝑥(𝜑 𝑥 ) ሶ 𝑥 Δ𝑓 = 𝐽Δ𝑥 𝑧 = 𝜑 𝑥 ሶ 𝑧= ∇𝑥(𝜑 𝑥 ) ሶ 𝑥 ℒ𝑑𝑧/𝑑𝑡 = ∇𝑥 𝜑 𝑥 ሶ 𝑥 − 𝜃 𝜑 𝑥 𝑇 Ξ 2 2
  • 34. Loss Term II • Predicting Time Evolution • Capturing Data and Dynamics • Improving Model Accuracy with Time Derivatives Minimize ( ሶ 𝑥 − ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠)
  • 35. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧)
  • 36. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧) ሶ 𝑥 = 𝐽𝜓 ( ሶ 𝑧)
  • 37. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧) ሶ 𝑥 = 𝐽𝜓 ( ሶ 𝑧) = ∇𝑧 ො 𝑥 ( ሶ 𝑧)
  • 38. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧) ሶ 𝑥 = 𝐽𝜓 ( ሶ 𝑧) = ∇𝑧 ො 𝑥 ( ሶ 𝑧) = ∇𝑧 𝜓(𝑧) ( ሶ 𝑧)
  • 39. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧) ሶ 𝑥 = 𝐽𝜓 ( ሶ 𝑧) = ∇𝑧 ො 𝑥 ( ሶ 𝑧) = ∇𝑧 𝜓(𝑧) ( ሶ 𝑧) = ∇𝑧 𝜓(𝜑 𝑥 ) ( ሶ 𝑧)
  • 40. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧) ሶ 𝑥 = 𝐽𝜓 ( ሶ 𝑧) = ∇𝑧 ො 𝑥 ( ሶ 𝑧) = ∇𝑧 𝜓(𝑧) ( ሶ 𝑧) = ∇𝑧 𝜓(𝜑 𝑥 ) ( ሶ 𝑧) = ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ)
  • 41. Loss Term II Calculating ሶ 𝑥 𝑓𝑟𝑜𝑚 𝑆𝐼𝑁𝐷𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 ො 𝑥 = 𝜓(𝑧) ሶ 𝑥 = 𝐽𝜓 ( ሶ 𝑧) = ∇𝑧 ො 𝑥 ( ሶ 𝑧) = ∇𝑧 𝜓(𝑧) ( ሶ 𝑧) = ∇𝑧 𝜓(𝜑 𝑥 ) ( ሶ 𝑧) = ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ) ℒ𝑑𝑥/𝑑𝑡 = ሶ 𝑥 − ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ) 2 2
  • 42. Loss Term III • Recreating the input ℒ𝑟𝑒𝑐𝑜𝑛 = 𝑥 − 𝜓(𝜑 𝑥 2 2
  • 43. Total Loss Overall, Loss = ℒ𝑟𝑒𝑐𝑜𝑛 + 𝜆1ℒ𝑑𝑥/𝑑𝑡 + 𝜆2ℒ𝑑𝑧/𝑑𝑡 + 𝜆3ℒ𝑟𝑒𝑔 ℒ𝑟𝑒𝑐𝑜𝑛 = 𝑥 − 𝜓(𝜑 𝑥 2 2 ℒ𝑑𝑥/𝑑𝑡 = ሶ 𝑥 − ∇𝑧 𝜓(𝜑 𝑥 ) (𝜃 𝑧𝑇 Ξ) 2 2 ℒ𝑑𝑧/𝑑𝑡 = ∇𝑥 𝜑 𝑥 ሶ 𝑥 − 𝜃 𝜑 𝑥 𝑇 Ξ 2 2 ℒ𝑟𝑒𝑔 = Ξ 1 1
  • 44. Applications: Nonlinear Pendulum Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45), 22445-22451.
  • 45. Applications: Lorenz System Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45), 22445-22451.
  • 46. Applications: Reaction Diffusion System Champion, K., Lusch, B., Kutz, J. N., & Brunton, S. L. (2019). Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences, 116(45), 22445-22451.
  • 47. Challenges • Clean, noise free measurement data • Limited interpretability of deep learning models • Coordinate transformation limitations • Need for integration with domain knowledge
  • 48. Future Extension Chebyshev Polynomials for System Identification 𝑇0 𝑥 = 1, 𝑇1 𝑥 = 𝑥 𝑇𝑛+1 𝑥 = 2 ∗ 𝑥 ∗ 𝑇𝑛 𝑥 − 𝑇𝑛−1 𝑥 𝑥𝜖[−1,1] 𝑦 𝑥 = 𝛼0𝑇0 𝑥 + 𝛼1𝑇1 𝑥 + 𝛼2𝑇2 𝑥 + ⋯ + 𝛼𝑛𝑇𝑛 𝑥 Why Chebyshev Polynomials? • Orthogonality • Numerical Stability • Efficient Approximation Application to System Identification: • Use high-order Chebyshev polynomials to create candidate basis functions. • Each function becomes a weighted sum of these polynomials:
  • 49. Extending Chebyshev Polynomials to Multiple Features • Generalizing to Multiple Features: • For features 𝑥1, 𝑥2,… 𝑥𝑚 ∶ 𝑦 𝑥1, 𝑥2 = ෍ 𝑖=0 𝑛 ෍ 𝑗 𝑛 𝛼𝑖,𝑗𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 • Including Interactions Between Features • Tensor Products • For Two Features 𝑥1 and 𝑥2 𝑦 𝑥1, 𝑥2, … 𝑥𝑚 = ෍ 𝑖,𝑗,…,𝑘 𝛼𝑖,𝑗,…,𝑘𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 … 𝑇𝑘 𝑥𝑚
  • 50. Optimizing and Sparsifying with Cross- Feature Terms • Learn the coefficients 𝛼𝑖,𝑗,…,𝑘that best fit the system while promoting sparsity • Optimization problem: 𝑚𝑖𝑛𝛼 𝑦 − ෍ 𝑖,𝑗,…,𝑘 𝛼𝑖,𝑗,…,𝑘𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 … 𝑇𝑘 𝑥𝑚 2 + 𝜆 𝛼 1 Sparsity and Interpretability • Pruning Insignificant Terms: Coefficients corresponding to unimportant polynomials shrink to zero. • Simplified Model: Results in a model that is both accurate and interpretable, using only the most significant terms.
  • 51. Synergy with Neural Networks • Neural Network Enhancement • Adaptive Coefficients • Dynamic Modeling • Model Architecture • Input Layer: Receives features 𝑥1, 𝑥2, … , 𝑥𝑚 • Hidden Layers • Output Layer: 𝛼𝑖,𝑗,…,𝑘 • Combined Model: 𝑦 𝑥1, 𝑥2, … 𝑥𝑚 = ෍ 𝑖,𝑗,…,𝑘 𝛼𝑖,𝑗,…,𝑘 𝑥1, 𝑥2, … , 𝑥𝑚 𝑇𝑖 𝑥1 𝑇𝑗 𝑥2 … 𝑇𝑘 𝑥𝑚
  • 52. Remarks Advantages • Enhanced Expressiveness • Improved Accuracy • Interpretability Practical considerations • Computational Complexity • Regularization Techniques • Training the Neural Network
  • 53. Example application: Neural Networks for classification Normally Output of a neuron is σ𝑖=0 𝑚 𝑤𝑖𝑥𝑖, m is the number of features Where 𝑤𝑖 is the learnable parameter For ChebyShev, each 𝑤𝑖 is as follows 𝑤𝑖 = ෍ 𝑑=0 𝑘 𝑐𝑖,𝑑𝑇𝑑 𝑥𝑖 Where 𝑐𝑑 is the learnable parameter, k is the order of Chebyshev polynomial Thus for Chebyshev, output of a neuron is ෍ 𝑖=0 𝑚 𝑤𝑖𝑥𝑖 Which can also be written as ෍ 𝑖=0 𝑚 ෍ 𝑑=0 𝑘 𝑐𝑖,𝑑𝑇𝑑 𝑥𝑖 𝑥𝑖
  • 54. For k=3(0,1,2), Cheby Shev Output ෍ 𝑖=0 𝑚 ෍ 𝑑=0 𝑘 𝑐𝑖,𝑑𝑇𝑑 𝑥𝑖 𝑥𝑖 = ෍ 𝑖=0 𝑚 𝑐𝑖,0𝑇0 𝑥𝑖 + 𝑐𝑖,1𝑇1 𝑥𝑖 + 𝑐𝑖,2𝑇2 𝑥𝑖 𝑥𝑖 Now, if 𝑐𝑖,1and 𝑐𝑖,2are forced to be 0, and 𝑇0 is 1, output becomes σ𝑖=0 𝑚 𝑐𝑖,0 𝑥𝑖 which is same as σ𝑖=0 𝑚 𝑤𝑖𝑥𝑖 where each 𝑐𝑖,0 term in the expression represents 𝑤𝑖 Example application: Neural Networks for classification
  • 55. Results File Size Columns Number Of Classes MLP F1 Score F1 Score by ChebyShevNetwork Delta percentage krkopt.csv 28056 6 18 0.509 0.632 24.165 contraceptive.csv 1473 9 3 0.529 0.582 10.019 led7.csv 3200 7 10 0.703 0.764 8.677 cmc.csv 1473 9 3 0.536 0.58 8.209 car.csv 1728 6 4 0.898 0.967 7.684 splice.csv 3188 60 3 0.851 0.899 5.64 wine_quality_red.csv 1599 11 6 0.559 0.59 5.546 wine_quality_white.csv 4898 11 7 0.517 0.542 4.836 letter.csv 20000 16 26 0.896 0.924 3.125 satimage.csv 6435 36 6 0.86 0.886 3.023 solar_flare_2.csv 1066 12 6 0.72 0.74 2.778 page_blocks.csv 5473 10 5 0.959 0.978 1.981 nursery.csv 12958 8 4 0.984 0.997 1.321 sleep.csv 105908 13 5 0.744 0.753 1.21 allhypo.csv 3770 29 3 0.938 0.948 1.066 allhyper.csv 3771 29 4 0.981 0.988 0.714 allrep.csv 3772 29 4 0.971 0.975 0.412 ann_thyroid.csv 7200 21 3 0.982 0.986 0.407 texture.csv 5500 40 11 0.986 0.99 0.406 segmentation.csv 2310 19 7 0.972 0.974 0.206 dna.csv 3186 180 3 0.938 0.939 0.107 pendigits.csv 10992 16 10 0.994 0.994 0 yeast.csv 1479 8 9 0.588 0.587 -0.17 allbp.csv 3772 29 3 0.977 0.973 -0.409 optdigits.csv 5620 64 10 0.976 0.972 -0.41 car_evaluation.csv 1728 21 4 0.991 0.986 -0.505 led24.csv 3200 24 10 0.742 0.735 -0.943
  • 56. Data Extrapolation Initial Hypothesis Generation Physical Model Simulation Prediction and Uncertainty Estimation Data Integration and Model Update Active Learning: Data Acquisition