SlideShare a Scribd company logo
An Introduction to
Hamiltonian Neural Networks
Presented by Miles Cranmer, Princeton University
@MilesCranmer
(advised by Shirley Ho/David Spergel)
This is based on none of my own research.
The work is by:
Sam Greydanus, Misko Dzamba, and Jason Yosinski
(+ Tom Bertalan, Felix Dietrich, Igor Mesić, and Ioannis G
Kevrekidis which was posted at a similar time)
Ordering:
1. Classical Mechanics Review
2. Neural Networks
3. Hamiltonian Neural Networks
4. Bonus: Neural ODEs
5. Code Demo
Forces
• Objects and fields by themselves induce
forces on other objects
• A vector-wise sum of forces gets the net force
• Divide by mass of the body to get the
acceleration
• Common forces:
• Normal force (desk holding something)
• Friction
• Tension (string)
• Gravity
[1]
Lagrangian Mechanics
• For a coordinate system,
• (Focus on object coordinates for today)
• Write down kinetic energy =
• Potential energy =
• Lagrangian is a function of coordinates and (usually) their first
order derivatives
• Action is:
• Apply principle of stationary action
Lagrangian Mechanics 2
• By extremizing the action, we get the Euler-Lagrange equations.
• Example: falling ball:
• Numerically integrate these to get the dynamics of the system
Hamiltonian Mechanics
• Canonical momenta for a system:
• Legendre transformation of L is the Hamiltonian:
• This usually is the energy, conserved in a dynamical system.
• What path preserves H?
• Move perpendicular to its gradient!
• Called symplectic programming
• Falling ball:
[2]
Hamiltonian Mechanics 2
• H-preserving path = Symplectic Gradient:
• Also known as Hamilton’s equations!
• Can use these first order, explicit ODEs to integrate physical
dynamics
• Problems with L:
• Second order, implicit ODEs
• L isn’t meaningful by itself
Things to worry about with L, H
• Dissipation/friction
• Need to add force to Euler-Lagrange equation
• Can also use multiplicative factor:
• Energy pools/boundaries
• Constraints
• E.g., normal forces
• Sol’n: Use better coordinates (sometimes tricky)
• Or, use constraint function that equals 0
• (Lagrange multiplier method)
• *After reading the presentation – if you manage to think of a way
to add these techniques to a Hamiltonian NN, come talk to me!
Integrators
• Presented with an explicit differential equation,
we can use several methods to numerically integrate it.
• Recall that:
• This is an Euler integrator:
Accurate Integrators
• Advanced integrators do several
intermediate steps to improve accuracy
• Runge-Kutta integrators target accuracy
• Can be very accurate, but not preserve
known invariants!
• Symplectic integrators target energy
conservation
• Can preserve energy very well, but have no
accuracy!
• (All integrators are bad for longterm
accuracy)
[3]
Integrator Examples
• Runge-Kutta 4th order
(most common)
• High accuracy, low-
cost
• Does not necessarily
preserve energy
[3]
[3]
• Symplectic 4th order (Yoshida)
• These exactly conserve energy!
• Do drift (update x) and kick (update p) steps separately
• (c, d) are ugly constants,
some negative,
which add to 1
[4]
Pivot to Machine Learning
• Recall (or not?): Machine Learning is parameter estimation where
the parameters lack explicit physical meaning!
• Many types of ML:
• Supervised (common):
• Regression
• Classification
• Unsupervised
• E.g., clustering, density estimation
• Semi-supervised – a mix
• Linear Regression – this counts as ML!
[5]
Neural Networks
• Repeat after me:
Neural Networks are piecewise Linear Regression!
• Mathematically (we’ll only talk Multi-Layer Perceptrons):
• (You do a linear regression -> zero the negatives -> repeat)
Neural Networks 2
• Repeat after me:
Neural Networks are piecewise Linear Regression!
• 0-hidden layer Neural Network: linear regression!
• 1-hidden layer NN with ReLU: Piecewise
• Whatever combination of “neurons” are on = different “region” for linear
regression
• 2^(layers*hidden size) different linear regression solutions
• Continuously connected
• Don’t expect good extrapolation! Only nearby interpolation
• Neural Net parameters both inform the slope and the regions.
I don’t believe you!
• Randomly-initialized 2-hidden layer 50-node NN:
Why?
• ReLU on = linear regression
• ReLU off = 0
• Remaining nodes simplify to
linear regression!
[6]
Neural Network Aside
• Other activation functions: tanh and softplus, smear this linearity
• Neural Networks are universal function approximators. In the
limit of infinitely wide layers, even with two hidden ones, they can
express any mapping.
• They happen to be efficient at doing this too!
• All Neural Network techniques are about getting them to cheat
less. They are very good at cheating.
• Data Augmentation (hugely important)
• Regularization
• Structure (Convolutional NN, Graph Net, etc)
Differentiability
• Derivative is well-defined. Just a product of sparse matrices!
• Interested in:
• Derivative wrt weights used for optimization (SGD or Adam)
• Auto-diff frameworks like TensorFlow and PyTorch make this easy.
• Demo: https://playground.tensorflow.org
Neural Nets for Physical Dynamics
• Here we will focus on physical systems over time.
• Many other things like sequences can be reframed as dynamics
problems.
• We are interested in problems where we have:
•
• for i particles over time
• In addition to other fixed properties...
• How do we use Neural Nets to simulate systems?
Example - Pendulum
• How to learn to estimate the future position and velocity of a
pendulum?
• Neural Net:
• n is the number of particles*dynamical parameters
• l is the number of fixed parameters
• Pendulum:
• n = 2 (theta, theta velocity)
• l = 2 (gravity, length of pendulum)
• Want to only predict change in parameters - easier regression problem
• So, here we are learning a function that approximates a velocity update
and a force law
[7]
Real World Applications (of NNs for
simulation)
• Neural Networks learn "effective" forces in simulations
• They only look at the most relevant degrees of freedom!
• Can be more accurate at reduced computational cost
• Some examples:
• Shirley Ho's U-Net can do cosmological simulations much faster and more
accurately than standard simulators
• Peter Battaglia's Interaction Network used in many applications
• Drug discovery/molecular+protein modelling – getting very popular
• E.g., Cecilia Clementi, Frank Noe, Mark Waller, many others
• DeepMind's AlphaFold Protein Folding algorithm - destroys baseline algorithms at
finding structure from genetic code
• See IPAM's recent workshop for good list!
• Some say intelligent reasoning is based on learning to simulate potential
outcomes => path to general intelligence?
Hamiltonian Neural Networks
• Learn a mapping from coordinates and momenta to a single
number
• The derivatives of this can describe your dynamics by Hamilton's
equations:
• Comparing the true and predicted dynamical updates gives a
minimization objective:
(Sam’s blog)
(Sam’s blog)
Why?
• It works better; it’s more interpretable. Not only
do we have a simulator, we know the energy!
(Sam’s blog)
Why does it work?
• It uses symplectic gradients: by prescribing that we can only move
along the level set of H, it learns the proper H.
Start: Final:
(Sam’s blog)
(Sanchez-Gonzalez
et al)
Graph Network extension:
Integrators
• So far we have only talked about Euler integrators. But as dH is
just an ODE, we can use any integrator: RK4 and symplectic
included.
• If H has learned the true energy, we can exactly preserve it with
symplectic integrators.
• In practice, RK4 still more accurate. Maybe some combination is best?
This model is less than 6 months old! We don't know what is best yet.
• Can train + eval with RK4 or Symplectic Methods!
• Do multiple queries and multiple derivatives of your network’s H
• This works very well in practice.
I don’t know the canonical coordinates!
• Pair two Neural Networks:
• g, an autoencoder to latent variables
• H, a Hamiltonian that pretends those
latent variables are (q, p).
• Training this setup in combination
will learn the
canonical coords
+ the Hamiltonian!
(Sam’s blog)
Tips
• Activations:
• Recall: Neural Networks are piecewise linear regression.
• Looking at derivatives from ReLU means we are literally learning a lookup
table – not good!
• Use Softplus or Tanh to make H have a smoother derivative
• Use more hidden nodes than for regular NNs, as H needs to be very
smooth
• Stability:
• According to some (Stephan Hoyer), better to learn multiple timesteps at
once.
• Use RK4 integrators
Bonus: Neural ODEs
• Famous 2018 paper:
Neural Ordinary Differential
Equations.
• Hamiltonian Neural
Networks -ARE- a Neural
ODE.
• Paper connects ResNets
with Euler integrators
• Paper: “Why not just learn a
derivative and integrate it?”
• Smoother output!
(Chen et al)
PyTorch Tutorial – Falling Ball
• Short: https://bit.ly/2JiTEJE
• (Copy to new notebook in your drive)
Figure + other references
1. http://ffden-2.phys.uaf.edu/211_fall2004.web.dir/Jeff_Levison/Freebody%20diagram.htm
2. https://physics.stackexchange.com/questions/384990/why-will-a-dropped-object-land-at-the-same-time-as-a-sideways-
thrown-one
3. https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods
4. https://en.wikipedia.org/wiki/Leapfrog_integration
5. https://en.wikipedia.org/wiki/Linear_regression#/media/File:Linear_regression.svg
6. https://medium.com/@amarbudhiraja/https-medium-com-amarbudhiraja-learning-less-to-learn-better-dropout-in-
deep-machine-learning-74334da4bfc5
7. https://medium.com/@kriswilliams/how-life-is-like-a-pendulum-8811c4177685
Other resources used:
1. https://arxiv.org/abs/1906.01563
2. https://arxiv.org/abs/1907.12715
3. https://arxiv.org/pdf/1909.12790.pdf
4. https://greydanus.github.io/2019/05/15/hamiltonian-nns/
5. https://arxiv.org/pdf/1806.07366.pdf

More Related Content

PDF
[Paper reading] Hamiltonian Neural Networks
PPTX
[DL輪読会]物理学による帰納バイアスを組み込んだダイナミクスモデル作成に関する論文まとめ
PDF
PRML11.2 - 11.6
PDF
[DL輪読会]陰関数微分を用いた深層学習
PDF
Direct feedback alignment provides learning in Deep Neural Networks
PDF
統計的因果推論 勉強用 isseing333
PDF
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
PDF
PRML輪読#5
[Paper reading] Hamiltonian Neural Networks
[DL輪読会]物理学による帰納バイアスを組み込んだダイナミクスモデル作成に関する論文まとめ
PRML11.2 - 11.6
[DL輪読会]陰関数微分を用いた深層学習
Direct feedback alignment provides learning in Deep Neural Networks
統計的因果推論 勉強用 isseing333
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
PRML輪読#5

What's hot (20)

PDF
制限ボルツマンマシン入門
PPTX
研究紹介(学生向け)
PPTX
A gene-based association method for mapping traits using reference transcript...
PDF
TensorFlowプログラミングと分類アルゴリズムの基礎
PDF
Rで階層ベイズモデル
PDF
ICML 2020 最適輸送まとめ
PDF
半正定値計画問題と最大カット Sedemifinite Programming and Approximation Algorithm for Maxcu...
PDF
分布あるいはモーメント間距離最小化に基づく統計的音声合成
PPTX
マルコフ連鎖モンテカルロ法
PDF
私のファミコンのfpsは530000です。もちろんフルパワーで(以下略
PDF
目からウロコのモラルハザード解決法
PPTX
A Mathematical Introduction to Robotic Manipulation 第4章
PDF
MLaPP 24章 「マルコフ連鎖モンテカルロ法 (MCMC) による推論」
PDF
Stochastic Gradient MCMC
PDF
パターン認識と機械学習 §8.3.4 有向グラフとの関係
PDF
十分鐘讓程式人搞懂雲端平台與技術
PDF
用十分鐘快速瞭解 《人工智慧的過去、現在與未來》
PDF
機械学習モデルフォーマットの話:さようならPMML、こんにちはPFA
PDF
基礎からのベイズ統計学第5章
PPTX
Visualizing data using t-SNE
制限ボルツマンマシン入門
研究紹介(学生向け)
A gene-based association method for mapping traits using reference transcript...
TensorFlowプログラミングと分類アルゴリズムの基礎
Rで階層ベイズモデル
ICML 2020 最適輸送まとめ
半正定値計画問題と最大カット Sedemifinite Programming and Approximation Algorithm for Maxcu...
分布あるいはモーメント間距離最小化に基づく統計的音声合成
マルコフ連鎖モンテカルロ法
私のファミコンのfpsは530000です。もちろんフルパワーで(以下略
目からウロコのモラルハザード解決法
A Mathematical Introduction to Robotic Manipulation 第4章
MLaPP 24章 「マルコフ連鎖モンテカルロ法 (MCMC) による推論」
Stochastic Gradient MCMC
パターン認識と機械学習 §8.3.4 有向グラフとの関係
十分鐘讓程式人搞懂雲端平台與技術
用十分鐘快速瞭解 《人工智慧的過去、現在與未來》
機械学習モデルフォーマットの話:さようならPMML、こんにちはPFA
基礎からのベイズ統計学第5章
Visualizing data using t-SNE
Ad

Similar to Introduction to Hamiltonian Neural Networks (20)

PDF
Lecture artificial neural networks and pattern recognition
PDF
Lecture artificial neural networks and pattern recognition
PDF
Introduction to Neural Networks in Tensorflow
PPTX
160406_abajpai1
PPTX
Deep Learning Sample Class (Jon Lederman)
PDF
Capstone paper
PPTX
Lec10.pptx
PPT
Notes from 2016 bay area deep learning school
PPTX
Deep learning from a novice perspective
PPTX
Introduction to Neural Netwoks
PDF
Artificial neural networks
PPTX
UNIT III (8).pptx
PPTX
Artificial Neural Network by Dr.C.R.Dhivyaa Kongu Engineering College
PPTX
Introduction to deep learning - basic concept of CNN
PPT
Artificial Neural Network
PDF
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
PDF
Neural network book. Interesting and precise
PDF
APPLIED MACHINE LEARNING
PDF
Introduction to Neural Networks
PPTX
GAN for Bayesian Inference objectives
Lecture artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognition
Introduction to Neural Networks in Tensorflow
160406_abajpai1
Deep Learning Sample Class (Jon Lederman)
Capstone paper
Lec10.pptx
Notes from 2016 bay area deep learning school
Deep learning from a novice perspective
Introduction to Neural Netwoks
Artificial neural networks
UNIT III (8).pptx
Artificial Neural Network by Dr.C.R.Dhivyaa Kongu Engineering College
Introduction to deep learning - basic concept of CNN
Artificial Neural Network
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Neural network book. Interesting and precise
APPLIED MACHINE LEARNING
Introduction to Neural Networks
GAN for Bayesian Inference objectives
Ad

Recently uploaded (20)

PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PPTX
Seminar Hypertension and Kidney diseases.pptx
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PDF
Sciences of Europe No 170 (2025)
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
CORDINATION COMPOUND AND ITS APPLICATIONS
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Biomechanics of the Hip - Basic Science.pptx
PDF
The scientific heritage No 166 (166) (2025)
PPT
6.1 High Risk New Born. Padetric health ppt
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PPTX
Fluid dynamics vivavoce presentation of prakash
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
Application of enzymes in medicine (2).pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
BIOMOLECULES PPT........................
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
Science Quipper for lesson in grade 8 Matatag Curriculum
Seminar Hypertension and Kidney diseases.pptx
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Sciences of Europe No 170 (2025)
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Placing the Near-Earth Object Impact Probability in Context
CORDINATION COMPOUND AND ITS APPLICATIONS
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Biomechanics of the Hip - Basic Science.pptx
The scientific heritage No 166 (166) (2025)
6.1 High Risk New Born. Padetric health ppt
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
Fluid dynamics vivavoce presentation of prakash
. Radiology Case Scenariosssssssssssssss
Application of enzymes in medicine (2).pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
BIOMOLECULES PPT........................
lecture 2026 of Sjogren's syndrome l .pdf
Phytochemical Investigation of Miliusa longipes.pdf
7. General Toxicologyfor clinical phrmacy.pptx

Introduction to Hamiltonian Neural Networks

  • 1. An Introduction to Hamiltonian Neural Networks Presented by Miles Cranmer, Princeton University @MilesCranmer (advised by Shirley Ho/David Spergel) This is based on none of my own research. The work is by: Sam Greydanus, Misko Dzamba, and Jason Yosinski (+ Tom Bertalan, Felix Dietrich, Igor Mesić, and Ioannis G Kevrekidis which was posted at a similar time)
  • 2. Ordering: 1. Classical Mechanics Review 2. Neural Networks 3. Hamiltonian Neural Networks 4. Bonus: Neural ODEs 5. Code Demo
  • 3. Forces • Objects and fields by themselves induce forces on other objects • A vector-wise sum of forces gets the net force • Divide by mass of the body to get the acceleration • Common forces: • Normal force (desk holding something) • Friction • Tension (string) • Gravity [1]
  • 4. Lagrangian Mechanics • For a coordinate system, • (Focus on object coordinates for today) • Write down kinetic energy = • Potential energy = • Lagrangian is a function of coordinates and (usually) their first order derivatives • Action is: • Apply principle of stationary action
  • 5. Lagrangian Mechanics 2 • By extremizing the action, we get the Euler-Lagrange equations. • Example: falling ball: • Numerically integrate these to get the dynamics of the system
  • 6. Hamiltonian Mechanics • Canonical momenta for a system: • Legendre transformation of L is the Hamiltonian: • This usually is the energy, conserved in a dynamical system. • What path preserves H? • Move perpendicular to its gradient! • Called symplectic programming
  • 8. Hamiltonian Mechanics 2 • H-preserving path = Symplectic Gradient: • Also known as Hamilton’s equations! • Can use these first order, explicit ODEs to integrate physical dynamics • Problems with L: • Second order, implicit ODEs • L isn’t meaningful by itself
  • 9. Things to worry about with L, H • Dissipation/friction • Need to add force to Euler-Lagrange equation • Can also use multiplicative factor: • Energy pools/boundaries • Constraints • E.g., normal forces • Sol’n: Use better coordinates (sometimes tricky) • Or, use constraint function that equals 0 • (Lagrange multiplier method) • *After reading the presentation – if you manage to think of a way to add these techniques to a Hamiltonian NN, come talk to me!
  • 10. Integrators • Presented with an explicit differential equation, we can use several methods to numerically integrate it. • Recall that: • This is an Euler integrator:
  • 11. Accurate Integrators • Advanced integrators do several intermediate steps to improve accuracy • Runge-Kutta integrators target accuracy • Can be very accurate, but not preserve known invariants! • Symplectic integrators target energy conservation • Can preserve energy very well, but have no accuracy! • (All integrators are bad for longterm accuracy) [3]
  • 12. Integrator Examples • Runge-Kutta 4th order (most common) • High accuracy, low- cost • Does not necessarily preserve energy [3]
  • 13. [3]
  • 14. • Symplectic 4th order (Yoshida) • These exactly conserve energy! • Do drift (update x) and kick (update p) steps separately • (c, d) are ugly constants, some negative, which add to 1 [4]
  • 15. Pivot to Machine Learning • Recall (or not?): Machine Learning is parameter estimation where the parameters lack explicit physical meaning! • Many types of ML: • Supervised (common): • Regression • Classification • Unsupervised • E.g., clustering, density estimation • Semi-supervised – a mix • Linear Regression – this counts as ML! [5]
  • 16. Neural Networks • Repeat after me: Neural Networks are piecewise Linear Regression! • Mathematically (we’ll only talk Multi-Layer Perceptrons): • (You do a linear regression -> zero the negatives -> repeat)
  • 17. Neural Networks 2 • Repeat after me: Neural Networks are piecewise Linear Regression! • 0-hidden layer Neural Network: linear regression! • 1-hidden layer NN with ReLU: Piecewise • Whatever combination of “neurons” are on = different “region” for linear regression • 2^(layers*hidden size) different linear regression solutions • Continuously connected • Don’t expect good extrapolation! Only nearby interpolation • Neural Net parameters both inform the slope and the regions.
  • 18. I don’t believe you! • Randomly-initialized 2-hidden layer 50-node NN:
  • 19. Why? • ReLU on = linear regression • ReLU off = 0 • Remaining nodes simplify to linear regression! [6]
  • 20. Neural Network Aside • Other activation functions: tanh and softplus, smear this linearity • Neural Networks are universal function approximators. In the limit of infinitely wide layers, even with two hidden ones, they can express any mapping. • They happen to be efficient at doing this too! • All Neural Network techniques are about getting them to cheat less. They are very good at cheating. • Data Augmentation (hugely important) • Regularization • Structure (Convolutional NN, Graph Net, etc)
  • 21. Differentiability • Derivative is well-defined. Just a product of sparse matrices! • Interested in: • Derivative wrt weights used for optimization (SGD or Adam) • Auto-diff frameworks like TensorFlow and PyTorch make this easy. • Demo: https://playground.tensorflow.org
  • 22. Neural Nets for Physical Dynamics • Here we will focus on physical systems over time. • Many other things like sequences can be reframed as dynamics problems. • We are interested in problems where we have: • • for i particles over time • In addition to other fixed properties... • How do we use Neural Nets to simulate systems?
  • 23. Example - Pendulum • How to learn to estimate the future position and velocity of a pendulum? • Neural Net: • n is the number of particles*dynamical parameters • l is the number of fixed parameters • Pendulum: • n = 2 (theta, theta velocity) • l = 2 (gravity, length of pendulum) • Want to only predict change in parameters - easier regression problem • So, here we are learning a function that approximates a velocity update and a force law [7]
  • 24. Real World Applications (of NNs for simulation) • Neural Networks learn "effective" forces in simulations • They only look at the most relevant degrees of freedom! • Can be more accurate at reduced computational cost • Some examples: • Shirley Ho's U-Net can do cosmological simulations much faster and more accurately than standard simulators • Peter Battaglia's Interaction Network used in many applications • Drug discovery/molecular+protein modelling – getting very popular • E.g., Cecilia Clementi, Frank Noe, Mark Waller, many others • DeepMind's AlphaFold Protein Folding algorithm - destroys baseline algorithms at finding structure from genetic code • See IPAM's recent workshop for good list! • Some say intelligent reasoning is based on learning to simulate potential outcomes => path to general intelligence?
  • 25. Hamiltonian Neural Networks • Learn a mapping from coordinates and momenta to a single number • The derivatives of this can describe your dynamics by Hamilton's equations: • Comparing the true and predicted dynamical updates gives a minimization objective: (Sam’s blog)
  • 27. Why? • It works better; it’s more interpretable. Not only do we have a simulator, we know the energy! (Sam’s blog)
  • 28. Why does it work? • It uses symplectic gradients: by prescribing that we can only move along the level set of H, it learns the proper H. Start: Final: (Sam’s blog)
  • 30. Integrators • So far we have only talked about Euler integrators. But as dH is just an ODE, we can use any integrator: RK4 and symplectic included. • If H has learned the true energy, we can exactly preserve it with symplectic integrators. • In practice, RK4 still more accurate. Maybe some combination is best? This model is less than 6 months old! We don't know what is best yet. • Can train + eval with RK4 or Symplectic Methods! • Do multiple queries and multiple derivatives of your network’s H • This works very well in practice.
  • 31. I don’t know the canonical coordinates! • Pair two Neural Networks: • g, an autoencoder to latent variables • H, a Hamiltonian that pretends those latent variables are (q, p). • Training this setup in combination will learn the canonical coords + the Hamiltonian! (Sam’s blog)
  • 32. Tips • Activations: • Recall: Neural Networks are piecewise linear regression. • Looking at derivatives from ReLU means we are literally learning a lookup table – not good! • Use Softplus or Tanh to make H have a smoother derivative • Use more hidden nodes than for regular NNs, as H needs to be very smooth • Stability: • According to some (Stephan Hoyer), better to learn multiple timesteps at once. • Use RK4 integrators
  • 33. Bonus: Neural ODEs • Famous 2018 paper: Neural Ordinary Differential Equations. • Hamiltonian Neural Networks -ARE- a Neural ODE. • Paper connects ResNets with Euler integrators • Paper: “Why not just learn a derivative and integrate it?” • Smoother output! (Chen et al)
  • 34. PyTorch Tutorial – Falling Ball • Short: https://bit.ly/2JiTEJE • (Copy to new notebook in your drive)
  • 35. Figure + other references 1. http://ffden-2.phys.uaf.edu/211_fall2004.web.dir/Jeff_Levison/Freebody%20diagram.htm 2. https://physics.stackexchange.com/questions/384990/why-will-a-dropped-object-land-at-the-same-time-as-a-sideways- thrown-one 3. https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods 4. https://en.wikipedia.org/wiki/Leapfrog_integration 5. https://en.wikipedia.org/wiki/Linear_regression#/media/File:Linear_regression.svg 6. https://medium.com/@amarbudhiraja/https-medium-com-amarbudhiraja-learning-less-to-learn-better-dropout-in- deep-machine-learning-74334da4bfc5 7. https://medium.com/@kriswilliams/how-life-is-like-a-pendulum-8811c4177685 Other resources used: 1. https://arxiv.org/abs/1906.01563 2. https://arxiv.org/abs/1907.12715 3. https://arxiv.org/pdf/1909.12790.pdf 4. https://greydanus.github.io/2019/05/15/hamiltonian-nns/ 5. https://arxiv.org/pdf/1806.07366.pdf