SlideShare a Scribd company logo
SPARSITY IN THE NEOCORTEX,
AND ITS IMPLICATIONS FOR CONTINUOUS LEARNING
CVPR WORKSHOP ON CONTINUAL LEARNING IN COMPUTER VISION
JUNE 14, 2020
Subutai Ahmad
Email: sahmad@numenta.com
Twitter: @SubutaiAhmad
1) Reverse engineer the neocortex
- biologically accurate theories
- open access neuroscience publications
2) Apply neocortical principles to AI
- improve current techniques
- move toward truly intelligent systems
Mission
Founded in 2005,
by Jeff Hawkins and Donna Dubinsky
OUTLINE
1. Sparsity in the neocortex
• Sparse activations and connectivity
• Neuron model
• Learning rules
2. Sparse representations and catastrophic forgetting
• Stability
• Plasticity
3. Network model
• Unsupervised continuously learning system
Source: Prof. Hasan, Max-Planck-Institute for Research
“mostly missing”
sparse vector = vector with mostly zero elements
Most neuroscience papers describe three types of sparsity:
1) Population sparsity
How many neurons are active right now?
Estimate: roughly 0.5% to 2% of cells are active at a time (Attwell & Laughlin, 2001; Lennie, 2003).
2) Lifetime sparsity
How often does a given cell fire?
3) Connection sparsity
When a layer of cells projects to another layer, what percentage are connected?
Estimate: 1% - 5% of possible neuron to neuron connections exist (Holmgren et al., 2003).
WHAT EXACTLY IS “SPARSITY”?
“axon”“soma”
Point Neuron Model
NEURON MODEL
x Not a neuron
Integrate and fire neuron: Lapicque, 1907
Perceptron: Rosenblatt 1962;
Deep learning: Rumelhart et al. 1986; LeCun et al., 2015
Source: Smirnakis Lab, Baylor College of Medicine
DENDRITES DETECT SPARSE PATTERNS
(Mel, 1992; Branco & Häusser, 2011; Schiller et al,
2000; Losonczy, 2006; Antic et al, 2010; Major et al,
2013; Spruston, 2008; Milojkovic et al, 2005, etc.)
Major, Larkum and Schiller 2013
Pyramidal neuron
3K to 10K synapses
Dendrites split into dozens of independent computational segments
These segments activate with cluster of 10-20 active synapses
Neurons detect dozens of highly sparse patterns, in parallel
Pyramidal neuron
Sparse feedforward patterns
Sparse local patterns
Sparse top-down patterns
9
Learning localized to dendritic segments
“Branch specific plasticity”
If cell becomes active:
• If there was a dendritic spike, reinforce that segment
• If there were no dendritic spikes, grow connections by
subsampling cells active in the past
If cell is not active:
• If there was a dendritic spike, weaken the segments
(Gordon et al., 2006; Losonczy et al., 2008; Yang et al., 2014; Cichon & Gang, 2015;
El-Boustani et al., 2018; Weber et al., 2016; Sander et al., 2016; Holthoff et al., 2004)
NEURONS UNDERGO SPARSE LEARNING
“We observed substantial spine turnover, indicating that the architecture of the neuronal circuits in the
auditory cortex is dynamic (Fig. 1B). Indeed, 31% ± 1% (SEM) of the spines in a given imaging
session were not detected in the previous imaging session; and, similarly, 31 ± 1% (SEM) of the spines
identified in an imaging session were no longer found in the next imaging session.
(Loewenstein, et al., 2015)
Learning involves growing and removing synapses
• Structural plasticity: network structure is dynamically altered during learning
HIGHLY DYNAMIC LEARNING AND CONNECTIVITY
OUTLINE
1. Sparsity in the neocortex
• Neural activations and connectivity are highly sparse
• Neurons detect dozens of independent sparse patterns
• Learning is sparse and incredibly dynamic
2. Sparse representations and catastrophic forgetting
• Stability
• Plasticity
3. Network model
• Unsupervised continuously learning system
Thousands of neurons send input to any single
neuron
On each neuron, 8-20 synapses on tiny segments of
dendrites recognize patterns.
The connections are learned.
STABILITY OF SPARSE REPRESENTATIONS
Pyramidal neuron
3K to 10K synapses
xi<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
n inputs<latexit sha1_base64="dwl2mcnPQNgv8dEIjD6vL0j60R0=">AAAB+XicbVDLSsNAFJ3UV62vqEtFBovgqiS60GXRjcsW7APaUCbTSTt0MgkzN8USuvQv3LhQxK2bfoc7v8GfcJp2oa0HLhzOuZd77/FjwTU4zpeVW1ldW9/Ibxa2tnd29+z9g7qOEkVZjUYiUk2faCa4ZDXgIFgzVoyEvmANf3A79RtDpjSP5D2MYuaFpCd5wCkBI3VsW+I2sAdIMZdxAnrcsYtOycmAl4k7J8Xy8aT6/XgyqXTsz3Y3oknIJFBBtG65TgxeShRwKti40E40iwkdkB5rGSpJyLSXZpeP8ZlRujiIlCkJOFN/T6Qk1HoU+qYzJNDXi95U/M9rJRBce2n2E5N0tihIBIYIT2PAXa4YBTEyhFDFza2Y9okiFExYBROCu/jyMqlflNzLklM1adygGfLoCJ2ic+SiK1RGd6iCaoiiIXpCL+jVSq1n6816n7XmrPnMIfoD6+MHppeXXg==</latexit>
P(xi · xj ✓)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Sparse vector matching
= connections on dendrite
= input activity
We can get excellent robustness by reducing , at
the cost of increased “false positives” and
interference.
Can compute the probability of a random
vector matching a given :
Numerator: volume around point (white)
Denominator: full volume of space (grey)
P (xi · xj ✓) =
P|xi|
b=✓ | ⌦n
(xi, b, |xj|) |
n
|xj |<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
|⌦n
(xi, b, k)| =
✓
|xi|
b
◆✓
n |xi|
k b
◆
<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
xi<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
P(xi · xj ✓)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
STABILITY OF SPARSE REPRESENTATIONS
(Ahmad & Scheinkman, 2019)
1) False positive error decreases exponentially with dimensionality with sparsity.
2) Error rates do not decrease when activity is dense (a=n/2).
3) Assume uniform random distribution of vectors.
Sparse binary vectors: probability of interference Sparse scalar vectors: probability of interference
STABILITY OF SPARSE REPRESENTATIONS
(Ahmad & Scheinkman, 2019)
Pyramidal neuron
Sparse feedforward patterns
Sparse top-down patterns
Sparse local patterns
STABILITY VS PLASTICITY
(Hawkins & Ahmad, 2016)
Model pyramidal neuron
Simple localized learning rules
When a cell becomes active:
1) If a segment detected pattern, reinforce that segment
2) If no segment detected a pattern, grow new connections
on new dendritic segment
If cell did not become active:
1) If a segment detected pattern, weaken that segment
- Learning consists of growing new connections
- Neurons learn continuously but since patterns are
sparse and learning is sparse, new patterns don’t
interfere with old ones
Sparse top-down context
Sparse local context
Sparse feedforward patterns
STABILITY VS PLASTICITY
OUTLINE
1. Sparsity in the neocortex
• Neural activations and connectivity are highly sparse
• Neurons detect dozens of independent sparse patterns
• Learning is sparse and incredibly dynamic
2. Sparse representations and catastrophic forgetting
• Sparse high dimensional representations are remarkably stable
• Local plasticity rules enable learning new patterns without interference
3. Network model
• Unsupervised continuously learning system
(Hawkins & Ahmad, 2016)
HTM SEQUENCE MEMORY
Model pyramidal neuron
Sparse top-down context
Sparse local context
Sparse feedforward patterns
1) Associates past activity as context for current activity
2) Automatically learns from prediction errors
3) Learns continuously without forgetting past patterns
4) Can learn complex high-Markov order sequences
CONTINUOUS LEARNING AND FAULT TOLERANCE
Input: continuous stream of non-Markov sequences interspersed with random input
Task: correctly predict the next element (max accuracy is 50%)
XABCDE noise YABCFG noise YABCFG noise……
time
(Hawkins & Ahmad, 2016)
Changed sequences mid-stream
“killed” neurons
Recurrent
Neural network
(ESN, LSTM)
HTM*
CONTINUOUS LEARNING WITH STREAMING DATA SOURCES
(Cui et al, Neural Computation, 2016)
2015-04-20
Monday
2015-04-21
Tuesday
2015-04-22
Wednesday
2015-04-23
Thursday
2015-04-24
Friday
2015-04-25
Saturday
2015-04-26
Sunday
0 k
5 k
10 k
15 k
20 k
25 k
30 k
PassengerCountin30minwindow
A
B C
Shift
ARIM
ALSTM
1000LSTM
3000LSTM
6000
TM
0.0
0.2
0.4
0.6
0.8
1.0
NRMSE
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
MAPE
0.0
0.5
1.0
1.5
2.0
2.5
NegativeLog-likelihood
Shift
ARIM
ALSTM
1000LSTM
3000LSTM
6000
TM
LSTM
1000LSTM
3000LSTM
6000
TM
D
NYC Taxi demand datastream
Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
?
Apr01
15
Apr08
15
Apr15
15
Apr22
15
Apr29
15
M
ay
06
15
LSTM6000
HTMMeanabsolutepercenterror
0.06
0.08
0.10
0.12
0.14
Dynamics of pattern changed
(Cui et al, Neural Computation, 2016)
ADAPTS QUICKLY TO CHANGING STATISTICS
ANOMALY DETECTION
Benchmark for anomaly detection in streaming applications
Detector Score
Perfect 100.0
HTM 70.1
CAD OSE† 69.9
nab-comportex† 64.6
KNN CAD† 58.0
Relative Entropy 54.6
Twitter ADVec v1.0.0 47.1
Windowed Gaussian 39.6
Etsy Skyline 35.7
Sliding Threshold 30.7
Bayesian
Changepoint
17.7
EXPoSE 16.4
Random 11.0
• Real-world data (365,551 points, 58 data streams)
• Scoring encourages early detection
• Published, open resource
(Ahmad et al, 2017)
SUMMARY
1. Sparsity in the neocortex
• Neural activations and connectivity are highly sparse
• Neurons detect dozens of independent sparse patterns
• Learning is sparse and incredibly dynamic
2. Sparse representations and catastrophic forgetting
• Sparse high dimensional representations are remarkably stable
• Local plasticity rules enable learning new patterns without interference
3. Network model
• Biologically inspired unsupervised continuously learning system
• Inherently stable representations
• Thank you! Questions? sahmad@numenta.com
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for continuous learning

More Related Content

PDF
Sparsity In The Neocortex, And Its Implications For Machine Learning
PPTX
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
PDF
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
PPTX
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
PPTX
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
PDF
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
PPTX
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
PDF
Does the neocortex use grid cell-like mechanisms to learn the structure of ob...
Sparsity In The Neocortex, And Its Implications For Machine Learning
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Does the neocortex use grid cell-like mechanisms to learn the structure of ob...

What's hot (20)

PPTX
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
PPTX
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
PDF
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
PDF
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
PDF
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
PDF
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
PDF
Recognizing Locations on Objects by Marcus Lewis
PPTX
Sparse Distributed Representations: Our Brain's Data Structure
PPT
What is (computational) neuroscience?
PPTX
Consciousness, Graph theory and brain network tsc 2017
PDF
A tutorial in Connectome Analysis (3) - Marcus Kaiser
PDF
fundamentals-of-neural-networks-laurene-fausett
PDF
A tutorial in Connectome Analysis (0) - Marcus Kaiser
PPTX
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
PDF
Pattern Recognition using Artificial Neural Network
PDF
A framework for approaches to transfer of mind substrate
PDF
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
PPTX
Brain Networks
PPTX
CARLsim 3: Concepts, Tools, and Applications
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
Recognizing Locations on Objects by Marcus Lewis
Sparse Distributed Representations: Our Brain's Data Structure
What is (computational) neuroscience?
Consciousness, Graph theory and brain network tsc 2017
A tutorial in Connectome Analysis (3) - Marcus Kaiser
fundamentals-of-neural-networks-laurene-fausett
A tutorial in Connectome Analysis (0) - Marcus Kaiser
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Pattern Recognition using Artificial Neural Network
A framework for approaches to transfer of mind substrate
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
Brain Networks
CARLsim 3: Concepts, Tools, and Applications
Ad

Similar to CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for continuous learning (20)

PPT
Ann by rutul mehta
PPT
neuros
PPT
Neutral Network
PPTX
lecture13-NN-basics.pptx
PDF
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
PPT
Neural Networks
PDF
Machine Learning 2
PPTX
neural-networks (1)
PPTX
Neural Networks and Deep Learning Basics
PPT
PPTX
Artificial Neural Networks for NIU session 2016 17
PPS
Neural Networks Ver1
PPTX
Connectivism: Education & Artificial Intelligence
PPT
Neural network final NWU 4.3 Graphics Course
PPT
neuralnetworklearningalgorithm-231219123006-bb13a863.ppt
PDF
nncollovcapaldo2013-131220052427-phpapp01.pdf
PDF
nncollovcapaldo2013-131220052427-phpapp01.pdf
PPT
chapter one introduction to nueral networks
PPT
Neural Networks
PPT
0321204662_lec07_2.pptbklllllmbklgbkhnjfv
Ann by rutul mehta
neuros
Neutral Network
lecture13-NN-basics.pptx
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
Neural Networks
Machine Learning 2
neural-networks (1)
Neural Networks and Deep Learning Basics
Artificial Neural Networks for NIU session 2016 17
Neural Networks Ver1
Connectivism: Education & Artificial Intelligence
Neural network final NWU 4.3 Graphics Course
neuralnetworklearningalgorithm-231219123006-bb13a863.ppt
nncollovcapaldo2013-131220052427-phpapp01.pdf
nncollovcapaldo2013-131220052427-phpapp01.pdf
chapter one introduction to nueral networks
Neural Networks
0321204662_lec07_2.pptbklllllmbklgbkhnjfv
Ad

More from Numenta (15)

PDF
Deep learning at the edge: 100x Inference improvement on edge devices
PDF
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
PDF
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
PDF
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
PDF
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
PDF
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
PDF
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
PDF
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
PDF
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
PDF
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
PDF
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
PDF
The Biological Path Towards Strong AI Strange Loop 2017, St. Louis
PPTX
HTM Spatial Pooler
PDF
Biological path toward strong AI
PDF
Predictive Analytics with Numenta Machine Intelligence
Deep learning at the edge: 100x Inference improvement on edge devices
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Towards Strong AI Strange Loop 2017, St. Louis
HTM Spatial Pooler
Biological path toward strong AI
Predictive Analytics with Numenta Machine Intelligence

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation theory and applications.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
cuic standard and advanced reporting.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
Encapsulation theory and applications.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Big Data Technologies - Introduction.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The AUB Centre for AI in Media Proposal.docx
Programs and apps: productivity, graphics, security and other tools
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Spectral efficient network and resource selection model in 5G networks
cuic standard and advanced reporting.pdf
Machine learning based COVID-19 study performance prediction
MIND Revenue Release Quarter 2 2025 Press Release
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation

CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for continuous learning

  • 1. SPARSITY IN THE NEOCORTEX, AND ITS IMPLICATIONS FOR CONTINUOUS LEARNING CVPR WORKSHOP ON CONTINUAL LEARNING IN COMPUTER VISION JUNE 14, 2020 Subutai Ahmad Email: sahmad@numenta.com Twitter: @SubutaiAhmad
  • 2. 1) Reverse engineer the neocortex - biologically accurate theories - open access neuroscience publications 2) Apply neocortical principles to AI - improve current techniques - move toward truly intelligent systems Mission Founded in 2005, by Jeff Hawkins and Donna Dubinsky
  • 3. OUTLINE 1. Sparsity in the neocortex • Sparse activations and connectivity • Neuron model • Learning rules 2. Sparse representations and catastrophic forgetting • Stability • Plasticity 3. Network model • Unsupervised continuously learning system
  • 4. Source: Prof. Hasan, Max-Planck-Institute for Research
  • 5. “mostly missing” sparse vector = vector with mostly zero elements Most neuroscience papers describe three types of sparsity: 1) Population sparsity How many neurons are active right now? Estimate: roughly 0.5% to 2% of cells are active at a time (Attwell & Laughlin, 2001; Lennie, 2003). 2) Lifetime sparsity How often does a given cell fire? 3) Connection sparsity When a layer of cells projects to another layer, what percentage are connected? Estimate: 1% - 5% of possible neuron to neuron connections exist (Holmgren et al., 2003). WHAT EXACTLY IS “SPARSITY”?
  • 6. “axon”“soma” Point Neuron Model NEURON MODEL x Not a neuron Integrate and fire neuron: Lapicque, 1907 Perceptron: Rosenblatt 1962; Deep learning: Rumelhart et al. 1986; LeCun et al., 2015
  • 7. Source: Smirnakis Lab, Baylor College of Medicine
  • 8. DENDRITES DETECT SPARSE PATTERNS (Mel, 1992; Branco & Häusser, 2011; Schiller et al, 2000; Losonczy, 2006; Antic et al, 2010; Major et al, 2013; Spruston, 2008; Milojkovic et al, 2005, etc.) Major, Larkum and Schiller 2013 Pyramidal neuron 3K to 10K synapses Dendrites split into dozens of independent computational segments These segments activate with cluster of 10-20 active synapses Neurons detect dozens of highly sparse patterns, in parallel
  • 9. Pyramidal neuron Sparse feedforward patterns Sparse local patterns Sparse top-down patterns 9 Learning localized to dendritic segments “Branch specific plasticity” If cell becomes active: • If there was a dendritic spike, reinforce that segment • If there were no dendritic spikes, grow connections by subsampling cells active in the past If cell is not active: • If there was a dendritic spike, weaken the segments (Gordon et al., 2006; Losonczy et al., 2008; Yang et al., 2014; Cichon & Gang, 2015; El-Boustani et al., 2018; Weber et al., 2016; Sander et al., 2016; Holthoff et al., 2004) NEURONS UNDERGO SPARSE LEARNING
  • 10. “We observed substantial spine turnover, indicating that the architecture of the neuronal circuits in the auditory cortex is dynamic (Fig. 1B). Indeed, 31% ± 1% (SEM) of the spines in a given imaging session were not detected in the previous imaging session; and, similarly, 31 ± 1% (SEM) of the spines identified in an imaging session were no longer found in the next imaging session. (Loewenstein, et al., 2015) Learning involves growing and removing synapses • Structural plasticity: network structure is dynamically altered during learning HIGHLY DYNAMIC LEARNING AND CONNECTIVITY
  • 11. OUTLINE 1. Sparsity in the neocortex • Neural activations and connectivity are highly sparse • Neurons detect dozens of independent sparse patterns • Learning is sparse and incredibly dynamic 2. Sparse representations and catastrophic forgetting • Stability • Plasticity 3. Network model • Unsupervised continuously learning system
  • 12. Thousands of neurons send input to any single neuron On each neuron, 8-20 synapses on tiny segments of dendrites recognize patterns. The connections are learned. STABILITY OF SPARSE REPRESENTATIONS Pyramidal neuron 3K to 10K synapses xi<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> n inputs<latexit sha1_base64="dwl2mcnPQNgv8dEIjD6vL0j60R0=">AAAB+XicbVDLSsNAFJ3UV62vqEtFBovgqiS60GXRjcsW7APaUCbTSTt0MgkzN8USuvQv3LhQxK2bfoc7v8GfcJp2oa0HLhzOuZd77/FjwTU4zpeVW1ldW9/Ibxa2tnd29+z9g7qOEkVZjUYiUk2faCa4ZDXgIFgzVoyEvmANf3A79RtDpjSP5D2MYuaFpCd5wCkBI3VsW+I2sAdIMZdxAnrcsYtOycmAl4k7J8Xy8aT6/XgyqXTsz3Y3oknIJFBBtG65TgxeShRwKti40E40iwkdkB5rGSpJyLSXZpeP8ZlRujiIlCkJOFN/T6Qk1HoU+qYzJNDXi95U/M9rJRBce2n2E5N0tihIBIYIT2PAXa4YBTEyhFDFza2Y9okiFExYBROCu/jyMqlflNzLklM1adygGfLoCJ2ic+SiK1RGd6iCaoiiIXpCL+jVSq1n6816n7XmrPnMIfoD6+MHppeXXg==</latexit> P(xi · xj ✓)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> Sparse vector matching = connections on dendrite = input activity
  • 13. We can get excellent robustness by reducing , at the cost of increased “false positives” and interference. Can compute the probability of a random vector matching a given : Numerator: volume around point (white) Denominator: full volume of space (grey) P (xi · xj ✓) = P|xi| b=✓ | ⌦n (xi, b, |xj|) | n |xj |<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> |⌦n (xi, b, k)| = ✓ |xi| b ◆✓ n |xi| k b ◆ <latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> xi<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> P(xi · xj ✓)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> STABILITY OF SPARSE REPRESENTATIONS (Ahmad & Scheinkman, 2019)
  • 14. 1) False positive error decreases exponentially with dimensionality with sparsity. 2) Error rates do not decrease when activity is dense (a=n/2). 3) Assume uniform random distribution of vectors. Sparse binary vectors: probability of interference Sparse scalar vectors: probability of interference STABILITY OF SPARSE REPRESENTATIONS (Ahmad & Scheinkman, 2019)
  • 15. Pyramidal neuron Sparse feedforward patterns Sparse top-down patterns Sparse local patterns STABILITY VS PLASTICITY
  • 16. (Hawkins & Ahmad, 2016) Model pyramidal neuron Simple localized learning rules When a cell becomes active: 1) If a segment detected pattern, reinforce that segment 2) If no segment detected a pattern, grow new connections on new dendritic segment If cell did not become active: 1) If a segment detected pattern, weaken that segment - Learning consists of growing new connections - Neurons learn continuously but since patterns are sparse and learning is sparse, new patterns don’t interfere with old ones Sparse top-down context Sparse local context Sparse feedforward patterns STABILITY VS PLASTICITY
  • 17. OUTLINE 1. Sparsity in the neocortex • Neural activations and connectivity are highly sparse • Neurons detect dozens of independent sparse patterns • Learning is sparse and incredibly dynamic 2. Sparse representations and catastrophic forgetting • Sparse high dimensional representations are remarkably stable • Local plasticity rules enable learning new patterns without interference 3. Network model • Unsupervised continuously learning system
  • 18. (Hawkins & Ahmad, 2016) HTM SEQUENCE MEMORY Model pyramidal neuron Sparse top-down context Sparse local context Sparse feedforward patterns 1) Associates past activity as context for current activity 2) Automatically learns from prediction errors 3) Learns continuously without forgetting past patterns 4) Can learn complex high-Markov order sequences
  • 19. CONTINUOUS LEARNING AND FAULT TOLERANCE Input: continuous stream of non-Markov sequences interspersed with random input Task: correctly predict the next element (max accuracy is 50%) XABCDE noise YABCFG noise YABCFG noise…… time (Hawkins & Ahmad, 2016) Changed sequences mid-stream “killed” neurons
  • 20. Recurrent Neural network (ESN, LSTM) HTM* CONTINUOUS LEARNING WITH STREAMING DATA SOURCES (Cui et al, Neural Computation, 2016) 2015-04-20 Monday 2015-04-21 Tuesday 2015-04-22 Wednesday 2015-04-23 Thursday 2015-04-24 Friday 2015-04-25 Saturday 2015-04-26 Sunday 0 k 5 k 10 k 15 k 20 k 25 k 30 k PassengerCountin30minwindow A B C Shift ARIM ALSTM 1000LSTM 3000LSTM 6000 TM 0.0 0.2 0.4 0.6 0.8 1.0 NRMSE 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 MAPE 0.0 0.5 1.0 1.5 2.0 2.5 NegativeLog-likelihood Shift ARIM ALSTM 1000LSTM 3000LSTM 6000 TM LSTM 1000LSTM 3000LSTM 6000 TM D NYC Taxi demand datastream Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml ?
  • 22. ANOMALY DETECTION Benchmark for anomaly detection in streaming applications Detector Score Perfect 100.0 HTM 70.1 CAD OSE† 69.9 nab-comportex† 64.6 KNN CAD† 58.0 Relative Entropy 54.6 Twitter ADVec v1.0.0 47.1 Windowed Gaussian 39.6 Etsy Skyline 35.7 Sliding Threshold 30.7 Bayesian Changepoint 17.7 EXPoSE 16.4 Random 11.0 • Real-world data (365,551 points, 58 data streams) • Scoring encourages early detection • Published, open resource (Ahmad et al, 2017)
  • 23. SUMMARY 1. Sparsity in the neocortex • Neural activations and connectivity are highly sparse • Neurons detect dozens of independent sparse patterns • Learning is sparse and incredibly dynamic 2. Sparse representations and catastrophic forgetting • Sparse high dimensional representations are remarkably stable • Local plasticity rules enable learning new patterns without interference 3. Network model • Biologically inspired unsupervised continuously learning system • Inherently stable representations • Thank you! Questions? sahmad@numenta.com