SlideShare a Scribd company logo
November 21, 2014
Jeff Hawkins
jhawkins@Numenta.com
What the Brain Says About Machine Intelligence
1940’s 1950’s
- Dedicated vs. universal
- Analog vs. digital
- Decimal vs. binary
- Wired vs. memory-based programming
- Serial vs. random access memory
Many approaches
- Universal
- Digital
- Binary
- Memory-based programming
- Two tier memory
One dominant paradigm
The Birth of Programmable Computing
Why Did One Paradigm Win?
- Network effects
Why Did This Paradigm Win?
- Most flexible
- Most scalable
2010’s 2020’s
The Birth of Machine Intelligence
- Specific vs. universal algorithms
- Mathematical vs. memory-based
- Batch vs. on-line learning
- Labeled vs. behavior-based learning
Many approaches
- Universal algorithms
- Memory-based
- On-line learning
- Behavior-based learning
One dominant paradigm
Why Will One Paradigm Win?
- Network effects
Why Will This Paradigm Win?
- Most flexible
- Most scalable
How Do We Know This is Going to Happen?
- Brain is proof case
- We have made great progress
1) Discover operating principles of neocortex.
2) Create machine intelligence technology
based on neocortical principles.
Numenta’s Mission
Talk Topics
- Cortical facts
- Cortical theory
- Research roadmap
- Applications
- Thoughts on Machine Intelligence
What the Cortex Does
patterns Learns a model of world
from changing sensory data
The model generates
- predictions
- anomalies
- actions
Most sensory changes are due
to your own movement
The neocortex learns a sensory-motor model of the world
patterns
patterns
light
sound
touch
retina
cochlear
somatic
Cortical Facts
Hierarchy
Cellular layers
Mini-columns
Neurons: 3-10K synapses
- 10% proximal
- 90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
2.5 mm
Sheet of cells
2/3
4
6
5
Cortical Theory
Hierarchy
Cellular layers
Mini-columns
Neurons: 3-10K synapses
- 10% proximal
- 90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
Sheet of cellsHTM
Hierarchical Temporal Memory
1) Hierarchy of identical regions
2) Each region learns sequences
3) Stability increases going up hierarchy if
input is predictable
4) Sequences unfold going down
Questions
- What does a region do?
- What do the cellular layers do?
- How do neurons implement this?
- How does this work in hierarchy?
2/3
4
6
5
2/3
4
5
6
Cellular Layers
Sequence memory:
Sequence memory:
Sequence memory:
Sequence memory:
Inference (high-order)
Inference (sensory-motor)
Motor
Attention
FeedforwardFeedback
Each layer is a variation of common sequence memory algorithm.
These are universal functions. They apply to:
- all cortical regions
- all sensory-motor modalities.
Copy of motor commands
Sensor data Higher region
Sub-cortical
Motor centers
Lower region
2/3
4
5
6
Sequence memory:
Sequence memory:
Sequence memory:
Sequence memory:
?
?
?
?
How Does Sequence Memory Work?
HTM Temporal Memory
Learns sequences
Recognizes and recalls sequences
Predicts next inputs
- High capacity
- Distributed
- Local learning rules
- Fault tolerant
- No sensitive parameters
- Generalizes
HTM Temporal Memory
Not Just Another ANN 1) Cortical Anatomy
Mini-columns
Inhibitory cells
Cell connectivity patterns
2) Sparse Distributed
Representations
3) Realistic Neurons
Active dendrites
Thousands of synapses
Learn via synapse formation
numenta.com/learn/
2/3
4
5
6
Research Roadmap
Sensory-motor Inference
High-order Inference
Motor Sequences
Attention/Feedback
Theory 98%
Extensively tested
Commercial
Theory 80%
In development
Theory 50%
Theory 30%
Streaming Data
Capabilities: Prediction
Anomaly detection
Classification
Applications: Predictive maintenance
Security
Natural Language Processing
HTM
Encoder
SDRData stream Predictions
Anomalies
Classification
Streaming Data Applications
Numbers
Categories
Date
Time
GPS
Words
Applications
Servers
Biometrics
Medical
Vehicles
Industrial equipment
Social media
Comm. networks
Streaming Data Applications
Server metrics Human metrics
Natural languageGPS dataEEG data
Financial data
.
.
.
Anomaly Detection in Server Metrics (Grok for AWS)
HTM
Encoder
SDRServer Metric
Anomaly Score
HTM
Encoder
SDRServer Metric
Anomaly Score
Mobile Dashboard
 Servers sorted by
anomaly score
 Continuously updated
Web Dashboard
What Kind of Anomalies Can HTM Detect?
Sudden changes Slow changes Changes in noisy dataSubtle changes
in regular data
Changes that humans can’t see
Engineer manually started
build on automated build server
What Kind of Anomalies Can HTM Detect?
Created large
Zip file
Anomaly Detection in Human Metrics
Keystrokes
File access
CPU usage
App access
Anomaly Detection in Financial and Social Media Data
Stock volume
Social media
Stock volume
Social media
Berkeley Cognitive Technology Group
Classification of EEG Data
GPS Data: SmartHarbors
Document corpus
(e.g. Wikipedia)
128 x 128
100K “Word SDRs”
- =
Apple Fruit Computer
Macintosh
Microsoft
Mac
Linux
Operating system
….
Natural Language
Training set
frog eats flies
cow eats grain
elephant eats leaves
goat eats grass
wolf eats rabbit
cat likes ball
elephant likes water
sheep eats grass
cat eats salmon
wolf eats mice
lion eats cow
dog likes sleep
elephant likes water
cat likes ball
coyote eats rodent
coyote eats rabbit
wolf eats squirrel
dog likes sleep
cat likes ball
---- ---- -----
Word 3Word 2Word 1
Sequences of Word SDRs
HTM
Training set
eats“fox”
?
frog eats flies
cow eats grain
elephant eats leaves
goat eats grass
wolf eats rabbit
cat likes ball
elephant likes water
sheep eats grass
cat eats salmon
wolf eats mice
lion eats cow
dog likes sleep
elephant likes water
cat likes ball
coyote eats rodent
coyote eats rabbit
wolf eats squirrel
dog likes sleep
cat likes ball
---- ---- -----
Sequences of Word SDRs
HTM
Training set
eats“fox”
rodent
- Learning is unsupervised
- Semantic generalization
- Works across languages
- Many applications
Intelligent search
Sentiment analysis
Semantic filtering
frog eats flies
cow eats grain
elephant eats leaves
goat eats grass
wolf eats rabbit
cat likes ball
elephant likes water
sheep eats grass
cat eats salmon
wolf eats mice
lion eats cow
dog likes sleep
elephant likes water
cat likes ball
coyote eats rodent
coyote eats rabbit
wolf eats squirrel
dog likes sleep
cat likes ball
---- ---- -----
Sequences of Word SDRs
HTM
Server metrics Human metrics
Natural language
GPS dataEEG dataFinancial data
All these applications run on
the exact same HTM code.
2/3
4
5
6
Research Roadmap
Sensory-motor Inference
High-order Inference
Motor Sequences
Attention/Feedback
Theory 98%
Extensively tested
Commercial
Theory 80%
In development
Theory 50%
Theory 30%
Streaming Data
Capabilities: Prediction
Anomaly detection
Classification
Applications: IT
Security
Natural Language Processing
Static Data (via active learning)
Capabilities: Classification
Prediction
Applications: Vision image classification
Network classification
Classification of connected graphs
2/3
4
5
6
Research Roadmap
Sensory-motor Inference
High-order Inference
Motor Sequences
Attention/Feedback
Theory 98%
Extensively tested
Commercial
Theory 80%
In development
Theory 50%
Theory 30%
Streaming Data
Capabilities: Prediction
Anomaly detection
Classification
Applications: IT
Security
Natural Language Processing
Static Data (via active learning)
Capabilities: Classification
Prediction
Applications: Vision image classification
Network classification
Classification of connected graphs
Static and/or streaming Data
Capabilities: Goal-oriented behavior
Applications: Robotics
Smart bots
Proactive defense
2/3
4
5
6
Research Roadmap
Sensory-motor Inference
High-order Inference
Motor Sequences
Attention/Feedback
Theory 98%
Extensively tested
Commercial
Theory 80%
In development
Theory 50%
Theory 30%
Streaming Data
Capabilities: Prediction
Anomaly detection
Classification
Applications: IT
Security
Natural Language Processing
Static Data (via active learning)
Capabilities: Classification
Prediction
Applications: Vision image classification
Network classification
Classification of connected graphs
Static and/or streaming Data
Capabilities: Goal-oriented behavior
Applications: Robotics
Smart bots
Proactive defense
Enables : Multi-sensory modalities
Multi-behavioral modalities
- Algorithms are documented
- Multiple independent implementations
NuPIC www.Numenta.org
- Numenta’s software is open source (GPLv3)
- Numenta’s daily research code is online
- Active discussion groups for theory and implementation
- Collaborative
IBM Almaden Research, San Jose, CA
DARPA, Washington D.C
Cortical.IO, Austria
Research Transparency
NuPIC Community
Machine Intelligence Landscape
Cortical
(e.g. HTM)
ANNs
(e.g. Deep learning)
A.I.
(e.g. Watson)
Machine Intelligence Landscape
Premise Biological Mathematical Engineered
Cortical
(e.g. HTM)
ANNs
(e.g. Deep learning)
A.I.
(e.g. Watson)
Machine Intelligence Landscape
Premise Biological Mathematical Engineered
Data Spatial-temporal
Language, Behavior
Spatial-temporal Language
Documents
Cortical
(e.g. HTM)
ANNs
(e.g. Deep learning)
A.I.
(e.g. Watson)
Machine Intelligence Landscape
Premise Biological Mathematical Engineered
Data Spatial-temporal
Language, Behavior
Spatial-temporal Language
Documents
Capabilities Classification
Prediction
Goal-oriented Behavior
Classification NL Query
Cortical
(e.g. HTM)
ANNs
(e.g. Deep learning)
A.I.
(e.g. Watson)
Machine Intelligence Landscape
Premise Biological Mathematical Engineered
Data Spatial-temporal
Language, Behavior
Spatial-temporal Language
Documents
Capabilities Classification
Prediction
Goal-oriented Behavior
Classification NL Query
Path to M.I.? Yes Probably not Probably not
Cortical
(e.g. HTM)
ANNs
(e.g. Deep learning)
A.I.
(e.g. Watson)
Learning Normal Behavior
Learning Normal Behavior
Learning Normal Behavior
Geospatial Anomalies
Deviation in path Change in direction
Learning Transitions
Time = 1
Learning Transitions
Time = 2
Learning Transitions
Learning Transitions
Form connections to previously active cells.
Predict future activity.
- This is a first order sequence memory.
- It cannot learn A-B-C-D vs. X-B-C-Y.
- Mini-columns turn this into a high-order sequence memory.
Learning Transitions
Multiple predictions can occur at once.
A-B A-C A-D
Forming High Order Representations
Feedforward: Sparse activation of columns
Burst of activity Highly sparse unique pattern
Unpredicted Predicted
Feedforward: Sparse activation of columns
Representing High-order Sequences
A
X B
B
C
C
Y
D
Before training
A
X B’’
B’
C’’
C’
Y’’
D’
After training
Same columns,
but only one cell active per column.
IF 40 active columns, 10 cells per column
THEN 1040 ways to represent the same input in different contexts
SDR Properties
subsampling is OK
3) Union membership:
Indices
1
2
|
10
Is this SDR
a member?
2) Store and Compare:
store indices of active bits
Indices
1
2
3
4
5
|
40
1)
2)
3)
….
10)
2%
20%Union
1) Similarity:
shared bits = semantic similarity
What Can Be Done With Software
1 layer
30 msec / learning-inference-prediction step
10-6 of human cortex
2048 columns 65,000 neurons
300M synapses
Challenges
Dendritic regions
Active dendrites
1,000s of synapses
10,000s of potential synapses
Continuous learning
Challenges and Opportunities for Neuromorphic HW
Opportunities
Low precision memory (synapses)
Fault tolerant
- memory
- connectivity
- neurons
- natural recovery
Simple activation states (no spikes)
Connectivity
- very sparse, topological
2/3
4
5
6
Cellular Layers
Sequence memory
Sequence memory
Sequence memory
Sequence memory
Inference
Inference
Motor
Attention
FeedforwardFeedback
Each layer implements a variation of a common sequence
memory algorithm.
Higher cortexSensor/lower cortex
Lower cortex
Motor center
Why Will Machine Intelligence be Based on Cortical Principles?
1) Cortex uses a common learning algorithm
vision
hearing
touch
behavior
2) Cortical algorithm is incredibly adaptable
languages
engineering
science
arts …
3) Network effects
Hardware and software efforts will
focus on most universal solution
2/3
4
5
6
Cellular Layers
Sequence memory:
Sequence memory:
Sequence memory:
Sequence memory:
Inference
Inference
Motor
Attention
FeedforwardFeedback
Each layer is a variation of a common sequence memory algorithm.
Higher cortexSensor/lower cortex
Lower cortex
Sub-cortical
motor center
Inputs/outputs define the role of each layer.
Learning Transitions
Feedforward activation
Learning Transitions
Inhibition
Sparse Distributed Representations (SDRs)
- Sensory perception
- Planning
- Motor control
- Prediction
- Attention
Sparse Distribution Representations are used
everywhere in the cortex.
Sparse Distributed Representations
What are they
• Many bits (thousands)
• Few 1’s mostly 0’s
• Example: 2,000 bits, 2% active
• Each bit has semantic meaning
• No bit is essential
01000000000000000001000000000000000000000000000000000010000…………01000
Desirable attributes
• High capacity
• Robust to noise and deletion
• Efficient and fast
• Enable new operations
SDR Operations
1) Similarity:
shared bits = semantic similarity
subsampling is OK
3) Union membership:
Indices
1
2
|
10
Is this SDR
a member?
2) Store and Compare:
store indices of active bits
Indices
1
2
3
4
5
|
40
1)
2)
3)
….
10)
2%
20%Union
SmartHarbors
GPS to SDR Encoder
GPS to SDR Encoder
GPS to SDR Encoder
GPS to SDR Encoder
Feedback
Local
Feedforward
Activates cell
Neurons
Biological neuron HTM neuron
Non-linear
Dendritic AP’s
Depolarize soma
Coincidence
detectors
HTM SynapsesBiological Synapses
Learning is formation of new
synapses.
Synapses have low fidelity. Connection weight is binary
0.0 1.00.4
Learning forms new connections
(“permanence” is scalar)
0 1
Feedforward
Activates cell
Prediction:
Recognize hundreds
of unique patterns
Synapses
Activation:
Recognize dozens of
unique patterns
SDRs are used everywhere in the cortex.
Sparse Distributed Representations (SDRs)
From: Prof. Hasan, Max-Planck-
Institute for Research
x = 0100000000000000000100000000000110000000
• Extremely high capacity
• Robust to noise and deletions
• Have many desirable properties
• Solve semantic representation problem
Attributes
SDR Basics
• Large number of neurons
• Few active at once
• Every cell represents something
• Information is distributed
• SDRs are binary
10 to 15 synapses are
sufficient to
recognize patterns in
thousands of cells.
A single dendrite can
recognize multiple
unique patterns
without confusion.
Example: SDR Classification Capacity in Presence of Noise
• n = number of bits in SDR
• w = number of 1 bits
• W = number of vectors that overlap vector x by b bits
• Probability of false positive for one stored pattern
• Probability of false positive for M stored patterns
Wx (n,w,b) =
wx
b
æ
èç
ö
ø÷ ´
n - wx
w - b
æ
èç
ö
ø÷
fpw
n
(q) =
Wx (n,w,b)
b=q
w
å
n
w
æ
èç
ö
ø÷
fpX (q) £ fpwxi
n
(q)
i=0
M-1
å n = 2048, w = 40
With 50% noise, you can classify 1015 patterns with an error < 10-11
n = 64, w=12
With 33% noise, you can classify only 10 patterns with an error 0.04%
Link.to.whitepaper.com

More Related Content

PPTX
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
PPT
HTM Theory
PPTX
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
PPTX
Why Neurons have thousands of synapses? A model of sequence memory in the brain
PDF
Hierarchical Temporal Memory: Computing Like the Brain - Matt Taylor, Numenta
PDF
The Biological Path Towards Strong AI Strange Loop 2017, St. Louis
PDF
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
PPTX
Sparse Distributed Representations: Our Brain's Data Structure
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
HTM Theory
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Why Neurons have thousands of synapses? A model of sequence memory in the brain
Hierarchical Temporal Memory: Computing Like the Brain - Matt Taylor, Numenta
The Biological Path Towards Strong AI Strange Loop 2017, St. Louis
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
Sparse Distributed Representations: Our Brain's Data Structure

What's hot (20)

PDF
Biological path toward strong AI
PDF
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
PDF
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
PDF
Introduction to Deep Learning
PDF
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
PDF
Deep Learning
PDF
Introduction of Deep Learning
PPTX
Ai ml dl_bct and mariners-1
PPTX
Deep learning tutorial 9/2019
PDF
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
PPTX
Neural networks...
PDF
Deep Learning: Application & Opportunity
PDF
Does the neocortex use grid cell-like mechanisms to learn the structure of ob...
PPTX
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
PPTX
Artificial Intelligence, Machine Learning and Deep Learning with CNN
PDF
Deep learning
PDF
Artificial Neural Network Seminar - Google Brain
PDF
Neural networks and deep learning
PDF
Deep Learning Class #0 - You Can Do It
PDF
Deep Learning - The Past, Present and Future of Artificial Intelligence
Biological path toward strong AI
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
Introduction to Deep Learning
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
Deep Learning
Introduction of Deep Learning
Ai ml dl_bct and mariners-1
Deep learning tutorial 9/2019
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
Neural networks...
Deep Learning: Application & Opportunity
Does the neocortex use grid cell-like mechanisms to learn the structure of ob...
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Artificial Intelligence, Machine Learning and Deep Learning with CNN
Deep learning
Artificial Neural Network Seminar - Google Brain
Neural networks and deep learning
Deep Learning Class #0 - You Can Do It
Deep Learning - The Past, Present and Future of Artificial Intelligence
Ad

Viewers also liked (8)

PPTX
HTM Spatial Pooler
PPTX
Beginner's Guide to NuPIC
PPTX
a tour of several popular tensorflow models
PDF
Recognizing Locations on Objects by Marcus Lewis
PPTX
Applications of Hierarchical Temporal Memory (HTM)
PPTX
Getting Started with Numenta Technology
PDF
Predictive Analytics with Numenta Machine Intelligence
PDF
TouchNet preview at Numenta
HTM Spatial Pooler
Beginner's Guide to NuPIC
a tour of several popular tensorflow models
Recognizing Locations on Objects by Marcus Lewis
Applications of Hierarchical Temporal Memory (HTM)
Getting Started with Numenta Technology
Predictive Analytics with Numenta Machine Intelligence
TouchNet preview at Numenta
Ad

Similar to What the Brain says about Machine Intelligence (20)

PPT
Useful Techniques in Artificial Intelligence
PDF
SF Big Analytics20170706: What the brain tells us about the future of streami...
PDF
Ai ml dl_bct and mariners
PPTX
Ai ml dl_bct and mariners
PDF
Ch 1 Introduction to AI Applications.pdf
PDF
Deep learning - A Visual Introduction
PPTX
Artificial Intelligence Today (22 June 2017)
PDF
AI in 6 Hours this pdf contains a general idea of how AI will be asked in the...
PDF
AI_in_6_Hours_lyst1728638806090-invert.pdf
PPT
SMACS Research
PPTX
Big Sky Earth 2018 Introduction to machine learning
PDF
Pharo-AI
PPTX
AI/ML/DL/BCT A Revolution in Maritime Sector
PPT
AI and Expert Systems
PDF
AI for Cybersecurity Innovation
PPT
Nural network ER. Abhishek k. upadhyay
PPTX
[DSC Europe 23] Goran S. Milovanovic - Deciphering the AI Landscape: Business...
PPTX
AI for Everyone: Master the Basics
PPTX
Parsimony and Self-Consistency-with-Translation.pptx
PDF
Novi sad ai event 1-2018
Useful Techniques in Artificial Intelligence
SF Big Analytics20170706: What the brain tells us about the future of streami...
Ai ml dl_bct and mariners
Ai ml dl_bct and mariners
Ch 1 Introduction to AI Applications.pdf
Deep learning - A Visual Introduction
Artificial Intelligence Today (22 June 2017)
AI in 6 Hours this pdf contains a general idea of how AI will be asked in the...
AI_in_6_Hours_lyst1728638806090-invert.pdf
SMACS Research
Big Sky Earth 2018 Introduction to machine learning
Pharo-AI
AI/ML/DL/BCT A Revolution in Maritime Sector
AI and Expert Systems
AI for Cybersecurity Innovation
Nural network ER. Abhishek k. upadhyay
[DSC Europe 23] Goran S. Milovanovic - Deciphering the AI Landscape: Business...
AI for Everyone: Master the Basics
Parsimony and Self-Consistency-with-Translation.pptx
Novi sad ai event 1-2018

More from Numenta (20)

PDF
Deep learning at the edge: 100x Inference improvement on edge devices
PDF
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
PDF
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
PDF
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
PDF
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
PDF
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
PDF
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
PDF
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
PDF
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
PDF
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
PDF
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
PDF
Sparsity In The Neocortex, And Its Implications For Machine Learning
PDF
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
PPTX
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
PPTX
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
PPTX
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
PPTX
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
PPTX
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
PDF
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
PPTX
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
Deep learning at the edge: 100x Inference improvement on edge devices
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
Sparsity In The Neocortex, And Its Implications For Machine Learning
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
Chapter 3 Spatial Domain Image Processing.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation_ Review paper, used for researhc scholars
MIND Revenue Release Quarter 2 2025 Press Release
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Mobile App Security Testing_ A Comprehensive Guide.pdf
Network Security Unit 5.pdf for BCA BBA.
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Spectroscopy.pptx food analysis technology
Per capita expenditure prediction using model stacking based on satellite ima...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
“AI and Expert System Decision Support & Business Intelligence Systems”
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

What the Brain says about Machine Intelligence

  • 1. November 21, 2014 Jeff Hawkins jhawkins@Numenta.com What the Brain Says About Machine Intelligence
  • 2. 1940’s 1950’s - Dedicated vs. universal - Analog vs. digital - Decimal vs. binary - Wired vs. memory-based programming - Serial vs. random access memory Many approaches - Universal - Digital - Binary - Memory-based programming - Two tier memory One dominant paradigm The Birth of Programmable Computing Why Did One Paradigm Win? - Network effects Why Did This Paradigm Win? - Most flexible - Most scalable
  • 3. 2010’s 2020’s The Birth of Machine Intelligence - Specific vs. universal algorithms - Mathematical vs. memory-based - Batch vs. on-line learning - Labeled vs. behavior-based learning Many approaches - Universal algorithms - Memory-based - On-line learning - Behavior-based learning One dominant paradigm Why Will One Paradigm Win? - Network effects Why Will This Paradigm Win? - Most flexible - Most scalable How Do We Know This is Going to Happen? - Brain is proof case - We have made great progress
  • 4. 1) Discover operating principles of neocortex. 2) Create machine intelligence technology based on neocortical principles. Numenta’s Mission Talk Topics - Cortical facts - Cortical theory - Research roadmap - Applications - Thoughts on Machine Intelligence
  • 5. What the Cortex Does patterns Learns a model of world from changing sensory data The model generates - predictions - anomalies - actions Most sensory changes are due to your own movement The neocortex learns a sensory-motor model of the world patterns patterns light sound touch retina cochlear somatic
  • 6. Cortical Facts Hierarchy Cellular layers Mini-columns Neurons: 3-10K synapses - 10% proximal - 90% distal Active dendrites Learning = new synapses Remarkably uniform - anatomically - functionally 2.5 mm Sheet of cells 2/3 4 6 5
  • 7. Cortical Theory Hierarchy Cellular layers Mini-columns Neurons: 3-10K synapses - 10% proximal - 90% distal Active dendrites Learning = new synapses Remarkably uniform - anatomically - functionally Sheet of cellsHTM Hierarchical Temporal Memory 1) Hierarchy of identical regions 2) Each region learns sequences 3) Stability increases going up hierarchy if input is predictable 4) Sequences unfold going down Questions - What does a region do? - What do the cellular layers do? - How do neurons implement this? - How does this work in hierarchy? 2/3 4 6 5
  • 8. 2/3 4 5 6 Cellular Layers Sequence memory: Sequence memory: Sequence memory: Sequence memory: Inference (high-order) Inference (sensory-motor) Motor Attention FeedforwardFeedback Each layer is a variation of common sequence memory algorithm. These are universal functions. They apply to: - all cortical regions - all sensory-motor modalities. Copy of motor commands Sensor data Higher region Sub-cortical Motor centers Lower region
  • 9. 2/3 4 5 6 Sequence memory: Sequence memory: Sequence memory: Sequence memory: ? ? ? ? How Does Sequence Memory Work?
  • 10. HTM Temporal Memory Learns sequences Recognizes and recalls sequences Predicts next inputs - High capacity - Distributed - Local learning rules - Fault tolerant - No sensitive parameters - Generalizes
  • 11. HTM Temporal Memory Not Just Another ANN 1) Cortical Anatomy Mini-columns Inhibitory cells Cell connectivity patterns 2) Sparse Distributed Representations 3) Realistic Neurons Active dendrites Thousands of synapses Learn via synapse formation numenta.com/learn/
  • 12. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: Predictive maintenance Security Natural Language Processing
  • 13. HTM Encoder SDRData stream Predictions Anomalies Classification Streaming Data Applications Numbers Categories Date Time GPS Words Applications Servers Biometrics Medical Vehicles Industrial equipment Social media Comm. networks
  • 14. Streaming Data Applications Server metrics Human metrics Natural languageGPS dataEEG data Financial data
  • 15. . . . Anomaly Detection in Server Metrics (Grok for AWS) HTM Encoder SDRServer Metric Anomaly Score HTM Encoder SDRServer Metric Anomaly Score Mobile Dashboard  Servers sorted by anomaly score  Continuously updated Web Dashboard
  • 16. What Kind of Anomalies Can HTM Detect? Sudden changes Slow changes Changes in noisy dataSubtle changes in regular data
  • 17. Changes that humans can’t see Engineer manually started build on automated build server What Kind of Anomalies Can HTM Detect?
  • 18. Created large Zip file Anomaly Detection in Human Metrics Keystrokes File access CPU usage App access
  • 19. Anomaly Detection in Financial and Social Media Data Stock volume Social media Stock volume Social media
  • 20. Berkeley Cognitive Technology Group Classification of EEG Data
  • 22. Document corpus (e.g. Wikipedia) 128 x 128 100K “Word SDRs” - = Apple Fruit Computer Macintosh Microsoft Mac Linux Operating system …. Natural Language
  • 23. Training set frog eats flies cow eats grain elephant eats leaves goat eats grass wolf eats rabbit cat likes ball elephant likes water sheep eats grass cat eats salmon wolf eats mice lion eats cow dog likes sleep elephant likes water cat likes ball coyote eats rodent coyote eats rabbit wolf eats squirrel dog likes sleep cat likes ball ---- ---- ----- Word 3Word 2Word 1 Sequences of Word SDRs HTM
  • 24. Training set eats“fox” ? frog eats flies cow eats grain elephant eats leaves goat eats grass wolf eats rabbit cat likes ball elephant likes water sheep eats grass cat eats salmon wolf eats mice lion eats cow dog likes sleep elephant likes water cat likes ball coyote eats rodent coyote eats rabbit wolf eats squirrel dog likes sleep cat likes ball ---- ---- ----- Sequences of Word SDRs HTM
  • 25. Training set eats“fox” rodent - Learning is unsupervised - Semantic generalization - Works across languages - Many applications Intelligent search Sentiment analysis Semantic filtering frog eats flies cow eats grain elephant eats leaves goat eats grass wolf eats rabbit cat likes ball elephant likes water sheep eats grass cat eats salmon wolf eats mice lion eats cow dog likes sleep elephant likes water cat likes ball coyote eats rodent coyote eats rabbit wolf eats squirrel dog likes sleep cat likes ball ---- ---- ----- Sequences of Word SDRs HTM
  • 26. Server metrics Human metrics Natural language GPS dataEEG dataFinancial data All these applications run on the exact same HTM code.
  • 27. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: IT Security Natural Language Processing Static Data (via active learning) Capabilities: Classification Prediction Applications: Vision image classification Network classification Classification of connected graphs
  • 28. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: IT Security Natural Language Processing Static Data (via active learning) Capabilities: Classification Prediction Applications: Vision image classification Network classification Classification of connected graphs Static and/or streaming Data Capabilities: Goal-oriented behavior Applications: Robotics Smart bots Proactive defense
  • 29. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: IT Security Natural Language Processing Static Data (via active learning) Capabilities: Classification Prediction Applications: Vision image classification Network classification Classification of connected graphs Static and/or streaming Data Capabilities: Goal-oriented behavior Applications: Robotics Smart bots Proactive defense Enables : Multi-sensory modalities Multi-behavioral modalities
  • 30. - Algorithms are documented - Multiple independent implementations NuPIC www.Numenta.org - Numenta’s software is open source (GPLv3) - Numenta’s daily research code is online - Active discussion groups for theory and implementation - Collaborative IBM Almaden Research, San Jose, CA DARPA, Washington D.C Cortical.IO, Austria Research Transparency
  • 32. Machine Intelligence Landscape Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  • 33. Machine Intelligence Landscape Premise Biological Mathematical Engineered Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  • 34. Machine Intelligence Landscape Premise Biological Mathematical Engineered Data Spatial-temporal Language, Behavior Spatial-temporal Language Documents Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  • 35. Machine Intelligence Landscape Premise Biological Mathematical Engineered Data Spatial-temporal Language, Behavior Spatial-temporal Language Documents Capabilities Classification Prediction Goal-oriented Behavior Classification NL Query Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  • 36. Machine Intelligence Landscape Premise Biological Mathematical Engineered Data Spatial-temporal Language, Behavior Spatial-temporal Language Documents Capabilities Classification Prediction Goal-oriented Behavior Classification NL Query Path to M.I.? Yes Probably not Probably not Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  • 40. Geospatial Anomalies Deviation in path Change in direction
  • 42. Time = 1 Learning Transitions
  • 43. Time = 2 Learning Transitions
  • 44. Learning Transitions Form connections to previously active cells. Predict future activity.
  • 45. - This is a first order sequence memory. - It cannot learn A-B-C-D vs. X-B-C-Y. - Mini-columns turn this into a high-order sequence memory. Learning Transitions Multiple predictions can occur at once. A-B A-C A-D
  • 46. Forming High Order Representations Feedforward: Sparse activation of columns Burst of activity Highly sparse unique pattern Unpredicted Predicted Feedforward: Sparse activation of columns
  • 47. Representing High-order Sequences A X B B C C Y D Before training A X B’’ B’ C’’ C’ Y’’ D’ After training Same columns, but only one cell active per column. IF 40 active columns, 10 cells per column THEN 1040 ways to represent the same input in different contexts
  • 48. SDR Properties subsampling is OK 3) Union membership: Indices 1 2 | 10 Is this SDR a member? 2) Store and Compare: store indices of active bits Indices 1 2 3 4 5 | 40 1) 2) 3) …. 10) 2% 20%Union 1) Similarity: shared bits = semantic similarity
  • 49. What Can Be Done With Software 1 layer 30 msec / learning-inference-prediction step 10-6 of human cortex 2048 columns 65,000 neurons 300M synapses
  • 50. Challenges Dendritic regions Active dendrites 1,000s of synapses 10,000s of potential synapses Continuous learning Challenges and Opportunities for Neuromorphic HW Opportunities Low precision memory (synapses) Fault tolerant - memory - connectivity - neurons - natural recovery Simple activation states (no spikes) Connectivity - very sparse, topological
  • 51. 2/3 4 5 6 Cellular Layers Sequence memory Sequence memory Sequence memory Sequence memory Inference Inference Motor Attention FeedforwardFeedback Each layer implements a variation of a common sequence memory algorithm. Higher cortexSensor/lower cortex Lower cortex Motor center
  • 52. Why Will Machine Intelligence be Based on Cortical Principles? 1) Cortex uses a common learning algorithm vision hearing touch behavior 2) Cortical algorithm is incredibly adaptable languages engineering science arts … 3) Network effects Hardware and software efforts will focus on most universal solution
  • 53. 2/3 4 5 6 Cellular Layers Sequence memory: Sequence memory: Sequence memory: Sequence memory: Inference Inference Motor Attention FeedforwardFeedback Each layer is a variation of a common sequence memory algorithm. Higher cortexSensor/lower cortex Lower cortex Sub-cortical motor center Inputs/outputs define the role of each layer.
  • 56. Sparse Distributed Representations (SDRs) - Sensory perception - Planning - Motor control - Prediction - Attention Sparse Distribution Representations are used everywhere in the cortex.
  • 57. Sparse Distributed Representations What are they • Many bits (thousands) • Few 1’s mostly 0’s • Example: 2,000 bits, 2% active • Each bit has semantic meaning • No bit is essential 01000000000000000001000000000000000000000000000000000010000…………01000 Desirable attributes • High capacity • Robust to noise and deletion • Efficient and fast • Enable new operations
  • 58. SDR Operations 1) Similarity: shared bits = semantic similarity subsampling is OK 3) Union membership: Indices 1 2 | 10 Is this SDR a member? 2) Store and Compare: store indices of active bits Indices 1 2 3 4 5 | 40 1) 2) 3) …. 10) 2% 20%Union
  • 60. GPS to SDR Encoder
  • 61. GPS to SDR Encoder
  • 62. GPS to SDR Encoder
  • 63. GPS to SDR Encoder
  • 64. Feedback Local Feedforward Activates cell Neurons Biological neuron HTM neuron Non-linear Dendritic AP’s Depolarize soma Coincidence detectors HTM SynapsesBiological Synapses Learning is formation of new synapses. Synapses have low fidelity. Connection weight is binary 0.0 1.00.4 Learning forms new connections (“permanence” is scalar) 0 1 Feedforward Activates cell Prediction: Recognize hundreds of unique patterns Synapses Activation: Recognize dozens of unique patterns
  • 65. SDRs are used everywhere in the cortex. Sparse Distributed Representations (SDRs)
  • 66. From: Prof. Hasan, Max-Planck- Institute for Research
  • 67. x = 0100000000000000000100000000000110000000 • Extremely high capacity • Robust to noise and deletions • Have many desirable properties • Solve semantic representation problem Attributes SDR Basics • Large number of neurons • Few active at once • Every cell represents something • Information is distributed • SDRs are binary 10 to 15 synapses are sufficient to recognize patterns in thousands of cells. A single dendrite can recognize multiple unique patterns without confusion.
  • 68. Example: SDR Classification Capacity in Presence of Noise • n = number of bits in SDR • w = number of 1 bits • W = number of vectors that overlap vector x by b bits • Probability of false positive for one stored pattern • Probability of false positive for M stored patterns Wx (n,w,b) = wx b æ èç ö ø÷ ´ n - wx w - b æ èç ö ø÷ fpw n (q) = Wx (n,w,b) b=q w å n w æ èç ö ø÷ fpX (q) £ fpwxi n (q) i=0 M-1 å n = 2048, w = 40 With 50% noise, you can classify 1015 patterns with an error < 10-11 n = 64, w=12 With 33% noise, you can classify only 10 patterns with an error 0.04% Link.to.whitepaper.com