SlideShare a Scribd company logo
Hiroki Sayama
sayama@binghamton.edu
2
https://medium.com/swlh/the-map-of-artificial-intelligence-2020-2c4f446f4e43
1. The Origin: Understanding
“Intelligence”
2. Key Ingredient I: Statistics &
Data Analytics
3. Key Ingredient II: Optimization
4. Machine Learning
5. Artificial Neural Networks
6. Deep Learning
7. Other Topics and Tools
8. Research Examples
9. Challenges
3
The Origin:
Understanding
“Intelligence”
4
5
https://www.felienne.com/archives/2974
6
https://en.wikipedia.org/wiki/Turing_test
7
The first formal model of
computational mechanisms of
(artificial) neurons
8
Multilayer perceptron
(Rosenblatt 1958)
Backpropagation
(Rumelhart, Hinton &
Williams 1986)
Deep learning
https://commons.wikimedia.org/wiki/File:
Example_of_a_deep_neural_network.png
9
10
Norbert Wiener
(This is where the word “cyber-” came from!)
▪ Herbert Simon et al.’s “Logic Theorist” (1956)
▪ Functional programming, list processing (e.g.,
LISP (1955-))
▪ Logic-based chatbots (e.g., ELIZA (1966))
▪ Expert systems
▪ Fuzzy logic (Zadeh, 1965)
11
12
Key
Ingredient I:
Statistics &
Data Analytics
13
▪ Descriptive statistics
▪ Distribution, correlation,
regression
▪ Inferential statistics
▪ Hypothesis testing, estimation,
Bayesian inference
▪ Parametric / non-parametric
approaches
14
https://en.wikipedia.org/wiki/Statistics
▪ Legendre, Gauss (early 1800s)
▪ Representing the behavior of a
dependent variable (DV) as a
function of independent
variable(s) (IV)
▪ Linear regression, polynomial
regression, logistic regression,
etc.
▪ Optimization (minimization) of
errors between model and data
15
https://en.wikipedia.org/wiki/Regression_analysis
https://en.wikipedia.org/wiki/Polynomial_regression
▪ Original idea dates back to
1700s
▪ Pearson, Gosset, Fisher (early
1900s)
▪ Set up hypothesis(-ses) and
see how (un)likely the
observed data could be
explained by them
▪ Type-I error (false positive),
Type-II error (false negative)
16
https://en.wikibooks.org/wiki/Statistics/Testing
_Statistical_Hypothesis
▪ Bayes & Price (1763), Laplace
(1774)
▪ Probability as a degree of belief
that an event or a proposition is
true
▪ Estimated likelihoods updated
as additional data are obtained
▪ Empowered by Markov Chain
Monte Carlo (MCMC) numerical
integration methods (Metropolis
1953; Hastings 1970)
17
https://en.wikipedia.org/wiki/Bayes%27_theorem
https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo
Key
Ingredient II:
Optimization
18
▪ Legendre, Gauss (early 1800s)
▪ Find the formula that minimizes
the sum of squared errors
(residuals) analytically
19
https://en.wikipedia.org/wiki/Least_squares
▪ Find local minimum of a
function computationally
▪ Gradient descent (Cauchy
1847) and its variants
▪ More than 150 years later,
this is still what modern
AI/ML/DL systems are
essentially doing!!
▪ Error minimization
20
https://commons.wikimedia.org/wiki/File:
Gradient_descent.gif
▪ Extensively studied and used in
Operations Research
▪ Practical optimization algorithms
under various constraints
21
https://en.wikipedia.org/wiki/Linear_programming
https://en.wikipedia.org/wiki/Integer_programming
https://en.wikipedia.org/wiki/Floyd%E2%80%93Wa
rshall_algorithm
▪ Original idea by Turing (1950)
▪ Genetic algorithm (Holland 1975)
▪ Genetic programming (Cramer 1985, Koza 1988)
▪ Differential evolution (Storn & Price 1997)
▪ Neuroevolution (Stanley & Miikkulainen 2002)
22
https://becominghuman.ai/my-new-genetic-algorithm-for-time-series-f7f0df31343d https://en.wikipedia.org/wiki/Genetic_programming
▪ Ant colony optimization
(Dorigo 1992)
▪ Particle swarm optimization
(Kennedy & Eberhart 1995)
▪ And various other metaphor-based metaheuristic algorithms
https://en.wikipedia.org/wiki/List_of_metaphor-based_metaheuristics
23
https://en.wikipedia.org/wiki
/Ant_colony_optimization_al
gorithms
https://en.wikipedia.org/wiki
/Particle_swarm_optimizati
on
Machine
Learning
24
▪ Unsupervised learning
▪ Find patterns in the data
▪ Supervised learning
▪ Find patterns in the input-output mapping
▪ Reinforcement learning
▪ Learn the world by taking actions and receiving
rewards from the environment
25
▪ Clustering
▪ k-means, agglomerative
clustering, DBSCAN,
Gaussian mixture, community
detection, Jarvis Patrick, etc.
▪ Anomaly detection
▪ Feature
extraction/selection
▪ Dimension reduction
▪ PCA, t-SNE, etc.
26
https://reference.wolfram.com/language/ref/FindClusters.html
https://commons.wikimedia.org/wiki/File:T-SNE_and_PCA.png
▪ Regression
▪ Linear regression, Lasso, polynomial
regression, nearest neighbors,
decision tree, random forest,
Gaussian process, gradient boosted
trees, neural networks, support vector
machine, etc.
▪ Classification
▪ Logistic regression, decision tree,
gradient boosted trees, naive Bayes,
nearest neighbors, support vector
machine, neural networks, etc.
▪ Risk of overfitting
▪ Addressed by model selection, cross-
validation, etc.
27
https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html
https://scikit-learn.org/stable/auto_examples/
model_selection/plot_underfitting_overfitting.html
▪ Environment typically
formulated as a Markov
decision process (MDP)
▪ State of the world + agent’s
action
→ next state of the world +
reward
▪ Monte Carlo methods
▪ TD learning, Q-learning
28
https://en.wikipedia.org/wiki/Markov_decision_process
Artificial
Neural
Networks
29
▪ Hopfield (1982)
▪ A.k.a. “attractor networks”
▪ Fully connected networks with
symmetric weights can recover
imprinted patterns from imperfect
initial conditions
▪ “Associative memory”
Input Output
30
https://github.com/nosratullah/hopfieldNeuralNetwork
▪ Hinton & Sejnowski (1983),
Hinton & Salakhutdinov (2006)
▪ Stochastic, learnable variants
of Hopfield networks
▪ Restricted (bipartite) Boltzmann
machine was at the core of the
HS 2006 Science paper that
ignited the current boom of “Deep
Learning”
31
https://en.wikipedia.org/wiki/Boltzmann_machine
https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine
▪ Multilayer perceptron
(Rosenblatt 1958)
▪ Backpropagation (Werbos
1974; Rumelhart, Hinton &
Williams 1986)
▪ Minimization of errors by
gradient descent method
▪ Note that this is NOT how our
brain learns
▪ “Vanishing gradient” problem
32
Computation
Error correction
Input
Output
▪ Rumelhart, Hinton & Williams
(1986) (again!)
▪ Feed-forward ANNs that try
to reproduce the input
▪ Smaller intermediate layers
→ dimension reduction,
feature learning
▪ HS 2006 Science paper also
used restricted Boltzmann
machines as stacked
autoencoders
33
https://towardsdatascience.com/applied-deep-learning-part-3-
autoencoders-1c083af4d798
https://doi.org/10.1126/science.1127647
▪ Hopfield (1982);
Rumelhart, Hinton &
Williams (1986) (again!!)
▪ ANNs that contain
feedback loops
▪ Have internal states and
can learn temporal
behaviors of any long-
term dependencies
▪ With practical problems
in vanishing or exploding
long-term gradients
34
https://commons.wikimedia.org/wiki/File:Neuronal-Networks-
Feedback.png
https://en.wikipedia.org/wiki/Recurrent_neural_network
h
o
V
nfold
t 1
ht 1
ot 1
t
ht
ot
t+1
ht+1
ot+1
V
V V V
... ...
▪ Hochreiter & Schmidhuber
(1997)
▪ An improved neural module
for RNNs that can learn long-
term dependencies
effectively
▪ Vanishing gradient problem
resolved by hidden states
and error flow control
▪ “The most cited NN paper of
the 20th century”
35
▪ Actively studied since 2000s
▪ Use inherent behaviors of
complex dynamical systems
(usually a random RNN) as
a “reservoir” of various
solutions
▪ Learning takes place only at
the readout layer (i.e., no
backpropagation needed)
▪ Discrete-time, continuous-
time versions
36
https://doi.org/10.1515/nanoph-2016-0132
https://doi.org/10.1103/PhysRevLett.120.024102
▪ Self-organizing map (Kohonen 1982)
▪ Neural gas (Martinetz & Schulten 1991)
▪ Spiking neural networks (1990s-)
▪ Hierarchical Temporal Memory (2004-)
etc…
37
https://en.wikipedia.org/wiki/
Self-organizing_map
https://doi.org/10.1016/j.neucom.
2019.10.104
https://numenta.com/neuroscience-research/sequence-learning/
Deep Learning
38
▪ Ideas originally around since
the beginning of ANNs
▪ Became feasible and popular
in 2010s because of:
▪ Huge increase in available
computational power thanks
to GPUs
▪ Wide availability of training
data over the Internet
39
https://commons.wikimedia.org/wiki/File:Example_of_a_deep_neural_network.png
https://www.techradar.com/news/computing-components/graphics-cards/best-graphics-cards-1291458
▪ Fukushima (1980), Homma
et al. (1988), LeCun et al.
(1989, 1998)
▪ DNNs with convolution
operations between layers
▪ Layers represent spatial
(and/or temporal) patterns
▪ Many great applications to
image/video/time series
analyses
40
https://towardsdatascience.com/a-comprehensive-guide-to-
convolutional-neural-networks-the-eli5-way-3bd2b1164a53
https://cs231n.github.io/convolutional-networks/
41
https://arxiv.org/abs/1412.6572
https://en.wikipedia.org/wiki/Generative_
adversarial_network
▪ Goodfellow et al. (2014a,b)
▪ DNNs are vulnerable
against adversarial attacks
▪ Utilize it to create co-
evolutionary systems of
generator and discriminator
https://commons.wikimedia.org/wiki/File:A-Standard-GAN-and-b-conditional-GAN-architecturpn.png
▪ Scarselli et al. (2008),
Kipf & Welling (2016)
▪ Non-regular graph
structure used as
network topology
within each layer of
DNN
▪ Applications to graph-
based data modeling,
e.g, social networks,
molecular biology, etc.
42
https://tkipf.github.io/graph-convolutional-networks/
https://towardsdatascience.com/how-to-do-deep-learning-on-
graphs-with-graph-convolutional-networks-7d2250723780
▪ Vaswani et al. (2017)
▪ DNNs with self-attention
mechanism for natural
language processing (NLP)
▪ Enhanced parallelizability
leading to shorter training time
than LSTM
▪ BERT (2018) for Google search
▪ Massive language models:
Open AI’s GPT-3 (2020),
Google's Switch Transformer
(2021), etc.
43
https://arxiv.org/abs/1706.03762
44
OpenAI GPT-3 / DALL-E
https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-
article-gpt-3
Other Topics
and Tools
45
46
Time series analysis
• Autoregression, ARMA/ARIMA, time series
embedding, phase space reconstruction, etc.
Natural language processing (NLP)
• Classic syntactic/semantic approaches
Information theory
• Entropy, mutual information
Computation theory
• Automata, computational complexity
47
Brain/neuroscience, cognitive science
Complex systems and networks
Robotics and control
Consciousness, sentience, self
▪Python!!
▪scikit-learn
▪TensorFlow / Keras
▪PyTorch
▪Mathematica, MATLAB
48
Research
Examples
(of My Own)
49
50
Zamani Esfahlani, F. et al. (2018). A network-based classification framework
for predicting treatment response of schizophrenia patients. Expert Systems
with Applications, 109, 152-161. https://doi.org/10.1016/j.eswa.2018.05.005
Graduate Award for Excellence
in Research (2018)
51
Cao, Y., et al. (2022). Visualizing collective
idea generation and innovation processes in
social networks. IEEE Transactions on
Computational Social Systems.
https://doi.org/10.1109/TCSS.2022.3184628
52
Dong, Y. et al. (2021).
Utterance clustering using
stereo audio channels.
Computational Intelligence
and Neuroscience, 2021,
6151651.
https://doi.org/10.1155/2021/
6151651
53
Sayama, H. (2022). Social fragmentation transitions in
large-scale adaptive social network simulations,
Proceedings of the 14th International Conference on
Parallel Processing and Applied Mathematics (PPAM 2022)
/ 7th Workshop on Complex Collective Systems, Springer,
in press. https://arxiv.org/abs/2205.10489
Challenges
54
▪ Words, numbers, facts
▪ Maintaining stability and plasticity
▪ Catastrophic forgetting
▪ Transfer Learning
▪ Application of acquired knowledge
to different problems
55
https://spectrum.ieee.org/openai-dall-e-2
https://www.invistaperforms.
org/getting-ahead-forgetting-
curve-training/
https://www.analyticsvidhy
a.com/blog/2021/10/unders
tanding-transfer-learning-
for-deep-learning/
56
https://spectrum.ieee.org/openai-dall-e-2
istockphoto.com
57
58
59
https://www.wired.com/story/deepfakes-getting-better-theyre-easy-spot/
60
https://www.analyticsvidhya.com/blog
/2022/03/the-carbon-footprint-of-ai-
and-deep-learning/
61
Fall 2020: “How to
safely reopen the
campus”
62
63
https://en.wikipedia.org/wiki/Tree_of_life_(biology)
Are We Getting Any
Closer to the
Understanding of
True “Intelligence"?
64
▪ Don’t get drowned in the vast
ocean of methods and tools
▪ Hundreds of years of history
▪ Buzzwords and fads keep changing
▪ Keep the big picture in mind –
focus on what the real problem is
and how you will solve it
▪ Being able to develop unique,
original, creative solutions is
key to differentiate your
intelligence from AI/machines
65
▪ Wikipedia, various websites and many AI/ML bloggers
for great info and images!!
▪ The following people for providing feedback on the
initial version:
▪ Sofia Teixeira, Arseny Krasikov, Odai Yousef Dweekat,
Mohammed Jarbou, Seth Bullock, Dobromir Dotov, and
others
66
67
@hirokisayama

More Related Content

PPTX
Thin Layer Cromatography.pptx
PDF
2002 Polaris 700 XC SP SNOWMOBILE Service Repair Manual
PDF
A Quick Overview of Artificial Intelligence and Machine Learning
PDF
Artificial Intelligence, Machine Learning, and (Large) Language Models: A Qui...
PDF
Barga DIDC'14 Invited Talk
PDF
Finite Fields Theory And Applications Gary Mcguire Gary Mcguire
PDF
ML4CS_L08_NeuralNetworks machine learning
PPTX
Artificial Intelligence and its application
Thin Layer Cromatography.pptx
2002 Polaris 700 XC SP SNOWMOBILE Service Repair Manual
A Quick Overview of Artificial Intelligence and Machine Learning
Artificial Intelligence, Machine Learning, and (Large) Language Models: A Qui...
Barga DIDC'14 Invited Talk
Finite Fields Theory And Applications Gary Mcguire Gary Mcguire
ML4CS_L08_NeuralNetworks machine learning
Artificial Intelligence and its application

Similar to A Quick Overview of Artificial Intelligence and Machine Learning (revised version) (20)

PDF
AI and Robotics at an Inflection Point
PPTX
Benevolent machine learning sgs
PPTX
Benevolent machine learning
PDF
Unraveling Information about Deep Learning
PPTX
Applying Machine Learning and Artificial Intelligence to Business
PDF
Choices, modelling and Frankenstein Ontologies
PDF
TensorFlow London: Cutting edge generative models
PDF
BCII 2016 - Visualizing Complexity
PPT
Artificial Intelligence Lecture Slide 02
DOC
Curriculum Vitae
PDF
AI history (Epita International Masters)
PDF
20181212 ibm aot
PDF
MachineLearning_Road to deep learning.pdf
PDF
Multichaos Fractal And Multifractional Artificial Intelligence Of Different C...
PDF
ANALYSIS ON MACHINE CELL RECOGNITION AND DETACHING FROM NEURAL SYSTEMS
PPTX
myExperiment and the Rise of Social Machines
PDF
3234150
PPTX
Closing clive holtham
 
PDF
AI in Manufacturing: Opportunities & Challenges
PPTX
Big Sky Earth 2018 Introduction to machine learning
AI and Robotics at an Inflection Point
Benevolent machine learning sgs
Benevolent machine learning
Unraveling Information about Deep Learning
Applying Machine Learning and Artificial Intelligence to Business
Choices, modelling and Frankenstein Ontologies
TensorFlow London: Cutting edge generative models
BCII 2016 - Visualizing Complexity
Artificial Intelligence Lecture Slide 02
Curriculum Vitae
AI history (Epita International Masters)
20181212 ibm aot
MachineLearning_Road to deep learning.pdf
Multichaos Fractal And Multifractional Artificial Intelligence Of Different C...
ANALYSIS ON MACHINE CELL RECOGNITION AND DETACHING FROM NEURAL SYSTEMS
myExperiment and the Rise of Social Machines
3234150
Closing clive holtham
 
AI in Manufacturing: Opportunities & Challenges
Big Sky Earth 2018 Introduction to machine learning
Ad

More from Hiroki Sayama (15)

PDF
How to Make Things Evolve
PDF
Review of linear algebra
PDF
What an ALifer Has Been Doing About COVID-19
PDF
Self-organization of society: fragmentation, disagreement, and how to overcom...
PDF
Enhanced ability of information gathering may intensify disagreement among gr...
PDF
Complexity Explained: A brief intro to complex systems
PPTX
Graph product representation of organism-environment couplings in evolution
PPTX
Suppleness and Open-Endedness for Social Sustainability
PDF
Swarm Chemistry: A Decade-Long Quest to Emergent Creativity in Artificial "Na...
PDF
Adaptive network models of socio-cultural dynamics
PDF
Artificial Creativity of Evolutionary Swarm Systems
PDF
Effects of Organizational Network Structure and Task-Related Diversity on Col...
PDF
How to survive as an interdisciplinary being
PPTX
Formulating Evolutionary Dynamics of Organism-Environment Couplings Using Gra...
PDF
Self-Replication and the Halting Problem
How to Make Things Evolve
Review of linear algebra
What an ALifer Has Been Doing About COVID-19
Self-organization of society: fragmentation, disagreement, and how to overcom...
Enhanced ability of information gathering may intensify disagreement among gr...
Complexity Explained: A brief intro to complex systems
Graph product representation of organism-environment couplings in evolution
Suppleness and Open-Endedness for Social Sustainability
Swarm Chemistry: A Decade-Long Quest to Emergent Creativity in Artificial "Na...
Adaptive network models of socio-cultural dynamics
Artificial Creativity of Evolutionary Swarm Systems
Effects of Organizational Network Structure and Task-Related Diversity on Col...
How to survive as an interdisciplinary being
Formulating Evolutionary Dynamics of Organism-Environment Couplings Using Gra...
Self-Replication and the Halting Problem
Ad

Recently uploaded (20)

PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
neck nodes and dissection types and lymph nodes levels
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPT
protein biochemistry.ppt for university classes
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Comparative Structure of Integument in Vertebrates.pptx
Placing the Near-Earth Object Impact Probability in Context
neck nodes and dissection types and lymph nodes levels
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
2. Earth - The Living Planet Module 2ELS
TOTAL hIP ARTHROPLASTY Presentation.pptx
HPLC-PPT.docx high performance liquid chromatography
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
microscope-Lecturecjchchchchcuvuvhc.pptx
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
The KM-GBF monitoring framework – status & key messages.pptx
. Radiology Case Scenariosssssssssssssss
Classification Systems_TAXONOMY_SCIENCE8.pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt
Taita Taveta Laboratory Technician Workshop Presentation.pptx
2Systematics of Living Organisms t-.pptx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
protein biochemistry.ppt for university classes
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...

A Quick Overview of Artificial Intelligence and Machine Learning (revised version)

  • 3. 1. The Origin: Understanding “Intelligence” 2. Key Ingredient I: Statistics & Data Analytics 3. Key Ingredient II: Optimization 4. Machine Learning 5. Artificial Neural Networks 6. Deep Learning 7. Other Topics and Tools 8. Research Examples 9. Challenges 3
  • 7. 7 The first formal model of computational mechanisms of (artificial) neurons
  • 8. 8 Multilayer perceptron (Rosenblatt 1958) Backpropagation (Rumelhart, Hinton & Williams 1986) Deep learning https://commons.wikimedia.org/wiki/File: Example_of_a_deep_neural_network.png
  • 9. 9
  • 10. 10 Norbert Wiener (This is where the word “cyber-” came from!)
  • 11. ▪ Herbert Simon et al.’s “Logic Theorist” (1956) ▪ Functional programming, list processing (e.g., LISP (1955-)) ▪ Logic-based chatbots (e.g., ELIZA (1966)) ▪ Expert systems ▪ Fuzzy logic (Zadeh, 1965) 11
  • 12. 12
  • 14. ▪ Descriptive statistics ▪ Distribution, correlation, regression ▪ Inferential statistics ▪ Hypothesis testing, estimation, Bayesian inference ▪ Parametric / non-parametric approaches 14 https://en.wikipedia.org/wiki/Statistics
  • 15. ▪ Legendre, Gauss (early 1800s) ▪ Representing the behavior of a dependent variable (DV) as a function of independent variable(s) (IV) ▪ Linear regression, polynomial regression, logistic regression, etc. ▪ Optimization (minimization) of errors between model and data 15 https://en.wikipedia.org/wiki/Regression_analysis https://en.wikipedia.org/wiki/Polynomial_regression
  • 16. ▪ Original idea dates back to 1700s ▪ Pearson, Gosset, Fisher (early 1900s) ▪ Set up hypothesis(-ses) and see how (un)likely the observed data could be explained by them ▪ Type-I error (false positive), Type-II error (false negative) 16 https://en.wikibooks.org/wiki/Statistics/Testing _Statistical_Hypothesis
  • 17. ▪ Bayes & Price (1763), Laplace (1774) ▪ Probability as a degree of belief that an event or a proposition is true ▪ Estimated likelihoods updated as additional data are obtained ▪ Empowered by Markov Chain Monte Carlo (MCMC) numerical integration methods (Metropolis 1953; Hastings 1970) 17 https://en.wikipedia.org/wiki/Bayes%27_theorem https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo
  • 19. ▪ Legendre, Gauss (early 1800s) ▪ Find the formula that minimizes the sum of squared errors (residuals) analytically 19 https://en.wikipedia.org/wiki/Least_squares
  • 20. ▪ Find local minimum of a function computationally ▪ Gradient descent (Cauchy 1847) and its variants ▪ More than 150 years later, this is still what modern AI/ML/DL systems are essentially doing!! ▪ Error minimization 20 https://commons.wikimedia.org/wiki/File: Gradient_descent.gif
  • 21. ▪ Extensively studied and used in Operations Research ▪ Practical optimization algorithms under various constraints 21 https://en.wikipedia.org/wiki/Linear_programming https://en.wikipedia.org/wiki/Integer_programming https://en.wikipedia.org/wiki/Floyd%E2%80%93Wa rshall_algorithm
  • 22. ▪ Original idea by Turing (1950) ▪ Genetic algorithm (Holland 1975) ▪ Genetic programming (Cramer 1985, Koza 1988) ▪ Differential evolution (Storn & Price 1997) ▪ Neuroevolution (Stanley & Miikkulainen 2002) 22 https://becominghuman.ai/my-new-genetic-algorithm-for-time-series-f7f0df31343d https://en.wikipedia.org/wiki/Genetic_programming
  • 23. ▪ Ant colony optimization (Dorigo 1992) ▪ Particle swarm optimization (Kennedy & Eberhart 1995) ▪ And various other metaphor-based metaheuristic algorithms https://en.wikipedia.org/wiki/List_of_metaphor-based_metaheuristics 23 https://en.wikipedia.org/wiki /Ant_colony_optimization_al gorithms https://en.wikipedia.org/wiki /Particle_swarm_optimizati on
  • 25. ▪ Unsupervised learning ▪ Find patterns in the data ▪ Supervised learning ▪ Find patterns in the input-output mapping ▪ Reinforcement learning ▪ Learn the world by taking actions and receiving rewards from the environment 25
  • 26. ▪ Clustering ▪ k-means, agglomerative clustering, DBSCAN, Gaussian mixture, community detection, Jarvis Patrick, etc. ▪ Anomaly detection ▪ Feature extraction/selection ▪ Dimension reduction ▪ PCA, t-SNE, etc. 26 https://reference.wolfram.com/language/ref/FindClusters.html https://commons.wikimedia.org/wiki/File:T-SNE_and_PCA.png
  • 27. ▪ Regression ▪ Linear regression, Lasso, polynomial regression, nearest neighbors, decision tree, random forest, Gaussian process, gradient boosted trees, neural networks, support vector machine, etc. ▪ Classification ▪ Logistic regression, decision tree, gradient boosted trees, naive Bayes, nearest neighbors, support vector machine, neural networks, etc. ▪ Risk of overfitting ▪ Addressed by model selection, cross- validation, etc. 27 https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html https://scikit-learn.org/stable/auto_examples/ model_selection/plot_underfitting_overfitting.html
  • 28. ▪ Environment typically formulated as a Markov decision process (MDP) ▪ State of the world + agent’s action → next state of the world + reward ▪ Monte Carlo methods ▪ TD learning, Q-learning 28 https://en.wikipedia.org/wiki/Markov_decision_process
  • 30. ▪ Hopfield (1982) ▪ A.k.a. “attractor networks” ▪ Fully connected networks with symmetric weights can recover imprinted patterns from imperfect initial conditions ▪ “Associative memory” Input Output 30 https://github.com/nosratullah/hopfieldNeuralNetwork
  • 31. ▪ Hinton & Sejnowski (1983), Hinton & Salakhutdinov (2006) ▪ Stochastic, learnable variants of Hopfield networks ▪ Restricted (bipartite) Boltzmann machine was at the core of the HS 2006 Science paper that ignited the current boom of “Deep Learning” 31 https://en.wikipedia.org/wiki/Boltzmann_machine https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine
  • 32. ▪ Multilayer perceptron (Rosenblatt 1958) ▪ Backpropagation (Werbos 1974; Rumelhart, Hinton & Williams 1986) ▪ Minimization of errors by gradient descent method ▪ Note that this is NOT how our brain learns ▪ “Vanishing gradient” problem 32 Computation Error correction Input Output
  • 33. ▪ Rumelhart, Hinton & Williams (1986) (again!) ▪ Feed-forward ANNs that try to reproduce the input ▪ Smaller intermediate layers → dimension reduction, feature learning ▪ HS 2006 Science paper also used restricted Boltzmann machines as stacked autoencoders 33 https://towardsdatascience.com/applied-deep-learning-part-3- autoencoders-1c083af4d798 https://doi.org/10.1126/science.1127647
  • 34. ▪ Hopfield (1982); Rumelhart, Hinton & Williams (1986) (again!!) ▪ ANNs that contain feedback loops ▪ Have internal states and can learn temporal behaviors of any long- term dependencies ▪ With practical problems in vanishing or exploding long-term gradients 34 https://commons.wikimedia.org/wiki/File:Neuronal-Networks- Feedback.png https://en.wikipedia.org/wiki/Recurrent_neural_network h o V nfold t 1 ht 1 ot 1 t ht ot t+1 ht+1 ot+1 V V V V ... ...
  • 35. ▪ Hochreiter & Schmidhuber (1997) ▪ An improved neural module for RNNs that can learn long- term dependencies effectively ▪ Vanishing gradient problem resolved by hidden states and error flow control ▪ “The most cited NN paper of the 20th century” 35
  • 36. ▪ Actively studied since 2000s ▪ Use inherent behaviors of complex dynamical systems (usually a random RNN) as a “reservoir” of various solutions ▪ Learning takes place only at the readout layer (i.e., no backpropagation needed) ▪ Discrete-time, continuous- time versions 36 https://doi.org/10.1515/nanoph-2016-0132 https://doi.org/10.1103/PhysRevLett.120.024102
  • 37. ▪ Self-organizing map (Kohonen 1982) ▪ Neural gas (Martinetz & Schulten 1991) ▪ Spiking neural networks (1990s-) ▪ Hierarchical Temporal Memory (2004-) etc… 37 https://en.wikipedia.org/wiki/ Self-organizing_map https://doi.org/10.1016/j.neucom. 2019.10.104 https://numenta.com/neuroscience-research/sequence-learning/
  • 39. ▪ Ideas originally around since the beginning of ANNs ▪ Became feasible and popular in 2010s because of: ▪ Huge increase in available computational power thanks to GPUs ▪ Wide availability of training data over the Internet 39 https://commons.wikimedia.org/wiki/File:Example_of_a_deep_neural_network.png https://www.techradar.com/news/computing-components/graphics-cards/best-graphics-cards-1291458
  • 40. ▪ Fukushima (1980), Homma et al. (1988), LeCun et al. (1989, 1998) ▪ DNNs with convolution operations between layers ▪ Layers represent spatial (and/or temporal) patterns ▪ Many great applications to image/video/time series analyses 40 https://towardsdatascience.com/a-comprehensive-guide-to- convolutional-neural-networks-the-eli5-way-3bd2b1164a53 https://cs231n.github.io/convolutional-networks/
  • 41. 41 https://arxiv.org/abs/1412.6572 https://en.wikipedia.org/wiki/Generative_ adversarial_network ▪ Goodfellow et al. (2014a,b) ▪ DNNs are vulnerable against adversarial attacks ▪ Utilize it to create co- evolutionary systems of generator and discriminator https://commons.wikimedia.org/wiki/File:A-Standard-GAN-and-b-conditional-GAN-architecturpn.png
  • 42. ▪ Scarselli et al. (2008), Kipf & Welling (2016) ▪ Non-regular graph structure used as network topology within each layer of DNN ▪ Applications to graph- based data modeling, e.g, social networks, molecular biology, etc. 42 https://tkipf.github.io/graph-convolutional-networks/ https://towardsdatascience.com/how-to-do-deep-learning-on- graphs-with-graph-convolutional-networks-7d2250723780
  • 43. ▪ Vaswani et al. (2017) ▪ DNNs with self-attention mechanism for natural language processing (NLP) ▪ Enhanced parallelizability leading to shorter training time than LSTM ▪ BERT (2018) for Google search ▪ Massive language models: Open AI’s GPT-3 (2020), Google's Switch Transformer (2021), etc. 43 https://arxiv.org/abs/1706.03762
  • 44. 44 OpenAI GPT-3 / DALL-E https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this- article-gpt-3
  • 46. 46 Time series analysis • Autoregression, ARMA/ARIMA, time series embedding, phase space reconstruction, etc. Natural language processing (NLP) • Classic syntactic/semantic approaches Information theory • Entropy, mutual information Computation theory • Automata, computational complexity
  • 47. 47 Brain/neuroscience, cognitive science Complex systems and networks Robotics and control Consciousness, sentience, self
  • 50. 50 Zamani Esfahlani, F. et al. (2018). A network-based classification framework for predicting treatment response of schizophrenia patients. Expert Systems with Applications, 109, 152-161. https://doi.org/10.1016/j.eswa.2018.05.005 Graduate Award for Excellence in Research (2018)
  • 51. 51 Cao, Y., et al. (2022). Visualizing collective idea generation and innovation processes in social networks. IEEE Transactions on Computational Social Systems. https://doi.org/10.1109/TCSS.2022.3184628
  • 52. 52 Dong, Y. et al. (2021). Utterance clustering using stereo audio channels. Computational Intelligence and Neuroscience, 2021, 6151651. https://doi.org/10.1155/2021/ 6151651
  • 53. 53 Sayama, H. (2022). Social fragmentation transitions in large-scale adaptive social network simulations, Proceedings of the 14th International Conference on Parallel Processing and Applied Mathematics (PPAM 2022) / 7th Workshop on Complex Collective Systems, Springer, in press. https://arxiv.org/abs/2205.10489
  • 55. ▪ Words, numbers, facts ▪ Maintaining stability and plasticity ▪ Catastrophic forgetting ▪ Transfer Learning ▪ Application of acquired knowledge to different problems 55 https://spectrum.ieee.org/openai-dall-e-2 https://www.invistaperforms. org/getting-ahead-forgetting- curve-training/ https://www.analyticsvidhy a.com/blog/2021/10/unders tanding-transfer-learning- for-deep-learning/
  • 57. 57
  • 58. 58
  • 61. 61 Fall 2020: “How to safely reopen the campus”
  • 62. 62
  • 64. Are We Getting Any Closer to the Understanding of True “Intelligence"? 64
  • 65. ▪ Don’t get drowned in the vast ocean of methods and tools ▪ Hundreds of years of history ▪ Buzzwords and fads keep changing ▪ Keep the big picture in mind – focus on what the real problem is and how you will solve it ▪ Being able to develop unique, original, creative solutions is key to differentiate your intelligence from AI/machines 65
  • 66. ▪ Wikipedia, various websites and many AI/ML bloggers for great info and images!! ▪ The following people for providing feedback on the initial version: ▪ Sofia Teixeira, Arseny Krasikov, Odai Yousef Dweekat, Mohammed Jarbou, Seth Bullock, Dobromir Dotov, and others 66