SlideShare a Scribd company logo
Machine Learning
                                      Homework 2-1
                                 Due Tuesday, May 3, 2005

    1.   Non-linear auto-regression – the Mackey-Glass equation.

         In this exercise you will use a neural net to predict the dynamics of a chaotic time
         series generated by the Mackey-Glass delay differential equation. The data are in
         the files mack2_tr.dat , mack2_val.dat, and mack2_tst.dat on
         the class web page. There is a description of the data in the file mack2.txt. To
         get a look at the behavior of this time series, plot the first column of data.

         For your experiments, use the Netlab toolbox for Matlab. Both Matlab and the
         Netlab package are on the OGI-CSEE and PSU-CS education compute servers.
         The idea behind prediction of univariate time series is that their past behavior
         provides the information needed to predict future behavior. For linear systems
         and models, this is the basis for autoregressive (AR) time series models. For
         nonlinear systems, the notion of predictability from the past is formally captured
         by Taken’s theorem.

         Taken’s theorem tells us that we can represent the dynamical state of the system
         that generated the observed time series x(t) by embedding the time series in a
         vector

                v(t ) = [ x(t ), x(t − ∆), x(t − 2 ∆), x(t − 3 ∆), ...x(t − (m − 1)∆ ) ]

         where ∆ is the embedding lag, and m is the embedding dimension.

The data in the files is embedded for you, using a lag and dimension known to provide
decent results for this data set.

The data consists of five columns. The first four are the lagged values used as inputs, the

                     x(t ) x(t − 6) x(t − 12) x(t − 18)           x(t + 85)
last column is the value to be predicted:

Thus we use the current value x(t), and three previous values to predict the value in the
future x(t+85).
Your assignment is to apply several neural nets to this prediction problem and obtain the
best results you can. You are to use mack2_tr.dat to actually optimize the network
weights. When you have arrived at what you think is your best network architecture
(size, optimization …), then run the test data mack2_tst.dat through your models.
(This is noise-free data, so you will not see overfitting.)

First fit a single linear node (linear AR model) to the problem. You can prepare, train,
and evaluate the one-node linear model using the NetLab functions glm, glmtrain,
and glmfwd. Plot the prediction results (on mack2_test.dat) and also the data on the
same plot so you can compare the shapes visually. Measure and report the mean square
error on the test data also.

Then move on to a network with a single layer of sigmoidal hidden nodes and a linear
output. Experiment with the number of hidden nodes, and with different optimizers. Try
the scaled conjugate gradient (scg)and the quasi-Newton (quasinew) optimizers.
You can try quite large networks – up to 50 hidden units -- although they will take a bit
longer to train them.

As for the linear node, plot network predictions along with the results. Does the non-
linear neural net do significantly better than the linear model? Report the best result
you’ve achieved on the test data. Plot the prediction along with the data and compare
visually with the linear node prediction.



1. Classification – Iris Data

   For this exercise, you will train a neural network to classify the three different iris
   species in the famous Fisher iris data. The data are in the files IrisDev.dat (the
   development data) and IrisTest.dat (the test data). The files contain the input
   features in columns one through four, and in the last three columns, the class of each
   example encoded in a one-of-three representation

                   T = (0 0 1) for class 1
                   T = (0 1 0) for class 2
                   T = (1 0 0) for class 3

   You will need to construct a network with four inputs and three outputs. Use a
   logistic unit in the output layer. You can compute the classification accuracy with the
   Netlab function confmat, which computes both the overall classification accuracy
   (expressed as percent), and writes out a confusion matrix.

   The rows of a confusion matrix contain the true class labels, while the columns are
   the network assignments. For example, suppose we have a three-class problem and a
   classifier that generates the following confusion matrix:
C1      C2     C3
         C1     44       5      1
         C2      1      39     10
         C3      0       9     41

For this example, 44 of 50 C1 example were correctly classified, 5 were mislabeled
C2, and 1 was mislabeled C3. The total number of misclassified examples is the sum
of the off-diagonal elements, i.e. 26. The error rate is 0.173.


Segment the development data set into five segments of 15 examples, and use a 5-
fold cross-validation to pick the network size. Do you see any overfitting? When
you have what you believe is the best network, run the test data through the network
and report the classification accuracy, and include the confusion matrix.

More Related Content

PPTX
Hierarchical Clustering
PPTX
K means clustering algorithm
PPTX
"k-means-clustering" presentation @ Papers We Love Bucharest
PDF
Principal component analysis and lda
PDF
Data Science - Part IX - Support Vector Machine
PDF
Fuzzy c-Means Clustering Algorithms
PDF
Approaches to online quantile estimation
PPT
3.3 hierarchical methods
Hierarchical Clustering
K means clustering algorithm
"k-means-clustering" presentation @ Papers We Love Bucharest
Principal component analysis and lda
Data Science - Part IX - Support Vector Machine
Fuzzy c-Means Clustering Algorithms
Approaches to online quantile estimation
3.3 hierarchical methods

What's hot (20)

PPTX
K means clustering | K Means ++
PPTX
Hierarchical clustering
PDF
Neural Networks: Principal Component Analysis (PCA)
PPTX
Unsupervised learning (clustering)
PPSX
Lecture 1 an introduction to data structure
PDF
Probabilistic PCA, EM, and more
PPT
PPT
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
PPTX
Efficient Sparse Coding Algorithms
PPT
Cure, Clustering Algorithm
PDF
New Approach for K-mean and K-medoids Algorithm
PPTX
K-Means clustring @jax
PPT
Enhance The K Means Algorithm On Spatial Dataset
PPTX
Application of Matrices in real life | Matrices application | The Matrices
PPTX
PPTX
Bsc cs ii dfs u-1 introduction to data structure
PDF
Pca analysis
PDF
Data Science - Part VII - Cluster Analysis
PPTX
Types of clustering and different types of clustering algorithms
PDF
Mahoney mlconf-nov13
K means clustering | K Means ++
Hierarchical clustering
Neural Networks: Principal Component Analysis (PCA)
Unsupervised learning (clustering)
Lecture 1 an introduction to data structure
Probabilistic PCA, EM, and more
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
Efficient Sparse Coding Algorithms
Cure, Clustering Algorithm
New Approach for K-mean and K-medoids Algorithm
K-Means clustring @jax
Enhance The K Means Algorithm On Spatial Dataset
Application of Matrices in real life | Matrices application | The Matrices
Bsc cs ii dfs u-1 introduction to data structure
Pca analysis
Data Science - Part VII - Cluster Analysis
Types of clustering and different types of clustering algorithms
Mahoney mlconf-nov13
Ad

Viewers also liked (20)

PDF
TZA058_CARE_Midterm_Evaluation Final Report
PPTX
Innovación Educativa parte II - Tendencias Pedagógicas
PPTX
La competència digital del professorat (model TPACK)
PPT
NOAC use - brief handout
PPTX
ベトナム人の新聞・雑誌・読書の習慣についての調査
PDF
Machine Learning
PDF
MAPATO - PASADA Complete Document
PPTX
Robert mueller pitch
PPTX
Review of use of learning and observation in ITE lesson study
ODP
20150816-民撰官酌的台灣開放政府資料授權條款 Revision of the Taiwan Open Government Data Licens...
PPTX
Phun phủ nhiệt
PDF
Historia de la alhòndiga de granaditas 150909
PPTX
Internship powerpoint
PPTX
ベトナムの歯磨き・歯ブラシ調査
PPTX
Vietnamese oral research
PPTX
Germany and German literature
PPTX
Backpacking is the latest modern travel trend
PPT
Los generos en la pintura
PPT
Maruti suzuki marketing strategies by Aviroop Banik,Rizvi Institute of Manage...
PPTX
Movimientos del cuello
TZA058_CARE_Midterm_Evaluation Final Report
Innovación Educativa parte II - Tendencias Pedagógicas
La competència digital del professorat (model TPACK)
NOAC use - brief handout
ベトナム人の新聞・雑誌・読書の習慣についての調査
Machine Learning
MAPATO - PASADA Complete Document
Robert mueller pitch
Review of use of learning and observation in ITE lesson study
20150816-民撰官酌的台灣開放政府資料授權條款 Revision of the Taiwan Open Government Data Licens...
Phun phủ nhiệt
Historia de la alhòndiga de granaditas 150909
Internship powerpoint
ベトナムの歯磨き・歯ブラシ調査
Vietnamese oral research
Germany and German literature
Backpacking is the latest modern travel trend
Los generos en la pintura
Maruti suzuki marketing strategies by Aviroop Banik,Rizvi Institute of Manage...
Movimientos del cuello
Ad

Similar to HW2-1_05.doc (20)

PDF
PDF
Nn examples
PDF
DWDM-AG-day-1-2023-SEC A plus Half B--.pdf
PPTX
Nimrita koul Machine Learning
PPTX
Data Science and Machine Learning with Tensorflow
PPTX
Artificial Neural Network
PPTX
# Neural network toolbox
PPT
Introduction
PPT
Introduction to Machine Learning Aristotelis Tsirigos
PPTX
Java and Deep Learning
PDF
Neural Network Classification and its Applications in Insurance Industry
PPTX
Machine learning and types
PPTX
Machine Learning Essentials Demystified part2 | Big Data Demystified
PDF
Classifier Model using Artificial Neural Network
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PPTX
MACHINE LEARNING.pptx
PPTX
rsec2a-2016-jheaton-morning
PDF
IRJET- Machine Learning based Object Identification System using Python
DOC
Lecture #1: Introduction to machine learning (ML)
Nn examples
DWDM-AG-day-1-2023-SEC A plus Half B--.pdf
Nimrita koul Machine Learning
Data Science and Machine Learning with Tensorflow
Artificial Neural Network
# Neural network toolbox
Introduction
Introduction to Machine Learning Aristotelis Tsirigos
Java and Deep Learning
Neural Network Classification and its Applications in Insurance Industry
Machine learning and types
Machine Learning Essentials Demystified part2 | Big Data Demystified
Classifier Model using Artificial Neural Network
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
MACHINE LEARNING.pptx
rsec2a-2016-jheaton-morning
IRJET- Machine Learning based Object Identification System using Python
Lecture #1: Introduction to machine learning (ML)

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

HW2-1_05.doc

  • 1. Machine Learning Homework 2-1 Due Tuesday, May 3, 2005 1. Non-linear auto-regression – the Mackey-Glass equation. In this exercise you will use a neural net to predict the dynamics of a chaotic time series generated by the Mackey-Glass delay differential equation. The data are in the files mack2_tr.dat , mack2_val.dat, and mack2_tst.dat on the class web page. There is a description of the data in the file mack2.txt. To get a look at the behavior of this time series, plot the first column of data. For your experiments, use the Netlab toolbox for Matlab. Both Matlab and the Netlab package are on the OGI-CSEE and PSU-CS education compute servers. The idea behind prediction of univariate time series is that their past behavior provides the information needed to predict future behavior. For linear systems and models, this is the basis for autoregressive (AR) time series models. For nonlinear systems, the notion of predictability from the past is formally captured by Taken’s theorem. Taken’s theorem tells us that we can represent the dynamical state of the system that generated the observed time series x(t) by embedding the time series in a vector v(t ) = [ x(t ), x(t − ∆), x(t − 2 ∆), x(t − 3 ∆), ...x(t − (m − 1)∆ ) ] where ∆ is the embedding lag, and m is the embedding dimension. The data in the files is embedded for you, using a lag and dimension known to provide decent results for this data set. The data consists of five columns. The first four are the lagged values used as inputs, the x(t ) x(t − 6) x(t − 12) x(t − 18) x(t + 85) last column is the value to be predicted: Thus we use the current value x(t), and three previous values to predict the value in the future x(t+85).
  • 2. Your assignment is to apply several neural nets to this prediction problem and obtain the best results you can. You are to use mack2_tr.dat to actually optimize the network weights. When you have arrived at what you think is your best network architecture (size, optimization …), then run the test data mack2_tst.dat through your models. (This is noise-free data, so you will not see overfitting.) First fit a single linear node (linear AR model) to the problem. You can prepare, train, and evaluate the one-node linear model using the NetLab functions glm, glmtrain, and glmfwd. Plot the prediction results (on mack2_test.dat) and also the data on the same plot so you can compare the shapes visually. Measure and report the mean square error on the test data also. Then move on to a network with a single layer of sigmoidal hidden nodes and a linear output. Experiment with the number of hidden nodes, and with different optimizers. Try the scaled conjugate gradient (scg)and the quasi-Newton (quasinew) optimizers. You can try quite large networks – up to 50 hidden units -- although they will take a bit longer to train them. As for the linear node, plot network predictions along with the results. Does the non- linear neural net do significantly better than the linear model? Report the best result you’ve achieved on the test data. Plot the prediction along with the data and compare visually with the linear node prediction. 1. Classification – Iris Data For this exercise, you will train a neural network to classify the three different iris species in the famous Fisher iris data. The data are in the files IrisDev.dat (the development data) and IrisTest.dat (the test data). The files contain the input features in columns one through four, and in the last three columns, the class of each example encoded in a one-of-three representation T = (0 0 1) for class 1 T = (0 1 0) for class 2 T = (1 0 0) for class 3 You will need to construct a network with four inputs and three outputs. Use a logistic unit in the output layer. You can compute the classification accuracy with the Netlab function confmat, which computes both the overall classification accuracy (expressed as percent), and writes out a confusion matrix. The rows of a confusion matrix contain the true class labels, while the columns are the network assignments. For example, suppose we have a three-class problem and a classifier that generates the following confusion matrix:
  • 3. C1 C2 C3 C1 44 5 1 C2 1 39 10 C3 0 9 41 For this example, 44 of 50 C1 example were correctly classified, 5 were mislabeled C2, and 1 was mislabeled C3. The total number of misclassified examples is the sum of the off-diagonal elements, i.e. 26. The error rate is 0.173. Segment the development data set into five segments of 15 examples, and use a 5- fold cross-validation to pick the network size. Do you see any overfitting? When you have what you believe is the best network, run the test data through the network and report the classification accuracy, and include the confusion matrix.