SlideShare a Scribd company logo
2
Most read
3
Most read
7
Most read
I
International University of Business Agriculture and Technology
(IUBAT Universiy)
Report
on
“Secondary Memory"
This report is prepared for partial fulfillment of
CSC 347 – Computer Hardware and Maintenance
Prepared For:
Abhijit Saha, Ph.D
Course Instructor
Department of Computer Science and Engineering
IUBAT University
Prepared By:
WGM Group – C5
Spring 2017
II
Group – C5
Sl.
No#
Name ID Email Signature
1
2
3
4
5
6
Submission Date:
Group Photo
III
Table of Contents
1. Introduction……………………………………………………………………………………
2. Lskfjlksadf……………………………………………………………………………………..
2.1 Jksfhgjs
2.2 Djkfhjkhf
3. sdhfhsdjh
1
3
4
7
7
Title Fly…………………………………………………………………………………………………..I-VII
Chapter 1: Introduction………………………………………………………………………...……..……1
Chapter 2: Secondary memory………………………………………………………………………..……1
2.0. Secondary memory……………………………………………………………………………… …..…..1
2.1. Types of Secondary memory………………………………………………………………………....1
2.2. Some latest secondary memory………………………………………………………………………2
2.3. Modern examples by shape…………………………………………………………………..……….3
Chapter 3: Floppy disk……………………………………………………………………...……..…….4
3.0. Floppy disk…………………………………………………………………………………..………..4
3.1. How a Floppy Disk Works ………………………………………………………..………….………4
3.2. Advantages…………………………………………………………………………………..…..……4
3.3. Disadvantages……………………………………………………………………………………...….4
Chapter 4: Hard Disk Drive………………………………………………………………………..…....5
4.0. Hard Disk Drive ………………………………………………………………………..…………….5
4.1. External and Internal hard drives…………………………………………………………….………..5
4.2. Types of Computer Hard Drive……………………………………….................................................5
Chapter 5: Optical Memory…………………………………………………………..……...…….……10
5.0. Optical Memory…………………………………………………………………………….………..10
5.1. CD ………………………………………………………………………………………………........10
5.2. DVD……………………………………………………………………………………………..…..11
Chapter 6: External devices of the Secondary Memory………………………………………….……15
6.0. External devices………………………………………………………………………………….…...15
6.4Jazz disk……………………………………………………………………………………………….19
6.4.1. Some careens when we use Jazz disk……………………………………………………..….19
Chapter 7: Flash Memory: ………………………………………………………………..…………….20
7.0. Flash Memory…………………………………………………………………………….…………..20
7.1. Uses Flash technology…………………………………………………………………….………….20
7.2. Types of Flash Memory………………………………………………………………………………21
IV
7.2.1. NOR flash……………………………………………………………………………...……..21
7.2.2. NAND flash…………………………………………………………………………………..22
Chapter 8: Installation……………………..……………………………………………………………25
8.0. How to install Hard Drive……………………..……………………………………………………..25
8.1. Desktop hard drive……………………………..…………………………………………………….25
8.2. Laptop Hard Drive………………………………..………………………………………………… 27
Chapter 9: Comparison among secondary storages…………………………………………………..27
9.1. Hard drive vs magnetic tape…………………………………………………………………………..27
9.2. CD vs DVD………………………………………………………………………………….……….28
9.3. Floppy drive vs USB drive………………………………………………………...............................28
Chapter 10: Market Available Secondary Memory…………………………..………………………28
10.1.Top hard disks available in the market………………………………………..…………………….28
10.2. Market Available USB drives………………………………………………..……………………..29
10. 3. Some Optical drive Brands in market…………………………………………………………..30
Chapter 11 : Conclusion……………………………………………………..……………….………30
V
List of Figures
Figure 4.1: parts of hard disk…………………………………………………………….6
Figure 6.1. Icon for the Zip drive………………………………………………………16
Figure 7.1. Programming a NORmemory cell………………………………………….21
Figure 7.2. Erasing a NOR memorycell…………………………………………………21
.
VI
List of Tables
Table I. Speed of a CD- Audio……………………………………………………………………….11
Table II. CD Physical Specifications………………………………………………………………….11
VII
Abstract/Executive Summary
1
1. Introduction
Network operators and system administrators are interested in the mixture of traffic carried in their
networks for several reasons. Knowledge about traffic composition is valuable for network
planning, accounting, security, and traffic control. Traffic control includes packet scheduling and
intelligent buffer management to provide the quality of service (QoS) needed by applications. It is
necessary to determine to which applications packets belong, but traditional protocol layering
principles restrict the network to processing only the IP packet header.
1.1sfjalskfjksadjdfakl
In Section 2, we review the previous work in traffic classification. Section 3 addresses the
question of usefulfeatures and number of QoS classes. We describe experiments with unsupervised
clustering of real traffic traces to build classification rules. Given the discovered QoS classes,
Section 4 presents experimental evaluation of classification accuracy using k-nearest neighbor
compared to minimum mean distance clustering.
2. Related Work
Research in traffic classification, which avoids payload inspection, has accelerated over the last
five years. It is generally difficult to compare different approaches, because they vary in the
selection of features (some requiring inspection of the packet payload), choice of supervised or
unsupervised classification algorithms, and set of classified traffic classes. The wide range of
previous approaches can be seen in the comprehensive survey by Nguyen and Armitage [1].
Further complicating comparisons between different studies is the fact that classification
performance depends on how the classifier is trained and the test data used to evaluate accuracy.
Unfortunately, a universal set of test traffic data does not exist to allow uniform comparisons of
different classifiers.
A common approach is to classify traffic on the basis of flows instead of individual packets.
Trussell et al. proposed the distribution of packet lengths as a useful feature [2]. McGregor et al.
used a variety of features: packet length statistics, interarrival times, byte counts, connection
duration [3]. Flows with similar features were grouped together using EM
(expectation-maximization) clustering. Having found the clusters representing a set of traffic
classes, the features contributing little were deleted to simplify classification and the clusters were
recomputed with the reduced feature set. EM clustering was also studied by Zander, Nguyen, and
Armitage [4]. Sequential forward selection (SFS) was used to reduce the feature set. The same
authors also tried AutoClass,an unsupervised Bayesian classifier,for cluster formation and SFS for
feature set reduction [5].….
3. Unsupervised Clustering
3.1 Self-Organizing Map
SOM is trained iteratively. In each training step, one sample vector x from the input data pool is
chosen randomly, and the distances between it and all the SOM codebook vectors are calculated
2
using some distance measure. The neuron whose codebook vector is closest to the input vector is
called the best-matching unit (BMU),denoted by

mc :

x  mc  min
i
x  mi (1)
where

 is the Euclidean distance, and

mi  are the codebook vectors.
After finding BMU, the SOM codebook vectors are updated, such that the BMU is moved closer
to the input vector. The topological neighbors of BMU are also treated this way. This procedure
moves BMU and its topological neighbors towards the sample vectors. The update rule for the ith
codebook vector is:

mi(n1)  mi(n)r(n)hci(n)[x(n)mi(n)] (2)
where n is the training iteration number, x(t) is an input vector randomly selected from the input
data set at the nth training,

r (n) is the learning rate in the nth training, and

hci(n) is the kernel
function around BMU

mc . The kernel function defines the region of influence that x has on the
map.
…
Fig. 1 shows the U-matrix and the components planes for the feature variables. The U-matrix is a
visualization of distance between neurons, where distance is color coded according to the spectrum
shown next to the map. Blue areas represent codebook vectors close to each other in input space,
i.e., clusters.
Fig. 1. U-matrix with 7 components scaled to [0,1]
3.2 K-Means Clustering
The K-means clustering algorithm starts with a training data set and a given number of clusters K.
The samples in the training data set are assigned to a cluster based on a similarity measurement.
Euclidean distance is generally used to measure the similarity. The K-means algorithm tries to find
an optimal solution by minimizing the square error:

Er  x j ci
2
j1
n

i1
K
 (3)
3
where K is the number of clusters and n is the number of training samples,

ci is the center of the ith
cluster,

x ci is the Euclidean distance between sample x and center

ci of the ith cluster.
…
4. Experimental Classification Results and Analysis
The previous section identified three clusters for QoS classes and features to build up classification
rules through unsupervised learning. In this section, the accuracy of the classification rules is
evaluated experimentally. For classification, we chose the K-nearest neighbor (KNN) algorithm.
Experimental results are compared with the minimum mean distance (MMD) classifier.
The selected application lists for each class and the number of applications in each class are
shown in Table I.
Table I. Applications in each class
Class Applications Total number
Transactional 53/TCP, 13/TCP, 111/TCP,… 112
Interactive
23/TCP, 21/TCP, 43/TCP, 513/TCP, 514/TCP, 540/TCP,
251/TCP, 1017/TCP, 1019/TCP, 1020/TCP, 1022/TCP,…
77
Bulk data
80/TCP, 20/TCP, 25/TCP, 70/TCP, 79/TCP, 81/TCP,
82/TCP, 83/TCP, 84/TCP, 119/TCP, 210/TCP, 8080/TCP,…
1351
5. Conclusion
Traffic classification was carried out in two phases. In the first off-line phase, we started with no
assumptions about traffic classes and used the unsupervised SOM and K-means clustering
algorithms to find the structure in the traffic data. The data exploration procedure found three
clusters corresponding to three QoS classes:transactional, interactive, and bulk data transfer. …
In the second classification phase,the accuracy of the KNN classifier was evaluated for test data.
Leave-one-out cross-validation tests showed that this algorithm had a low error rate. The KNN
classifier was found to have an error rate of about 2 percent for the test data, compared to an error
rate of 7 percent for a MMD classifier. KNN is one of the simplest classification algorithms,but not
necessarily the most accurate. Other supervised algorithms, such as back propagation (BP) and
SVM, also have attractive features and should be compared in future work.
References
[1] Thuy Nguyen and Grenville Armitage, “A survey of techniques for Internet traffic classification using
machine learning,” IEEE Communications Surveys and Tutorials, vol. 10, no. 4, pp. 56-76, November,
2008.
[2] H. Trussell, A. Nilsson, P. Patel and Y. Wang, “Estimation and detection of network traffic,” in Proc. of
11th Digital Signal Processing Workshop, pp. 246-248, January 12-16, 2004.
[3] Anthony McGregor, Mark Hall, Perry Lorier and James Brunskill, “Flow clustering using machine
learning techniques,” in Proc. of 5th Int. Workshop on Passive and Active Network Measurement,
4
pp.205-214, June 2-7, 2004.
[4] Sebastian Zander, Thuy Nguyen and Grenville Armitage, “Self-learning IP traffic classification based
on statistical flow characteristics,” in Proc. of 6th Int. Workshop on Passive and Active Measurement,
pp.325-328, March 23-27, 2005.
[5] Sebastian Zander, Thuy Nguyen and Grenville Armitage, “Automated traffic classification and
application identification using machine learning,” in Proc. of IEEE Conf. on Local Computer Networks,
pp.250-257, February 11-12, 2005.
[6] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd Edition, Wiley, New York,
2009.

More Related Content

PDF
Survey on clustering based color image segmentation
PDF
Fuzzy clustering Approach in segmentation of T1-T2 brain MRI
PDF
Hybrid compression based stationary wavelet transforms
PDF
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
PDF
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
PDF
Image Segmentation Using Two Weighted Variable Fuzzy K Means
PDF
Review of Diverse Techniques Used for Effective Fractal Image Compression
PDF
STUDY OF TASK SCHEDULING STRATEGY BASED ON TRUSTWORTHINESS
Survey on clustering based color image segmentation
Fuzzy clustering Approach in segmentation of T1-T2 brain MRI
Hybrid compression based stationary wavelet transforms
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Image Segmentation Using Two Weighted Variable Fuzzy K Means
Review of Diverse Techniques Used for Effective Fractal Image Compression
STUDY OF TASK SCHEDULING STRATEGY BASED ON TRUSTWORTHINESS

What's hot (18)

PDF
Chaotic Block Image Scheme using Large Key Space and Message Digest Algorithm
PDF
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
PDF
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
PDF
IRJET- Chatbot Using Gated End-to-End Memory Networks
DOC
DOWNLOAD
PDF
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
PDF
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
PDF
Bf36342346
PDF
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
PDF
Optimized block size based video coding algorithm
PDF
Fixed-Point Code Synthesis for Neural Networks
PDF
Cc24529533
PDF
Highly Parallel Pipelined VLSI Implementation of Lifting Based 2D Discrete Wa...
PDF
PDF
An approach for color image compression of bmp and tiff images using dct and dwt
PDF
The Chimera Grid Concept and Application
PDF
Ijarcet vol-2-issue-7-2230-2231
Chaotic Block Image Scheme using Large Key Space and Message Digest Algorithm
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
IRJET- Chatbot Using Gated End-to-End Memory Networks
DOWNLOAD
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
Bf36342346
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
Optimized block size based video coding algorithm
Fixed-Point Code Synthesis for Neural Networks
Cc24529533
Highly Parallel Pipelined VLSI Implementation of Lifting Based 2D Discrete Wa...
An approach for color image compression of bmp and tiff images using dct and dwt
The Chimera Grid Concept and Application
Ijarcet vol-2-issue-7-2230-2231
Ad

Similar to CSC 347 – Computer Hardware and Maintenance (20)

PDF
A Review on Traffic Classification Methods in WSN
PDF
H0444146
PDF
Traffic classification svm_im2015_10may2015
PDF
Application Identification Using Supervised Clustering Method
PDF
IRJET- Comparative Study on Embedded Feature Selection Techniques for Interne...
PDF
Classification of Software Defined Network Traffic to provide Quality of Service
PPTX
Intrusion Detection Model using Self Organizing Maps.
PDF
Image Morphing: A Literature Study
PDF
Intrusion Detection System Using Self Organizing Map Algorithms
PDF
Intrusion Detection System Using Self Organizing Map Algorithms
PPTX
Rise of the machines -- Owasp israel -- June 2014 meetup
PDF
Jd2516161623
PDF
Jd2516161623
PDF
Packet Classification using Support Vector Machines with String Kernels
PDF
A Hybrid Theory Of Power Theft Detection
PDF
Cloudera Movies Data Science Project On Big Data
PDF
Top Cite Articles- International Journal on Soft Computing, Artificial Intell...
PDF
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
PDF
An ann approach for network
PPT
PPT file
A Review on Traffic Classification Methods in WSN
H0444146
Traffic classification svm_im2015_10may2015
Application Identification Using Supervised Clustering Method
IRJET- Comparative Study on Embedded Feature Selection Techniques for Interne...
Classification of Software Defined Network Traffic to provide Quality of Service
Intrusion Detection Model using Self Organizing Maps.
Image Morphing: A Literature Study
Intrusion Detection System Using Self Organizing Map Algorithms
Intrusion Detection System Using Self Organizing Map Algorithms
Rise of the machines -- Owasp israel -- June 2014 meetup
Jd2516161623
Jd2516161623
Packet Classification using Support Vector Machines with String Kernels
A Hybrid Theory Of Power Theft Detection
Cloudera Movies Data Science Project On Big Data
Top Cite Articles- International Journal on Soft Computing, Artificial Intell...
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
An ann approach for network
PPT file
Ad

More from Sumaiya Ismail (17)

PDF
Portfolio sumaiya ismail
PDF
Role of chemistry in cse
PDF
Online resort reservation system report (practicum)
PDF
Comparative study of microprocessor perspective of historical preference
DOCX
Job description Format
PDF
Stuxnet, a malicious computer worm
DOCX
Cover letter
PDF
Products with chemical elements (chm 117)
PDF
Safe Internet (Art 203)
PDF
Biometrics Research/Thesis Paper
PDF
Comparison and contrast on studying at north south university campus and stud...
PDF
Food ordering system for red bangladesh course system ananlysis
PDF
Landslide monitoring using wireless sensor network
PDF
Food ordering system for red bd csc 397
PDF
Internet
PDF
Strategies of improving Communication between University & Students
PDF
Spelling Bee Competition Slide for school
Portfolio sumaiya ismail
Role of chemistry in cse
Online resort reservation system report (practicum)
Comparative study of microprocessor perspective of historical preference
Job description Format
Stuxnet, a malicious computer worm
Cover letter
Products with chemical elements (chm 117)
Safe Internet (Art 203)
Biometrics Research/Thesis Paper
Comparison and contrast on studying at north south university campus and stud...
Food ordering system for red bangladesh course system ananlysis
Landslide monitoring using wireless sensor network
Food ordering system for red bd csc 397
Internet
Strategies of improving Communication between University & Students
Spelling Bee Competition Slide for school

Recently uploaded (20)

PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
Getting Started with Data Integration: FME Form 101
PPTX
A Presentation on Artificial Intelligence
PDF
Mushroom cultivation and it's methods.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPTX
A Presentation on Touch Screen Technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
OMC Textile Division Presentation 2021.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Heart disease approach using modified random forest and particle swarm optimi...
A novel scalable deep ensemble learning framework for big data classification...
WOOl fibre morphology and structure.pdf for textiles
Web App vs Mobile App What Should You Build First.pdf
Getting Started with Data Integration: FME Form 101
A Presentation on Artificial Intelligence
Mushroom cultivation and it's methods.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
A Presentation on Touch Screen Technology
MIND Revenue Release Quarter 2 2025 Press Release
A comparative analysis of optical character recognition models for extracting...
SOPHOS-XG Firewall Administrator PPT.pptx

CSC 347 – Computer Hardware and Maintenance

  • 1. I International University of Business Agriculture and Technology (IUBAT Universiy) Report on “Secondary Memory" This report is prepared for partial fulfillment of CSC 347 – Computer Hardware and Maintenance Prepared For: Abhijit Saha, Ph.D Course Instructor Department of Computer Science and Engineering IUBAT University Prepared By: WGM Group – C5 Spring 2017
  • 2. II Group – C5 Sl. No# Name ID Email Signature 1 2 3 4 5 6 Submission Date: Group Photo
  • 3. III Table of Contents 1. Introduction…………………………………………………………………………………… 2. Lskfjlksadf…………………………………………………………………………………….. 2.1 Jksfhgjs 2.2 Djkfhjkhf 3. sdhfhsdjh 1 3 4 7 7 Title Fly…………………………………………………………………………………………………..I-VII Chapter 1: Introduction………………………………………………………………………...……..……1 Chapter 2: Secondary memory………………………………………………………………………..……1 2.0. Secondary memory……………………………………………………………………………… …..…..1 2.1. Types of Secondary memory………………………………………………………………………....1 2.2. Some latest secondary memory………………………………………………………………………2 2.3. Modern examples by shape…………………………………………………………………..……….3 Chapter 3: Floppy disk……………………………………………………………………...……..…….4 3.0. Floppy disk…………………………………………………………………………………..………..4 3.1. How a Floppy Disk Works ………………………………………………………..………….………4 3.2. Advantages…………………………………………………………………………………..…..……4 3.3. Disadvantages……………………………………………………………………………………...….4 Chapter 4: Hard Disk Drive………………………………………………………………………..…....5 4.0. Hard Disk Drive ………………………………………………………………………..…………….5 4.1. External and Internal hard drives…………………………………………………………….………..5 4.2. Types of Computer Hard Drive……………………………………….................................................5 Chapter 5: Optical Memory…………………………………………………………..……...…….……10 5.0. Optical Memory…………………………………………………………………………….………..10 5.1. CD ………………………………………………………………………………………………........10 5.2. DVD……………………………………………………………………………………………..…..11 Chapter 6: External devices of the Secondary Memory………………………………………….……15 6.0. External devices………………………………………………………………………………….…...15 6.4Jazz disk……………………………………………………………………………………………….19 6.4.1. Some careens when we use Jazz disk……………………………………………………..….19 Chapter 7: Flash Memory: ………………………………………………………………..…………….20 7.0. Flash Memory…………………………………………………………………………….…………..20 7.1. Uses Flash technology…………………………………………………………………….………….20 7.2. Types of Flash Memory………………………………………………………………………………21
  • 4. IV 7.2.1. NOR flash……………………………………………………………………………...……..21 7.2.2. NAND flash…………………………………………………………………………………..22 Chapter 8: Installation……………………..……………………………………………………………25 8.0. How to install Hard Drive……………………..……………………………………………………..25 8.1. Desktop hard drive……………………………..…………………………………………………….25 8.2. Laptop Hard Drive………………………………..………………………………………………… 27 Chapter 9: Comparison among secondary storages…………………………………………………..27 9.1. Hard drive vs magnetic tape…………………………………………………………………………..27 9.2. CD vs DVD………………………………………………………………………………….……….28 9.3. Floppy drive vs USB drive………………………………………………………...............................28 Chapter 10: Market Available Secondary Memory…………………………..………………………28 10.1.Top hard disks available in the market………………………………………..…………………….28 10.2. Market Available USB drives………………………………………………..……………………..29 10. 3. Some Optical drive Brands in market…………………………………………………………..30 Chapter 11 : Conclusion……………………………………………………..……………….………30
  • 5. V List of Figures Figure 4.1: parts of hard disk…………………………………………………………….6 Figure 6.1. Icon for the Zip drive………………………………………………………16 Figure 7.1. Programming a NORmemory cell………………………………………….21 Figure 7.2. Erasing a NOR memorycell…………………………………………………21 .
  • 6. VI List of Tables Table I. Speed of a CD- Audio……………………………………………………………………….11 Table II. CD Physical Specifications………………………………………………………………….11
  • 8. 1 1. Introduction Network operators and system administrators are interested in the mixture of traffic carried in their networks for several reasons. Knowledge about traffic composition is valuable for network planning, accounting, security, and traffic control. Traffic control includes packet scheduling and intelligent buffer management to provide the quality of service (QoS) needed by applications. It is necessary to determine to which applications packets belong, but traditional protocol layering principles restrict the network to processing only the IP packet header. 1.1sfjalskfjksadjdfakl In Section 2, we review the previous work in traffic classification. Section 3 addresses the question of usefulfeatures and number of QoS classes. We describe experiments with unsupervised clustering of real traffic traces to build classification rules. Given the discovered QoS classes, Section 4 presents experimental evaluation of classification accuracy using k-nearest neighbor compared to minimum mean distance clustering. 2. Related Work Research in traffic classification, which avoids payload inspection, has accelerated over the last five years. It is generally difficult to compare different approaches, because they vary in the selection of features (some requiring inspection of the packet payload), choice of supervised or unsupervised classification algorithms, and set of classified traffic classes. The wide range of previous approaches can be seen in the comprehensive survey by Nguyen and Armitage [1]. Further complicating comparisons between different studies is the fact that classification performance depends on how the classifier is trained and the test data used to evaluate accuracy. Unfortunately, a universal set of test traffic data does not exist to allow uniform comparisons of different classifiers. A common approach is to classify traffic on the basis of flows instead of individual packets. Trussell et al. proposed the distribution of packet lengths as a useful feature [2]. McGregor et al. used a variety of features: packet length statistics, interarrival times, byte counts, connection duration [3]. Flows with similar features were grouped together using EM (expectation-maximization) clustering. Having found the clusters representing a set of traffic classes, the features contributing little were deleted to simplify classification and the clusters were recomputed with the reduced feature set. EM clustering was also studied by Zander, Nguyen, and Armitage [4]. Sequential forward selection (SFS) was used to reduce the feature set. The same authors also tried AutoClass,an unsupervised Bayesian classifier,for cluster formation and SFS for feature set reduction [5].…. 3. Unsupervised Clustering 3.1 Self-Organizing Map SOM is trained iteratively. In each training step, one sample vector x from the input data pool is chosen randomly, and the distances between it and all the SOM codebook vectors are calculated
  • 9. 2 using some distance measure. The neuron whose codebook vector is closest to the input vector is called the best-matching unit (BMU),denoted by  mc :  x  mc  min i x  mi (1) where   is the Euclidean distance, and  mi  are the codebook vectors. After finding BMU, the SOM codebook vectors are updated, such that the BMU is moved closer to the input vector. The topological neighbors of BMU are also treated this way. This procedure moves BMU and its topological neighbors towards the sample vectors. The update rule for the ith codebook vector is:  mi(n1)  mi(n)r(n)hci(n)[x(n)mi(n)] (2) where n is the training iteration number, x(t) is an input vector randomly selected from the input data set at the nth training,  r (n) is the learning rate in the nth training, and  hci(n) is the kernel function around BMU  mc . The kernel function defines the region of influence that x has on the map. … Fig. 1 shows the U-matrix and the components planes for the feature variables. The U-matrix is a visualization of distance between neurons, where distance is color coded according to the spectrum shown next to the map. Blue areas represent codebook vectors close to each other in input space, i.e., clusters. Fig. 1. U-matrix with 7 components scaled to [0,1] 3.2 K-Means Clustering The K-means clustering algorithm starts with a training data set and a given number of clusters K. The samples in the training data set are assigned to a cluster based on a similarity measurement. Euclidean distance is generally used to measure the similarity. The K-means algorithm tries to find an optimal solution by minimizing the square error:  Er  x j ci 2 j1 n  i1 K  (3)
  • 10. 3 where K is the number of clusters and n is the number of training samples,  ci is the center of the ith cluster,  x ci is the Euclidean distance between sample x and center  ci of the ith cluster. … 4. Experimental Classification Results and Analysis The previous section identified three clusters for QoS classes and features to build up classification rules through unsupervised learning. In this section, the accuracy of the classification rules is evaluated experimentally. For classification, we chose the K-nearest neighbor (KNN) algorithm. Experimental results are compared with the minimum mean distance (MMD) classifier. The selected application lists for each class and the number of applications in each class are shown in Table I. Table I. Applications in each class Class Applications Total number Transactional 53/TCP, 13/TCP, 111/TCP,… 112 Interactive 23/TCP, 21/TCP, 43/TCP, 513/TCP, 514/TCP, 540/TCP, 251/TCP, 1017/TCP, 1019/TCP, 1020/TCP, 1022/TCP,… 77 Bulk data 80/TCP, 20/TCP, 25/TCP, 70/TCP, 79/TCP, 81/TCP, 82/TCP, 83/TCP, 84/TCP, 119/TCP, 210/TCP, 8080/TCP,… 1351 5. Conclusion Traffic classification was carried out in two phases. In the first off-line phase, we started with no assumptions about traffic classes and used the unsupervised SOM and K-means clustering algorithms to find the structure in the traffic data. The data exploration procedure found three clusters corresponding to three QoS classes:transactional, interactive, and bulk data transfer. … In the second classification phase,the accuracy of the KNN classifier was evaluated for test data. Leave-one-out cross-validation tests showed that this algorithm had a low error rate. The KNN classifier was found to have an error rate of about 2 percent for the test data, compared to an error rate of 7 percent for a MMD classifier. KNN is one of the simplest classification algorithms,but not necessarily the most accurate. Other supervised algorithms, such as back propagation (BP) and SVM, also have attractive features and should be compared in future work. References [1] Thuy Nguyen and Grenville Armitage, “A survey of techniques for Internet traffic classification using machine learning,” IEEE Communications Surveys and Tutorials, vol. 10, no. 4, pp. 56-76, November, 2008. [2] H. Trussell, A. Nilsson, P. Patel and Y. Wang, “Estimation and detection of network traffic,” in Proc. of 11th Digital Signal Processing Workshop, pp. 246-248, January 12-16, 2004. [3] Anthony McGregor, Mark Hall, Perry Lorier and James Brunskill, “Flow clustering using machine learning techniques,” in Proc. of 5th Int. Workshop on Passive and Active Network Measurement,
  • 11. 4 pp.205-214, June 2-7, 2004. [4] Sebastian Zander, Thuy Nguyen and Grenville Armitage, “Self-learning IP traffic classification based on statistical flow characteristics,” in Proc. of 6th Int. Workshop on Passive and Active Measurement, pp.325-328, March 23-27, 2005. [5] Sebastian Zander, Thuy Nguyen and Grenville Armitage, “Automated traffic classification and application identification using machine learning,” in Proc. of IEEE Conf. on Local Computer Networks, pp.250-257, February 11-12, 2005. [6] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd Edition, Wiley, New York, 2009.