SlideShare a Scribd company logo
Department of Computer Science,
University of Waikato, New Zealand
Eibe Frank
 WEKA: A Machine
Learning Toolkit
 The Explorer
• Classification and
Regression
• Clustering
• Association Rules
• Attribute Selection
• Data Visualization
 The Experimenter
 The Knowledge
Flow GUI
 Conclusions
Machine Learning with
WEKA
8/3/2022 University of Waikato 2
WEKA: the bird
Copyright: Martin Kramer (mkramer@wxs.nl)
8/3/2022 University of Waikato 3
WEKA: the software
 Machine learning/data mining software written in
Java (distributed under the GNU Public License)
 Used for research, education, and applications
 Complements “Data Mining” by Witten & Frank
 Main features:
 Comprehensive set of data pre-processing tools,
learning algorithms and evaluation methods
 Graphical user interfaces (incl. data visualization)
 Environment for comparing learning algorithms
8/3/2022 University of Waikato 4
WEKA: versions
 There are several versions of WEKA:
 WEKA 3.0: “book version” compatible with
description in data mining book
 WEKA 3.2: “GUI version” adds graphical user
interfaces (book version is command-line only)
 WEKA 3.3: “development version” with lots of
improvements
 This talk is based on the latest snapshot of WEKA
3.3 (soon to be WEKA 3.4)
8/3/2022 University of Waikato 5
@relation heart-disease-simplified
@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}
@attribute cholesterol numeric
@attribute exercise_induced_angina { no, yes}
@attribute class { present, not_present}
@data
63,male,typ_angina,233,no,not_present
67,male,asympt,286,yes,present
67,male,asympt,229,yes,present
38,female,non_anginal,?,no,not_present
...
WEKA only deals with “flat” files
8/3/2022 University of Waikato 6
@relation heart-disease-simplified
@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}
@attribute cholesterol numeric
@attribute exercise_induced_angina { no, yes}
@attribute class { present, not_present}
@data
63,male,typ_angina,233,no,not_present
67,male,asympt,286,yes,present
67,male,asympt,229,yes,present
38,female,non_anginal,?,no,not_present
...
WEKA only deals with “flat” files
8/3/2022 University of Waikato 7
8/3/2022 University of Waikato 8
8/3/2022 University of Waikato 9
8/3/2022 University of Waikato 10
Explorer: pre-processing the data
 Data can be imported from a file in various
formats: ARFF, CSV, C4.5, binary
 Data can also be read from a URL or from an SQL
database (using JDBC)
 Pre-processing tools in WEKA are called “filters”
 WEKA contains filters for:
 Discretization, normalization, resampling, attribute
selection, transforming and combining attributes, …
8/3/2022 University of Waikato 11
8/3/2022 University of Waikato 12
8/3/2022 University of Waikato 13
8/3/2022 University of Waikato 14
8/3/2022 University of Waikato 15
8/3/2022 University of Waikato 16
8/3/2022 University of Waikato 17
8/3/2022 University of Waikato 18
8/3/2022 University of Waikato 19
8/3/2022 University of Waikato 20
8/3/2022 University of Waikato 21
8/3/2022 University of Waikato 22
8/3/2022 University of Waikato 23
8/3/2022 University of Waikato 24
8/3/2022 University of Waikato 25
8/3/2022 University of Waikato 26
8/3/2022 University of Waikato 27
8/3/2022 University of Waikato 28
8/3/2022 University of Waikato 29
8/3/2022 University of Waikato 30
8/3/2022 University of Waikato 31
8/3/2022 University of Waikato 32
Explorer: building “classifiers”
 Classifiers in WEKA are models for predicting
nominal or numeric quantities
 Implemented learning schemes include:
 Decision trees and lists, instance-based classifiers,
support vector machines, multi-layer perceptrons,
logistic regression, Bayes’ nets, …
 “Meta”-classifiers include:
 Bagging, boosting, stacking, error-correcting output
codes, locally weighted learning, …
8/3/2022 University of Waikato 33
8/3/2022 University of Waikato 34
8/3/2022 University of Waikato 35
8/3/2022 University of Waikato 36
8/3/2022 University of Waikato 37
8/3/2022 University of Waikato 38
8/3/2022 University of Waikato 39
8/3/2022 University of Waikato 40
8/3/2022 University of Waikato 41
8/3/2022 University of Waikato 42
8/3/2022 University of Waikato 43
8/3/2022 University of Waikato 44
8/3/2022 University of Waikato 45
8/3/2022 University of Waikato 46
8/3/2022 University of Waikato 47
8/3/2022 University of Waikato 48
8/3/2022 University of Waikato 49
8/3/2022 University of Waikato 50
8/3/2022 University of Waikato 51
8/3/2022 University of Waikato 52
8/3/2022 University of Waikato 53
8/3/2022 University of Waikato 54
8/3/2022 University of Waikato 55
8/3/2022 University of Waikato 56
8/3/2022 University of Waikato 57
8/3/2022 University of Waikato 58
8/3/2022 University of Waikato 59
8/3/2022 University of Waikato 60
8/3/2022 University of Waikato 61
8/3/2022 University of Waikato 62
8/3/2022 University of Waikato 63
8/3/2022 University of Waikato 64
8/3/2022 University of Waikato 65
8/3/2022 University of Waikato 66
8/3/2022 University of Waikato 67
8/3/2022 University of Waikato 68
8/3/2022 University of Waikato 69
8/3/2022 University of Waikato 70
8/3/2022 University of Waikato 71
8/3/2022 University of Waikato 72
8/3/2022 University of Waikato 73
8/3/2022 University of Waikato 74
8/3/2022 University of Waikato 75
8/3/2022 University of Waikato 76
8/3/2022 University of Waikato 77
8/3/2022 University of Waikato 78
8/3/2022 University of Waikato 79
8/3/2022 University of Waikato 80
8/3/2022 University of Waikato 81
8/3/2022 University of Waikato 82
8/3/2022 University of Waikato 83
8/3/2022 University of Waikato 84
8/3/2022 University of Waikato 85
8/3/2022 University of Waikato 86
8/3/2022 University of Waikato 87
8/3/2022 University of Waikato 88
8/3/2022 University of Waikato 89
8/3/2022 University of Waikato 90
8/3/2022 University of Waikato 91
8/3/2022 University of Waikato 92
Explorer: clustering data
 WEKA contains “clusterers” for finding groups of
similar instances in a dataset
 Implemented schemes are:
 k-Means, EM, Cobweb, X-means, FarthestFirst
 Clusters can be visualized and compared to “true”
clusters (if given)
 Evaluation based on loglikelihood if clustering
scheme produces a probability distribution
8/3/2022 University of Waikato 93
8/3/2022 University of Waikato 94
8/3/2022 University of Waikato 95
8/3/2022 University of Waikato 96
8/3/2022 University of Waikato 97
8/3/2022 University of Waikato 98
8/3/2022 University of Waikato 99
8/3/2022 University of Waikato 100
8/3/2022 University of Waikato 101
8/3/2022 University of Waikato 102
8/3/2022 University of Waikato 103
8/3/2022 University of Waikato 104
8/3/2022 University of Waikato 105
8/3/2022 University of Waikato 106
8/3/2022 University of Waikato 107
8/3/2022 University of Waikato 108
Explorer: finding associations
 WEKA contains an implementation of the Apriori
algorithm for learning association rules
 Works only with discrete data
 Can identify statistical dependencies between
groups of attributes:
 milk, butter  bread, eggs (with confidence 0.9 and
support 2000)
 Apriori can compute all rules that have a given
minimum support and exceed a given confidence
8/3/2022 University of Waikato 109
8/3/2022 University of Waikato 110
8/3/2022 University of Waikato 111
8/3/2022 University of Waikato 112
8/3/2022 University of Waikato 113
8/3/2022 University of Waikato 114
8/3/2022 University of Waikato 115
8/3/2022 University of Waikato 116
Explorer: attribute selection
 Panel that can be used to investigate which
(subsets of) attributes are the most predictive ones
 Attribute selection methods contain two parts:
 A search method: best-first, forward selection,
random, exhaustive, genetic algorithm, ranking
 An evaluation method: correlation-based, wrapper,
information gain, chi-squared, …
 Very flexible: WEKA allows (almost) arbitrary
combinations of these two
8/3/2022 University of Waikato 117
8/3/2022 University of Waikato 118
8/3/2022 University of Waikato 119
8/3/2022 University of Waikato 120
8/3/2022 University of Waikato 121
8/3/2022 University of Waikato 122
8/3/2022 University of Waikato 123
8/3/2022 University of Waikato 124
8/3/2022 University of Waikato 125
Explorer: data visualization
 Visualization very useful in practice: e.g. helps to
determine difficulty of the learning problem
 WEKA can visualize single attributes (1-d) and
pairs of attributes (2-d)
 To do: rotating 3-d visualizations (Xgobi-style)
 Color-coded class values
 “Jitter” option to deal with nominal attributes (and
to detect “hidden” data points)
 “Zoom-in” function
8/3/2022 University of Waikato 126
8/3/2022 University of Waikato 127
8/3/2022 University of Waikato 128
8/3/2022 University of Waikato 129
8/3/2022 University of Waikato 130
8/3/2022 University of Waikato 131
8/3/2022 University of Waikato 132
8/3/2022 University of Waikato 133
8/3/2022 University of Waikato 134
8/3/2022 University of Waikato 135
8/3/2022 University of Waikato 136
8/3/2022 University of Waikato 137
8/3/2022 University of Waikato 138
Performing experiments
 Experimenter makes it easy to compare the
performance of different learning schemes
 For classification and regression problems
 Results can be written into file or database
 Evaluation options: cross-validation, learning
curve, hold-out
 Can also iterate over different parameter settings
 Significance-testing built in!
8/3/2022 University of Waikato 139
8/3/2022 University of Waikato 140
8/3/2022 University of Waikato 141
8/3/2022 University of Waikato 142
8/3/2022 University of Waikato 143
8/3/2022 University of Waikato 144
8/3/2022 University of Waikato 145
8/3/2022 University of Waikato 146
8/3/2022 University of Waikato 147
8/3/2022 University of Waikato 148
8/3/2022 University of Waikato 149
8/3/2022 University of Waikato 150
8/3/2022 University of Waikato 151
8/3/2022 University of Waikato 152
The Knowledge Flow GUI
 New graphical user interface for WEKA
 Java-Beans-based interface for setting up and
running machine learning experiments
 Data sources, classifiers, etc. are beans and can
be connected graphically
 Data “flows” through components: e.g.,
“data source” -> “filter” -> “classifier” -> “evaluator”
 Layouts can be saved and loaded again later
8/3/2022 University of Waikato 153
8/3/2022 University of Waikato 154
8/3/2022 University of Waikato 155
8/3/2022 University of Waikato 156
8/3/2022 University of Waikato 157
8/3/2022 University of Waikato 158
8/3/2022 University of Waikato 159
8/3/2022 University of Waikato 160
8/3/2022 University of Waikato 161
8/3/2022 University of Waikato 162
8/3/2022 University of Waikato 163
8/3/2022 University of Waikato 164
8/3/2022 University of Waikato 165
8/3/2022 University of Waikato 166
8/3/2022 University of Waikato 167
8/3/2022 University of Waikato 168
8/3/2022 University of Waikato 169
8/3/2022 University of Waikato 170
8/3/2022 University of Waikato 171
8/3/2022 University of Waikato 172
8/3/2022 University of Waikato 173
Conclusion: try it yourself!
 WEKA is available at
http://www.cs.waikato.ac.nz/ml/weka
 Also has a list of projects based on WEKA
 WEKA contributors:
Abdelaziz Mahoui, Alexander K. Seewald, Ashraf M. Kibriya, Bernhard
Pfahringer , Brent Martin, Peter Flach, Eibe Frank ,Gabi Schmidberger
,Ian H. Witten , J. Lindgren, Janice Boughton, Jason Wells, Len Trigg,
Lucio de Souza Coelho, Malcolm Ware, Mark Hall ,Remco Bouckaert ,
Richard Kirkby, Shane Butler, Shane Legg, Stuart Inglis, Sylvain Roy,
Tony Voyle, Xin Xu, Yong Wang, Zhihai Wang

More Related Content

PPTX
WEKA Tutorial and Introduction Data mining
PPT
Weka a tool_for_exploratory_data_mining
PPTX
A simple introduction to weka
PPT
WEKA Tutorial
PPT
Weka-Presentation.ppt
PPT
R1234_SRU data knowledge informations regarding
PDF
wekapresentation-130107115704-phpapp02.pdf
PDF
Microsoft PowerPoint - weka [Read-Only]
WEKA Tutorial and Introduction Data mining
Weka a tool_for_exploratory_data_mining
A simple introduction to weka
WEKA Tutorial
Weka-Presentation.ppt
R1234_SRU data knowledge informations regarding
wekapresentation-130107115704-phpapp02.pdf
Microsoft PowerPoint - weka [Read-Only]

Similar to weka-tutorial-all.ppt (20)

PDF
VT Lecture_Clean Room_Shared(1) .pdf
PDF
Basics of Random Matrix Theory and Its Applications - Edukite
PDF
VCO Simulation with Cadence Spectre
PDF
Elaboration and enhanced usage of data analysis tool DAMIS+
PDF
Saulius Gražulis The Crystalography Open Database
PDF
PDF
Linked Data for Improved Vaccine Information Systems
PPTX
Internship_Presentation_datascirncehar038.pptx
PDF
Complex Networks Principles Methods And Applications Vito Latora
PDF
BLOCKCHAIN IMPLEMENTATION IN EDUCATIONAL SYSTEM
PPTX
Knowledge Graph Engineering
PDF
The proposed solution entails the development of a system that leverages the ...
PDF
IRJET- Digital Certification using Blockchain
PDF
My Quantum Journey
PDF
JAICOB- A DATA SCIENCE CHATBOT A DATA SCIENCE CHATBOT
PDF
Fake Certificate Detection by using Blockchain
PDF
Research Anthology On Big Data Analytics Architectures And Applications Infor...
PPTX
Deakin University’s IBM Centre of Excellence in Business Analytics
PDF
Data Science Second International Conference Icds 2015 Sydney Australia Augus...
PDF
Data Science Second International Conference Icds 2015 Sydney Australia Augus...
VT Lecture_Clean Room_Shared(1) .pdf
Basics of Random Matrix Theory and Its Applications - Edukite
VCO Simulation with Cadence Spectre
Elaboration and enhanced usage of data analysis tool DAMIS+
Saulius Gražulis The Crystalography Open Database
Linked Data for Improved Vaccine Information Systems
Internship_Presentation_datascirncehar038.pptx
Complex Networks Principles Methods And Applications Vito Latora
BLOCKCHAIN IMPLEMENTATION IN EDUCATIONAL SYSTEM
Knowledge Graph Engineering
The proposed solution entails the development of a system that leverages the ...
IRJET- Digital Certification using Blockchain
My Quantum Journey
JAICOB- A DATA SCIENCE CHATBOT A DATA SCIENCE CHATBOT
Fake Certificate Detection by using Blockchain
Research Anthology On Big Data Analytics Architectures And Applications Infor...
Deakin University’s IBM Centre of Excellence in Business Analytics
Data Science Second International Conference Icds 2015 Sydney Australia Augus...
Data Science Second International Conference Icds 2015 Sydney Australia Augus...
Ad

Recently uploaded (20)

PPTX
G10 HOMEROOM PARENT-TEACHER ASSOCIATION MEETING SATURDAY.pptx
PPTX
Review1_Bollywood_Project analysis of bolywood trends from 1950s to 2025
PDF
the saint and devil who dominated the outcasts
PPTX
A slideshow about aesthetic value in arts
PPTX
Certificados y Diplomas para Educación de Colores Candy by Slidesgo.pptx
PPTX
22 Bindushree Sahu.pptxmadam curie life and achievements
PPTX
SlideEgg_21518-Company Presentation.pptx
PPTX
400kV_Switchyard_Training_with_Diagrams.pptx
PPTX
Understanding Postmodernism Powerpoint.pptx
PPTX
Visual-Arts.pptx power point elements of art the line, shape, form
PPTX
Technical-Codes-presentation-G-12Student
PPTX
unit5-servicesrelatedtogeneticsinnursing-241221084421-d77c4adb.pptx
PPTX
Presentation on tradtional textiles of kutch
PDF
DPSR MUN'25 (U).pdf hhhhhhhhhhhhhbbnhhhh
PDF
; Projeto Rixa Antiga.pdf
PPTX
EJ Wedding 520 It's official! We went to Xinyi District to do the documents
PDF
Chapter 3 about The site of the first mass
PDF
waiting, Queuing, best time an event cab be done at a time .pdf
PPTX
CPAR-ELEMENTS AND PRINCIPLE OF ARTS.pptx
PPSX
Multiple scenes in a single painting.ppsx
G10 HOMEROOM PARENT-TEACHER ASSOCIATION MEETING SATURDAY.pptx
Review1_Bollywood_Project analysis of bolywood trends from 1950s to 2025
the saint and devil who dominated the outcasts
A slideshow about aesthetic value in arts
Certificados y Diplomas para Educación de Colores Candy by Slidesgo.pptx
22 Bindushree Sahu.pptxmadam curie life and achievements
SlideEgg_21518-Company Presentation.pptx
400kV_Switchyard_Training_with_Diagrams.pptx
Understanding Postmodernism Powerpoint.pptx
Visual-Arts.pptx power point elements of art the line, shape, form
Technical-Codes-presentation-G-12Student
unit5-servicesrelatedtogeneticsinnursing-241221084421-d77c4adb.pptx
Presentation on tradtional textiles of kutch
DPSR MUN'25 (U).pdf hhhhhhhhhhhhhbbnhhhh
; Projeto Rixa Antiga.pdf
EJ Wedding 520 It's official! We went to Xinyi District to do the documents
Chapter 3 about The site of the first mass
waiting, Queuing, best time an event cab be done at a time .pdf
CPAR-ELEMENTS AND PRINCIPLE OF ARTS.pptx
Multiple scenes in a single painting.ppsx
Ad

weka-tutorial-all.ppt