Weka: A Short Introduction
Introduction A collection of open source ML algorithms pre-processing classifiers clustering association rule Created by researchers at the University of Waikato in New Zealand  Software Platform: Java based
What is WEKA Waikato Environment for Knowledge Analysis (WEKA) Developed by the Department of Computer Science, University of Waikato, New Zealand Machine learning/data mining software written in Java (distributed under the GNU Public License) Used for research, education, and applications http://www.cs.waikato.ac.nz/ml/weka/
Installation Download software from  http://www.cs.waikato.ac.nz/ml/weka/ If you are interested in modifying/extending weka there is a developer version that includes the source code Set the weka environment variable for java Download some ML data from  http://mlearn.ics.uci.edu/MLRepository.html
Main Features 49 data preprocessing tools 76 classification/regression algorithms 8 clustering algorithms 15 attribute/subset evaluators + 10 search algorithms for feature selection 3 algorithms for finding association rules More algorithms being added Options to customize using the Java source code is made available. Custom extensions and plug ins can be developed Excellent mailing and discussion lists available. 3 graphical user interfaces “ The Explorer” (exploratory data analysis) “ The Experimenter” (experimental environment) “ The KnowledgeFlow” (new process model inspired interface)
Weka Interfaces Command-line Explorer preprocessing, attribute selection, learning, visualiation Knowledge Flow visual design of KDD process capabilities  ~  Explorer Experimenter testing and evaluating machine learning algorithms
WEKA GUI Interface
WEKA  Data format Uses flat text files to describe the data Can work with a wide variety of data files including its own “.arff” format and C4.5 file formats Data can be imported from a file in various formats:  ARFF , CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC)
WEKA:: ARRF file format @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... numeric attribute nominal attribute
Attribute-Relation File Format (ARFF) Weka reads ARFF files: @relation  adult @attribute  age   numeric @attribute  name string @attribute  education {College, Masters, Doctorate} @attribute  class {>50K,<=50K} @data 50,Lisa, College,  <= 50K 30,Martin John, College,<=50K Supported attributes: numeric, nominal, string, date  Details at: www.cs.waikato.ac.nz/~ml/weka/arff.html
Weka Explorer What we will use today in Weka: Pre-process: Load, analyze, and filter data Visualize: Compare pairs of attributes Plot matrices Classify: All algorithms seem in class (Naive Bayes, etc.) Feature selection: Forward feature subset selection, etc.
Explorer: pre-processing the data Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL databases using JDBC Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, attribute combination, …
Explorer: Building classification models “ Classifiers” in WEKA are models for predicting nominal or numeric quantities Implemented schemes include: Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “ Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output codes, data cleansing, …
load filter analyze
visualize attributes
Weka Experimenter If you need to perform many experiments: Experimenter makes it easy to compare the performance of different learning schemes Results can be written into file or database Evaluation options: cross-validation, learning curve,  etc. Can also iterate over different parameter settings Significance-testing built in .
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

More Related Content

PPT
1.5 weka an intoduction
PPT
Weka toolkit introduction
PDF
Weka
PDF
PDF
Weka tutorial
PPTX
PPT
data mining with weka application
DOC
Data mining techniques using weka
1.5 weka an intoduction
Weka toolkit introduction
Weka
Weka tutorial
data mining with weka application
Data mining techniques using weka

What's hot (17)

PPT
WEKA Tutorial
PPT
Weka a tool_for_exploratory_data_mining
PPT
Data Mining with WEKA WEKA
PDF
Machine Learning with WEKA
PPT
Weka presentation
PPTX
WEKA: Introduction To Weka
PPTX
A simple introduction to weka
PPTX
Weka bike rental
PPTX
Weka presentation
PPTX
Analytics machine learning in weka
PPTX
weka data mining
PDF
Data mining with Weka
PPTX
Weka library, JAVA
DOCX
Data mining techniques using weka
PPTX
WEKA: Data Mining Input Concepts Instances And Attributes
PPTX
Association Rule Mining Using WEKA
WEKA Tutorial
Weka a tool_for_exploratory_data_mining
Data Mining with WEKA WEKA
Machine Learning with WEKA
Weka presentation
WEKA: Introduction To Weka
A simple introduction to weka
Weka bike rental
Weka presentation
Analytics machine learning in weka
weka data mining
Data mining with Weka
Weka library, JAVA
Data mining techniques using weka
WEKA: Data Mining Input Concepts Instances And Attributes
Association Rule Mining Using WEKA
Ad

Viewers also liked (20)

DOC
PPTX
Amazon marketplace
PDF
Sharing economy-2
PPTX
Weka.arff
PPTX
Real time classification of malicious urls.pptx 2
PDF
Fighting spam using social gate keepers
PDF
Twitter r t under crisis
PPTX
PDF
Belajar mudah algoritma data mining c4.5
PDF
Weka_Manual_Sagar
PDF
Weka presentation cmt111
PPTX
Social influence and political mobilization
PPTX
Predictive Analytics: It's The Intervention That Matters
PPTX
InfoChimps.Org
PPTX
XL-MINER:Partition
PPTX
Data-Applied: Technology Insights
PPTX
Graph Plots in Matlab
PPTX
Art, Culture, and Technology
PPTX
LISP: Macros in lisp
Amazon marketplace
Sharing economy-2
Weka.arff
Real time classification of malicious urls.pptx 2
Fighting spam using social gate keepers
Twitter r t under crisis
Belajar mudah algoritma data mining c4.5
Weka_Manual_Sagar
Weka presentation cmt111
Social influence and political mobilization
Predictive Analytics: It's The Intervention That Matters
InfoChimps.Org
XL-MINER:Partition
Data-Applied: Technology Insights
Graph Plots in Matlab
Art, Culture, and Technology
LISP: Macros in lisp
Ad

Similar to An Introduction To Weka (20)

PPT
Weka : A machine learning algorithms for data mining
PPT
Weka toolkit introduction
PPT
Pruebas De RapidMinner Aplicado A La.ppt
PPTX
WEKA:Introduction To Weka
PPTX
PPT
Shraddha weka
PPT
Shraddha weka
PDF
Athena java dev guide
PDF
Download full ebook of Oracle Sql Developer Narayanan Ajith instant download pdf
PPTX
Bringing OpenClinica Data into SAS
PPT
BioWeka
DOCX
Java se 8 fundamentals
PDF
Apache avro and overview hadoop tools
PDF
Spark SQL In Depth www.syedacademy.com
PDF
Itb weka nikhil
PPT
Introduction to Weka and Preprocessing.ppt
PPTX
PPTX
Installation Guidelines_Weka.pptx
PPTX
CARA for eTMF (electronic Trial Master Files)
Weka : A machine learning algorithms for data mining
Weka toolkit introduction
Pruebas De RapidMinner Aplicado A La.ppt
WEKA:Introduction To Weka
Shraddha weka
Shraddha weka
Athena java dev guide
Download full ebook of Oracle Sql Developer Narayanan Ajith instant download pdf
Bringing OpenClinica Data into SAS
BioWeka
Java se 8 fundamentals
Apache avro and overview hadoop tools
Spark SQL In Depth www.syedacademy.com
Itb weka nikhil
Introduction to Weka and Preprocessing.ppt
Installation Guidelines_Weka.pptx
CARA for eTMF (electronic Trial Master Files)

More from DataminingTools Inc (20)

PPTX
Terminology Machine Learning
PPTX
Techniques Machine Learning
PPTX
Machine learning Introduction
PPTX
Areas of machine leanring
PPTX
AI: Planning and AI
PPTX
AI: Logic in AI 2
PPTX
AI: Logic in AI
PPTX
AI: Learning in AI 2
PPTX
AI: Learning in AI
PPTX
AI: Introduction to artificial intelligence
PPTX
AI: Belief Networks
PPTX
AI: AI & Searching
PPTX
AI: AI & Problem Solving
PPTX
Data Mining: Text and web mining
PPTX
Data Mining: Outlier analysis
PPTX
Data Mining: Mining stream time series and sequence data
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
Data Mining: Graph mining and social network analysis
PPTX
Data warehouse and olap technology
PPTX
Data Mining: Data processing
Terminology Machine Learning
Techniques Machine Learning
Machine learning Introduction
Areas of machine leanring
AI: Planning and AI
AI: Logic in AI 2
AI: Logic in AI
AI: Learning in AI 2
AI: Learning in AI
AI: Introduction to artificial intelligence
AI: Belief Networks
AI: AI & Searching
AI: AI & Problem Solving
Data Mining: Text and web mining
Data Mining: Outlier analysis
Data Mining: Mining stream time series and sequence data
Data Mining: Mining ,associations, and correlations
Data Mining: Graph mining and social network analysis
Data warehouse and olap technology
Data Mining: Data processing

Recently uploaded (20)

PDF
CloudStack 4.21: First Look Webinar slides
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPT
What is a Computer? Input Devices /output devices
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
STKI Israel Market Study 2025 version august
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Unlock new opportunities with location data.pdf
PDF
Architecture types and enterprise applications.pdf
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
CloudStack 4.21: First Look Webinar slides
sustainability-14-14877-v2.pddhzftheheeeee
What is a Computer? Input Devices /output devices
Developing a website for English-speaking practice to English as a foreign la...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Enhancing emotion recognition model for a student engagement use case through...
STKI Israel Market Study 2025 version august
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Web Crawler for Trend Tracking Gen Z Insights.pptx
A comparative study of natural language inference in Swahili using monolingua...
Univ-Connecticut-ChatGPT-Presentaion.pdf
Tartificialntelligence_presentation.pptx
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Zenith AI: Advanced Artificial Intelligence
Final SEM Unit 1 for mit wpu at pune .pptx
Getting Started with Data Integration: FME Form 101
1 - Historical Antecedents, Social Consideration.pdf
Unlock new opportunities with location data.pdf
Architecture types and enterprise applications.pdf
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx

An Introduction To Weka

  • 1. Weka: A Short Introduction
  • 2. Introduction A collection of open source ML algorithms pre-processing classifiers clustering association rule Created by researchers at the University of Waikato in New Zealand Software Platform: Java based
  • 3. What is WEKA Waikato Environment for Knowledge Analysis (WEKA) Developed by the Department of Computer Science, University of Waikato, New Zealand Machine learning/data mining software written in Java (distributed under the GNU Public License) Used for research, education, and applications http://www.cs.waikato.ac.nz/ml/weka/
  • 4. Installation Download software from http://www.cs.waikato.ac.nz/ml/weka/ If you are interested in modifying/extending weka there is a developer version that includes the source code Set the weka environment variable for java Download some ML data from http://mlearn.ics.uci.edu/MLRepository.html
  • 5. Main Features 49 data preprocessing tools 76 classification/regression algorithms 8 clustering algorithms 15 attribute/subset evaluators + 10 search algorithms for feature selection 3 algorithms for finding association rules More algorithms being added Options to customize using the Java source code is made available. Custom extensions and plug ins can be developed Excellent mailing and discussion lists available. 3 graphical user interfaces “ The Explorer” (exploratory data analysis) “ The Experimenter” (experimental environment) “ The KnowledgeFlow” (new process model inspired interface)
  • 6. Weka Interfaces Command-line Explorer preprocessing, attribute selection, learning, visualiation Knowledge Flow visual design of KDD process capabilities ~ Explorer Experimenter testing and evaluating machine learning algorithms
  • 8. WEKA Data format Uses flat text files to describe the data Can work with a wide variety of data files including its own “.arff” format and C4.5 file formats Data can be imported from a file in various formats: ARFF , CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC)
  • 9. WEKA:: ARRF file format @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... numeric attribute nominal attribute
  • 10. Attribute-Relation File Format (ARFF) Weka reads ARFF files: @relation adult @attribute age numeric @attribute name string @attribute education {College, Masters, Doctorate} @attribute class {>50K,<=50K} @data 50,Lisa, College, <= 50K 30,Martin John, College,<=50K Supported attributes: numeric, nominal, string, date Details at: www.cs.waikato.ac.nz/~ml/weka/arff.html
  • 11. Weka Explorer What we will use today in Weka: Pre-process: Load, analyze, and filter data Visualize: Compare pairs of attributes Plot matrices Classify: All algorithms seem in class (Naive Bayes, etc.) Feature selection: Forward feature subset selection, etc.
  • 12. Explorer: pre-processing the data Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL databases using JDBC Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, attribute combination, …
  • 13. Explorer: Building classification models “ Classifiers” in WEKA are models for predicting nominal or numeric quantities Implemented schemes include: Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “ Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output codes, data cleansing, …
  • 16. Weka Experimenter If you need to perform many experiments: Experimenter makes it easy to compare the performance of different learning schemes Results can be written into file or database Evaluation options: cross-validation, learning curve, etc. Can also iterate over different parameter settings Significance-testing built in .
  • 17. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

Editor's Notes

  • #3: Talk about * hacking weka * discretization * cross validations
  • #8: Simple CLI provides a commandline interface to weka’s routines Explorer interface provides a graphical front end to weka’s routines and components Experimenter allows you to build classification experiments KnowledgeFlow provides an alternative to the Explorer as a graphical front end to Weka&apos;s core algorithms.