Principal Component Analysis
PCA is a technique which will allow reducing the dimension of a dataset by identifying a few most influential
parameters (if they exist). This sort of variable screening or feature selection will make it easy to apply other
predictive modeling techniques and also make the job of interpreting the results easier.
Properties of PCs:
a. They are uncorrelated with each other
b. They cumulatively contain/explain a large amount of variance within the data
c. Original variables with very low weightage factors in their principal components can be removed from the
dataset.
Variance measures (always non negative) how far a set of numbers is spread out.
Covariance is the mean value of the product of the deviations of two variates from their respective means.
Association
http://www.slideshare.net/zafarjcp/data-mining-association-rules-basics
Comparison of Weka and RapidMiner
Both are OpenSource & written in JAVA.
RapidMiner Weka
More Flexible Easier to use; apt for beginners
More Analysis steps Faster
Better visualization & suggest fixes
More algorithms supported
Disadvantages
Lot of ETL modules but difficult to use Poor connectivity-non JAVA base DB

More Related Content

PPT
Elane - Promise08
PPTX
Outlier detection for high dimensional data
DOCX
Data handling and constraints
PPTX
Feature enginnering and selection
PDF
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
PDF
AWS Machine Learning Workshp
PDF
Synthetic Data Generation for Statistical Testing
PDF
Query aware determinization of uncertain objects
Elane - Promise08
Outlier detection for high dimensional data
Data handling and constraints
Feature enginnering and selection
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
AWS Machine Learning Workshp
Synthetic Data Generation for Statistical Testing
Query aware determinization of uncertain objects

Similar to Implementation of algorithms using WEKA (20)

PPTX
Principal component analysis.pptx
PPTX
Principal component analysis.pptx
PPTX
Principal component analysis.pptx
PPTX
Principal Component Analysis (PCA).pptx
PPTX
PRINCIPLE COMPONENT ANALYSIS.pptx
PPTX
Principal Component Analysis in Machine learning.pptx
PPTX
Easy_PCA_Presentation multivariate .pptx
PDF
Principal Component Analysis in Machine Learning.pdf
PPTX
Dimensionality Reduction and feature extraction.pptx
PPTX
Feature selection using PCA.pptx
PPTX
PCA Final.pptx
PPTX
Principal component analysis in machine L
PDF
Principal Component Analysis
PPTX
DATA MINING.pptx
PPTX
Principal Component Analysis (PCA) machine Learning.
PPT
Lecture 12 Principal Component Analysis in Machine Learning.ppt
PPT
pca in machine learning pca in machine learning pca in machine learning pca i...
PPT
Principal Component Analysis (PCA):How to conduct PCA
PPTX
Sess03 Dimension Reduction Methods.pptx
Principal component analysis.pptx
Principal component analysis.pptx
Principal component analysis.pptx
Principal Component Analysis (PCA).pptx
PRINCIPLE COMPONENT ANALYSIS.pptx
Principal Component Analysis in Machine learning.pptx
Easy_PCA_Presentation multivariate .pptx
Principal Component Analysis in Machine Learning.pdf
Dimensionality Reduction and feature extraction.pptx
Feature selection using PCA.pptx
PCA Final.pptx
Principal component analysis in machine L
Principal Component Analysis
DATA MINING.pptx
Principal Component Analysis (PCA) machine Learning.
Lecture 12 Principal Component Analysis in Machine Learning.ppt
pca in machine learning pca in machine learning pca in machine learning pca i...
Principal Component Analysis (PCA):How to conduct PCA
Sess03 Dimension Reduction Methods.pptx
Ad

Recently uploaded (20)

PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
Managing Community Partner Relationships
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
modul_python (1).pptx for professional and student
DOCX
Factor Analysis Word Document Presentation
PPTX
Steganography Project Steganography Project .pptx
PPTX
IMPACT OF LANDSLIDE.....................
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Global Data and Analytics Market Outlook Report
PDF
Introduction to the R Programming Language
PPTX
Leprosy and NLEP programme community medicine
PDF
Microsoft 365 products and services descrption
PPTX
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPT
Predictive modeling basics in data cleaning process
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Managing Community Partner Relationships
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
DU, AIS, Big Data and Data Analytics.ppt
STERILIZATION AND DISINFECTION-1.ppthhhbx
modul_python (1).pptx for professional and student
Factor Analysis Word Document Presentation
Steganography Project Steganography Project .pptx
IMPACT OF LANDSLIDE.....................
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Global Data and Analytics Market Outlook Report
Introduction to the R Programming Language
Leprosy and NLEP programme community medicine
Microsoft 365 products and services descrption
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
Pilar Kemerdekaan dan Identi Bangsa.pptx
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Predictive modeling basics in data cleaning process
Ad

Implementation of algorithms using WEKA

  • 1. Principal Component Analysis PCA is a technique which will allow reducing the dimension of a dataset by identifying a few most influential parameters (if they exist). This sort of variable screening or feature selection will make it easy to apply other predictive modeling techniques and also make the job of interpreting the results easier. Properties of PCs: a. They are uncorrelated with each other b. They cumulatively contain/explain a large amount of variance within the data c. Original variables with very low weightage factors in their principal components can be removed from the dataset. Variance measures (always non negative) how far a set of numbers is spread out. Covariance is the mean value of the product of the deviations of two variates from their respective means.
  • 3. Comparison of Weka and RapidMiner Both are OpenSource & written in JAVA. RapidMiner Weka More Flexible Easier to use; apt for beginners More Analysis steps Faster Better visualization & suggest fixes More algorithms supported Disadvantages Lot of ETL modules but difficult to use Poor connectivity-non JAVA base DB