SlideShare a Scribd company logo
K-NEAREST NEIGHBORS
(K-NN)
knn classification
knn classification
PYTHON
READING DATASET DYNAMICALLY
from tkinter import *
from tkinter.filedialog import askopenfilename
root = Tk()
root.withdraw()
root.update()
file_path = askopenfilename()
root.destroy()
IMPORTING LIBRARIES
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
IMPORTING DATASET
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:,2:4]
y= dataset.iloc[:,-1]
SPLITTING THE DATASET INTO THE
TRAINING SET AND TEST SET
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25,
random_state = 0)
FEATURE SCALING
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
FEATURE SELECTION …
Recursive Feature Elimination (RFE) is based on the idea to
repeatedly construct a model and choose either the best or
worst performing feature, setting the feature aside and then
repeating the process with the rest of the features.
This process is applied until all features in the dataset are
exhausted. The goal of RFE is to select features.
FEATURE SELECTION …
from sklearn import datasets
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
rfe = RFE(logreg, 2)
rfe = rfe.fit(X, y )
print(rfe.support_)
FITTING CLASSIFIER TO THE
TRAINING SET
from sklearn.neighbors import KNeighborsClassifier
classifier =
KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2)
classifier.fit(X_train,y_train)
CONFUSION MATRIX
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
K FOLD
from sklearn import model_selection
from sklearn.model_selection import cross_val_score
kfold = model_selection.KFold(n_splits=10, random_state=7)
modelCV = KNeighborsClassifier()
scoring = 'accuracy'
results = model_selection.cross_val_score(modelCV, X_train, y_train, cv=kfold,
scoring=scoring)
print("10-fold cross validation average accuracy: %.3f" % (results.mean()))
PREDICTION
y_pred = classifier.predict(X_test)
EVALUATING CLASSIFICATION
REPORT
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
R
READ DATASET
library(readr)
dataset <- read_csv("D:/machine learning AZ/Machine Learning A-Z
Template Folder/Part 3 - Classification/Section 14 - Logistic
Regression/Logistic_Regression/Social_Network_Ads.csv")
dataset = dataset[3:5]
ENCODING THE TARGET FEATURE
AS FACTOR
dataset$Purchased = factor(dataset$Purchased, levels = c(0, 1))
SPLITTING THE DATASET INTO THE
TRAINING SET AND TEST SET
# install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(dataset$Purchased, SplitRatio = 0.75)
training_set = subset(dataset, split == TRUE)
test_set = subset(dataset, split == FALSE)
FEATURE SCALING
training_set[-3] = scale(training_set[-3])
test_set[-3] = scale(test_set[-3])
FITTING K-NN TO THE TRAINING SET &
PREDICTION
library(class)
y_pred = knn(train = training_set[, c(1,2)],
test = test_set[, c(1,2)],
cl = training_set$Purchased,
k = 5,
prob = TRUE)
CONFUSION MATRIX
cm = table(test_set[, 3], y_pred)

More Related Content

PDF
Decision tree
PDF
PDF
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
PDF
yolov3-4-5.pdf
PPTX
NLP & ML Webinar
PPTX
K Nearest Neighbor Presentation
PDF
Challenges and Solutions in Group Recommender Systems
PDF
CounterFactual Explanations.pdf
Decision tree
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
yolov3-4-5.pdf
NLP & ML Webinar
K Nearest Neighbor Presentation
Challenges and Solutions in Group Recommender Systems
CounterFactual Explanations.pdf

What's hot (20)

PPTX
Kernel Method
PDF
Generative adversarial networks
PPTX
Classification decision tree
PPTX
LDM_ImageSythesis.pptx
PDF
Machine Learning and Data Mining: 10 Introduction to Classification
PDF
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
PDF
Synthetic Data Generation with DoppelGanger
PPTX
Ensemble Learning and Random Forests
PDF
Ant colony opitimization numerical example
PDF
Deep Learning for Computer Vision: Generative models and adversarial training...
PPTX
Lecture 1 graphical models
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PDF
K means clustering
PDF
Interpretable machine learning : Methods for understanding complex models
PDF
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Net...
PPTX
Introduction to Grad-CAM (complete version)
PDF
Link prediction
PDF
Naive Bayes Classifier
PPTX
Decision trees and random forests
Kernel Method
Generative adversarial networks
Classification decision tree
LDM_ImageSythesis.pptx
Machine Learning and Data Mining: 10 Introduction to Classification
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
Synthetic Data Generation with DoppelGanger
Ensemble Learning and Random Forests
Ant colony opitimization numerical example
Deep Learning for Computer Vision: Generative models and adversarial training...
Lecture 1 graphical models
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
K means clustering
Interpretable machine learning : Methods for understanding complex models
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Net...
Introduction to Grad-CAM (complete version)
Link prediction
Naive Bayes Classifier
Decision trees and random forests
Ad

Similar to knn classification (20)

PPTX
svm classification
PPTX
logistic regression with python and R
PDF
Lab 2: Classification and Regression Prediction Models, training and testing ...
PDF
Implementation of K-Nearest Neighbor Algorithm
PDF
Cheat Sheet for Machine Learning in Python: Scikit-learn
PDF
Scikit learn cheat_sheet_python
PDF
Scikit-learn Cheatsheet-Python
PDF
ML with python.pdf
PPTX
wk5ppt1_Titanic
PDF
Kaggle KDD Cup Report
PDF
機械学習によるデータ分析 実践編
PPTX
Data Science Job Required Skill Analysis
PPTX
Kaggle Gold Medal Case Study
PDF
ML MODULE 2.pdf
PDF
General Tips for participating Kaggle Competitions
PDF
It's Not Magic - Explaining classification algorithms
PPTX
Classification Techniques in Machine Learning.pptx
DOCX
AIMLProgram-6 AIMLProgram-6 AIMLProgram-6 AIMLProgram-6
PDF
Machine learning in science and industry — day 1
PDF
Human_Activity_Recognition_Predictive_Model
svm classification
logistic regression with python and R
Lab 2: Classification and Regression Prediction Models, training and testing ...
Implementation of K-Nearest Neighbor Algorithm
Cheat Sheet for Machine Learning in Python: Scikit-learn
Scikit learn cheat_sheet_python
Scikit-learn Cheatsheet-Python
ML with python.pdf
wk5ppt1_Titanic
Kaggle KDD Cup Report
機械学習によるデータ分析 実践編
Data Science Job Required Skill Analysis
Kaggle Gold Medal Case Study
ML MODULE 2.pdf
General Tips for participating Kaggle Competitions
It's Not Magic - Explaining classification algorithms
Classification Techniques in Machine Learning.pptx
AIMLProgram-6 AIMLProgram-6 AIMLProgram-6 AIMLProgram-6
Machine learning in science and industry — day 1
Human_Activity_Recognition_Predictive_Model
Ad

More from Akhilesh Joshi (18)

PPTX
PCA and LDA in machine learning
PPTX
random forest regression
PPTX
decision tree regression
PPTX
support vector regression
PPTX
polynomial linear regression
PPTX
multiple linear regression
PPTX
simple linear regression
PPTX
R square vs adjusted r square
PPTX
PPTX
Grid search (parameter tuning)
PPTX
Data preprocessing for Machine Learning with R and Python
PPTX
Design patterns
PPTX
Bastion Host : Amazon Web Services
PDF
Design patterns in MapReduce
PPT
Google knowledge graph
DOCX
Machine learning (domingo's paper)
DOC
SoLoMo - Future of Marketing
PPTX
Webcrawler
PCA and LDA in machine learning
random forest regression
decision tree regression
support vector regression
polynomial linear regression
multiple linear regression
simple linear regression
R square vs adjusted r square
Grid search (parameter tuning)
Data preprocessing for Machine Learning with R and Python
Design patterns
Bastion Host : Amazon Web Services
Design patterns in MapReduce
Google knowledge graph
Machine learning (domingo's paper)
SoLoMo - Future of Marketing
Webcrawler

Recently uploaded (20)

PPTX
Database Infoormation System (DBIS).pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Introduction to Business Data Analytics.
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Database Infoormation System (DBIS).pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Supervised vs unsupervised machine learning algorithms
Introduction to Business Data Analytics.
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Mega Projects Data Mega Projects Data
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Miokarditis (Inflamasi pada Otot Jantung)
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
climate analysis of Dhaka ,Banglades.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...

knn classification

  • 5. READING DATASET DYNAMICALLY from tkinter import * from tkinter.filedialog import askopenfilename root = Tk() root.withdraw() root.update() file_path = askopenfilename() root.destroy()
  • 6. IMPORTING LIBRARIES import pandas as pd import numpy as np import matplotlib.pyplot as plt
  • 7. IMPORTING DATASET dataset = pd.read_csv('Social_Network_Ads.csv') X = dataset.iloc[:,2:4] y= dataset.iloc[:,-1]
  • 8. SPLITTING THE DATASET INTO THE TRAINING SET AND TEST SET from sklearn.cross_validation import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
  • 9. FEATURE SCALING from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test)
  • 10. FEATURE SELECTION … Recursive Feature Elimination (RFE) is based on the idea to repeatedly construct a model and choose either the best or worst performing feature, setting the feature aside and then repeating the process with the rest of the features. This process is applied until all features in the dataset are exhausted. The goal of RFE is to select features.
  • 11. FEATURE SELECTION … from sklearn import datasets from sklearn.feature_selection import RFE from sklearn.linear_model import LogisticRegression logreg = LogisticRegression() rfe = RFE(logreg, 2) rfe = rfe.fit(X, y ) print(rfe.support_)
  • 12. FITTING CLASSIFIER TO THE TRAINING SET from sklearn.neighbors import KNeighborsClassifier classifier = KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2) classifier.fit(X_train,y_train)
  • 13. CONFUSION MATRIX from sklearn.metrics import confusion_matrix cm = confusion_matrix(y_test, y_pred)
  • 14. K FOLD from sklearn import model_selection from sklearn.model_selection import cross_val_score kfold = model_selection.KFold(n_splits=10, random_state=7) modelCV = KNeighborsClassifier() scoring = 'accuracy' results = model_selection.cross_val_score(modelCV, X_train, y_train, cv=kfold, scoring=scoring) print("10-fold cross validation average accuracy: %.3f" % (results.mean()))
  • 16. EVALUATING CLASSIFICATION REPORT from sklearn.metrics import classification_report print(classification_report(y_test, y_pred))
  • 17. R
  • 18. READ DATASET library(readr) dataset <- read_csv("D:/machine learning AZ/Machine Learning A-Z Template Folder/Part 3 - Classification/Section 14 - Logistic Regression/Logistic_Regression/Social_Network_Ads.csv") dataset = dataset[3:5]
  • 19. ENCODING THE TARGET FEATURE AS FACTOR dataset$Purchased = factor(dataset$Purchased, levels = c(0, 1))
  • 20. SPLITTING THE DATASET INTO THE TRAINING SET AND TEST SET # install.packages('caTools') library(caTools) set.seed(123) split = sample.split(dataset$Purchased, SplitRatio = 0.75) training_set = subset(dataset, split == TRUE) test_set = subset(dataset, split == FALSE)
  • 21. FEATURE SCALING training_set[-3] = scale(training_set[-3]) test_set[-3] = scale(test_set[-3])
  • 22. FITTING K-NN TO THE TRAINING SET & PREDICTION library(class) y_pred = knn(train = training_set[, c(1,2)], test = test_set[, c(1,2)], cl = training_set$Purchased, k = 5, prob = TRUE)
  • 23. CONFUSION MATRIX cm = table(test_set[, 3], y_pred)