SlideShare a Scribd company logo
Machine
Learning?
What
is
Soumya Mukherjee
Md Shamimuddin
https://www.linkedin.com/in/soumyarmukherjee/
https://www.linkedin.com/in/mdshamimuddin/
Agenda
 Overview of AI and ML
 Terminology awareness
 Applications in real world
 Use cases within Nokia
 Types of Learning
 Regression
 Classification
 Clustering
 Linear Regression Single Variable with python
• Arthur Samuel (1959)
Machine Learning: Field of study that gives computers the
ability to learn without being explicitly programmed.
• Tom Mitchell (1998)
A computer program is said to learn from experience E with
respect to some task T and some performance measure P, if its
performance on T, as measured by P, improves with experience E.
Machine Learning Definition
Artificial Intelligence Vs Machine Learning Vs Deep Learning
Terminology Awareness
Implies huge data
volumes that cannot be
processed effectively with
traditional applications.
Big Data processing
begins with raw data that
is not aggregated and it is
often impossible to store
such data in the memory
of a single computer
Is about using Statistics
as well as other
programming methods to
find patterns hidden in
the data so that you can
explain some
phenomenon. Machine
Learning uses Data
Mining techniques and
other learning algorithms
to build models of what is
happening behind
some data.
Big Data Data Mining
Is an artificial
intelligence technique
that is broadly used
in Data Mining. ML uses
a training dataset to build
a model that can predict
values of target variables.
Data Mining uses the
predictive force of
Machine Learning by
applying various ML
algorithms on Big data.
Machine Learning
WHAT IS ARTIFICIAL INTELLIGENCE
• Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent
machines that work and react like humans. Some of the activities computers with artificial intelligence
are designed for include:
Knowledge
Gain
Reasoning
Problem
Solving
Learning
Artificial Intelligence
Machine Learning
Supervised Unsupervised Reinforcement
Types of Learning
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Target/outcome
variable to be
predicted from set of
predictors is known
at training phase.
E.g. Regression,
Decision Tree,
Random Forest, KNN
Target/outcome
variable to be
predicted from set of
predictors is
unknown at training
phase.
E.g. Clustering (K-
means, Apriori)
Machine is trained to
take specific decision
Exposed to an
environment where it
trains itself
continually using trial
and error.
E.g. Markov Decision
process
Applications in real world
• Google search engine
• Self driving cars
• Facebook auto tagging
• Netflix movie recommendation
• Amazon product recommendation
• Healthcare diagnosis
• Speech recognition
• StackOverflow QA tagging
• Chatbot
Data as input
(Text files,
spreadsheet,
SQL database)
Feature Engineering
(Removing unwanted data,
Handle missing values,
Normalization or
Standardization)
Algorithm
Output/
Model
Pipeline solving ML Problem
Pipeline in solving ML Problem
Data Exploration/Feature Engineering
1. Variable Identification
• Predictor(s) n Target
• Type n Category of variable
2. Univariate Analysis
• Central tendency
• Measure of Dispersion
• Visualization Method
• Frequency table(categorical)
3. Bivariate Analysis
• Relation between 2 variables
• Correlation
• Chi-square test
• Z-test
4. Missing Value
Treatment
• Deletion
• Imputation
• Prediction Model
• KNN Imputation
5. Outlier Handling
Detection
• Very Important to handle outlier
• Visualization technique like box-
plot, scatter plot, Histogram
• Any value beyond -1.5IQR to
1.5IQR is an outlier
Treatment
• Remove
• Scale or Normalize
• Transform
• Impute
SUPERVISED LEARNING
• Supervised learning is used whenever we want to predict a certain outcome from
a given input, and we have examples of input/output pairs.
• We build a machine learning model from these input/output pairs, which
comprise our training set.
• Our goal is to make accurate predictions for new, never-before-seen data.
• Supervised learning often requires human effort to build the training set, but
afterward automates and often speeds up an otherwise laborious or infeasible
task.
TYPES OF SUPERVISED MODEL
• Regression :
• regression is the process of predicting a continuous value
• Classification
• predict a class label, which is a choice from a predefined list of possibilities.
CLASSIFICATION
• Binary Classification : Distinguishing between exactly two classes
• Multiclass classification : Classification between more than two classes.
Types of regression
1. Simple Linear Regression
Single predictor + single target
y = m*x + c
2. Multiple Linear Regression
Multiple predictors + single target
y = m1*x1 + m2*x2 + c
3. Polynomial Regression
One or many predictors + single target
Y = mn * x^n + … + m2*x^2 + m1*x1 + c
4. Stepwise Regression
Useful in case of multiple predictors
Add or Remove predictors as needed
Forward selection
Backward elimination
5. Lasso Regression
6. Ridge Regression
7. ElasticNet Regression
Simple Linear Regression
• Single predictor and single target
• Y = b0 + b1*X
• Minimum sum squared error
• Standard packages are already available
• Formula
• Programming example
Classification
 Type of supervised learning
 Output or target is a categorical outcome
Example
 Mail spam or no spam
 Weather rainy, sunny, humid
 Stock price up or down
Predictor(s) Algorithm
Categorical
Target
Types of Classification
1. K-nearest Neighbor Classifier
2. Logistic Regression
3. NaĂŻve Bayes 6. Support Vector Machine
Classifier
5. Random Forest Classifier
4. Decision Tree Classifier
Clustering (Unsupervised learning)
Cluster 1
Cluster 2
Cluster 3
Unsupervised learning
• Unsupervised learning is the training of machine using
information that is neither classified nor labelled
For instance, Given an image having both dogs and cats which have not seen ever.
Machine tries to find pattern
based on shape of head,
ears, body structure etc.
Reinforcement Learning
• Reinforcement learning (RL) is an area of machine learning concerned with
how software agents ought to take actions in an environment so as to maximize some
notion of cumulative reward. (source : Wikipedia)
Eg : you go near fire , its warm : positive reinforcement
you touch fire, it burns your hand : negative reinforcement  learn not to touch
fire
• Algorithms for RL include – MonteCarlo methods, Markov Decision Processes, Q-
learning etc
ML in Python:
• Numpy
• Pandas
• Scikit-learn
• Matplotlib
• Seaborn
Non-
Programming:
• Weka
• Orange
• RapidMiner
• Qlik Sense
• xls
Deep Learning:
• Tensorflow
• Keras
• PyTorch
• Theano
Tools And Packages
LINEAR REGRESSION
SINGLE VARIABLE
LINEAR REGRESSION
• Linear regression, or ordinary least squares (OLS), is the simplest and most classic
linear method for regression. Linear regression finds the parameters m and b that
minimize the mean squared error between predictions and the true regression
targets, y, on the training set.
HOME PRICES
area price
2600 550000
3000 565000
3200 610000
3600 680000
4000 725000
HOME PRICES
area price
2600 550000
3000 565000
3200 610000
3600 680000
4000 725000
Given these home prices, find
out the price of homes whose
area is
3300 square feet
5000 square feet
SCATTER PLOT
BEST FIT LINE.
PREDICT HOME PRICES FOR A GIVEN AREA
PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
SLOPE INTERSECTION EQUATION OF A STRAIGHT
LINE
PROGRAM IN PYTHON
EVALUATING MODEL PERFORMANCE
• The performance of a regression model can be understood by knowing the error
rate of the predictions made by the model. You can also measure the performance
by knowing how well your regression line fit the dataset.
• Let’s try to understand how to measure the performance of regression models.
• A good regression model is one where the difference between the actual or
observed values and predicted values for the selected model is small and unbiased
for train, validation and test data sets.
EVALUATING MODEL PERFORMANCE
• To measure the performance of your regression model, some statistical metrics are used. They
are-
• Mean Absolute Error(MAE)
• Root Mean Square Error(RMSE)
• Coefficient of determination or R2
• Adjusted R2
MEAN ABSOLUTE ERROR(MAE)
• This is the simplest of all the metrics. It is measured by taking the average of the absolute
difference between actual values and the predictions.
MEAN ABSOLUTE ERROR (MAE)
ROOT MEAN SQUARE ERROR(RMSE)
• The Root Mean Square Error is measured
by taking the square root of the average
of the squared difference between the
prediction and the actual value.
• It represents the sample standard
deviation of the differences between
predicted values and observed
values(also called residuals). It is
calculated using the following formula:
ROOT MEAN SQUARE ERROR(RMSE)
COEFFICIENT OF DETERMINATION OR R^2
• It measures how well the actual
outcomes are replicated by the
regression line.
• It helps you to understand how well the
independent variable adjusted with the
variance in your model.
• That means how good is your model
for a dataset.
• The mathematical representation for
R^2 is
Here, SSR = Sum Square of
Residuals(the squared difference
between the predicted and the
average value)
SST = Sum Square of Total(the
squared difference between the
actual and average value)
COEFFICIENT OF DETERMINATION OR R^2 (CONT.)
• Here the green line represents the regression line
and the red line represents the average line. The
differences in data points from these lines are
taken in the equation.
• Usually, the value of R^2 lies between 0 to 1(it
can be negative if the regression line somehow
has a worse fit than the average!). The closer its
value to one, the better your model is. This is
because either your regression line has well fitted
the dataset or the data points are distributed with
low variance. Which lessens the value of the Sum
of Residuals. Hence, the equation gets closer to
one.
THANK YOU

More Related Content

PDF
Logistic regression in Machine Learning
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PPT
CONFUSION MATRIX.ppt
PPTX
Support vector machines (svm)
PPTX
Recursive Descent Parsing
PDF
Linear regression
PPTX
Machine Learning-Linear regression
PPTX
Genetic algorithms vs Traditional algorithms
Logistic regression in Machine Learning
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
CONFUSION MATRIX.ppt
Support vector machines (svm)
Recursive Descent Parsing
Linear regression
Machine Learning-Linear regression
Genetic algorithms vs Traditional algorithms

What's hot (20)

PDF
K - Nearest neighbor ( KNN )
PPTX
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
PPTX
N queens using backtracking
PPT
Np cooks theorem
PDF
AI PPT-ALR_Unit-3-1.pdf
PPTX
Machine learning session4(linear regression)
PPT
backpropagation in neural networks
PPTX
8 queens problem using back tracking
PPTX
Non- Deterministic Algorithms
PPTX
Decision Trees
PPTX
Loss Function.pptx
PPTX
Clustering
PPT
2.5 backpropagation
PPTX
Logistic regression
PPTX
Machine learning with ADA Boost
PPT
Fuzzy relations
PDF
Scaling and Normalization
PPTX
Deep Learning With Neural Networks
PPT
program partitioning and scheduling IN Advanced Computer Architecture
PPTX
Adversarial search
K - Nearest neighbor ( KNN )
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
N queens using backtracking
Np cooks theorem
AI PPT-ALR_Unit-3-1.pdf
Machine learning session4(linear regression)
backpropagation in neural networks
8 queens problem using back tracking
Non- Deterministic Algorithms
Decision Trees
Loss Function.pptx
Clustering
2.5 backpropagation
Logistic regression
Machine learning with ADA Boost
Fuzzy relations
Scaling and Normalization
Deep Learning With Neural Networks
program partitioning and scheduling IN Advanced Computer Architecture
Adversarial search
Ad

Similar to Machine learning and linear regression programming (20)

PDF
LR2. Summary Day 2
PPTX
Unsupervised Learning: Clustering
PPTX
Ml ppt at
PPTX
Application of Machine Learning in Agriculture
PPTX
Machine Learning techniques used in AI.
PPTX
Day17.pptx department of computer science and eng
PPTX
An Introduction to Simulation in the Social Sciences
PDF
The Power of Auto ML and How Does it Work
PDF
Machine Learning Notes for beginners ,Step by step
PPTX
Machine learning ppt unit one syllabuspptx
PPT
Supervised and unsupervised learning
PDF
Integrating Artificial Intelligence with IoT
PPTX
Machine Learning, Deep Learning and Data Analysis Introduction
PDF
Introduction to data structure and algorithm
 
PDF
EiB Seminar from Esteban Vegas, Ph.D.
PPTX
Matrix OLS dshksdfbksjdbfkdjsfbdskfbdkj.pptx
PPTX
Build_Machine_Learning_System for Machine Learning Course
PPTX
Macine learning algorithms - K means, KNN
PPTX
fINAL ML PPT.pptx
PDF
An introduction to machine learning and statistics
LR2. Summary Day 2
Unsupervised Learning: Clustering
Ml ppt at
Application of Machine Learning in Agriculture
Machine Learning techniques used in AI.
Day17.pptx department of computer science and eng
An Introduction to Simulation in the Social Sciences
The Power of Auto ML and How Does it Work
Machine Learning Notes for beginners ,Step by step
Machine learning ppt unit one syllabuspptx
Supervised and unsupervised learning
Integrating Artificial Intelligence with IoT
Machine Learning, Deep Learning and Data Analysis Introduction
Introduction to data structure and algorithm
 
EiB Seminar from Esteban Vegas, Ph.D.
Matrix OLS dshksdfbksjdbfkdjsfbdskfbdkj.pptx
Build_Machine_Learning_System for Machine Learning Course
Macine learning algorithms - K means, KNN
fINAL ML PPT.pptx
An introduction to machine learning and statistics
Ad

Recently uploaded (20)

PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Global journeys: estimating international migration
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPT
Quality review (1)_presentation of this 21
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Clinical guidelines as a resource for EBP(1).pdf
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Supervised vs unsupervised machine learning algorithms
.pdf is not working space design for the following data for the following dat...
STUDY DESIGN details- Lt Col Maksud (21).pptx
Global journeys: estimating international migration
oil_refinery_comprehensive_20250804084928 (1).pptx
climate analysis of Dhaka ,Banglades.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
1_Introduction to advance data techniques.pptx
Quality review (1)_presentation of this 21
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn

Machine learning and linear regression programming

  • 2. Agenda  Overview of AI and ML  Terminology awareness  Applications in real world  Use cases within Nokia  Types of Learning  Regression  Classification  Clustering  Linear Regression Single Variable with python
  • 3. • Arthur Samuel (1959) Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed. • Tom Mitchell (1998) A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. Machine Learning Definition
  • 4. Artificial Intelligence Vs Machine Learning Vs Deep Learning
  • 6. Implies huge data volumes that cannot be processed effectively with traditional applications. Big Data processing begins with raw data that is not aggregated and it is often impossible to store such data in the memory of a single computer Is about using Statistics as well as other programming methods to find patterns hidden in the data so that you can explain some phenomenon. Machine Learning uses Data Mining techniques and other learning algorithms to build models of what is happening behind some data. Big Data Data Mining Is an artificial intelligence technique that is broadly used in Data Mining. ML uses a training dataset to build a model that can predict values of target variables. Data Mining uses the predictive force of Machine Learning by applying various ML algorithms on Big data. Machine Learning
  • 7. WHAT IS ARTIFICIAL INTELLIGENCE • Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. Some of the activities computers with artificial intelligence are designed for include: Knowledge Gain Reasoning Problem Solving Learning
  • 9. Types of Learning Supervised Learning Unsupervised Learning Reinforcement Learning Target/outcome variable to be predicted from set of predictors is known at training phase. E.g. Regression, Decision Tree, Random Forest, KNN Target/outcome variable to be predicted from set of predictors is unknown at training phase. E.g. Clustering (K- means, Apriori) Machine is trained to take specific decision Exposed to an environment where it trains itself continually using trial and error. E.g. Markov Decision process
  • 10. Applications in real world • Google search engine • Self driving cars • Facebook auto tagging • Netflix movie recommendation • Amazon product recommendation • Healthcare diagnosis • Speech recognition • StackOverflow QA tagging • Chatbot
  • 11. Data as input (Text files, spreadsheet, SQL database) Feature Engineering (Removing unwanted data, Handle missing values, Normalization or Standardization) Algorithm Output/ Model Pipeline solving ML Problem
  • 12. Pipeline in solving ML Problem
  • 13. Data Exploration/Feature Engineering 1. Variable Identification • Predictor(s) n Target • Type n Category of variable 2. Univariate Analysis • Central tendency • Measure of Dispersion • Visualization Method • Frequency table(categorical) 3. Bivariate Analysis • Relation between 2 variables • Correlation • Chi-square test • Z-test 4. Missing Value Treatment • Deletion • Imputation • Prediction Model • KNN Imputation 5. Outlier Handling Detection • Very Important to handle outlier • Visualization technique like box- plot, scatter plot, Histogram • Any value beyond -1.5IQR to 1.5IQR is an outlier Treatment • Remove • Scale or Normalize • Transform • Impute
  • 14. SUPERVISED LEARNING • Supervised learning is used whenever we want to predict a certain outcome from a given input, and we have examples of input/output pairs. • We build a machine learning model from these input/output pairs, which comprise our training set. • Our goal is to make accurate predictions for new, never-before-seen data. • Supervised learning often requires human effort to build the training set, but afterward automates and often speeds up an otherwise laborious or infeasible task.
  • 15. TYPES OF SUPERVISED MODEL • Regression : • regression is the process of predicting a continuous value • Classification • predict a class label, which is a choice from a predefined list of possibilities.
  • 16. CLASSIFICATION • Binary Classification : Distinguishing between exactly two classes • Multiclass classification : Classification between more than two classes.
  • 17. Types of regression 1. Simple Linear Regression Single predictor + single target y = m*x + c 2. Multiple Linear Regression Multiple predictors + single target y = m1*x1 + m2*x2 + c 3. Polynomial Regression One or many predictors + single target Y = mn * x^n + … + m2*x^2 + m1*x1 + c 4. Stepwise Regression Useful in case of multiple predictors Add or Remove predictors as needed Forward selection Backward elimination 5. Lasso Regression 6. Ridge Regression 7. ElasticNet Regression
  • 18. Simple Linear Regression • Single predictor and single target • Y = b0 + b1*X • Minimum sum squared error • Standard packages are already available • Formula • Programming example
  • 19. Classification  Type of supervised learning  Output or target is a categorical outcome Example  Mail spam or no spam  Weather rainy, sunny, humid  Stock price up or down Predictor(s) Algorithm Categorical Target
  • 20. Types of Classification 1. K-nearest Neighbor Classifier 2. Logistic Regression 3. NaĂŻve Bayes 6. Support Vector Machine Classifier 5. Random Forest Classifier 4. Decision Tree Classifier
  • 22. Unsupervised learning • Unsupervised learning is the training of machine using information that is neither classified nor labelled For instance, Given an image having both dogs and cats which have not seen ever. Machine tries to find pattern based on shape of head, ears, body structure etc.
  • 23. Reinforcement Learning • Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. (source : Wikipedia) Eg : you go near fire , its warm : positive reinforcement you touch fire, it burns your hand : negative reinforcement  learn not to touch fire • Algorithms for RL include – MonteCarlo methods, Markov Decision Processes, Q- learning etc
  • 24. ML in Python: • Numpy • Pandas • Scikit-learn • Matplotlib • Seaborn Non- Programming: • Weka • Orange • RapidMiner • Qlik Sense • xls Deep Learning: • Tensorflow • Keras • PyTorch • Theano Tools And Packages
  • 26. LINEAR REGRESSION • Linear regression, or ordinary least squares (OLS), is the simplest and most classic linear method for regression. Linear regression finds the parameters m and b that minimize the mean squared error between predictions and the true regression targets, y, on the training set.
  • 27. HOME PRICES area price 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000
  • 28. HOME PRICES area price 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000 Given these home prices, find out the price of homes whose area is 3300 square feet 5000 square feet
  • 31. PREDICT HOME PRICES FOR A GIVEN AREA
  • 32. PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
  • 33. PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
  • 34. SLOPE INTERSECTION EQUATION OF A STRAIGHT LINE
  • 36. EVALUATING MODEL PERFORMANCE • The performance of a regression model can be understood by knowing the error rate of the predictions made by the model. You can also measure the performance by knowing how well your regression line fit the dataset. • Let’s try to understand how to measure the performance of regression models. • A good regression model is one where the difference between the actual or observed values and predicted values for the selected model is small and unbiased for train, validation and test data sets.
  • 37. EVALUATING MODEL PERFORMANCE • To measure the performance of your regression model, some statistical metrics are used. They are- • Mean Absolute Error(MAE) • Root Mean Square Error(RMSE) • Coefficient of determination or R2 • Adjusted R2
  • 38. MEAN ABSOLUTE ERROR(MAE) • This is the simplest of all the metrics. It is measured by taking the average of the absolute difference between actual values and the predictions.
  • 40. ROOT MEAN SQUARE ERROR(RMSE) • The Root Mean Square Error is measured by taking the square root of the average of the squared difference between the prediction and the actual value. • It represents the sample standard deviation of the differences between predicted values and observed values(also called residuals). It is calculated using the following formula:
  • 41. ROOT MEAN SQUARE ERROR(RMSE)
  • 42. COEFFICIENT OF DETERMINATION OR R^2 • It measures how well the actual outcomes are replicated by the regression line. • It helps you to understand how well the independent variable adjusted with the variance in your model. • That means how good is your model for a dataset. • The mathematical representation for R^2 is Here, SSR = Sum Square of Residuals(the squared difference between the predicted and the average value) SST = Sum Square of Total(the squared difference between the actual and average value)
  • 43. COEFFICIENT OF DETERMINATION OR R^2 (CONT.) • Here the green line represents the regression line and the red line represents the average line. The differences in data points from these lines are taken in the equation. • Usually, the value of R^2 lies between 0 to 1(it can be negative if the regression line somehow has a worse fit than the average!). The closer its value to one, the better your model is. This is because either your regression line has well fitted the dataset or the data points are distributed with low variance. Which lessens the value of the Sum of Residuals. Hence, the equation gets closer to one.

Editor's Notes

  • #16: list of possibilities. classification approach can be thought of as a means of categorizing or "classifying" some unknown items into a discrete set of "classes."
  • #30: plt.scatter(df['area'],df['price'] , marker = '*', color = 'red')
  • #31: plt.xlabel('area') plt.ylabel('price') plt.scatter(df['area'],df['price'], marker = '*', color = 'red') plt.plot(df['area'], model.predict(df[['area']]))