Ml ppt at

PYTHON WITH
MACHINE LEARNING
ACRANTON TECHNOLOGIES PVT LTD

CONTENT OF THE INTERNSHIP
• Introduction
• Linear Regression with One Variable & Python
functions Programming
• Linear Regression with Multiple Variables
• Logistic Regression
• Support Vector Machines / Unsupervised Learning
• Applying Machine Learning & Python Manipulations
& Intelligence Programming with Mini Project

INTRODUCTION
• Machine learning is about extracting knowledge from data.
• Machine Learning theory is a field that intersects statistical,
probabilistic, computer science and algorithmic.
• Despite the immense possibilities of Machine and Deep
Learning, a thorough mathematical understanding of many of
these techniques is necessary for a good grasp of the inner
workings of the algorithms and getting good results.

Origin of Learning.. What is intelligence??
• Ability to comprehend , to understand and profit from
experience.
Three buzz words of ML
• Capability to acquire and apply knowledge
• Mystic Connection to the world
• Ability to learn or adapt to changing world.

We are in era where…
• People worry that computers will get too smart and take over the world, but the real problem is
that they're too stupid and they've already taken over the world." (Pedro Domingos)
• Data is the key to unlocking machine learning, as much as machine learning is the key to unlocking
the insight hidden in data.

->A Brief History of AI
• 1943: McCulloch and Pitts propose a model of artificial neurons
• 1956 Minsky and Edmonds build first neural network computer, the SNARC
• The Dartmouth Conference (1956)
• John McCarthy organizes a two-month workshop for researchers interested in neural networks and the study of intelligence
• Agreement to adopt a new name for this field of study: Artificial Intelligence
• 1952-1969 Enthusiasm:
• Arthur Samuel’s checkers player
• Shakey the robot • Lots of work on neural networks
• 1966-1974 Reality:
• AI problems appear to be too big and complex
• Computers are very slow, very expensive, and have very little memory (compared to today)
• 1969-1979 Knowledge-based systems:
• Birth of expert systems
• Idea is to give AI systems lots of information to start with
• 1980-1988 AI in industry:
• R1 becomes first successful commercial expert system
• Some interesting phone company systems for diagnosing failures of telephone service
• 1990s to the present:
• Increases in computational power (computers are cheaper, faster, and have tons more memory than they used to)
• An example of the coolness of speed: Computer Chess
• 2/96: Kasparov vs Deep Blue : Kasparov victorious: 3 wins, 2 draws, 1 loss
• 3/97: Kasparov vs Deeper Blue : First match won against world champion: 512 processors: 200 million chess positions per second

Why renewed interest in ML
• Loads of data from sensors, and other sources across the
globe!
• Cheap storage (Thanks to cloud computing )!!
• Lowest ever computing cost!!!

Machine Learning is almost everywhere
• Virtual Personal Assistants
• Predictions while Commuting
• Videos Surveillance
• Self driving Car
• Online recommendation offer and customer support
• Email Spam and Malware Filtering
• Epidemic Outbreak Prediction
• Online Fraud Detection
• Delayed airplane flights
• Determining which voters to canvass during an election
• Developing pharmaceutical drugs (combinatorial chemistry)
• Identifying human genes that make people more likely to develop cancer
• Predicting housing prices for real estate companies

Traditional programming vs.
machine learning

What is it???
• Using data” is what is typically referred to as “training”, while
• “answering questions” is referred to as “making predictions”,
or “inference”.
• What connects these two parts together is the model. We
train the model to make increasingly better and more
useful predictions, using the our datasets.
• This predictive model can then be deployed to serve up
predictions on previously unseen data.

What is Machine Learning?
• Science of getting computers to learn without being explicitly
programmed.
• The world is filled with data.
• Machine learning brings the promise of deriving meaning from all of that
data.
• Field of computer science that uses statistical techniques to give
computer systems the ability to "learn" with data, without being
explicitly programmed.

One of the ways to define..
Field of computer science that uses statistical techniques to
give computer systems the ability to "learn" with data,
without being explicitly programmed.

Another definition: (Tom Mitchell)
Example: playing checkers.
E = the experience of playing many games of
checkers
T = the task of playing checkers.
P = the probability that the program will win the
next game.
A computer program is said to learn from
experience E w.r.t some task T and some
performance measure P if its performance on
T as measured by P improves with E

Training set and testing set
• Machine learning is about learning some properties of a data
set and applying them to new data.
• Data Split into two sets:
Training set on which we learn data properties
Testing set on which we test these properties.

Types of Learning
• Supervised (inductive) learning – Given: training data + desired outputs (labels). Learning with a labeled
trainingset Example: email classification with already labeled emails
• Unsupervised learning – Given: training data (without desired outputs). Discover patterns in unlabeled data
Example: cluster similar documents based on text
• Reinforcement learning – Rewards from sequence of actions, learn to act based on feedback/reward Example:
learn to play Go, reward: win or lose

Simple ML Program
Installation: anaconda powershell prompt:
Pip install numpy
Pip install pandas
pip install matplotlib
Pip install scikit_learn
Pip install scipy
Pip install opencv-python
Pip install librosa
Sample program: detection of good and bad wine.

Regression
• Regression searches for relationships among variables.
• Predict a value of a given continuous valued variable based on the values of other
variables, assuming a linear or nonlinear model of dependency.
• Greatly studied in statistics, neural network fields.
Examples: Predicting sales amounts of new product based on advetising expenditure.
Predicting wind velocities as a function of temperature, humidity, air pressure, etc.
Time series prediction of stock market indices.

Why Regression(contd..)
• The objective of a linear regression model is to find a relationship between one or more
features(independent variables) and a continuous target variable(dependent variable). When there is
only feature it is called Uni-variate Linear Regression and if there are multiple features, it is
called Multiple Linear Regression.
• The dependent features are called the dependent variables, outputs, or responses.
• The independent features are called the independent variables, inputs, or predictors.
• Typically, regression is needed to answer whether and how some phenomenon influences the other
or how several variables are related.
• Regression is also useful when you want to forecast a response using a new set of predictors.
• Example: predicting the housing price, economy, computer science, social sciences, and so on. Its
importance rises every day with the availability of large amounts of data and increased awareness of the
practical value of data.

Problem Formulation
• When implementing linear regression of some dependent variable 𝑦 on the set of independent variables
• 𝐱 = (𝑥₁, …, 𝑥ᵣ), where 𝑟 is the number of predictors,
• you assume a linear relationship between 𝑦 and 𝐱: 𝑦 = 𝛽₀ + 𝛽₁𝑥₁ + ⋯ + 𝛽ᵣ𝑥ᵣ + 𝜀. This equation is the regression equation. 𝛽₀, 𝛽₁, …, 𝛽ᵣ are
the regression coefficients, and 𝜀 is the random error.
• Linear regression calculates the estimators of the regression coefficients or simply the predicted weights, denoted with 𝑏₀, 𝑏₁, …, 𝑏ᵣ.
They define the estimated regression function 𝑓(𝐱) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ + 𝑏ᵣ𝑥ᵣ. This function should capture the dependencies between the
inputs and output sufficiently well.
• The estimated or predicted response, 𝑓(𝐱ᵢ), for each observation 𝑖 = 1, …, 𝑛, should be as close as possible to the corresponding actual
response 𝑦ᵢ. The differences 𝑦ᵢ - 𝑓(𝐱ᵢ) for all observations 𝑖 = 1, …, 𝑛, are called the residuals. Regression is about determining the best
predicted weights, that is the weights corresponding to the smallest residuals.
• To get the best weights, you usually minimize the sum of squared residuals (SSR) for all observations 𝑖 = 1, …, 𝑛: SSR = Σᵢ(𝑦ᵢ - 𝑓(𝐱ᵢ))². This
approach is called the method of ordinary least squares.
• output (y) can be calculated from a linear combination of the input variables (X). When there is a single input variable, the method is
referred to as a simple linear regression.

Linear Regression with One
Variable & Python functions
Programming
Simple Linear Regression
Simple or single-variate linear regression is the simplest case of linear regression with a
single independent variable, 𝐱 = 𝑥.
The following figure illustrates simple linear regression:
• When implementing simple linear regression,
you typically start with a given set of input-
output (𝑥-𝑦) pairs (green circles).
• The estimated regression function (black line) has
the equation 𝑓(𝑥) = 𝑏₀ + 𝑏₁𝑥.
• The predicted responses (red squares) are the
points on the regression line that correspond to
the input values.
• The residuals (vertical dashed gray lines) can be
calculated as 𝑦ᵢ - 𝑓(𝐱ᵢ) = 𝑦ᵢ - 𝑏₀ - 𝑏₁𝑥ᵢ for 𝑖 = 1, …, 𝑛.

Linear Regression with Multiple Variables
• Multiple or multivariate linear regression is a case of linear regression with two or
more independent variables.
• If there are just two independent variables, the estimated regression function is 𝑓(𝑥
₁, 𝑥₂) = 𝑏₀ + 𝑏₁𝑥₁ + 𝑏₂𝑥₂. It represents a regression plane in a three-dimensional
space. The goal of regression is to determine the values of the weights 𝑏₀, 𝑏₁, and
𝑏₂ such that this plane is as close as possible to the actual responses and yield the
minimal SSR.
• The case of more than two independent variables is similar, but more general. The
estimated regression function is 𝑓(𝑥₁, …, 𝑥ᵣ) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ +𝑏ᵣ𝑥ᵣ, and there

Logistic Regression
• Logistic Regression is used when the
dependent variable(target) is categorical.
• Consider a scenario where we need to
classify whether an email is spam or not.

Gradient Decent Algorithm
Gradient Decent Algorithm-Part 1
Gradient Decent Algorithm-Part 2

Unsupervised Learning
• No labels are given to the learning algorithm, leaving it on its own to find
structure in its input. Unsupervised learning can be a goal in itself
(discovering hidden patterns in data) or a means towards an end (feature
learning)..
• In some pattern recognition problems, the training data consists of a set of
input vectors x without any corresponding target values. The goal in such
unsupervised learning problems may be to discover groups of similar
examples within the data, where it is called clustering, or to determine how
the data is distributed in the space, known as density estimation.

Why Unsupervised Learning
• Annotating large datasets is very costly and hence we can
label only a few examples manually. Example: Speech
Recognition
• There may be cases where we don’t know how many/what
classes is the data divided into. Example: Data Mining
• We may want to use clustering to gain some insight into the
structure of the data before designing a classifier.

What is Clustering
Clustering can be considered the most important unsupervised
learning problem; so, as every other problem of this kind, it deals with finding
a structure in a collection of unlabeled data. A loose definition of clustering could
be “the process of organizing objects into groups whose members are similar in
some way”. A cluster is therefore a collection of objects which are “similar”
between them and are “dissimilar” to the objects belonging to other clusters.

Goal of Clustering
The goal of clustering is to determine the internal grouping in a set of
unlabeled data. But how to decide what constitutes a good clustering? It can
be shown that there is no absolute “best” criterion which would be independent
of the final aim of the clustering.

Proximity Measures
• For clustering, we need to define a proximity measure for two data
points. Proximity here means how similar/dissimilar the samples are
with respect to each other.
• Similarity measure S(xi,xk): large if xi,xk are similar
• Dissimilarity(or distance) measure D(xi,xk): small if xi,xk are similar

K-Means Clustering
• The procedure follows a simple and easy way to classify a given data set through a certain number of clusters
(assume k clusters) fixed a priori. The main idea is to define k centres, one for each cluster.
• These centroids should be placed in a smart way because of different location causes different result.
• The next step is to take each point belonging to a given data set and associate it to the nearest centroid.
• At this point we need to re-calculate k new centroids as barycenters of the clusters resulting from the previous
step. After we have these k new centroids, a new binding has to be done between the same data set points and
the nearest new centroid.
• A loop has been generated. As a result of this loop we may notice that the k centroids change their location step
by step until no more changes are done.
• Finally, this algorithm aims at minimizing an objective function, in this case a squared error function. The objective
function.

Algorithm Steps
The algorithm is composed of the following steps:
• Let X = {x1,x2,x3,……..,xn} be the set of data points and V = {v1,v2,…….,vc} be the set of
centers.
• Randomly select ‘c’ cluster centers.
• Calculate the distance between each data point and cluster centers.
• Assign the data point to the cluster center whose distance from the cluster center is minimum of
all the cluster centers.
• Recalculate the new cluster center using:
where, ‘ci’ represents the number of data points in ith cluster.
• Recalculate the distance between each data point and new obtained cluster centers.
• If no data point was reassigned then stop, otherwise repeat from step 3).

Working on projects
1.AN IMPROVED OF SPAM E-MAIL CLASSIFICATION
MECHANISM USING K-MEANS CLUSTERING
AnimprovedofspamE-
mailclassificationmechanismusingK-
meansclustering.pdf

Terms used
• Training example: a sample from x including its output from the target function
• Target function: the mapping function f from x to f(x)
• Hypothesis: approximation of f, a candidate function.
Example: E- mail spam classification, it would be the rule we came up with that
allows us to separate spam from non-spam emails.
• Concept: A Boolean target function, positive examples and negative examples
• Classifier: Learning program outputs a classifier that can be used to classify.
• Learner: Process that creates the classifier.
• Hypothesis space: set of possible approximations of f that the algorithm can create.

Ml ppt at

More Related Content

What's hot (20)

Similar to Ml ppt at (20)

More from pradeep kumar (6)

Recently uploaded (20)

Ml ppt at