Mis End Term Exam Theory Concepts

MIS End Term Exam for AI and A2
on Thursday ,6th February 2020
Concepts
Please read the handwritten scanned
notes also

End Term Syllabus
• Decision Trees – Graphical Representation – Root
Node , Internal Nodes , Leaf
• Level ,Height and Depth of Node
• Decision Tree – As Classifier
• Classifier Case Study /Numerical using 3 Classifier
Algorithms
- ID3
- CART
-Naïve Bayes

End Term Syllabus Cont
• Supervised , Unsupervised and Reinforced
Learning
• Unsupervised Learning – Clustering – use of K
Means in numerical example
• Bias & Variance , Overfitting and Underfitting
Trees with reference to Training and Test Data
• Ensemble Learning Technique using Bagging
• Random Forest

How does id3 algorithm work?
The ID3 algorithm begins with the original set as the root node. On
each iteration of the algorithm, it iterates through every unused
attribute of the set and calculates the entropy or the information gain
of that attribute. ... The set is then split or partitioned by the selected
attribute to produce subsets of the data.

What is id3 in decision tree?
Very simply, ID3 builds a decision tree from a fixed set of examples. ...
The leaf nodes of the decision tree contain the class name whereas a
non-leaf node is a decision node. The decision node is an attribute test
with each branch (to another decision tree) being a possible value of
the attribute.

How does CART algorithm work?
A Classification and Regression Tree(CART) is a predictive algorithm used
in machine learning. It explains how a target variable's values can be
predicted based on other values. It is a decision tree where each fork is a
split in a predictor variable and each node at the end has a prediction for
the target variable

What is CART technique?
Decision Trees
Classification and Regression Trees or CART for short is a term introduced
by Leo Breiman to refer to Decision Tree algorithms that can be used for
classification or regression predictive modeling problems.

What is a cart analysis?
Classification and regression tree (CART) analysis recursively
partitions observations in a matched data set, consisting of a
categorical (for classification trees) or continuous (for regression
trees) dependent (response) variable and one or more independent
(explanatory) variables, into progressively smaller groups

What is Naive Bayes classifier
algorithm?
Naive Bayes classifiers are a collection of
classification algorithms based on Bayes' Theorem. It is not a
single algorithm but a family of algorithms where all of them share a
common principle, i.e. every pair of features being classified is
independent of each other.

What is K means clustering used for?
Business Uses
The K-means clustering algorithm is used to find groups which have
not been explicitly labelled in the data. ... Once the algorithm has
been run and the groups are defined, any new data can be easily
assigned to the correct group. This is a versatile algorithm that can
be used for any type of grouping.

Mis End Term Exam Theory Concepts

What is K means clustering in data
mining?
Data mining with the K-Means algorithm
The k-means clustering algorithm is a data mining and machine
learning tool used to cluster observations into groups of related
observations without any prior knowledge of those relationships. ...
The term "k-means" was coined in 1967 by James McQueen

How is K means calculated?
K-Means Clustering
Select k points at random as cluster centers. Assign objects to
their closest cluster center according to the Euclidean distance
function. Calculate the centroid or mean of all objects in each
cluster. Repeat steps 2, 3 and 4 until the same points are assigned
to each cluster in consecutive rounds.

What is a supervised learning model?
Supervised learning is a learning model built to make prediction,
given an unforeseen input instance. A supervised
learning algorithm takes a known set of input dataset and its
known responses to the data (output) to learn the
regression/classification model.

How does supervised machine
learning work?
Supervised machine learning builds a model that makes predictions
based on evidence in the presence of uncertainty. A supervised
learning algorithm takes a known set of input data and known
responses to the data (output) and trains a model to generate
reasonable predictions for the response to new data.

What is meant by unsupervised
learning?
Unsupervised learning is a type of machine learning algorithm used
to draw inferences from datasets consisting of input data without
labeled responses. The most common unsupervised
learning method is cluster analysis, which is used for exploratory
data analysis to find hidden patterns or grouping in data.

How does unsupervised learning
work?
In unsupervised learning, an AI system is presented with unlabelled,
uncategorized data and the system's algorithms act on the data without
prior training. ... In essence, unsupervised learning can be thought of
as learning without a teacher. In case of supervised learning, the system
has both the inputs and the outputs.

What is meant by reinforcement
learning?
Reinforcement Learning is a type of Machine Learning, and thereby
also a branch of Artificial Intelligence. It allows machines and software
agents to automatically determine the ideal behaviour within a specific
context, in order to maximize its performance.

What is reinforcement learning used
for?
Reinforcement learning is a type of Machine Learning algorithms
which allows software agents and machines to automatically determine
the ideal behaviour within a specific context, to maximize its
performance. An extensive blog post, which you can read here - How
businesses can leverage reinforcement learning?

What is bias and variance?
Bias is the simplifying assumptions made by the model to make the
target function easier to approximate. Variance is the amount that
the estimate of the target function will change given different
training data. Trade-off is tension between the error introduced by
the bias and the variance.

What is bias and variance in
Supervised learning?
The bias–variance dilemma or bias–variance problem is the conflict in
trying to simultaneously minimize these two sources of error that
prevent supervised learning algorithms from generalizing beyond their
training set: The bias error is an error from erroneous assumptions in
the learning algorithm.

What is bias and variance in Decision
Trees?
Q: Explain the bias vs. variance trade-off in statistical learning. A:
The bias-variance trade-off is an important aspect of data
science projects based on machine learning. ... The error due to
squared bias is the amount by which the expected model prediction
differs from the true value or target, over the training data.

What is Underfitting and Overfitting in
Decision Trees ?
Overfitting: Good performance on the training data, poor
generalization to other data. Underfitting: Poor performance on the
training data and poor generalization to other data.

What is meant by Overfitting of data?
Overfitting is a modelling error which occurs when a function is too
closely fit to a limited set of data points. Overfitting the model
generally takes the form of making an overly complex model to
explain idiosyncrasies in the data under study.

How do you deal with Underfitting?
According to Andrew Ng, the best methods of dealing with
an underfitting model is trying a bigger neural network (adding new layers
or increasing the number of neurons in existing layers) or training the
model a little bit longer.

What are the differences between
Overfitting and Underfitting?
Overfitting is a modelling error which occurs when a function is too
closely fit to a limited set of data points. Underfitting refers to a model
that can neither model the training data nor generalize to new data. ...
Intuitively, underfitting occurs when the model or the algorithm does
not fit the data well enough.

What is Ensemble Learning?
Ensemble Learning is a process using which multiple machine
learning models (such as classifiers) are strategically constructed to
solve a particular problem. Let's take a real example to build the
intuition. Suppose, you want to invest in a company XYZ. You are not
sure about its performance though.

How do ensemble models work?
What is an ensemble method? Ensemble models in machine
learning combine the decisions from multiple models to improve the
overall performance. They operate on the similar idea as employed
while buying headphones. The main causes of error in
learning models are due to noise, bias and variance.

Random Forest Model
Random forests, otherwise known as the random forest model, is a
method for classification and other tasks. It operates from decision trees
and outputs classification of the individual trees. Random forests correct
for the habit of decision trees to overfit to their training set.

What is test data and train data?
The training data set is a grouped set of examples that are used to fit
the parameters. The test data set is the last evaluation of the final
model fit on the training data set. So like the last test before it's the real
deal. Testing against each other ensures the machine learning model
will be more accurate.

What is meant by test data?
Test data is the data that is used in tests of a software system. ...
When test data is entered the expected result should come and
some test data is used to verify the software behaviour to invalid
input data. Test data is generated by testers or by automation tools
which support testing.

Mis End Term Exam Theory Concepts

More Related Content

What's hot (17)

Similar to Mis End Term Exam Theory Concepts (20)

Recently uploaded (20)

Mis End Term Exam Theory Concepts