1. Sanjivani Rural Education Society’s
Sanjivani College of Engineering, Kopargaon-423 603
(An Autonomous Institute, Affiliated to Savitribai Phule Pune University, Pune)
NACC ‘A’ Grade Accredited, ISO 9001:2015 Certified
Department of Computer Engineering
(NBA Accredited)
Prof. S. A. Shivarkar
Assistant Professor
Contact No.8275032712
Email- shivarkarsandipcomp@sanjivani.org.in
Subject- Unsupervised Modeling for AIML (CO9301)
Subject- Unsupervised Modeling for AIML (CO9301)
3. Course Outcome
Course Outcome
Understand project management methodology and
Exploratory data analysis.
Apply feature engineering techniques.
Apply clustering techniques.
Apply dimensionality reduction techniques
Apply association rules and recommendation system
Tecniques.
Apply text mining and NLP Techniques.
4. Course Objective
Course Objective
To learn CRISP-ML(Q) method of machine learning models
To understand Clustering, dimensionality reduction
To learn Association rules and recommendation system
To understand various NLP strategies
To learn how to evaluate the models and performance metrics
5. Unit I: Requirement to Machine Learning
Unit I: Requirement to Machine Learning
Project management methodology(CRISP-ML (Q)),Prescriptive
Analytics, Predictive Analytics, Diagnostic Analytics, Descriptive
Analytics, introduction of data types, measurement levels,
measure of central tendency, expected value ,Explorative data
analysis, number summary, boxplot, bargraph, Histogram,
correlation graph, scatter plots ,exploring two or more
variables,Data sampling and its types,various types bias.
6. Unit II: Feature Engineering Techniques
Unit II: Feature Engineering Techniques
Dummy variables conversion techniques Standardization and
normalization, outlier identification and outlier treatment
techniques, skewness identification and its treatment. Finding
null values and its treatment.
7. Unit III: Unsupervised Learning-Clustering
Unit III: Unsupervised Learning-Clustering
Supervised Vs Unsupervised learning, clustering/segmentation
algorithms-Hierarchical, Distance metrics for categorical data,
Distance metrics for continuous ,distance metrics for mixed
data, distance for clusters, k-means clustering, k selection-
elbow curve, drawbacks and comparison
8. Unit IV: Unsupervised Learning -Dimensionality Reduction
Unit IV: Unsupervised Learning -Dimensionality Reduction
Need for dimensionality reduction, Principal component
analysis(PCA),applications for PCA, Singular Value
Decomposition(SVD),application of SVD
9. Unit V: Unsupervised Learning -
Unit V: Unsupervised Learning -Association rules and
Association rules and
recommendation system
recommendation system
Market basket analysis,Association rules intuition,Association
rules applications ,Association rules terminology, need for
recommendation systems,similaritymeasures,user based
recommendation system,item to item collaborative filtering.
10. Unit VI:
Unit VI: Text Mining-Sentiment Analysis and NLP
Text Mining-Sentiment Analysis and NLP
Need of text mining, Bag of words, terminology and
preprocessing,DTM and TDM,corpus level word cloud.
Introduction of NLP,data preprocessing in NLP context ,NLP
terminology ,feature extraction from text,topic modeling,
vector representation
11. Unit I: Requirement to Machine Learning
Unit I: Requirement to Machine Learning
Project management methodology(CRISP-ML (Q)),Prescriptive
Analytics, Predictive Analytics, Diagnostic Analytics, Descriptive
Analytics, introduction of data types, measurement levels,
measure of central tendency, expected value ,Explorative data
analysis, number summary, boxplot, bargraph, Histogram,
correlation graph, scatter plots ,exploring two or more
variables,Data sampling and its types,various types bias.
12. Project management methodology(CRISP-ML (Q))
Project management methodology(CRISP-ML (Q))
Overall, the CRISP-ML(Q) process model describes six phases:
1. Business and Data Understanding
2. Data Engineering (Data Preparation)
3. Machine Learning Model Engineering
4. Quality Assurance for Machine Learning Applications
5. Deployment
6. Monitoring and Maintenance.
15. Project management methodology(CRISP-ML (Q))
Project management methodology(CRISP-ML (Q))
Business and Data Understanding:
Developing machine learning applications starts with
identifying the scope of the ML application, the success
criteria, and a data quality verification.
The goal of this first phase is to ensure the feasibility of the
project.
Defining clear and measurable Key Performance Indicators
(KPI) such as “time savings per user and session” is required.
16. Project management methodology(CRISP-ML (Q))
Project management methodology(CRISP-ML (Q))
Machine Learning Model Engineering
The modeling phase includes model selection, model
specialization, and model training tasks.
Additionally, depending on the application, we might use a pre-trained
model, compress the model, or apply ensemble learning methods to get the
final ML model.
Many phases in ML development are iterative.
Sometimes, we might need to review the business goals, KPIs, and available
data from the previous steps to adjust the outcomes of the ML model results.
Finally, we package the ML workflow in a pipeline to create repeatable model
training during the modeling phase.
17. Project management methodology(CRISP-ML (Q))
Project management methodology(CRISP-ML (Q))
Evaluating Machine Learning Models
Model training is followed by a model evaluation phase, also known as offline testing.
During this phase, the performance of the trained model needs to be validated on a
test set.
Additionally, the model robustness should be assessed using noisy or wrong input data.
Finally, the model deployment decision should be met automatically based on success
criteria or manually by domain and ML experts. Similar to the modeling phase, all
outcomes of the evaluation phase need to be documented.
Deployment: a process of the ML model integration into the existing software system.
Monitoring and Maintenance
https://ml-ops.org/content/crisp-ml
18. Project management methodology(CRISP-ML (Q))
Project management methodology(CRISP-ML (Q))
Deployment:
The ML model deployment denotes a process of the ML model integration
into the existing software system.
After succeeding in the evaluation step in the ML development life cycle, the
ML model is graduated to be deployed in the (pre-) production environment.
The ML model deployment includes the following tasks: inference hardware
definition, model evaluation in a production environment (online testing, e.g.,
A/B tests), providing user acceptance and usability testing, providing a fall-
back plan for model outages, and setting up the deployment strategy to roll
out the new model gradually (e.g. canary or green/blue deployment).
19. Project management methodology(CRISP-ML (Q))
Project management methodology(CRISP-ML (Q))
Monitoring and Maintenance
Once the ML model has been put into production, it is essential to monitor its performance
and maintain it.
When an ML model performs on real-world data, the main risk is the “model staleness”
effect when the performance of the ML model drops as it starts operating on unseen data.
Furthermore, model performance is affected by hardware performance and the existing
software stack.
Therefore, the best practice to prevent the model performance drop is to perform
the monitoring task when the model performance is continuously evaluated to decide
whether the model needs to be re-trained.
This is known as the Continued Model Evaluation pattern.
The decision from the monitoring task leads to the second task - updating the ML model.
20. DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 20
Reference
Reference
https://ml-ops.org/content/crisp-ml