1. Weather Patterns Analysis and
Prediction using machine
learning
This is a project report for the pattern analysis of weather dataset provided.
We’ll look into some predictions made using machine learning models and
key features of dataset.
by Yash Saxena
2. Project Overview
Objective
Predict weather conditions (e.g., rain presence)
Identify weather patterns and clusters
Use machine learning algorithms for prediction
Dataset Overview
The Dataset shows a number of aspects of weather.
Temperature, weather condition and rain presence are a few of
such. It has 730 instances, 8 features and 1 Key column.
Dataset contains weather data from 1st
jan 2015 to 31st
march
2017.
3. Exploratory Data Analysis
(EDA)
1 Temperature
Lowest: 12°C
Highest:45°C
Average:30.78 °C
Frequently Occuring:35°C
Occuring in 70 times.
2 Humidity
Lowest:6%
Maximum:100%
Average:36.34%
Frequently Occuring:31%
Occuring 30 times
3 Pressure
Lowest:994 hPa
Highest:1026 hPa
Average:1007.742 hPa
Frequently occuring:1014 hPa
Occuring times
4 Dew Point
Humidity
Lowest:1 °C
Highest:28 °C
Average:16.64 °C
Frequently occuring:12 °C
Occuring times
4. Methodology:Learning algorithms I used
K-NN Classification
The K-Nearest Neighbors algorithm calculates distances
between data points using a chosen metric (e.g., Euclidean
distance) and then assigns a class based on the majority
class among the k-nearest neighbors. Here, k is the number
of neighbours taken in consideration. This number is input by
user.
K-Means Clustering
K-Means Clustering aims to partition data into k number of
clusters by choosing k number of centroids and assigning
membership to nearest points and then recompute their
locations and the process goes on until there is no more
recomputation of their locations.
5. Results: Model Performance
K-NN Classification
Using this (where k=3), the rain presence predicted on the
06th
Jan 2019 was 0, indicating no rain. This output came from
the three nearest neighbours which were the following---
K-Means Clustering
The K-Means algorithm successfully identified distinct clusters
within the weather dataset with the two centroids as 01-Jan-
2015 and 02-Jan-2015
DATE 14-Jan-2016 28-Jan-2017 29-Jan-2017
Distance 4.034 4.585 2.833
6. Insights and Learnings
Trends
Analysis uncovered that the lower
the humidity was, the higher the
visibility became.
Insights
While analyzing, it was uncovered
that there is a steady increase of
0.88 degree celsius every year. This
can raise a big concern regarding
Global warming. With other natural
phenomenons like El Niño, there is a
good possibility in excessive
increase in temperature.
Impact
Machine learning techniques proved effective in analyzing and predicting
weather patterns, demonstrating the potential of data science and AI in
advancing weather forecasting and improving decision-making.
7. Challenges and Recommendations
1
Challenges
The biggest challenge was to use clustering in the 7 dimensional format, and assigning custom points for the K
means clustering.
2
Recommendations
Future projects could benefit from using more sophisticated machine learning models,
such as deep learning algorithms, to capture complex weather patterns and improve
predictive accuracy.
3
Recommendations
With more variety in the dataset, the model accuracy and
prediction could have been better.
8. Conclusion
This project demonstrated the successful application of machine learning
techniques for analyzing and predicting weather patterns. The results
suggest that AI and Machine learning can become future of Data Analysis.
The help of AI can be taken in other fields like Stock Market prediction,
Generative AI, and many more.
9. References
Google Sheets
Libre Office Impress(for presentation)
Gamma app (Template of ppt)
Orange(Machine learning algorithms)