SlideShare a Scribd company logo
Dynamic Weather Data Scraping
Solution Brief
Requirement of a system through which we can fetch the historic temperature information on
Day level with minimum or no manual efforts.
Weather data should be contextual taking into consideration inputs like Date, City and State.
The data would be accurate and will also give Day Minimum and Day Maximum Temperature
which will be the average of Hourly Minimum and Hourly Maximum, also it handles if there is
no weather data available for certain Hours.
Solution Approach
Our solution approach comprises several distinct components:
1. Finding the State ids from master data with state name provided in the data
2. Searching for all the Station ids and matching it with the right City Name
3. Fetching the Hourly temperature and averaging it to get daily minimum and maximum
4. Dealing with few NA issues
5. Having a master sheet for the State ids and Station ids for future scalability
Future Goals:
1. API it to our Dashboard
2. Expanding the master data file to incorporate other countries (Currently it has all the
major cities of United States) – this will also require UOM handling
3. Should fetch if the State Abbreviations are provided rather just full name
4. Much more validations need to be in place
5. Testing with false or abnormal inputs
6. Store master file on cloud so that it becomes platform independent
7. Search by only city if there is no state name mentioned (Issue with this is some city
names are in multiple states)
8. Code Optimization as the current system will slower the dashboard
9. Including Wind Speed and Humidity data along with temperature
Sample Master Data:
Sample Input Data:
Sample Output:
Snapshot of R Environment:

More Related Content

PPTX
Internship_Presentation
PPT
OpenWeatherMap on the Open GIS Conference 2012
PDF
Optimizing purdue lin microphysics scheme for intel xeon phi coprocessor
PPT
Fastnet Aq Conference
PPT
Smr Fastnet Presentation Take2 Pubs
PDF
Geospatial Data Visualization: WorldMap Integration by Raman Prasad
PPTX
Unit 4 -IOT5.pptx
Internship_Presentation
OpenWeatherMap on the Open GIS Conference 2012
Optimizing purdue lin microphysics scheme for intel xeon phi coprocessor
Fastnet Aq Conference
Smr Fastnet Presentation Take2 Pubs
Geospatial Data Visualization: WorldMap Integration by Raman Prasad
Unit 4 -IOT5.pptx

Similar to Dynamic weather data scraping (20)

PPTX
Big Data Airline Project at UAEU
PPTX
Unit 4 -IOT5_Domain Model refrence .pptx
PPTX
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
PPT
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
PPT
Dynamic integrations of crop data and corresponding meteorological data based...
PDF
Finding URL pattern with MapReduce and Apache Hadoop
PDF
Run time storage
PDF
Client Success Story - Oracle FDMEE is the Cloud Data Hub at Legg Mason
PPTX
Bigdata
PPTX
Easily Accessibility Of Power Plant Data & Reporting.pptx
PDF
Stored-Procedures-Presentation
PDF
IT infrastructure for Big Data and Data Science at Statistics Netherlands
PPTX
Thailand - Use of Technologies
PPTX
ETL Process
PPTX
PHP Continuous Data Processing
DOCX
Final Project Write-up
PDF
Weather data meets ibm cloud. part 1 ingestion and processing of weather da...
PPT
20090701 Climate Data Staging
PPTX
Big Data Hadoop (Overview)
PPTX
Big Data Airline Project at UAEU
Unit 4 -IOT5_Domain Model refrence .pptx
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic integrations of crop data and corresponding meteorological data based...
Finding URL pattern with MapReduce and Apache Hadoop
Run time storage
Client Success Story - Oracle FDMEE is the Cloud Data Hub at Legg Mason
Bigdata
Easily Accessibility Of Power Plant Data & Reporting.pptx
Stored-Procedures-Presentation
IT infrastructure for Big Data and Data Science at Statistics Netherlands
Thailand - Use of Technologies
ETL Process
PHP Continuous Data Processing
Final Project Write-up
Weather data meets ibm cloud. part 1 ingestion and processing of weather da...
20090701 Climate Data Staging
Big Data Hadoop (Overview)
Ad

Recently uploaded (20)

PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Lecture1 pattern recognition............
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Computer network topology notes for revision
PPTX
Introduction to machine learning and Linear Models
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
climate analysis of Dhaka ,Banglades.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Fluorescence-microscope_Botany_detailed content
Business Ppt On Nestle.pptx huunnnhhgfvu
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Introduction to Knowledge Engineering Part 1
Business Acumen Training GuidePresentation.pptx
Lecture1 pattern recognition............
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Database Infoormation System (DBIS).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
.pdf is not working space design for the following data for the following dat...
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Computer network topology notes for revision
Introduction to machine learning and Linear Models
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Ad

Dynamic weather data scraping

  • 1. Dynamic Weather Data Scraping Solution Brief Requirement of a system through which we can fetch the historic temperature information on Day level with minimum or no manual efforts. Weather data should be contextual taking into consideration inputs like Date, City and State. The data would be accurate and will also give Day Minimum and Day Maximum Temperature which will be the average of Hourly Minimum and Hourly Maximum, also it handles if there is no weather data available for certain Hours. Solution Approach Our solution approach comprises several distinct components: 1. Finding the State ids from master data with state name provided in the data 2. Searching for all the Station ids and matching it with the right City Name 3. Fetching the Hourly temperature and averaging it to get daily minimum and maximum 4. Dealing with few NA issues 5. Having a master sheet for the State ids and Station ids for future scalability Future Goals: 1. API it to our Dashboard 2. Expanding the master data file to incorporate other countries (Currently it has all the major cities of United States) – this will also require UOM handling 3. Should fetch if the State Abbreviations are provided rather just full name 4. Much more validations need to be in place 5. Testing with false or abnormal inputs 6. Store master file on cloud so that it becomes platform independent 7. Search by only city if there is no state name mentioned (Issue with this is some city names are in multiple states) 8. Code Optimization as the current system will slower the dashboard 9. Including Wind Speed and Humidity data along with temperature
  • 3. Sample Output: Snapshot of R Environment: