SlideShare a Scribd company logo
Data Science with Python
Student Project Presentation
WeCloudData
@WeCloudData @WeCloudData tordatascience
weclouddata
WeCloudData tordatascience
Web Crawler Project - Aritzia
By Shan Gao (WeCloudData)
Content
Motivation Data Interesting
Findings
Conclusion Challenges
Info & Motivation
´ Type : Public
´ Traded as : TSX: ATZ
´ Industry : Fashion
´ Founded : 1984
´ Founder : Brian Hill
´ Headquarters : Vancouver, British Columbia, Canada
´ Products : Clothing
Website
Dataframe
´ Total data : 856,452
´ Date range : 2019-06-08 21:53:50 ~ 2019-06-20 13:57:19
´ File numbers : 30
crawler.py
Interesting Findings
´ Categories & Brand
´ Price Distribution
´ Top 20 Colors
´ Weekdays Vs Weekend - Avg Stock
´ On Sale event - Discount%
´ Price Change Vs Stock Correlation
Category Distribution
Brand Distribution Vs. Brand Average Price
Top 20 Colors
Weekdays Vs Weekend - Avg Stock
SALE !
Discount % of Each Brand
Top 10 products – stock change
Conclusion
´ Business casual clothes prices are higher than others
´ More transactions/purchases happens in weekends
´ Sale event – good deal for famous brands
´ Promotion influences stock change
Challenges
´ Save data as Tree structure (.json)
´ Load data
´ Move root node properties to children node
´ Data analyzing using Pandas
´ Visualization - Plotly (multi-chart types)
Next Step
´ Detailed size distribution of
brands / products
´ Influences of the strength of
discount
´ Stock refill timing
´ Long term data analyzing
(winter vs. summer)
Web scraping project   aritza-compressed

More Related Content

PPTX
Midlands model engineering exhibition attendees email list 18 21 oct 2018
PDF
6 Ways To Reduce The Impact of The COVID-19 Pandemic | April 2020 Business In...
PPTX
Improving search experience with a taxonomy in the fashion domain
PPTX
Takeaways of the Hair-Trigger Harriets
PDF
B2B Social Media Strategy by Kevin Espinosa of Caterpillar
PPT
New Enterprise Development: Using secondary sources in your business plan
PDF
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
PDF
EO Singapore Craig Rispin Keynote January 26, 2015
Midlands model engineering exhibition attendees email list 18 21 oct 2018
6 Ways To Reduce The Impact of The COVID-19 Pandemic | April 2020 Business In...
Improving search experience with a taxonomy in the fashion domain
Takeaways of the Hair-Trigger Harriets
B2B Social Media Strategy by Kevin Espinosa of Caterpillar
New Enterprise Development: Using secondary sources in your business plan
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
EO Singapore Craig Rispin Keynote January 26, 2015

Similar to Web scraping project aritza-compressed (20)

PPT
Managerial methods - adidas AG research
PDF
Improving your metadata: Common issues and how to fix them - Tech Forum 2024
PDF
Market Analysis Trends and Opportunities in Powered Mobility Scooters
PDF
Building A Business Information Brand in 2013, the Skift story
PPT
Mari Kate Costin ASBPE
PDF
230723580-SAP-SD.pdf sd module by organisation structure
PDF
Graph Databases for Master Data Management
PPTX
Sales-Analysis for retail store which has both instore and online business .
PDF
Content Commerce + Growth Strategies For Online Retailers
PDF
Elders PD Days 2017 - Feedsy & SHM
PDF
How Industry 4.0 Redefines Product Management
PPTX
Utilizing template for marketing prospect clients
PPTX
Presentation (4).hy8gyhuyhyh7tgt97tgtgugtb
PPTX
Ughgugtgtg8yh8yh87h7Presentation (4).pptx
PPTX
Presentation (4).pptxik-=k-ol-kik-k0ik9k90
PDF
Presentatiotfffgfdhhgfeddgfdsfgdddvn.pdf
PDF
Presentationddgadddffdddddfhgdfdsesdfdd.pdf
PPTX
Knowledge Graphs --Enter--> The Hype Cycle (PyData 2019)
PPTX
Q3 2024 TUG Leader Quarterly Call (1).pptx
PPTX
Rare Labs 55+ Webinar Content 16.7.2020
Managerial methods - adidas AG research
Improving your metadata: Common issues and how to fix them - Tech Forum 2024
Market Analysis Trends and Opportunities in Powered Mobility Scooters
Building A Business Information Brand in 2013, the Skift story
Mari Kate Costin ASBPE
230723580-SAP-SD.pdf sd module by organisation structure
Graph Databases for Master Data Management
Sales-Analysis for retail store which has both instore and online business .
Content Commerce + Growth Strategies For Online Retailers
Elders PD Days 2017 - Feedsy & SHM
How Industry 4.0 Redefines Product Management
Utilizing template for marketing prospect clients
Presentation (4).hy8gyhuyhyh7tgt97tgtgugtb
Ughgugtgtg8yh8yh87h7Presentation (4).pptx
Presentation (4).pptxik-=k-ol-kik-k0ik9k90
Presentatiotfffgfdhhgfeddgfdsfgdddvn.pdf
Presentationddgadddffdddddfhgdfdsesdfdd.pdf
Knowledge Graphs --Enter--> The Hype Cycle (PyData 2019)
Q3 2024 TUG Leader Quarterly Call (1).pptx
Rare Labs 55+ Webinar Content 16.7.2020
Ad

More from WeCloudData (16)

PDF
Data Engineer Intro - WeCloudData
PDF
AWS Well Architected-Info Session WeCloudData
PDF
Data Engineering Course Syllabus - WeCloudData
PDF
Machine learning in Healthcare - WeCloudData
PDF
Deep Learning Introduction - WeCloudData
PDF
Big Data for Data Scientists - WeCloudData
PDF
Introduction to Machine Learning - WeCloudData
PDF
Data Science with Python - WeCloudData
PDF
SQL for Data Science
PDF
Introduction to Python by WeCloudData
PDF
Data Science Career Insights by WeCloudData
PDF
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
PDF
Introduction to Machine Learning - WeCloudData
PDF
Big Data for Data Scientists - Info Session
PPTX
WeCloudData Toronto Open311 Workshop - Matthew Reyes
PPTX
Tordatasci meetup-precima-retail-analytics-201901
Data Engineer Intro - WeCloudData
AWS Well Architected-Info Session WeCloudData
Data Engineering Course Syllabus - WeCloudData
Machine learning in Healthcare - WeCloudData
Deep Learning Introduction - WeCloudData
Big Data for Data Scientists - WeCloudData
Introduction to Machine Learning - WeCloudData
Data Science with Python - WeCloudData
SQL for Data Science
Introduction to Python by WeCloudData
Data Science Career Insights by WeCloudData
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Introduction to Machine Learning - WeCloudData
Big Data for Data Scientists - Info Session
WeCloudData Toronto Open311 Workshop - Matthew Reyes
Tordatasci meetup-precima-retail-analytics-201901
Ad

Recently uploaded (20)

PDF
Introduction to the R Programming Language
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
1_Introduction to advance data techniques.pptx
PDF
Mega Projects Data Mega Projects Data
PDF
annual-report-2024-2025 original latest.
PPTX
Computer network topology notes for revision
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Introduction to Data Science and Data Analysis
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Lecture1 pattern recognition............
PPTX
IB Computer Science - Internal Assessment.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to the R Programming Language
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction-to-Cloud-ComputingFinal.pptx
1_Introduction to advance data techniques.pptx
Mega Projects Data Mega Projects Data
annual-report-2024-2025 original latest.
Computer network topology notes for revision
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
SAP 2 completion done . PRESENTATION.pptx
Introduction to Data Science and Data Analysis
STUDY DESIGN details- Lt Col Maksud (21).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Lecture1 pattern recognition............
IB Computer Science - Internal Assessment.pptx
Reliability_Chapter_ presentation 1221.5784
climate analysis of Dhaka ,Banglades.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb

Web scraping project aritza-compressed