SlideShare a Scribd company logo
Giuliana Benedetti - Can Magento handle 1M products?
About me
• Project Manager @ Webformat
• Magento and TYPO3 projects
• Requirements analysis
• Planning of development and support activities
What’s in the menu?
• Huge catalog
• What we did
• What we are doing
Once upon a time..
• The project began as a migration from a proprietary platform to
Magento 1 Community
• Shoes and accessories E-commerce
• We developed the integration between their management software,
that was handling products anagraphic, warehouse and orders
anagraphic
• Integration with Amazon e Ebay
Products Database
• The original products database counted around 150k products
• Configurable products
• On average, 10 simple products for each configurable
• Virtual products
Continued Growth
• In one year only we reached the amount of 700K products stored in
Magento
• 66k configurable products
Challenges faced
Alignment between catalog and management software
Updating warehouse
Reindexing
Generating images
Server response time
Backoffice operations
Third parts modules integration
Export of feed for Google shopping & Co.
Marketplace synchronization
Disk space
Updating the catalog (1/2)
• Initially 150k products, this is what we planned:
• Massive initial import
• Frequent update during the day via webservice
• When the catalog started growing, the data exchange volumes via
webservice began unsustainable. The exchange procedure needed a
redesign.
Updating the catalog (2/2)
• Today we have700k products
• Based on Magmi and CSV file exchange (product anagraphic)
• Nighttime update – the DIFF
• Exceptional whole catalog update
• The client accepted that the new products will be published with a delay of 1
day
Warehouse update (1/2)
• No warehouse fully dedicated to web
• Shared with the offline shops
• It’s not possible to update the warehouse nighttime only and use that
stock during the day
• Frequent updates
Warehouse update (2/2)
• Every 15 minutes  update from management software by loading
the DIFF
• Only stock update
• Via CSV file writing directly on database (Magmi)
Reindex (1/2)
• The bigger the catalog, the slower the reindex
• Initially, the reidex was lauched after each update (15 min)
• After a while, the reindex started being too much time demanding:
the update cycle was starting when the previous update reindex cycle
was still running.
Reindex (2/2)
• Solution:
• All the reindexes have been disabled, except for the stock reindex
• All reindexes are now performed after the nighttime import
• Today a full reindex takes around 75 minutes and generates a heavy
load on the database
Catalog_url_rewrite (1/2)
• Magento 1 has a critical point with URL rewrite process:
• All product URLs are rewritten, also simple products that are «Not visible
individually» and exist only to be associated to a configurable.
• With 700k products catalog, this meant:
• Creating millions of rows in the catalog_url_rewrite table
• An URL rewrite process that takes hours to be completed
Catalog_url_rewrite (2/2)
• A patch has been installed, to avoid the simple and not visible
individually products url generation
• Module Dnd_Patchindexurl:
https://www.magentocommerce.com/magento-connect/dn-d-patch-
index-url-1.html
• Now the reindex process takes around 20 minutes
Images generation (1/2)
• One of the main problems that we had to face was the product
thumbnails generation, done by Imagemagik
• Every day hundreds of products are published
 We verified that the frontend CPUs were often stressed because of
Imagemagik process and the writing operations on database
Images generation (2/2)
• We found a solution in generating the thumbnails during the massive
import, so Imagemagik could work together with the import
procedure
• Nighttime, the images are generated and saved in a dedicated server,
without interfering with user navigation
• Today we have around 881K images saved
Server response time
• With such a huge catalog, some categories hold even hundreds of
products
• The first loading time (if they are not cached) is indeed high
• We activated caching on Redis and Varnish
• Not enough, the first loading time was anyway too heavy
Solutions 1/2
• Moving the cache clearing process during the night
• At 8 in the morning, the website navigation was starting to suffer
• We planned a job to pre-cache all the critical pages
• Minimized cache invalidation
• Clear cache only for products for which the stock quantity was updated via
WS
Solutions 2/2
• Client training to better handle the cache erasing
• Minimized the number of filters in layered navigation
• Each filter increases the reindex time and the pages combinations not cached
Backoffice operations
• Initially all the catalog update activities were performed from
Magento backoffice
• Problems:
• Frequent reindexes
• Frequent cache updates
• Server load (the backoffice product list filters are CPU demanding and they
charge MySql)
• Common operations were slown
• Several BE users ended to be concurrent
Solutions
• Initially a new backoffice server have been introduced
• MySql load problem was not solved. Reindex re-caching as well.
• We introduced a new process to handle the catalog, using an excel
file
• This improved the efficiency of who was managing the anagraphic data
• Massive excel file import performed each 3 days via FTP
• Categories still handled from backoffice
Third party modules integration
• Critical point
• Not all the modules found in the Marketplace are developed in an
optimal way
• They «simply» load the products collection without pagination
• They execute nested query
• There are cycles on collections that initialize all products unnecessarily
• …
• A big profiling and optimization work was needed
Feed export (Google Shopping & Co.) 1/2
• While the catalog was growing, the feed time export was encreasing
as well
• In the very beginning, the exports were handled by a Magento
module
Feed export (Google Shopping & Co.) 2/2
• Solution steps:
• The module have been replaced with ad-hoc procedures, with high level of
optimization
• The exportation jobs are executed on backoffice server during the night, to
not load the frontend
• It have been introduced a MySql slave as data source, to not load the master
and the website as a consequence
Marketplace synch
• We are using M2E Pro
• Client side: EAN code full check
• Tech side: handling the automatic synchronization process
• An automatic full synchronization is too heavy. When synchronize?
• What synchronize?
• Magmi
Disk space (1/2)
• Well, here we are: even if disk space is quite cheap, using too much of
it it’s not convenient..
• Data exchange logs very heavy
• Frequent data exchange and huge amount of data
• Log files were growing fast
• Log rotate was activated hourly
• Log are archived after few days
Disk space (2/2)
• High image quantity, continuously growing
• Huge feed export
• Huge CSV import files
• …
• Solutions applied:
• Constant monitoring activated
• Activated automatic procedures to clean log, old images, expired feed, etc.
Challenges to be faced
Elasticsearch integration
Growing catalog, until 1M products
More sells, more page views
Magento 2 migration
Elasticsearch
• For two reasons:
• Improve the search functionality offered to the client
• Minimize the load produced by the Magento internal search engine
• Critical issues to be faced:
• Catalog index time
• Only configurable products?
• What about the sizes?
1M products
• Expected growth: in 1 year we’ll have 1M products
• At the moment we are performing tests with fake products
• We didn’t detect other critical aspects
• At the moment, we had to develop some more data exchange and feed
generation procedures optimization
More sells, more page views
• Sessions are increasing  the number of not cached pages views is
increasing
• Pre – caching extension
• Increasing Varnish cache TTL
• Minimize products in categories and filters used
• Sales are increasing  increasing also frequency of out-of stock
products
• To be evaluated: the impact of new reindex and re-caching politics on client
What if..?
• We’re planning with the client a Magento 2 migration
• We started our tests by migrating the actual Magento 1 environment
(700K products) to a Magento 2 installation
• We collected the results and still performing some other tests
HW specs
All tests were run on a VirtualBox VM with Linux Ubuntu 16.04.1 LTS, 8
GB RAM, 1 x 2,60 GHz cpu
Lamp configuration was featuring PHP version 5.6, Apache 2.4.18,
MySQL 14.14
Migration was performed from Magento version 1.9.2.2 through 2.1.3
Magento 2 migration (1/4)
• DB migration times: 1h 20‘
• BE performances:
BE Operation Magento 1 with cache Magento 2 with cache
Access to catalog almost 5' 7''
Access to product 3'' 10''
Access to categories 7'' 6''
Product searching 1'5'' 3''
Magento 2 migration (2/4)
• FE performaces for catalog browsing:
FE Operation Magento 1with cache Magento 2with cache
Catalog browsing / categories 30'' 7''
Magento 2 migration – Reindex Times (3/4)
M1 M2
Total: 2h 55‘’11’’ Total: 2h 53‘ 47’’
Magento 2 migration (4/4)
• We had some issues with the Catalog Fullsearch reindex (Magento 2)
• we had to apply a patch 
https://github.com/magento/magento2/issues/5146
• Catalog Fullsearch reindex without patch takes around 2 hours with
patch applied took around 1 hour, so the times are quite comparable
02:12:37
02:12:37
Catalog URL rewrite
• M1 with Dnd_Patchindexurl module: 00:14:34
• M1 without Dnd_Patchindexurl module: 01:03:50
• M2: no catalog URL rewrite. URL Rewrite is handled at the product
saving
Tools
Xdebug
New Relic
AOE Profiler
Conclusions
Yes, we can!
• It’s possible, but not without effort
• Large initial analysis
• Special attention to optimization processes
• What about Magento 2?
Q & A
• Giuliana Benedetti – giuliana.benedetti@webformat.com
• WEBFORMAT srl - www.webformat.com

More Related Content

PDF
Vitalyi Golomoziy - Integration tests in Magento 2
PDF
Magento 2 ERP Integration Best Practices: Microsoft Dynamics
PDF
What's New With Magento 2?
PDF
Server fleet management using Camunda by Akhil Ahuja
PPTX
Nagpur Mulesoft Meetup on CICD using Jenkins
PDF
Webinar: Best Practices for Migrating to Magnolia 5
PDF
[Final] best practices for access management (mule soft meetups riyadh) - j...
PPTX
Mule soft meetup__dubai_12_june- Error Handling
Vitalyi Golomoziy - Integration tests in Magento 2
Magento 2 ERP Integration Best Practices: Microsoft Dynamics
What's New With Magento 2?
Server fleet management using Camunda by Akhil Ahuja
Nagpur Mulesoft Meetup on CICD using Jenkins
Webinar: Best Practices for Migrating to Magnolia 5
[Final] best practices for access management (mule soft meetups riyadh) - j...
Mule soft meetup__dubai_12_june- Error Handling

What's hot (20)

PDF
Mule soft meetup_indonesia_june2020
PDF
Creating SOA with Oracle Fusion Middleware 11g
PDF
Workshop: Delivering chnages for applications and databases
PDF
Get the Maximum Out of Your Magnolia Workflow
PDF
Overview of Oracle SOA Suite11g
PPTX
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
PPTX
Cracow MuleSoft Meetup #1
PDF
Mule soft meetups-24012020
PDF
Mobilize Your Business, Not Just an App
PPTX
Tools for Managing your LabVIEW Source Code
PPTX
MuleSoft Meetup Adelaide 7th April 2021
PPTX
Mule soft meetup__adelaide_october_2020_final (2)
PDF
MuleSoft Meetup #2 in Kyiv, Ukraine - What is special about MuleSoft Catalyst™?
PPTX
Building a Professional SDLC
PDF
Twelve Tasks Made Easier with IBM Domino XPages
PPTX
Mule soft meetup__riyadh_08_nov_2020
PPTX
Applying multi-processing techniques in Magento for upgrade optimization
PDF
2015 Technology Update
PDF
Camunda BPM at Zalando: Order Processing at scale
PDF
MuleSoft Manchester Meetup #5 slides 20th May 2021
Mule soft meetup_indonesia_june2020
Creating SOA with Oracle Fusion Middleware 11g
Workshop: Delivering chnages for applications and databases
Get the Maximum Out of Your Magnolia Workflow
Overview of Oracle SOA Suite11g
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
Cracow MuleSoft Meetup #1
Mule soft meetups-24012020
Mobilize Your Business, Not Just an App
Tools for Managing your LabVIEW Source Code
MuleSoft Meetup Adelaide 7th April 2021
Mule soft meetup__adelaide_october_2020_final (2)
MuleSoft Meetup #2 in Kyiv, Ukraine - What is special about MuleSoft Catalyst™?
Building a Professional SDLC
Twelve Tasks Made Easier with IBM Domino XPages
Mule soft meetup__riyadh_08_nov_2020
Applying multi-processing techniques in Magento for upgrade optimization
2015 Technology Update
Camunda BPM at Zalando: Order Processing at scale
MuleSoft Manchester Meetup #5 slides 20th May 2021
Ad

Viewers also liked (20)

PDF
Mauro Lorenzutti - Il passaggio da Magento 1 a Magento 2: le 5W
PDF
Roberto Fumarola - Il marketing nel post spedizione, tante opportunità da cog...
PDF
Giorgio Bignozzi - How to develop a Sticker plug-in for Magento 2: best practice
PDF
Irene Iaccio - Magento2 e RequireJS. The right way
PDF
Federico Minzoni - Software as a Service
PDF
Riccardo Tempesta - The right tools for the right job (or: surviving Magento ...
PDF
Giovanni Cappellotto - Come gestire le recommendation e le personalizzazioni ...
PDF
William Sbarzaglia - Le buyer personas nell'e-commerce
PDF
Andrea Zwirner - Magento security and hardening strategies
PDF
Max Pronko - Best practices for checkout customisation in Magento 2
PDF
Giulio Drei - Studio di fattibilità di un progetto eCommerce
PDF
Alan Rhode: Ecommerce export: IVA, dazi doganali, accise e altri importanti t...
PDF
Angelo Coletta - Dalla Mass production alla mass customization
PDF
Alessandro La Ciura - Live Chat ed Ecommerce: (ma) la chat vende veramente di...
PDF
Oleksii Korshenko - Magento 2 Backwards Compatible Policy
PDF
Gian Mario Infelici - Marketing automation e omnicanalità: come unire i canal...
PDF
R.Grassi - P.Sardo - One integration: every wat to pay
PDF
Piotr Karwatka - Managing IT project with no doubts. How to work with Agency,...
PDF
Alejandro Cordero - Secure Electronic Commerce New Business and Repeat Busine...
PDF
Igor Bondarenko - Magento2 Performance Bottlenecks: How to avoid it
Mauro Lorenzutti - Il passaggio da Magento 1 a Magento 2: le 5W
Roberto Fumarola - Il marketing nel post spedizione, tante opportunità da cog...
Giorgio Bignozzi - How to develop a Sticker plug-in for Magento 2: best practice
Irene Iaccio - Magento2 e RequireJS. The right way
Federico Minzoni - Software as a Service
Riccardo Tempesta - The right tools for the right job (or: surviving Magento ...
Giovanni Cappellotto - Come gestire le recommendation e le personalizzazioni ...
William Sbarzaglia - Le buyer personas nell'e-commerce
Andrea Zwirner - Magento security and hardening strategies
Max Pronko - Best practices for checkout customisation in Magento 2
Giulio Drei - Studio di fattibilità di un progetto eCommerce
Alan Rhode: Ecommerce export: IVA, dazi doganali, accise e altri importanti t...
Angelo Coletta - Dalla Mass production alla mass customization
Alessandro La Ciura - Live Chat ed Ecommerce: (ma) la chat vende veramente di...
Oleksii Korshenko - Magento 2 Backwards Compatible Policy
Gian Mario Infelici - Marketing automation e omnicanalità: come unire i canal...
R.Grassi - P.Sardo - One integration: every wat to pay
Piotr Karwatka - Managing IT project with no doubts. How to work with Agency,...
Alejandro Cordero - Secure Electronic Commerce New Business and Repeat Busine...
Igor Bondarenko - Magento2 Performance Bottlenecks: How to avoid it
Ad

Similar to Giuliana Benedetti - Can Magento handle 1M products? (20)

PDF
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
PPTX
Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g
PDF
Elastic on a Hyper-Converged Infrastructure for Operational Log Analytics
PPTX
Magento performance feat. core Hacks
PPTX
Why retail companies can't afford database downtime
PPTX
Taking Database Development to the 21st Century
KEY
CakePHP at a Massive Scale on a Budget
PDF
Tiago Fonseca & Rui Velho - Syone & Leroy Merlin - OSL19
PDF
The Art of Sitecore Upgrades
PDF
Advanced Benchmarking at Parse
PDF
Benchmarking at Parse
PPTX
Symfony2 for legacy app rejuvenation: the eZ Publish case study
PPTX
Extreme Makeover OnBase Edition
PPTX
Tuenti Release Workflow v1.1
PPTX
Openxcell conducts a successful webinar on Magento Optimization
PPTX
OpenXcell - Magento Optimization Webinar 2013
PPTX
Five Ways to Fix Your SQL Server Dev-Test Problems
PDF
Cincom Smalltalk Roadmap 2010
PPTX
Content Migrations: Getting from A to B
PPTX
Improving Performance on Magento 1*
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g
Elastic on a Hyper-Converged Infrastructure for Operational Log Analytics
Magento performance feat. core Hacks
Why retail companies can't afford database downtime
Taking Database Development to the 21st Century
CakePHP at a Massive Scale on a Budget
Tiago Fonseca & Rui Velho - Syone & Leroy Merlin - OSL19
The Art of Sitecore Upgrades
Advanced Benchmarking at Parse
Benchmarking at Parse
Symfony2 for legacy app rejuvenation: the eZ Publish case study
Extreme Makeover OnBase Edition
Tuenti Release Workflow v1.1
Openxcell conducts a successful webinar on Magento Optimization
OpenXcell - Magento Optimization Webinar 2013
Five Ways to Fix Your SQL Server Dev-Test Problems
Cincom Smalltalk Roadmap 2010
Content Migrations: Getting from A to B
Improving Performance on Magento 1*

More from Meet Magento Italy (20)

PDF
Dirk Pinamonti - Come affrontare la sfida del nuovo mercato multicanale e del...
PDF
Vinai Kopp - How i develop M2 modules
PDF
Eugene Shaksuvarov - Tuning Magento 2 for Maximum Performance
PDF
Muliadi jeo - How to sell online in Indonesia
PDF
Max Pronko - 10 migration mistakes from Magento 1 to Magento 2
PDF
Alessandro La Ciura - Progettare la migliore integrazione tra live chat ed e-...
PDF
Bodin - Hullin & Potencier - Magento Performance Profiling and Best Practices
PDF
Giulio Gargiullo - Strategie di marketing digitale per avviare l’e-commerce i...
PDF
Vinai Kopp - FPC Hole punching in Magento 2
PDF
Jacopo Nardiello - From CI to Prod: Running Magento at scale with Kubernetes
PDF
James Zetlen - PWA Studio Integration…With You
PDF
Talesh Seeparsan - The Hound of the Malwarevilles
PDF
Miguel Balparda - A day in support
PDF
Volodymyr Kublytskyi - Develop Product, Design Platform
PDF
Rosario Toscano - Processi di ottimizzazione per una crescita continua
PDF
Henrik Feld Jakobsen - How to sell online Scandinavia
PDF
Rabia Qureshi - How to sell online in UK
PDF
Matteo Schuerch - How to sell online in Switzerland
PDF
Il data-driven nell’e-commerce: il caso studio Alessi
PDF
Philippe Bernou - Seamless omnichannel solutions with Magento order management
Dirk Pinamonti - Come affrontare la sfida del nuovo mercato multicanale e del...
Vinai Kopp - How i develop M2 modules
Eugene Shaksuvarov - Tuning Magento 2 for Maximum Performance
Muliadi jeo - How to sell online in Indonesia
Max Pronko - 10 migration mistakes from Magento 1 to Magento 2
Alessandro La Ciura - Progettare la migliore integrazione tra live chat ed e-...
Bodin - Hullin & Potencier - Magento Performance Profiling and Best Practices
Giulio Gargiullo - Strategie di marketing digitale per avviare l’e-commerce i...
Vinai Kopp - FPC Hole punching in Magento 2
Jacopo Nardiello - From CI to Prod: Running Magento at scale with Kubernetes
James Zetlen - PWA Studio Integration…With You
Talesh Seeparsan - The Hound of the Malwarevilles
Miguel Balparda - A day in support
Volodymyr Kublytskyi - Develop Product, Design Platform
Rosario Toscano - Processi di ottimizzazione per una crescita continua
Henrik Feld Jakobsen - How to sell online Scandinavia
Rabia Qureshi - How to sell online in UK
Matteo Schuerch - How to sell online in Switzerland
Il data-driven nell’e-commerce: il caso studio Alessi
Philippe Bernou - Seamless omnichannel solutions with Magento order management

Recently uploaded (20)

PPTX
fundraisepro pitch deck elegant and modern
PPTX
Introduction to Effective Communication.pptx
PPTX
Intro to ISO 9001 2015.pptx wareness raising
PPTX
Primary and secondary sources, and history
DOC
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
PPTX
Tour Presentation Educational Activity.pptx
PPTX
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
PDF
Swiggy’s Playbook: UX, Logistics & Monetization
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
Impressionism_PostImpressionism_Presentation.pptx
PPTX
Learning-Plan-5-Policies-and-Practices.pptx
PDF
Instagram's Product Secrets Unveiled with this PPT
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PPTX
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
PPTX
Non-Verbal-Communication .mh.pdf_110245_compressed.pptx
PPTX
2025-08-10 Joseph 02 (shared slides).pptx
PDF
Why Top Brands Trust Enuncia Global for Language Solutions.pdf
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
An Unlikely Response 08 10 2025.pptx
PPTX
The Effect of Human Resource Management Practice on Organizational Performanc...
fundraisepro pitch deck elegant and modern
Introduction to Effective Communication.pptx
Intro to ISO 9001 2015.pptx wareness raising
Primary and secondary sources, and history
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
Tour Presentation Educational Activity.pptx
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
Swiggy’s Playbook: UX, Logistics & Monetization
Effective_Handling_Information_Presentation.pptx
Impressionism_PostImpressionism_Presentation.pptx
Learning-Plan-5-Policies-and-Practices.pptx
Instagram's Product Secrets Unveiled with this PPT
oil_refinery_presentation_v1 sllfmfls.pdf
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
Non-Verbal-Communication .mh.pdf_110245_compressed.pptx
2025-08-10 Joseph 02 (shared slides).pptx
Why Top Brands Trust Enuncia Global for Language Solutions.pdf
Emphasizing It's Not The End 08 06 2025.pptx
An Unlikely Response 08 10 2025.pptx
The Effect of Human Resource Management Practice on Organizational Performanc...

Giuliana Benedetti - Can Magento handle 1M products?

  • 2. About me • Project Manager @ Webformat • Magento and TYPO3 projects • Requirements analysis • Planning of development and support activities
  • 3. What’s in the menu? • Huge catalog • What we did • What we are doing
  • 4. Once upon a time.. • The project began as a migration from a proprietary platform to Magento 1 Community • Shoes and accessories E-commerce • We developed the integration between their management software, that was handling products anagraphic, warehouse and orders anagraphic • Integration with Amazon e Ebay
  • 5. Products Database • The original products database counted around 150k products • Configurable products • On average, 10 simple products for each configurable • Virtual products
  • 6. Continued Growth • In one year only we reached the amount of 700K products stored in Magento • 66k configurable products
  • 7. Challenges faced Alignment between catalog and management software Updating warehouse Reindexing Generating images Server response time Backoffice operations Third parts modules integration Export of feed for Google shopping & Co. Marketplace synchronization Disk space
  • 8. Updating the catalog (1/2) • Initially 150k products, this is what we planned: • Massive initial import • Frequent update during the day via webservice • When the catalog started growing, the data exchange volumes via webservice began unsustainable. The exchange procedure needed a redesign.
  • 9. Updating the catalog (2/2) • Today we have700k products • Based on Magmi and CSV file exchange (product anagraphic) • Nighttime update – the DIFF • Exceptional whole catalog update • The client accepted that the new products will be published with a delay of 1 day
  • 10. Warehouse update (1/2) • No warehouse fully dedicated to web • Shared with the offline shops • It’s not possible to update the warehouse nighttime only and use that stock during the day • Frequent updates
  • 11. Warehouse update (2/2) • Every 15 minutes  update from management software by loading the DIFF • Only stock update • Via CSV file writing directly on database (Magmi)
  • 12. Reindex (1/2) • The bigger the catalog, the slower the reindex • Initially, the reidex was lauched after each update (15 min) • After a while, the reindex started being too much time demanding: the update cycle was starting when the previous update reindex cycle was still running.
  • 13. Reindex (2/2) • Solution: • All the reindexes have been disabled, except for the stock reindex • All reindexes are now performed after the nighttime import • Today a full reindex takes around 75 minutes and generates a heavy load on the database
  • 14. Catalog_url_rewrite (1/2) • Magento 1 has a critical point with URL rewrite process: • All product URLs are rewritten, also simple products that are «Not visible individually» and exist only to be associated to a configurable. • With 700k products catalog, this meant: • Creating millions of rows in the catalog_url_rewrite table • An URL rewrite process that takes hours to be completed
  • 15. Catalog_url_rewrite (2/2) • A patch has been installed, to avoid the simple and not visible individually products url generation • Module Dnd_Patchindexurl: https://www.magentocommerce.com/magento-connect/dn-d-patch- index-url-1.html • Now the reindex process takes around 20 minutes
  • 16. Images generation (1/2) • One of the main problems that we had to face was the product thumbnails generation, done by Imagemagik • Every day hundreds of products are published  We verified that the frontend CPUs were often stressed because of Imagemagik process and the writing operations on database
  • 17. Images generation (2/2) • We found a solution in generating the thumbnails during the massive import, so Imagemagik could work together with the import procedure • Nighttime, the images are generated and saved in a dedicated server, without interfering with user navigation • Today we have around 881K images saved
  • 18. Server response time • With such a huge catalog, some categories hold even hundreds of products • The first loading time (if they are not cached) is indeed high • We activated caching on Redis and Varnish • Not enough, the first loading time was anyway too heavy
  • 19. Solutions 1/2 • Moving the cache clearing process during the night • At 8 in the morning, the website navigation was starting to suffer • We planned a job to pre-cache all the critical pages • Minimized cache invalidation • Clear cache only for products for which the stock quantity was updated via WS
  • 20. Solutions 2/2 • Client training to better handle the cache erasing • Minimized the number of filters in layered navigation • Each filter increases the reindex time and the pages combinations not cached
  • 21. Backoffice operations • Initially all the catalog update activities were performed from Magento backoffice • Problems: • Frequent reindexes • Frequent cache updates • Server load (the backoffice product list filters are CPU demanding and they charge MySql) • Common operations were slown • Several BE users ended to be concurrent
  • 22. Solutions • Initially a new backoffice server have been introduced • MySql load problem was not solved. Reindex re-caching as well. • We introduced a new process to handle the catalog, using an excel file • This improved the efficiency of who was managing the anagraphic data • Massive excel file import performed each 3 days via FTP • Categories still handled from backoffice
  • 23. Third party modules integration • Critical point • Not all the modules found in the Marketplace are developed in an optimal way • They «simply» load the products collection without pagination • They execute nested query • There are cycles on collections that initialize all products unnecessarily • … • A big profiling and optimization work was needed
  • 24. Feed export (Google Shopping & Co.) 1/2 • While the catalog was growing, the feed time export was encreasing as well • In the very beginning, the exports were handled by a Magento module
  • 25. Feed export (Google Shopping & Co.) 2/2 • Solution steps: • The module have been replaced with ad-hoc procedures, with high level of optimization • The exportation jobs are executed on backoffice server during the night, to not load the frontend • It have been introduced a MySql slave as data source, to not load the master and the website as a consequence
  • 26. Marketplace synch • We are using M2E Pro • Client side: EAN code full check • Tech side: handling the automatic synchronization process • An automatic full synchronization is too heavy. When synchronize? • What synchronize? • Magmi
  • 27. Disk space (1/2) • Well, here we are: even if disk space is quite cheap, using too much of it it’s not convenient.. • Data exchange logs very heavy • Frequent data exchange and huge amount of data • Log files were growing fast • Log rotate was activated hourly • Log are archived after few days
  • 28. Disk space (2/2) • High image quantity, continuously growing • Huge feed export • Huge CSV import files • … • Solutions applied: • Constant monitoring activated • Activated automatic procedures to clean log, old images, expired feed, etc.
  • 29. Challenges to be faced Elasticsearch integration Growing catalog, until 1M products More sells, more page views Magento 2 migration
  • 30. Elasticsearch • For two reasons: • Improve the search functionality offered to the client • Minimize the load produced by the Magento internal search engine • Critical issues to be faced: • Catalog index time • Only configurable products? • What about the sizes?
  • 31. 1M products • Expected growth: in 1 year we’ll have 1M products • At the moment we are performing tests with fake products • We didn’t detect other critical aspects • At the moment, we had to develop some more data exchange and feed generation procedures optimization
  • 32. More sells, more page views • Sessions are increasing  the number of not cached pages views is increasing • Pre – caching extension • Increasing Varnish cache TTL • Minimize products in categories and filters used • Sales are increasing  increasing also frequency of out-of stock products • To be evaluated: the impact of new reindex and re-caching politics on client
  • 33. What if..? • We’re planning with the client a Magento 2 migration • We started our tests by migrating the actual Magento 1 environment (700K products) to a Magento 2 installation • We collected the results and still performing some other tests
  • 34. HW specs All tests were run on a VirtualBox VM with Linux Ubuntu 16.04.1 LTS, 8 GB RAM, 1 x 2,60 GHz cpu Lamp configuration was featuring PHP version 5.6, Apache 2.4.18, MySQL 14.14 Migration was performed from Magento version 1.9.2.2 through 2.1.3
  • 35. Magento 2 migration (1/4) • DB migration times: 1h 20‘ • BE performances: BE Operation Magento 1 with cache Magento 2 with cache Access to catalog almost 5' 7'' Access to product 3'' 10'' Access to categories 7'' 6'' Product searching 1'5'' 3''
  • 36. Magento 2 migration (2/4) • FE performaces for catalog browsing: FE Operation Magento 1with cache Magento 2with cache Catalog browsing / categories 30'' 7''
  • 37. Magento 2 migration – Reindex Times (3/4) M1 M2 Total: 2h 55‘’11’’ Total: 2h 53‘ 47’’
  • 38. Magento 2 migration (4/4) • We had some issues with the Catalog Fullsearch reindex (Magento 2) • we had to apply a patch  https://github.com/magento/magento2/issues/5146 • Catalog Fullsearch reindex without patch takes around 2 hours with patch applied took around 1 hour, so the times are quite comparable 02:12:37 02:12:37
  • 39. Catalog URL rewrite • M1 with Dnd_Patchindexurl module: 00:14:34 • M1 without Dnd_Patchindexurl module: 01:03:50 • M2: no catalog URL rewrite. URL Rewrite is handled at the product saving
  • 42. Yes, we can! • It’s possible, but not without effort • Large initial analysis • Special attention to optimization processes • What about Magento 2?
  • 43. Q & A • Giuliana Benedetti – giuliana.benedetti@webformat.com • WEBFORMAT srl - www.webformat.com