SlideShare a Scribd company logo
eCommerce data migration
in moving systems across
Data Centres
Regu, Sharad
Flipkart in recent times
● Leading eCommerce player in India
○ 10M page visits, 2M shipments a day
○ 30 million products across more than 70 categories
○ Big Billion Days ($300M sales, top ranked app on Google
Play Store)
○ Ping - social collab in eCommerce
○ Progressive web app - native like experience on browser
(debuted at #chromedevsummit 2015)
Data at Flipkart - User & Order path
Data at Flipkart - Data Platform
● 6 TB new data ingested daily
○ 30 TB on sale days
● 1100 Raw Streams
● 3 Billion Raw events in a day
● 0.6 PB data processed daily
● 10K Hadoop jobs daily
● 3000 Report views daily
DC Landscape
DC2
(Chennai)
DC1
(Chennai)
DC3
(Chennai)
1 Gbps
2 x 10 Gbps
Primary : All User,
Order & FDP
systems
Secondary : Few batch
processing systems
(UIE, Reco, Ads)
New : All User & FDP
systems
Data migration needs, challenges
● User path systems
○ Minimize downtime. Site & App downtime is visible
■ Data - mostly eventually consistent
○ Session Data - Avoid User logout, Service scale : 250K RPS
○ Promise Data (Stock, Serviceability) - Avoid OOS, Over-booking (consistency matters)
○ Live Orders - Accept orders, Let customers checkout & pay (consistency matters)
○ User Accounts - No data loss. Change velocity is not much
● Order path systems
○ Availability not a constraint, Throughput and Data durability is
■ Data - strong durability, consistent
○ Current orders being fulfilled
○ Warehouse stock inwarding, movement
Data migration needs, challenges
● User Insights
○ Inter DC data bandwidth limitations (1Gbps shared link)
○ 130 TB (snappy compressed) data in HBase, Derived data(Insights) much smaller though
● MySQL instance footprint - 600+
● Flipkart Data Platform
○ Data publishers/consumers not moving together
■ Data consumers could move earlier than the publishers, vice-versa
○ Migrating couple of PB data not feasible over network
○ Consistency for raw, prepared and reporting data
Migration planning and execution
● Most useful tool - Google spreadsheets & docs!
○ Inventory of systems in each business cluster - split by service, backing data store
○ Defined data migration recipes and SME group for each data store type
■ Advise on IaaS constructs - instance types, PaaS integration - service discovery, Data
migration strategy (export vs live replication), built tooling
○ Create cutover sequence and interdependencies
■ for e.g. Catalog → Search → Cart/CO → Mobile apps
○ Wrote playbook for each cutover activity - including checklists, verification of data
export/restore
● Program managed a plan that touched 1000+ systems and many of the 1000
member engineering org
Knowledge base on 3rd party data stores, packages
Hacks, Tools and Utilities
● “Never underestimate the bandwidth of a station wagon full of tapes hurtling
down the highway” -- Andrew S. Tanenbaum (Computer Networks, 4th ed., p. 91)
○ We used disks instead to move User Insights data (stored in HBase)
■ Moves snapshots of derived/computed data over wire (relatively small)
■ Avoided HBase export. Instead transferred HFiles into disks using custom ‘distcp’ like
tool which knapsack'ed ~40K files into 6 disks. Open sourced as : https://github.
com/flipkart-incubator/blueshift
■ Disked shipped to new DC
■ Transferred HFiles into HDFS using Blueshift
■ Imported HFiles into HBase using HBase Bulk Load
Migrating live User sessions - dual write
● Cold data in HBase (9TB - compressed), hot in Memcached(1TB)
● Live read-writes on Memcached, async batched writes to HBase
● Migration via Dual writes
○ Fresh Memcached cluster in new DC
○ Added this cluster as another batched write destination in old DC
○ Data move initiated 21 days before actual-cutover to allow for catchup
○ Hbase data was exported using standard snapshotting and incremental copy table periodically
○ Batch interval reduced from 10 minutes to 1 minutes during cutover for aggressive copy
● No user logout, session loss after cutover
Migrating Product catalog data
● Data modelled as Entities & Relationships : clients have “Views” of this data
● Views expressed as JSON DSL
● Raw data exported from HBase, Elastic Search and copied to new DC
● Required a solution that could migrate updates after initial move
○ Developed JSON diff library that could work over 100 million views. Open sourced : https:
//github.com/flipkart-incubator/zjsonpatch
■ Diffs are applied in order - important for DC move
○ Bandwidth consumption for applying updates dropped from 800 Mbps to 13-14 Mbps
MySQL migration utility
Application Relay bridge over Kafka queues
● RESTBus :
orchestrates all Order
fulfillment systems
● Pattern : Locally
committed messages
in MySQL, relayed
over Kafka to Http
endpoints
● Bridge over 2 DCs with
destinations resolved
from ELB endpoints
2 way sync of Kafka Streams
Mirror
Old DC New DC
Copying data across clusters
● Only copied raw data about 200TB compressed
○ All prepared and reports data generated from raw data
● Verification utilities to check correctness in data in both clusters
● Ran the full data platform stack in both places for over 2 weeks till all data
publishers and consumers move
Thank You

More Related Content

PDF
Scalability truths and serverless architectures
PDF
Building tiered data stores using aesop to bridge sql and no sql systems
PDF
Aesop change data propagation
ODP
Oss as a competitive advantage
PPTX
Facebook style notifications using hbase and event streams
PPTX
Google mesa
PPTX
Hadoop and friends
PDF
Real time analytics at uber @ strata data 2019
Scalability truths and serverless architectures
Building tiered data stores using aesop to bridge sql and no sql systems
Aesop change data propagation
Oss as a competitive advantage
Facebook style notifications using hbase and event streams
Google mesa
Hadoop and friends
Real time analytics at uber @ strata data 2019

What's hot (20)

PDF
Presto @ Treasure Data - Presto Meetup Boston 2015
PDF
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
PDF
Introduction to Data Engineer and Data Pipeline at Credit OK
PPTX
DC Migration and Hadoop Scale For Big Billion Days
PPTX
PPTX
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
PDF
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
PDF
Alluxio Data Orchestration Platform for the Cloud
PDF
Enabling Presto Caching at Uber with Alluxio
PDF
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
PPTX
An Intro to Elasticsearch and Kibana
PPTX
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
PPTX
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
PPTX
Stream processing at Hotstar
PDF
GCP Data Engineer cheatsheet
PDF
Hoodie: How (And Why) We built an analytical datastore on Spark
PPTX
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
PDF
The Rise of Streaming SQL
PDF
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
PDF
Running MySQL on Linux
Presto @ Treasure Data - Presto Meetup Boston 2015
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
Introduction to Data Engineer and Data Pipeline at Credit OK
DC Migration and Hadoop Scale For Big Billion Days
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Alluxio Data Orchestration Platform for the Cloud
Enabling Presto Caching at Uber with Alluxio
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
An Intro to Elasticsearch and Kibana
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Stream processing at Hotstar
GCP Data Engineer cheatsheet
Hoodie: How (And Why) We built an analytical datastore on Spark
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
The Rise of Streaming SQL
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Running MySQL on Linux
Ad

Viewers also liked (10)

PPTX
Unique identification authority of india uid
PDF
What database
PPTX
Srikanth Nadhamuni
PPTX
PPTX
Uid
ODP
Hadoop at aadhaar
PPT
practical risks in aadhaar project and measures to overcome them
PDF
Building the Flipkart phantom
PPTX
Aadhaar at 5th_elephant_v3
PPTX
Authentication(pswrd,token,certificate,biometric)
Unique identification authority of india uid
What database
Srikanth Nadhamuni
Uid
Hadoop at aadhaar
practical risks in aadhaar project and measures to overcome them
Building the Flipkart phantom
Aadhaar at 5th_elephant_v3
Authentication(pswrd,token,certificate,biometric)
Ad

Similar to E commerce data migration in moving systems across data centres (20)

PPTX
Changing the tires on a big data racecar
PDF
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
PPTX
Big data cloud architecture
PPTX
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
PDF
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
PPTX
Svccg nosql 2011_v4
PPTX
Zero-downtime Hadoop/HBase Cross-datacenter Migration
PPTX
Netflix's Transition to High-Availability Storage (QCon SF 2010)
PDF
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
PPTX
Flipkart Data Platform @ Scale - slash n 2018 reprise
PDF
"Database isolation: how we deal with hundreds of direct connections to the d...
PDF
Simply Business' Data Platform
PPTX
FlipkartFLIPKART USE IT AND INFORMATION SYSTEM
PPTX
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
PDF
stackconf 2021 | How we finally migrated an eCommerce-Platform to GCP
PDF
How Big Data is helping Flipkart to achieve the Milestone
PDF
Storage Migration Made Simple
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PPTX
Stream me to the Cloud (and back) with Confluent & MongoDB
Changing the tires on a big data racecar
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
Big data cloud architecture
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Svccg nosql 2011_v4
Zero-downtime Hadoop/HBase Cross-datacenter Migration
Netflix's Transition to High-Availability Storage (QCon SF 2010)
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
Flipkart Data Platform @ Scale - slash n 2018 reprise
"Database isolation: how we deal with hundreds of direct connections to the d...
Simply Business' Data Platform
FlipkartFLIPKART USE IT AND INFORMATION SYSTEM
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
stackconf 2021 | How we finally migrated an eCommerce-Platform to GCP
How Big Data is helping Flipkart to achieve the Milestone
Storage Migration Made Simple
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Stream me to the Cloud (and back) with Confluent & MongoDB

Recently uploaded (20)

PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
KodekX | Application Modernization Development
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Spectroscopy.pptx food analysis technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Network Security Unit 5.pdf for BCA BBA.
Programs and apps: productivity, graphics, security and other tools
Building Integrated photovoltaic BIPV_UPV.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
20250228 LYD VKU AI Blended-Learning.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MIND Revenue Release Quarter 2 2025 Press Release
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
Understanding_Digital_Forensics_Presentation.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Empathic Computing: Creating Shared Understanding
Spectroscopy.pptx food analysis technology
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25 Week I
Network Security Unit 5.pdf for BCA BBA.

E commerce data migration in moving systems across data centres

  • 1. eCommerce data migration in moving systems across Data Centres Regu, Sharad
  • 2. Flipkart in recent times ● Leading eCommerce player in India ○ 10M page visits, 2M shipments a day ○ 30 million products across more than 70 categories ○ Big Billion Days ($300M sales, top ranked app on Google Play Store) ○ Ping - social collab in eCommerce ○ Progressive web app - native like experience on browser (debuted at #chromedevsummit 2015)
  • 3. Data at Flipkart - User & Order path
  • 4. Data at Flipkart - Data Platform ● 6 TB new data ingested daily ○ 30 TB on sale days ● 1100 Raw Streams ● 3 Billion Raw events in a day ● 0.6 PB data processed daily ● 10K Hadoop jobs daily ● 3000 Report views daily
  • 5. DC Landscape DC2 (Chennai) DC1 (Chennai) DC3 (Chennai) 1 Gbps 2 x 10 Gbps Primary : All User, Order & FDP systems Secondary : Few batch processing systems (UIE, Reco, Ads) New : All User & FDP systems
  • 6. Data migration needs, challenges ● User path systems ○ Minimize downtime. Site & App downtime is visible ■ Data - mostly eventually consistent ○ Session Data - Avoid User logout, Service scale : 250K RPS ○ Promise Data (Stock, Serviceability) - Avoid OOS, Over-booking (consistency matters) ○ Live Orders - Accept orders, Let customers checkout & pay (consistency matters) ○ User Accounts - No data loss. Change velocity is not much ● Order path systems ○ Availability not a constraint, Throughput and Data durability is ■ Data - strong durability, consistent ○ Current orders being fulfilled ○ Warehouse stock inwarding, movement
  • 7. Data migration needs, challenges ● User Insights ○ Inter DC data bandwidth limitations (1Gbps shared link) ○ 130 TB (snappy compressed) data in HBase, Derived data(Insights) much smaller though ● MySQL instance footprint - 600+ ● Flipkart Data Platform ○ Data publishers/consumers not moving together ■ Data consumers could move earlier than the publishers, vice-versa ○ Migrating couple of PB data not feasible over network ○ Consistency for raw, prepared and reporting data
  • 8. Migration planning and execution ● Most useful tool - Google spreadsheets & docs! ○ Inventory of systems in each business cluster - split by service, backing data store ○ Defined data migration recipes and SME group for each data store type ■ Advise on IaaS constructs - instance types, PaaS integration - service discovery, Data migration strategy (export vs live replication), built tooling ○ Create cutover sequence and interdependencies ■ for e.g. Catalog → Search → Cart/CO → Mobile apps ○ Wrote playbook for each cutover activity - including checklists, verification of data export/restore ● Program managed a plan that touched 1000+ systems and many of the 1000 member engineering org
  • 9. Knowledge base on 3rd party data stores, packages
  • 10. Hacks, Tools and Utilities ● “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway” -- Andrew S. Tanenbaum (Computer Networks, 4th ed., p. 91) ○ We used disks instead to move User Insights data (stored in HBase) ■ Moves snapshots of derived/computed data over wire (relatively small) ■ Avoided HBase export. Instead transferred HFiles into disks using custom ‘distcp’ like tool which knapsack'ed ~40K files into 6 disks. Open sourced as : https://github. com/flipkart-incubator/blueshift ■ Disked shipped to new DC ■ Transferred HFiles into HDFS using Blueshift ■ Imported HFiles into HBase using HBase Bulk Load
  • 11. Migrating live User sessions - dual write ● Cold data in HBase (9TB - compressed), hot in Memcached(1TB) ● Live read-writes on Memcached, async batched writes to HBase ● Migration via Dual writes ○ Fresh Memcached cluster in new DC ○ Added this cluster as another batched write destination in old DC ○ Data move initiated 21 days before actual-cutover to allow for catchup ○ Hbase data was exported using standard snapshotting and incremental copy table periodically ○ Batch interval reduced from 10 minutes to 1 minutes during cutover for aggressive copy ● No user logout, session loss after cutover
  • 12. Migrating Product catalog data ● Data modelled as Entities & Relationships : clients have “Views” of this data ● Views expressed as JSON DSL ● Raw data exported from HBase, Elastic Search and copied to new DC ● Required a solution that could migrate updates after initial move ○ Developed JSON diff library that could work over 100 million views. Open sourced : https: //github.com/flipkart-incubator/zjsonpatch ■ Diffs are applied in order - important for DC move ○ Bandwidth consumption for applying updates dropped from 800 Mbps to 13-14 Mbps
  • 14. Application Relay bridge over Kafka queues ● RESTBus : orchestrates all Order fulfillment systems ● Pattern : Locally committed messages in MySQL, relayed over Kafka to Http endpoints ● Bridge over 2 DCs with destinations resolved from ELB endpoints
  • 15. 2 way sync of Kafka Streams Mirror Old DC New DC
  • 16. Copying data across clusters ● Only copied raw data about 200TB compressed ○ All prepared and reports data generated from raw data ● Verification utilities to check correctness in data in both clusters ● Ran the full data platform stack in both places for over 2 weeks till all data publishers and consumers move