SlideShare a Scribd company logo
Migrating 100,000 pages of content
       From Legacy CMS to Drupal

                          Rachel Jaro
Solutions Architect at PrometSource
            www.prometsource.com
Overview
We’ll talk about:
 Successful migration recipe
 Common questions you should be asking before you
  start
 Top 3 tools to do migration in Drupal
 Issues
   Tools to use in URL Rewriting
   File management Comparison in D6
 Testing
 Deploying Solution
Data Migration
 “Data migration solutions extract data from a source
 system, correct errors, reformat, restructure and load
 the data into a replacement target system”.

 It sounds simple, but poorly managed data migration
 is the most common cause of failure in implementing
 a replacement system.

 -- Gershon Pick, March 2001
Successful Migration Recipe
Planning




  Source: http://www.flickr.com/photos/bjornmeansbear/4380595283/
Plan: What to Ask
 Node types (Content separation, fields)
   Do you want to separate contents into pages, articles,
    biography, news, etc.
   What fields are needed for each node?
   Who can access it?
   Do you really need that content type? Or can we just use
    taxonomies instead for similar contents.
Plan: What to Ask
 Taxonomy (Categorization, tags)
    Do you need to categorize nodes?
    Would you need different access?
    What kind of taxonomy groups or vocabularies you
     would need?
 Permission (per nodes) and User Roles
    Who are going to use the site?
    What are particularly their access rights?
Plan: What to Ask
 New URL mapping
   Do you need to make SEO friendly URLs?
 Files, files permissions and file directory
   Do you need advance file management or document
      management tool?
     Do you need simpler solutions? How simple is that.
     Do you need access rights for each folder?
     Do you need browser type interface to access them?
     What kind of files do you need to store? Images, pdfs?
Build
Requirements
 Use CSV files to import data
 Divide migration into group or sections
 Map and replace old URL to SEO friendly URL
   Before: 05-200.htm
Data in CSV Example
December 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report
  Spotlights Need for Reform in Jackpot
  Jurisdictions||||||||||/press/releases/2005/december/||||||||||05-
  200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy ||||||||||

<p>LoremIpsum is simply dummy text of the printing and typesetting
  industry. LoremIpsum has been the industry's standard dummy text
  ever since the 1500s, when an unknown printer took a galley of type and
  scrambled it to make a type specimen book. </p>

<p>LoremIpsum is simply dummy text of the printing and typesetting
  industry. LoremIpsum has been the industry's standard dummy text
  ever since the 1500s, when an unknown printer took a galley of type and
  scrambled it to make a type specimen book. </p>
$$$$$$$$$$

                                                     Separator: ||||||||||
                                                     End of Row: $$$$$$$$$$
Content Type Division
Example: CNN.com
Divide migration sequences into US, World, Politics, Justice, etc
Solutions/Tools
 TW and Migrate modules Combo
 node_import()
 Drush + custom script
TW & Migrate Module Combo
 http://drupal.org/project/tw
    Supports Migrate module to run views of source data


 http://drupal.org/project/migrate
    a flexible framework for migrating content
Migrate Module
Features:
 users browse their legacy data using views
 support for creating Drupal nodes, users, and
  comments is included
 hooks permit migration of other types of content.
 provides a dashboard for running mini migrations
 Drush support
Why I did not choose migrate
 Importing to mysql was not an option. CSV were used
  instead
 Cannot map old URL to new URL
node_import()
http://drupal.org/project/node_import
Features:
 Easy to learn, Point and click
 Uses CSV to upload contents
 Can easily delete previous imported data
 Can download errors when import failed for easy
  reference to fix issues
node_import() Problems
 I can’t define map old URL to new URL
 No drush support
 It doesn’t save my old settings for a csv.
Drush + Custom script


             Flexibility
     - I can do whatever I want with the data
Create your own migration script



            [demo]
Issues
 File Management
 URL Rewriting
File Management
Client requirements
 Intuitive
 Has wysiwyg support
 Access control – upload, edit, delete, revise files by
  different roles
 Revision control – optional but good to have
 Limited time!
File Management Modules




*DbFm was not included due to problems encountered during tests in D6
URL Rewriting




   Source: http://www.flickr.com/photos/randomfactor/483264915/
URLs Rewriting Solution
Not recommended
 .htaccess
     Too many URL to handle.
     Too much server load


Recommended
 pathauto + path_redirect modules
     automated alias settings
     301 redirect set
 global redirect


Additional reference:
http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
URL Checker
 http://drupal.org/project/linkchecker
Access control Alternative
 /default/files/PressReleases
 /default/files/Documents
 /default/files/International
    /default/files/International/America
    /default/files/International/England
    /default/files/International/Asia
Test, Test and did I say Test?




   Source: http://www.flickr.com/photos/paperpariah/2424107350/
Common problems
 Broken links
 Misconfigured page
 Empty pages
 Invalid date
 File not found or orphan pages
 Page format


            Test when CACHE is on
Deployment
Deployment
2 Ways to Deploy your data to live environment
1. All at once
2. Divide and conquer
Deployment: Divide and Conquer
Example: CNN Division
Deployment Mockup




 * shadow box is your migrated data’s production box
 * old CMS is still active at this time
Deployment
• Coordination between the old CMS and Drupal
• URL Testing
Deployment Mockup




 * shadow box is your migrated data’s production box
 * replacing old CMS with Drupal
Deployment
Pros
 Less risk, less stress
 Editors can do continues data entry daily


Cons
 URL rewriting can be a tricky
 Updating the production box with new content can be
  an arduous task
Deployment: Updating Production
Automation
 SVN
 Drush scripts to migrate contents from tester’s box to
  shadow box
 Deploy – http://drupal.org/project/deploy


Manual
 Document configuration changes
 Document database changes
Recap
 SDLC + Agile
 Common questions you should be asking before you
  start
 Top 3 tools to do migration in Drupal
   TW & Migrate, node_import(), drush
 Issues
    File management Comparison in D6
    Tools to use in URL Rewriting
 Testing
 Deployment Solution
Questions?
Resources
 http://groups.drupal.org/content-migration-import-
  and-export
 http://drupal.org/handbook/migrating

More Related Content

ODP
BrownSites: Building and Managing a CMS Infrastructure for Higher Ed
PPTX
Drupal is from Mars, Wordpress is from Venus: Finding your library's CMS soul...
PPTX
Best Practices for Migrating a Legacy-Based CMS to Drupal
PPTX
Cloud Computingfor Librarian To Librarian Networking Summit
PDF
Drupal is not your Website
PPTX
EdTechJoker Spring 2020 - Lecture 7 Drupal intro
PPT
Websites Unlimited - Pay Monthly Websites
PPT
Cms an overview
BrownSites: Building and Managing a CMS Infrastructure for Higher Ed
Drupal is from Mars, Wordpress is from Venus: Finding your library's CMS soul...
Best Practices for Migrating a Legacy-Based CMS to Drupal
Cloud Computingfor Librarian To Librarian Networking Summit
Drupal is not your Website
EdTechJoker Spring 2020 - Lecture 7 Drupal intro
Websites Unlimited - Pay Monthly Websites
Cms an overview

What's hot (11)

PPT
Introduction to Web Programming - first course
PDF
Drupal Is Not Your Web Site
PDF
Web Services PHP Tutorial
PPTX
History of Drupal: From Drop 1.0 to Drupal 8
PDF
facebook architecture for 600M users
PDF
WordPress as a CMS - Case Study of an Organizational Intranet
KEY
Beyond WP-CONTENT | #WCRaleigh
PDF
Html, WordPress & evolving forms of publishing
PPTX
The WordPress University 2012
PPT
Drupal Basics
PDF
Shared slides-edbt-keynote-03-19-13
Introduction to Web Programming - first course
Drupal Is Not Your Web Site
Web Services PHP Tutorial
History of Drupal: From Drop 1.0 to Drupal 8
facebook architecture for 600M users
WordPress as a CMS - Case Study of an Organizational Intranet
Beyond WP-CONTENT | #WCRaleigh
Html, WordPress & evolving forms of publishing
The WordPress University 2012
Drupal Basics
Shared slides-edbt-keynote-03-19-13
Ad

Viewers also liked (6)

PDF
Drupal for Non-Developers
PPT
JIIT PORTAL based on Drupal
PPTX
Content Migration to Drupal 8
PPTX
Drupal content-migration
PDF
Out With the Old, in With the Open-source: Brainshark's Complete CMS Migration
PPTX
Migration from Legacy CMS to Drupal
Drupal for Non-Developers
JIIT PORTAL based on Drupal
Content Migration to Drupal 8
Drupal content-migration
Out With the Old, in With the Open-source: Brainshark's Complete CMS Migration
Migration from Legacy CMS to Drupal
Ad

Similar to Drupalcampchicago2010.rachel.datamigration. (20)

PPTX
Drupal campchicago2010.rachel.datamigration
PDF
Migrate all the things!
PDF
Migrate for Site Builders from MidCamp 2016
PPTX
Best Practices and Tips on Migrating a Legacy-Based CMS to Drupal
PPTX
The long and the short of migrating to Drupal
PDF
Migrating to Drupal 8: How to Migrate Your Content and Minimize the Risks
KEY
Moving to Drupal
PDF
Migrating data into Drupal using the migrate module
PPTX
How to Migrate, Manage and Centralize your Web Infrastructure with Drupal
PPT
PPPA D8 presentation Drupal For Gov_0
PDF
Drupal migrate-june2015
PDF
MIGRATION - PAIN OR GAIN?
PDF
Drupal migrations in 2018 - presentation at DrupalCon in Nashville
PDF
Drupal Migrations in 2018
PPT
Taking your site from Drupal 6 to Drupal 7
PDF
Drupal upgrades and migrations. BAD Camp 2013 version
PDF
Migrate
PDF
Intro to Drupal Migrate for Site Builders
PPT
Dcm migration
PDF
Staging Drupal 8 31 09 1 3
Drupal campchicago2010.rachel.datamigration
Migrate all the things!
Migrate for Site Builders from MidCamp 2016
Best Practices and Tips on Migrating a Legacy-Based CMS to Drupal
The long and the short of migrating to Drupal
Migrating to Drupal 8: How to Migrate Your Content and Minimize the Risks
Moving to Drupal
Migrating data into Drupal using the migrate module
How to Migrate, Manage and Centralize your Web Infrastructure with Drupal
PPPA D8 presentation Drupal For Gov_0
Drupal migrate-june2015
MIGRATION - PAIN OR GAIN?
Drupal migrations in 2018 - presentation at DrupalCon in Nashville
Drupal Migrations in 2018
Taking your site from Drupal 6 to Drupal 7
Drupal upgrades and migrations. BAD Camp 2013 version
Migrate
Intro to Drupal Migrate for Site Builders
Dcm migration
Staging Drupal 8 31 09 1 3

More from Promet Source (20)

PPTX
How To Start Building Your Own Website With Drupal by Mary Chris Casis
PDF
DrupalCamp Cebu 2018 R&F by Andrew Kucharski
PDF
Unit test in drupal 8 by Pratomo Ardianto Drupalcamp Cebu 2018
PDF
Migrating to-Drupal-8 by Bryan Manalo
PDF
Why and When to use Drupal by Luc Bezier - Drupalcamp Cebu 2018
PDF
Drupal Development with Docker
PDF
Migrating Drupal 7 to Drupal 8
PPTX
Web Accessibility in Drupal
PDF
Drupal Continuous Integration and devops - Beyond Jenkins
PDF
Drupal 8 Involvement with Promet Source
PDF
Using Commerce License for Premium Content on Drupal Sites
PDF
Behavioral driven development with Behat
PDF
Composer tools and frameworks for Drupal
PDF
Responsive Design Testing the Promet Way
PDF
Optimize and succeed your next Fixed Budget Project planning process
PDF
Diy continuous integration
PPT
Higher Ed Web 2013 presentation - Field of Dreams, build it and they will come
PPTX
Getting agile with drupal
PPT
Project Estimation Presentation - Donte's 8th level of estimating level of ef...
PDF
DrupalCon 2013 Making Support Fun & Profitable
How To Start Building Your Own Website With Drupal by Mary Chris Casis
DrupalCamp Cebu 2018 R&F by Andrew Kucharski
Unit test in drupal 8 by Pratomo Ardianto Drupalcamp Cebu 2018
Migrating to-Drupal-8 by Bryan Manalo
Why and When to use Drupal by Luc Bezier - Drupalcamp Cebu 2018
Drupal Development with Docker
Migrating Drupal 7 to Drupal 8
Web Accessibility in Drupal
Drupal Continuous Integration and devops - Beyond Jenkins
Drupal 8 Involvement with Promet Source
Using Commerce License for Premium Content on Drupal Sites
Behavioral driven development with Behat
Composer tools and frameworks for Drupal
Responsive Design Testing the Promet Way
Optimize and succeed your next Fixed Budget Project planning process
Diy continuous integration
Higher Ed Web 2013 presentation - Field of Dreams, build it and they will come
Getting agile with drupal
Project Estimation Presentation - Donte's 8th level of estimating level of ef...
DrupalCon 2013 Making Support Fun & Profitable

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Diabetes mellitus diagnosis method based random forest with bat algorithm
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
MYSQL Presentation for SQL database connectivity
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Agricultural_Statistics_at_a_Glance_2022_0.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Building Integrated photovoltaic BIPV_UPV.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
NewMind AI Monthly Chronicles - July 2025
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Chapter 3 Spatial Domain Image Processing.pdf

Drupalcampchicago2010.rachel.datamigration.

  • 1. Migrating 100,000 pages of content From Legacy CMS to Drupal Rachel Jaro Solutions Architect at PrometSource www.prometsource.com
  • 2. Overview We’ll talk about:  Successful migration recipe  Common questions you should be asking before you start  Top 3 tools to do migration in Drupal  Issues  Tools to use in URL Rewriting  File management Comparison in D6  Testing  Deploying Solution
  • 3. Data Migration “Data migration solutions extract data from a source system, correct errors, reformat, restructure and load the data into a replacement target system”. It sounds simple, but poorly managed data migration is the most common cause of failure in implementing a replacement system. -- Gershon Pick, March 2001
  • 5. Planning Source: http://www.flickr.com/photos/bjornmeansbear/4380595283/
  • 6. Plan: What to Ask  Node types (Content separation, fields)  Do you want to separate contents into pages, articles, biography, news, etc.  What fields are needed for each node?  Who can access it?  Do you really need that content type? Or can we just use taxonomies instead for similar contents.
  • 7. Plan: What to Ask  Taxonomy (Categorization, tags)  Do you need to categorize nodes?  Would you need different access?  What kind of taxonomy groups or vocabularies you would need?  Permission (per nodes) and User Roles  Who are going to use the site?  What are particularly their access rights?
  • 8. Plan: What to Ask  New URL mapping  Do you need to make SEO friendly URLs?  Files, files permissions and file directory  Do you need advance file management or document management tool?  Do you need simpler solutions? How simple is that.  Do you need access rights for each folder?  Do you need browser type interface to access them?  What kind of files do you need to store? Images, pdfs?
  • 10. Requirements  Use CSV files to import data  Divide migration into group or sections  Map and replace old URL to SEO friendly URL  Before: 05-200.htm
  • 11. Data in CSV Example December 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report Spotlights Need for Reform in Jackpot Jurisdictions||||||||||/press/releases/2005/december/||||||||||05- 200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy |||||||||| <p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p> <p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p> $$$$$$$$$$ Separator: |||||||||| End of Row: $$$$$$$$$$
  • 12. Content Type Division Example: CNN.com Divide migration sequences into US, World, Politics, Justice, etc
  • 13. Solutions/Tools  TW and Migrate modules Combo  node_import()  Drush + custom script
  • 14. TW & Migrate Module Combo  http://drupal.org/project/tw  Supports Migrate module to run views of source data  http://drupal.org/project/migrate  a flexible framework for migrating content
  • 15. Migrate Module Features:  users browse their legacy data using views  support for creating Drupal nodes, users, and comments is included  hooks permit migration of other types of content.  provides a dashboard for running mini migrations  Drush support
  • 16. Why I did not choose migrate  Importing to mysql was not an option. CSV were used instead  Cannot map old URL to new URL
  • 17. node_import() http://drupal.org/project/node_import Features:  Easy to learn, Point and click  Uses CSV to upload contents  Can easily delete previous imported data  Can download errors when import failed for easy reference to fix issues
  • 18. node_import() Problems  I can’t define map old URL to new URL  No drush support  It doesn’t save my old settings for a csv.
  • 19. Drush + Custom script Flexibility - I can do whatever I want with the data
  • 20. Create your own migration script [demo]
  • 22. File Management Client requirements  Intuitive  Has wysiwyg support  Access control – upload, edit, delete, revise files by different roles  Revision control – optional but good to have  Limited time!
  • 23. File Management Modules *DbFm was not included due to problems encountered during tests in D6
  • 24. URL Rewriting Source: http://www.flickr.com/photos/randomfactor/483264915/
  • 25. URLs Rewriting Solution Not recommended  .htaccess  Too many URL to handle.  Too much server load Recommended  pathauto + path_redirect modules  automated alias settings  301 redirect set  global redirect Additional reference: http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
  • 27. Access control Alternative  /default/files/PressReleases  /default/files/Documents  /default/files/International  /default/files/International/America  /default/files/International/England  /default/files/International/Asia
  • 28. Test, Test and did I say Test? Source: http://www.flickr.com/photos/paperpariah/2424107350/
  • 29. Common problems  Broken links  Misconfigured page  Empty pages  Invalid date  File not found or orphan pages  Page format Test when CACHE is on
  • 31. Deployment 2 Ways to Deploy your data to live environment 1. All at once 2. Divide and conquer
  • 32. Deployment: Divide and Conquer Example: CNN Division
  • 33. Deployment Mockup * shadow box is your migrated data’s production box * old CMS is still active at this time
  • 34. Deployment • Coordination between the old CMS and Drupal • URL Testing
  • 35. Deployment Mockup * shadow box is your migrated data’s production box * replacing old CMS with Drupal
  • 36. Deployment Pros  Less risk, less stress  Editors can do continues data entry daily Cons  URL rewriting can be a tricky  Updating the production box with new content can be an arduous task
  • 37. Deployment: Updating Production Automation  SVN  Drush scripts to migrate contents from tester’s box to shadow box  Deploy – http://drupal.org/project/deploy Manual  Document configuration changes  Document database changes
  • 38. Recap  SDLC + Agile  Common questions you should be asking before you start  Top 3 tools to do migration in Drupal  TW & Migrate, node_import(), drush  Issues  File management Comparison in D6  Tools to use in URL Rewriting  Testing  Deployment Solution
  • 40. Resources  http://groups.drupal.org/content-migration-import- and-export  http://drupal.org/handbook/migrating

Editor's Notes

  • #5: Todo – make comparison of normal sdlc to migration of sdlc
  • #14: http://www.flickr.com/photos/14804582@N08/2111269218/