SlideShare a Scribd company logo
Mark Gross, Founder and CEO, Data Conversion Laboratory
Creating a Hybrid Approach to Legacy Conversion
16 May, 2014
Valuable Content Transformed
• Document Digitization
• XML and HTML Conversion
• eBook Production
• Hosted Solutions
• Big Data Automation
• Conversion Management
• Editorial Services
• Harmonizer
Experience the DCL Difference
DCL blends years of conversion experience with cutting-edge technology and
the infrastructure to make the process easy and efficient.
• World-Class Services
• Leading-Edge Technology
• Unparalleled Infrastructure
• US-Based Management
• Complex-Content Expertise
• 24/7 Online Project Tracking
• Automated Quality Control
• Global Capabilities
We Serve a Very Broad Client Base . . .
. . . Spanning All Industries
• Aerospace
• Associations
• Defense
• Distribution
• Education
• Financial
• Government
• Libraries
• Life Sciences
• Manufacturing
• Medical
• Museums
• Periodicals
• Professional
• Publishing
• Reference
• Research
• Societies
• Software
• STM
• Technology
• Telecommunications
• Universities
• Utilities
Conversion Setup
Components
Conversion Production
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
What Does a Conversion Project Look Like?
• Identify materials that
are candidates for
conversion
• Assess the material’s
importance, how it
might be used
• Classify and prioritize
Conversion Setup Components in Detail
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
• Analyze documents to
identify potentially
redundant materials
• Normalize documents
to maximize reusability
• Evaluate document
sources to determine the
relative ease & accuracy of
content extraction
• Identify metadata sources
• Identify the types of
information in the
documents and the
appropriate level of
tagging
• Identify processes for
various materials
• Identify a suitable DTD or
Schema
• Detailed analysis of
documents by type
• Review enough documents
to understand the
potential variations
• Develop tagging
instructions
• Prepare specification
Conversion Setup Components in Detail (cont’d)
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
• Load balancing
• Capacity requirements
• Hardware requirements
• Identify conversion SW
requirement
• Evaluate tools
• Identify manual
conversion needs
• Develop or modify
conversion software per
conversion specification
• Identify the various steps
and plan a workflow
• Evaluate control and QA
mechanisms that will be
needed
• Design workflow process
to route documents
appropriately
Conversion Setup Components in Detail (cont’d)
Design & Develop
Automation &
Workflow SW
Conversion
Software
Testing
Training
• Prepare a test plan
• Develop a document test
baseline
• Create process to test
documents coming
through conversion flow
• Create process for:
− random testing
− testing new material
types
− software changes
• XML training
• Company standards
training
• How to write for XML
• Pulling content together
from the various locations
• Delivering to the
processing group
• Logging content into the
workflow system
Conversion Production Components in Detail
Organizing
Content for
Conversion
Hosting &
Running
Conversion
SW
Hosting & Running
Automation &
Workflow SW
• Maintaining facility to run
software and keep it
updated
• Monitor performance
and operations
• Sample materials on a
continual basis
• Maintain facility to route
materials between
software and manual
operations
• Monitor performance and
keep software and process
updated
• Paper preparation
• Scanning & zoning
• OCR processing
Conversion Production Components in Detail (cont’d)
Scanning &
OCR
Image
Processing
Proofreading
• Image extraction
• Resizing and image
correction
• Image conversion
• Proofread to required level
of accuracy
• How much can
automation do?
• Export text to normalized
form
• Automated & Manual
pre-tagging
• Pre-conversion review
• Styling QC
• SME (subject matter
expert) support
Conversion Production Components in Detail (cont’d)
Pre-Conversion
Document
Preparation
Conversion Parse/View
• Automated conversion
• Tagged output
• Parse document
• Review error logs and
correct until validated
• Render document for
viewing with images
• View document and
correct errors
• Image review
• Execute test plans
• Automated and Manual
QC
• Fix errors or provide
feedback
• Random sampling
• Continuous improvement
Conversion Production Components in Detail (cont’d)
Quality Control
Reporting,
Audit and
Reconciliation
• Management reporting
• Process monitoring
• Exception reporting
• Audit and reconciliation
of production throughput
• Consultant/Strategist
• Architecture Developer/Specialization Expert
• Trainers
• XML/Content Experts
• Subject Matter Experts (SMEs)
• Project/Program Management
• Conversion Operators
• Production Tracking
• Software Developers
• Filter Developers
• IT
• QA Experts
• Editors/Writers/Authors
Various Skills You May Need on Board
Consider Your Options …
• Outsource it all
• Convert in-house
• Partner with an expert
• All of the above
Case Study 1: Converting a Large Content Repository
• Client Situation
- Build a database of scientific journals – 750,000 pages spanning almost 100 years
- Complex materials with lots of math, tables, and images
- Multiple formats and types needed to be normalized to a manageable database to produce
new products, and support future products not yet conceived
- The organization wanted to keep its limited personnel resources focused on their expertise
• Approach
- Flexibility - The size and breadth of the collection made it impractical to develop full
specifications in advance.
- Develop an overall specification, with allowance for change as new scenarios are discovered
− Software development sprints to incorporate changes
− Close collaboration between vendor and client to manage new situations
− The organization leveraged it’s knowledge of its materials to identify potential problems in
advance, sequence the materials, actively review materials as they got produced
− Frequent review meetings to assess nuances in new materials as they came up
• Results
− This was a three year project to be completed this summer
− On schedule and on budget, with several new products already developed and out on the
market
− The close collaboration and involvement of the client shaved 6-8 months off the project
schedule, and created a product that all goals.
Case Study 1: Project Components Breakdown
Conversion Production ComponentsConversion Setup Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
Client DCL
Specialty
Provider
Shared
Responsibility
Case Study 2: International Technology Hardware and Software Company
• Client Situation
- Company has developed many thousands of hours of instructional materials it wants to
centralize and convert to XML using a SCORM-based Schema
- Materials included slides, video and taped lectures, written materials in various forms
- Goal was to identify the re-usable assets and to normalize these materials so that this
library of reusable assets can be reused for training its own engineers and other personnel
- Some materials would be offered for external training
- The materials were very specialized and subject matter expertise (SME) input was needed to
review all materials
• Approach
- DCL integrated as part of the client’s team
- DCL prepared transcripts of all oral materials with timings keyed to PowerPoint Slides
- DCL copyedited transcripts and PowerPoint slides and normalized style for both
- Client provided SME and legal review of transcripts
- Client re-recorded any needed voice-overs
- Client created Flash format for web publishing
- DCL created integrated XML products for loading into the client educational database
• Results
- Full integration of client and DCL teams allowed for a rapid ramp to produce pilot and move
into larger production
- Client was able to use it’s own personnel who knew the product well for SME support
- The client also contracted with another engineering company to provide additional SME
support for those products that could be supported by outside engineers
Conversion Production ComponentsConversion Setup Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
Client DCL
Specialty
Provider
Case Study 2: Project Components Breakdown
Shared
Responsibility
Case Study 3: Engineering Company Supplying the US Air Force
• Client Situation
- Material were to be converted from SGML and delivered in S1000D
- Company had created a fully automated conversion; Air Force wanted an independent audit
of the converted documents
• Approach
- Client had developed the conversion specified, and converted the documents to S1000D
- DCL to validated that the final XML met S1000D requirements
- DCL developed a conversion plan and tools to perform the audit
- DCL performed both automated and manual analysis and review of the conversion processes
and converted documents checking for inventory accuracy, tagging accuracy, and text
accuracy of tags and tag values
- DCL performed 100% audit of all materials and reported results, along with suggestions to
the client and to the Air Force
• Results
- Client was able to utilize DCL’s S1000D expertise and take advantage of DCL’s automated audit and
QA tools
- The client produced a better product as a result of feedback DCL was able to provide
- Air Force received a fully audited document set that satisfied their independent review requirement
Case Study 3: Project Components Breakdown
Conversion Production ComponentsConversion Setup Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
Client DCL
Specialty
Provider
Shared
Responsibility
Case Study 4: Large Journal Publisher with Facilities in China and India
• Client Situation
- Ongoing publishing operations with good understanding of its work flow and requirements
- Growing very quickly and needing to ramp up its capacity to convert author-written articles from
Word and PDF into XML
- Has in-place facilities in China to handle process management and labor-intensive tasks
- Had been building its own software capability, but it was taking longer than expected
- Wanted to take advantage of DCL’s infrastructure for conversion and workflow while maintaining it’s
own facilities for the human processing tasks
• Approach
- DCL configured it’s workflow and conversion software to the client’s requirements
- But instead of using DCL’s facilities, all preliminary work, and all manual work was routed by the
workflow system directly to the clients facility.
• Results
- Process made use of DCL’s existing infrastructure and software which were quickly reconfigured to
the client’s specification, and able to improve the automation of its process quickly and at lower cost
- Client was able to take advantage of the efficient facilities and infrastructure it had put into place
- DCL would monitor software and provide enhancements and updates as needed
- DCL would provide backup capability for overflow surges
Case Study 4: Project Components Breakdown
Conversion Production ComponentsConversion Setup Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
Client DCL
Specialty
Provider
Shared
Responsibility
The Model That Maximizes Results and
Minimizes Risk is Best for Your Organization
• Which parts of the process are your core business?
• Will this be a permanent process, or a limited time project?
• Do you have the needed in-house expertise?
• Do you want to build the staff and infrastructure?
• What are the risks?
• What combination will be best for your business?
Ask yourself these questions to help make the determination ...
… the good news – it’s not “one size fits all” anymore
“You don’t have to go it alone.”
Q&A
Mark Gross
Founder and CEO, Data Conversion Laboratory
(718) 307-5711
Mgross@dclab.com

More Related Content

PPT
ASME's Digital Path Initiative: Don't Make a Molehill Out of a Mountain!
PPTX
Automating Complex High-Volume Technical Paper and Journal Article Page Compo...
PPTX
Managing the Complexities of Conversion to S1000D
PPTX
Converting Your Legacy Data to S1000D
PPT
When Conversion Makes Sense Following the Trends: Is your content ready?
PPTX
Discovering New Product Introduction (NPI) using Autodesk Fusion Lifecycle
PPTX
Top 10 DBA Mistakes on Microsoft SQL Server
PPTX
Plm & windchill
ASME's Digital Path Initiative: Don't Make a Molehill Out of a Mountain!
Automating Complex High-Volume Technical Paper and Journal Article Page Compo...
Managing the Complexities of Conversion to S1000D
Converting Your Legacy Data to S1000D
When Conversion Makes Sense Following the Trends: Is your content ready?
Discovering New Product Introduction (NPI) using Autodesk Fusion Lifecycle
Top 10 DBA Mistakes on Microsoft SQL Server
Plm & windchill

What's hot (20)

PDF
COE 2017: Atomic Content
PDF
AU 2015: Enterprise, Beam Me Up: Inphi's Enterprise PLM Solution (PPT)
PPTX
COE 2016: Technical Data Migration Made Simple
PPTX
Deploying DriveWorks Throughout the Organization
PDF
Best Practices for Upgrading your JD Edwards Software from Oracle
DOC
Rajesh Reddi_9_Years_Demantra_Consultant
DOC
Mohd_Shaukath_5_Exp_Datastage
PPTX
SPTechCon Austin - The Slippery Slope of SharePoint Migrations
DOC
Shuchi_Agrawal
PDF
Change the way you work: Lessons from other industries
PPTX
SolidWorks Design Automation Using the SolidWorks API, Microsoft Excel and VBA
PPTX
How and why you need to build a big data lab
PPSX
Directions NA Water-Agile-Fall methodology and NAV implementation
PPTX
Beginners HANA
PDF
What's new in microsoft project server and professional 2013
PPSX
Directions NA Choosing the best possible Azure platform for NAV
DOCX
resume
DOC
Laxmikant_Resume
PDF
Pr dc 2015 sql server is cheaper than open source
PPTX
70-461 Querying Microsoft SQL Server 2012
COE 2017: Atomic Content
AU 2015: Enterprise, Beam Me Up: Inphi's Enterprise PLM Solution (PPT)
COE 2016: Technical Data Migration Made Simple
Deploying DriveWorks Throughout the Organization
Best Practices for Upgrading your JD Edwards Software from Oracle
Rajesh Reddi_9_Years_Demantra_Consultant
Mohd_Shaukath_5_Exp_Datastage
SPTechCon Austin - The Slippery Slope of SharePoint Migrations
Shuchi_Agrawal
Change the way you work: Lessons from other industries
SolidWorks Design Automation Using the SolidWorks API, Microsoft Excel and VBA
How and why you need to build a big data lab
Directions NA Water-Agile-Fall methodology and NAV implementation
Beginners HANA
What's new in microsoft project server and professional 2013
Directions NA Choosing the best possible Azure platform for NAV
resume
Laxmikant_Resume
Pr dc 2015 sql server is cheaper than open source
70-461 Querying Microsoft SQL Server 2012
Ad

Viewers also liked (16)

PPTX
Best Practices: Cutting Through the Confusion & Avoiding the Pitfalls of Crea...
PPTX
eBooks Platforms, Standards and Use
PPTX
There's Gold in Them There Archives!: Printing Industries of America
PDF
Adaptive Content, Responsive Design and Medical Information
PPTX
Training: A Key Component of the Global Information Experience
PPTX
DITA 1.3: What's New and Different
PPTX
Advantages of DITA for the Life Sciences
PPTX
DITA and Information Architecture for Responsive Web Design
PPTX
Reducing Costs Through Document Automation for a More Efficient Workplace
PPTX
Monetizing and Marketing Digital Textbooks
PPTX
Envisioning the Global Information Experience
PPTX
Re-branding Content During a Migration with Marli Mesibov: Step 2--Finding Yo...
PPTX
The Freedom to Grow: How Standards in Communication Facilitate Our Industry, ...
PPTX
“Sprinkle the Pixie Dust”: How to Sell Your Content Management Initiative Int...
PPTX
Marketing & Publicity For Independent Authors: Get More Buzz For Your Book
PPTX
The Role of XML in an Information Society with Barry Schaeffer
Best Practices: Cutting Through the Confusion & Avoiding the Pitfalls of Crea...
eBooks Platforms, Standards and Use
There's Gold in Them There Archives!: Printing Industries of America
Adaptive Content, Responsive Design and Medical Information
Training: A Key Component of the Global Information Experience
DITA 1.3: What's New and Different
Advantages of DITA for the Life Sciences
DITA and Information Architecture for Responsive Web Design
Reducing Costs Through Document Automation for a More Efficient Workplace
Monetizing and Marketing Digital Textbooks
Envisioning the Global Information Experience
Re-branding Content During a Migration with Marli Mesibov: Step 2--Finding Yo...
The Freedom to Grow: How Standards in Communication Facilitate Our Industry, ...
“Sprinkle the Pixie Dust”: How to Sell Your Content Management Initiative Int...
Marketing & Publicity For Independent Authors: Get More Buzz For Your Book
The Role of XML in an Information Society with Barry Schaeffer
Ad

Similar to Creating a Hybrid Approach to Legacy Conversion (20)

PPTX
How to Get Started with a Cross Functional Approach to Content Management - T...
DOC
GouthamLaveti
DOC
Saurabh's_profile
PPTX
rough-work.pptx
DOC
Nerses Gevorkian CBS BA1
PDF
RESUME_SRUTHI_SRINIVASAN
PPTX
Preparing Your Legacy Data for Automation in S1000D
DOC
Kalpana Rai
PPT
Training on ASAP Methodology_11.10.2020.ppt
ODT
Prabhu Sundaramurthi (4)
DOC
Copy of Alok_Singh_CV
DOC
Resume - Deepak v.s
PPT
ERP II Overview.ppt
PPTX
Resume G Bisanz Detailed Feb22012
DOC
Saumya Thomas Resume
DOC
Aayush Sinha_8.4Yrs_PO_BA
PDF
How to Automate your Enterprise Application / ERP Testing
PPTX
FileServicesPitch
PPTX
Technical Without Code
DOC
Ravi_Nelluri_QA
How to Get Started with a Cross Functional Approach to Content Management - T...
GouthamLaveti
Saurabh's_profile
rough-work.pptx
Nerses Gevorkian CBS BA1
RESUME_SRUTHI_SRINIVASAN
Preparing Your Legacy Data for Automation in S1000D
Kalpana Rai
Training on ASAP Methodology_11.10.2020.ppt
Prabhu Sundaramurthi (4)
Copy of Alok_Singh_CV
Resume - Deepak v.s
ERP II Overview.ppt
Resume G Bisanz Detailed Feb22012
Saumya Thomas Resume
Aayush Sinha_8.4Yrs_PO_BA
How to Automate your Enterprise Application / ERP Testing
FileServicesPitch
Technical Without Code
Ravi_Nelluri_QA

More from dclsocialmedia (20)

PPTX
Content Development: Measuring the Trends
PPTX
What are the Strengths and Weaknesses of DITA Adoption?
PPTX
DITA's New Thang: Going Mapless!
PPTX
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
PPTX
Minimalism Revisited — Let’s Stop Developing Content that No One Wants
PPTX
Converting and Transforming Technical Graphics
PPTX
Converting and Integrating Legacy Data and Documents When Implementing a New CMS
PPT
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
PPTX
Anticipating Lightweight DITA
PPTX
Content Engineering and The Internet of “Smart” Things
PPTX
DITA for Small Teams: An Open Source Approach to DITA Content Management
PPTX
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
PPTX
Data-Driven User Experience
PPTX
Introduction to Structured Authoring
PPTX
Metadata Matters
PPTX
Using HTML5 to Deliver and Monetize Your Mobile Content
PPTX
Converting and Integrating Content When Implementing a New CMS
PPTX
There's Gold in Them Thar Data
PPTX
Content Conversion Done Right Saves More Than Money
PPTX
Precision Content™ Tools, Techniques, and Technology
Content Development: Measuring the Trends
What are the Strengths and Weaknesses of DITA Adoption?
DITA's New Thang: Going Mapless!
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Minimalism Revisited — Let’s Stop Developing Content that No One Wants
Converting and Transforming Technical Graphics
Converting and Integrating Legacy Data and Documents When Implementing a New CMS
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
Anticipating Lightweight DITA
Content Engineering and The Internet of “Smart” Things
DITA for Small Teams: An Open Source Approach to DITA Content Management
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Data-Driven User Experience
Introduction to Structured Authoring
Metadata Matters
Using HTML5 to Deliver and Monetize Your Mobile Content
Converting and Integrating Content When Implementing a New CMS
There's Gold in Them Thar Data
Content Conversion Done Right Saves More Than Money
Precision Content™ Tools, Techniques, and Technology

Recently uploaded (20)

PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
KodekX | Application Modernization Development
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
“AI and Expert System Decision Support & Business Intelligence Systems”
The AUB Centre for AI in Media Proposal.docx
Spectroscopy.pptx food analysis technology
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation_ Review paper, used for researhc scholars
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
KodekX | Application Modernization Development
Per capita expenditure prediction using model stacking based on satellite ima...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Network Security Unit 5.pdf for BCA BBA.
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...

Creating a Hybrid Approach to Legacy Conversion

  • 1. Mark Gross, Founder and CEO, Data Conversion Laboratory Creating a Hybrid Approach to Legacy Conversion 16 May, 2014
  • 2. Valuable Content Transformed • Document Digitization • XML and HTML Conversion • eBook Production • Hosted Solutions • Big Data Automation • Conversion Management • Editorial Services • Harmonizer
  • 3. Experience the DCL Difference DCL blends years of conversion experience with cutting-edge technology and the infrastructure to make the process easy and efficient. • World-Class Services • Leading-Edge Technology • Unparalleled Infrastructure • US-Based Management • Complex-Content Expertise • 24/7 Online Project Tracking • Automated Quality Control • Global Capabilities
  • 4. We Serve a Very Broad Client Base . . .
  • 5. . . . Spanning All Industries • Aerospace • Associations • Defense • Distribution • Education • Financial • Government • Libraries • Life Sciences • Manufacturing • Medical • Museums • Periodicals • Professional • Publishing • Reference • Research • Societies • Software • STM • Technology • Telecommunications • Universities • Utilities
  • 6. Conversion Setup Components Conversion Production Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation What Does a Conversion Project Look Like?
  • 7. • Identify materials that are candidates for conversion • Assess the material’s importance, how it might be used • Classify and prioritize Conversion Setup Components in Detail Inventory & Assessment Reuse Analysis Document Analysis • Analyze documents to identify potentially redundant materials • Normalize documents to maximize reusability • Evaluate document sources to determine the relative ease & accuracy of content extraction • Identify metadata sources • Identify the types of information in the documents and the appropriate level of tagging • Identify processes for various materials • Identify a suitable DTD or Schema
  • 8. • Detailed analysis of documents by type • Review enough documents to understand the potential variations • Develop tagging instructions • Prepare specification Conversion Setup Components in Detail (cont’d) Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW • Load balancing • Capacity requirements • Hardware requirements • Identify conversion SW requirement • Evaluate tools • Identify manual conversion needs • Develop or modify conversion software per conversion specification
  • 9. • Identify the various steps and plan a workflow • Evaluate control and QA mechanisms that will be needed • Design workflow process to route documents appropriately Conversion Setup Components in Detail (cont’d) Design & Develop Automation & Workflow SW Conversion Software Testing Training • Prepare a test plan • Develop a document test baseline • Create process to test documents coming through conversion flow • Create process for: − random testing − testing new material types − software changes • XML training • Company standards training • How to write for XML
  • 10. • Pulling content together from the various locations • Delivering to the processing group • Logging content into the workflow system Conversion Production Components in Detail Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW • Maintaining facility to run software and keep it updated • Monitor performance and operations • Sample materials on a continual basis • Maintain facility to route materials between software and manual operations • Monitor performance and keep software and process updated
  • 11. • Paper preparation • Scanning & zoning • OCR processing Conversion Production Components in Detail (cont’d) Scanning & OCR Image Processing Proofreading • Image extraction • Resizing and image correction • Image conversion • Proofread to required level of accuracy • How much can automation do?
  • 12. • Export text to normalized form • Automated & Manual pre-tagging • Pre-conversion review • Styling QC • SME (subject matter expert) support Conversion Production Components in Detail (cont’d) Pre-Conversion Document Preparation Conversion Parse/View • Automated conversion • Tagged output • Parse document • Review error logs and correct until validated • Render document for viewing with images • View document and correct errors • Image review
  • 13. • Execute test plans • Automated and Manual QC • Fix errors or provide feedback • Random sampling • Continuous improvement Conversion Production Components in Detail (cont’d) Quality Control Reporting, Audit and Reconciliation • Management reporting • Process monitoring • Exception reporting • Audit and reconciliation of production throughput
  • 14. • Consultant/Strategist • Architecture Developer/Specialization Expert • Trainers • XML/Content Experts • Subject Matter Experts (SMEs) • Project/Program Management • Conversion Operators • Production Tracking • Software Developers • Filter Developers • IT • QA Experts • Editors/Writers/Authors Various Skills You May Need on Board
  • 15. Consider Your Options … • Outsource it all • Convert in-house • Partner with an expert • All of the above
  • 16. Case Study 1: Converting a Large Content Repository • Client Situation - Build a database of scientific journals – 750,000 pages spanning almost 100 years - Complex materials with lots of math, tables, and images - Multiple formats and types needed to be normalized to a manageable database to produce new products, and support future products not yet conceived - The organization wanted to keep its limited personnel resources focused on their expertise • Approach - Flexibility - The size and breadth of the collection made it impractical to develop full specifications in advance. - Develop an overall specification, with allowance for change as new scenarios are discovered − Software development sprints to incorporate changes − Close collaboration between vendor and client to manage new situations − The organization leveraged it’s knowledge of its materials to identify potential problems in advance, sequence the materials, actively review materials as they got produced − Frequent review meetings to assess nuances in new materials as they came up • Results − This was a three year project to be completed this summer − On schedule and on budget, with several new products already developed and out on the market − The close collaboration and involvement of the client shaved 6-8 months off the project schedule, and created a product that all goals.
  • 17. Case Study 1: Project Components Breakdown Conversion Production ComponentsConversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation Client DCL Specialty Provider Shared Responsibility
  • 18. Case Study 2: International Technology Hardware and Software Company • Client Situation - Company has developed many thousands of hours of instructional materials it wants to centralize and convert to XML using a SCORM-based Schema - Materials included slides, video and taped lectures, written materials in various forms - Goal was to identify the re-usable assets and to normalize these materials so that this library of reusable assets can be reused for training its own engineers and other personnel - Some materials would be offered for external training - The materials were very specialized and subject matter expertise (SME) input was needed to review all materials • Approach - DCL integrated as part of the client’s team - DCL prepared transcripts of all oral materials with timings keyed to PowerPoint Slides - DCL copyedited transcripts and PowerPoint slides and normalized style for both - Client provided SME and legal review of transcripts - Client re-recorded any needed voice-overs - Client created Flash format for web publishing - DCL created integrated XML products for loading into the client educational database • Results - Full integration of client and DCL teams allowed for a rapid ramp to produce pilot and move into larger production - Client was able to use it’s own personnel who knew the product well for SME support - The client also contracted with another engineering company to provide additional SME support for those products that could be supported by outside engineers
  • 19. Conversion Production ComponentsConversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation Client DCL Specialty Provider Case Study 2: Project Components Breakdown Shared Responsibility
  • 20. Case Study 3: Engineering Company Supplying the US Air Force • Client Situation - Material were to be converted from SGML and delivered in S1000D - Company had created a fully automated conversion; Air Force wanted an independent audit of the converted documents • Approach - Client had developed the conversion specified, and converted the documents to S1000D - DCL to validated that the final XML met S1000D requirements - DCL developed a conversion plan and tools to perform the audit - DCL performed both automated and manual analysis and review of the conversion processes and converted documents checking for inventory accuracy, tagging accuracy, and text accuracy of tags and tag values - DCL performed 100% audit of all materials and reported results, along with suggestions to the client and to the Air Force • Results - Client was able to utilize DCL’s S1000D expertise and take advantage of DCL’s automated audit and QA tools - The client produced a better product as a result of feedback DCL was able to provide - Air Force received a fully audited document set that satisfied their independent review requirement
  • 21. Case Study 3: Project Components Breakdown Conversion Production ComponentsConversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation Client DCL Specialty Provider Shared Responsibility
  • 22. Case Study 4: Large Journal Publisher with Facilities in China and India • Client Situation - Ongoing publishing operations with good understanding of its work flow and requirements - Growing very quickly and needing to ramp up its capacity to convert author-written articles from Word and PDF into XML - Has in-place facilities in China to handle process management and labor-intensive tasks - Had been building its own software capability, but it was taking longer than expected - Wanted to take advantage of DCL’s infrastructure for conversion and workflow while maintaining it’s own facilities for the human processing tasks • Approach - DCL configured it’s workflow and conversion software to the client’s requirements - But instead of using DCL’s facilities, all preliminary work, and all manual work was routed by the workflow system directly to the clients facility. • Results - Process made use of DCL’s existing infrastructure and software which were quickly reconfigured to the client’s specification, and able to improve the automation of its process quickly and at lower cost - Client was able to take advantage of the efficient facilities and infrastructure it had put into place - DCL would monitor software and provide enhancements and updates as needed - DCL would provide backup capability for overflow surges
  • 23. Case Study 4: Project Components Breakdown Conversion Production ComponentsConversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation Client DCL Specialty Provider Shared Responsibility
  • 24. The Model That Maximizes Results and Minimizes Risk is Best for Your Organization • Which parts of the process are your core business? • Will this be a permanent process, or a limited time project? • Do you have the needed in-house expertise? • Do you want to build the staff and infrastructure? • What are the risks? • What combination will be best for your business? Ask yourself these questions to help make the determination ... … the good news – it’s not “one size fits all” anymore “You don’t have to go it alone.”
  • 25. Q&A Mark Gross Founder and CEO, Data Conversion Laboratory (718) 307-5711 Mgross@dclab.com