SlideShare a Scribd company logo
Confidential & Proprietarywww.dclab.comwww.dclab.com
Preparing Your Legacy Data for Automation
in S1000D
Naveh Greenberg,
Director, U.S. Defense Development,
Data Conversion Laboratory
Confidential & Proprietarywww.dclab.com 2
Valuable Content Transformed
• Document Digitization
• XML and HTML Conversion
• eBook Production
• Hosted Solutions
• Big Data Automation
• Conversion Management
• Editorial Services
• Harmonizer
Confidential & Proprietarywww.dclab.com 3
Experience the DCL Difference
DCL blends years of conversion experience with cutting-edge technology and the
infrastructure to make the process easy and efficient.
• World-Class Services
• Leading-Edge Technology
• Unparalleled Infrastructure
• US-Based Management
• Complex-Content Expertise
• 24/7 Online Project Tracking
• Automated Quality Control
• Global Capabilities
Confidential & Proprietarywww.dclab.com
We Serve a Very Broad Client Base . . .
4
Confidential & Proprietarywww.dclab.com 5
. . . Spanning All Industries
• Aerospace
• Associations
• Defense
• Distribution
• Education
• Financial
• Government
• Libraries
• Life Sciences
• Manufacturing
• Medical
• Museums
• Periodicals
• Professional
• Publishing
• Reference
• Research
• Societies
• Software
• STM
• Technology
• Telecommunications
• Universities
• Utilities
Confidential & Proprietarywww.dclab.com 6
What Makes S1000D Conversion Difficult
• S1000D is a conceptual departure from linear information – and
is difficult for many to get used to
• Turns the traditional book into a collection of DMs
– Introductory material that applies to numerous DMs
– Placement of Warnings, Cautions and Notes
– Writer creativity
• DMC & business rules.
– Assigning DMCs and ICNs
– Hierarchy in Map Files (Publication Module)
– Data can fit more than one information code
• …but your documents weren’t likely to have been designed to do
this.
Confidential & Proprietarywww.dclab.com 7
Structuring a Book into Data Modules in S1000D
IPD
Wiring
Descriptive
Crew
Fault
Appendix B
Procedural
Para 1-1Early S1000D
Publication
Para 1-2
Para 1-3
Para 1-1
Para 3-1
Para 2-1
PDF Book
Para 1-2
38784 Book
Para 2-1
Para 2-2
Appendix A
Para 3-2
Appendix A
Appendix B
S1000D Common Source
Database
Publication 1
Publication2
Subtask
Task
Subtask
ATA Book
Pageblock
Pageblock
Pageblock
Pageblock
Pageblock
Task
Maintenance
Process
Descriptive DM
Procedural DM
IPD DM
Wiring DM
Crew DM
Process DM
Maintenance DM
Fault DM
IPD
Wiring
Descriptive
Crew
Fault
Procedural
Maintenance
Process
Process
Wiring
Procedural
Descriptive
Fault
Crew
Process
Publication3
Confidential & Proprietarywww.dclab.com 8
Further Complications in S1000D Conversion
• There’s the usual conversion issues
– Accuracy of the transferred text
– Tables
– Math or odd looking text
– Special Characters
• There’s also the structuring issues
– Identifying DMs
– Identifying reusable content
– Identifying Applicability
• And the people issues
– Getting rugged individualists to collaborate more
– Deciding what needs re-authoring
– Getting used to a new “document” paradigm
Confidential & Proprietarywww.dclab.com 9
Most Importantly – Plan!!!
• Ask the important initial questions
˗ Who are the stakeholders. Who is the final client/user?
˗ What is the estimated volume and deadline?
˗ Source format. Not all source data are created equal.
˗ What version of S1000D?
˗ Do we know what CMS or rendering tools will be used?
˗ Budget?
• Ask around or join discussion groups.
• Get your hands on the source data, business rules, and schemas.
• Begin looking for the right people. You don’t need to be a S1000D savvy
but you do at a minimum understand the concept.
Confidential & Proprietarywww.dclab.com
Ask Questions
10
Confidential & Proprietarywww.dclab.com
“If I had eight hours to chop
down a tree, I'd spend six
sharpening my ax.”
- Abraham Lincoln
DCL’s Project Start-up Methodology
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Conversion Production
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com
Inventory & Assessment
• Log the batches received into a production control system.
• By logging and tracking each unit you can gather information
that can be used to:
– Project delivery schedules
– Confirm that processes are working properly
– Track each unit and show you in what step of the production
process it’s in.
Confidential & Proprietarywww.dclab.com 15
Inventory & Assessment: What to Convert, and in What Order
• Categorizing
– Active documents in good shape
– Active documents that need a lot of work
– Somewhat inactive document that will likely be retired
– Archival materials
• Prioritizing
– Documents that are most used
– Documents that are customer favorites
– Documents with longest product life
– Start with most recent documents and go back
• Identifying the process
– Can be converted as is
– Can be converted with some work
– Needs to be rewritten
– Don’t convert – just keep archival copies
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com
Why Is Reuse Analysis Important?
• Increased consistency
• Reduced development time
• Lower maintenance costs
• Rapid reconfiguration
• Divide and conquer
Confidential & Proprietarywww.dclab.com
Why Is Reuse Analysis Important?
Confidential & Proprietarywww.dclab.com 19
Content Reuse Analysis Reports
• Finding exact or similar text will help you when mapping to Data Modules
• It will also help to detect applicability and inconsistencies
Confidential & Proprietarywww.dclab.com 20
Content Reuse Analysis Reports
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com 22
Document Analysis & Conversion Specification
• Evaluate document sources to determine the
relative ease & accuracy of content extraction
• Identify metadata sources
• Identify the types of information in the documents
and the appropriate level of tagging
• Identify processes for various materials
• Detailed analysis of documents by type
• Review enough documents to understand the
potential variations
• Develop tagging instructions
• Prepare specification
• Normalize your data
Confidential & Proprietarywww.dclab.com 23
Document Analysis – Text extraction
Sample Document Text OCR Output
Confidential & Proprietarywww.dclab.com
The Conversion Specification (DMRL & specific rules)
24
Confidential & Proprietarywww.dclab.com
The Conversion Specification
25
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
26
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
27
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
28
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
29
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
30
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
31
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
32
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
33
Confidential & Proprietarywww.dclab.com 34
Q&A
Naveh Greenberg
Director, U.S. Defense Development,
Data Conversion Laboratory
(718) 307-5758
ngreenberg@dclab.com
@dclaboratory

More Related Content

PPT
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
PPTX
Converting and Integrating Legacy Data and Documents When Implementing a New CMS
PPTX
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
PPTX
Managing the Complexities of Conversion to S1000D
PPTX
Data-Driven User Experience
PPT
When Conversion Makes Sense
PPTX
Converting and Transforming Technical Graphics
PPTX
Content Development: Measuring the Trends
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
Converting and Integrating Legacy Data and Documents When Implementing a New CMS
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Managing the Complexities of Conversion to S1000D
Data-Driven User Experience
When Conversion Makes Sense
Converting and Transforming Technical Graphics
Content Development: Measuring the Trends

What's hot (20)

PPTX
What are the Strengths and Weaknesses of DITA Adoption?
PPTX
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
PPTX
DITA's New Thang: Going Mapless!
PPTX
Anticipating Lightweight DITA
PPTX
Converting and Integrating Content When Implementing a New CMS
PDF
Enabling Telco to Build and Run Modern Applications
PPTX
Content Conversion Done Right Saves More Than Money
PPTX
10 Million Dita Topics Can't Be Wrong
PPTX
Localization and DITA: What you Need to Know - LocWorld32
PPTX
Tackle your Documentation Challenges with the IXIASOFT DITA CMS
PDF
DataOps - Lean principles and lean practices
PDF
Sprinting to Success: Why Agile and DITA Work So Well Together
PDF
M|18 How We Made the Move to MariaDB at FNI
PDF
Using a Fast Operational Database to Build Real-time Streaming Aggregations
PPT
ODI 11g in the Enterprise - BIWA 2013
PPTX
DITA for Small Teams Workshop (Tekom 2017)
PPTX
4D Pubs - Distributed Dynamic Document Dsplay
PDF
The lean principles of data ops
PPTX
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
PDF
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
What are the Strengths and Weaknesses of DITA Adoption?
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
DITA's New Thang: Going Mapless!
Anticipating Lightweight DITA
Converting and Integrating Content When Implementing a New CMS
Enabling Telco to Build and Run Modern Applications
Content Conversion Done Right Saves More Than Money
10 Million Dita Topics Can't Be Wrong
Localization and DITA: What you Need to Know - LocWorld32
Tackle your Documentation Challenges with the IXIASOFT DITA CMS
DataOps - Lean principles and lean practices
Sprinting to Success: Why Agile and DITA Work So Well Together
M|18 How We Made the Move to MariaDB at FNI
Using a Fast Operational Database to Build Real-time Streaming Aggregations
ODI 11g in the Enterprise - BIWA 2013
DITA for Small Teams Workshop (Tekom 2017)
4D Pubs - Distributed Dynamic Document Dsplay
The lean principles of data ops
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
Ad

Viewers also liked (14)

PPTX
Minimalism Revisited — Let’s Stop Developing Content that No One Wants
PPTX
DITA for Small Teams: An Open Source Approach to DITA Content Management
PPTX
Content Engineering and The Internet of “Smart” Things
PPTX
Introduction to Structured Authoring
PPTX
Optimizing the DITA Authoring Experience
PPTX
There's Gold in Them Thar Data
PPTX
Metadata Matters
PPTX
New Directions 2015 – Changes in Content Best Practices
PPTX
Precision Content™ Tools, Techniques, and Technology
PPTX
Using HTML5 to Deliver and Monetize Your Mobile Content
PPTX
10 Mistakes When Moving to Topic-Based Authoring
PPTX
DITA, EPUB, and HTML5: An Update for 2015
PPTX
Demystifying SPL for Medical Devices
PPTX
Marketing and Strategy and Bears... oh my!
Minimalism Revisited — Let’s Stop Developing Content that No One Wants
DITA for Small Teams: An Open Source Approach to DITA Content Management
Content Engineering and The Internet of “Smart” Things
Introduction to Structured Authoring
Optimizing the DITA Authoring Experience
There's Gold in Them Thar Data
Metadata Matters
New Directions 2015 – Changes in Content Best Practices
Precision Content™ Tools, Techniques, and Technology
Using HTML5 to Deliver and Monetize Your Mobile Content
10 Mistakes When Moving to Topic-Based Authoring
DITA, EPUB, and HTML5: An Update for 2015
Demystifying SPL for Medical Devices
Marketing and Strategy and Bears... oh my!
Ad

Similar to Preparing Your Legacy Data for Automation in S1000D (20)

PPTX
Creating a Hybrid Approach to Legacy Conversion
PDF
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
PDF
DesignMind SQL Server 2008 Migration
PDF
Presentation application change management and data masking strategies for ...
PDF
Automating Data Quality Processes at Reckitt
PDF
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...
PDF
AWS User Group October
PDF
The Great Lakes: How to Approach a Big Data Implementation
PDF
SQL Server 2008 Migration
PDF
IBM Cloud Day January 2021 - A well architected data lake
PDF
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
PDF
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
PPTX
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
PDF
SecureKloud_Corporate Deck.pdf
DOCX
Hi I need security-related job points for the software develope.docx
PPTX
Engineering Collaboration Webinar One
PPTX
Ms net work-sharepoint 2013-applied architecture from the field v4
PPTX
Migrating from RDBMS to MongoDB
PPTX
[DSC DACH 24] Bridging the Technical-Business Divide with Modern Cloud Archit...
PPT
PLM Implementation
Creating a Hybrid Approach to Legacy Conversion
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
DesignMind SQL Server 2008 Migration
Presentation application change management and data masking strategies for ...
Automating Data Quality Processes at Reckitt
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...
AWS User Group October
The Great Lakes: How to Approach a Big Data Implementation
SQL Server 2008 Migration
IBM Cloud Day January 2021 - A well architected data lake
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
SecureKloud_Corporate Deck.pdf
Hi I need security-related job points for the software develope.docx
Engineering Collaboration Webinar One
Ms net work-sharepoint 2013-applied architecture from the field v4
Migrating from RDBMS to MongoDB
[DSC DACH 24] Bridging the Technical-Business Divide with Modern Cloud Archit...
PLM Implementation

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
KodekX | Application Modernization Development
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Review of recent advances in non-invasive hemoglobin estimation
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Unlocking AI with Model Context Protocol (MCP)
KodekX | Application Modernization Development
Big Data Technologies - Introduction.pptx
Programs and apps: productivity, graphics, security and other tools
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25 Week I
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectral efficient network and resource selection model in 5G networks
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

Preparing Your Legacy Data for Automation in S1000D

  • 1. Confidential & Proprietarywww.dclab.comwww.dclab.com Preparing Your Legacy Data for Automation in S1000D Naveh Greenberg, Director, U.S. Defense Development, Data Conversion Laboratory
  • 2. Confidential & Proprietarywww.dclab.com 2 Valuable Content Transformed • Document Digitization • XML and HTML Conversion • eBook Production • Hosted Solutions • Big Data Automation • Conversion Management • Editorial Services • Harmonizer
  • 3. Confidential & Proprietarywww.dclab.com 3 Experience the DCL Difference DCL blends years of conversion experience with cutting-edge technology and the infrastructure to make the process easy and efficient. • World-Class Services • Leading-Edge Technology • Unparalleled Infrastructure • US-Based Management • Complex-Content Expertise • 24/7 Online Project Tracking • Automated Quality Control • Global Capabilities
  • 4. Confidential & Proprietarywww.dclab.com We Serve a Very Broad Client Base . . . 4
  • 5. Confidential & Proprietarywww.dclab.com 5 . . . Spanning All Industries • Aerospace • Associations • Defense • Distribution • Education • Financial • Government • Libraries • Life Sciences • Manufacturing • Medical • Museums • Periodicals • Professional • Publishing • Reference • Research • Societies • Software • STM • Technology • Telecommunications • Universities • Utilities
  • 6. Confidential & Proprietarywww.dclab.com 6 What Makes S1000D Conversion Difficult • S1000D is a conceptual departure from linear information – and is difficult for many to get used to • Turns the traditional book into a collection of DMs – Introductory material that applies to numerous DMs – Placement of Warnings, Cautions and Notes – Writer creativity • DMC & business rules. – Assigning DMCs and ICNs – Hierarchy in Map Files (Publication Module) – Data can fit more than one information code • …but your documents weren’t likely to have been designed to do this.
  • 7. Confidential & Proprietarywww.dclab.com 7 Structuring a Book into Data Modules in S1000D IPD Wiring Descriptive Crew Fault Appendix B Procedural Para 1-1Early S1000D Publication Para 1-2 Para 1-3 Para 1-1 Para 3-1 Para 2-1 PDF Book Para 1-2 38784 Book Para 2-1 Para 2-2 Appendix A Para 3-2 Appendix A Appendix B S1000D Common Source Database Publication 1 Publication2 Subtask Task Subtask ATA Book Pageblock Pageblock Pageblock Pageblock Pageblock Task Maintenance Process Descriptive DM Procedural DM IPD DM Wiring DM Crew DM Process DM Maintenance DM Fault DM IPD Wiring Descriptive Crew Fault Procedural Maintenance Process Process Wiring Procedural Descriptive Fault Crew Process Publication3
  • 8. Confidential & Proprietarywww.dclab.com 8 Further Complications in S1000D Conversion • There’s the usual conversion issues – Accuracy of the transferred text – Tables – Math or odd looking text – Special Characters • There’s also the structuring issues – Identifying DMs – Identifying reusable content – Identifying Applicability • And the people issues – Getting rugged individualists to collaborate more – Deciding what needs re-authoring – Getting used to a new “document” paradigm
  • 9. Confidential & Proprietarywww.dclab.com 9 Most Importantly – Plan!!! • Ask the important initial questions ˗ Who are the stakeholders. Who is the final client/user? ˗ What is the estimated volume and deadline? ˗ Source format. Not all source data are created equal. ˗ What version of S1000D? ˗ Do we know what CMS or rendering tools will be used? ˗ Budget? • Ask around or join discussion groups. • Get your hands on the source data, business rules, and schemas. • Begin looking for the right people. You don’t need to be a S1000D savvy but you do at a minimum understand the concept.
  • 11. Confidential & Proprietarywww.dclab.com “If I had eight hours to chop down a tree, I'd spend six sharpening my ax.” - Abraham Lincoln DCL’s Project Start-up Methodology
  • 12. Confidential & Proprietarywww.dclab.com Conversion Setup Components Conversion Production Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation What Does a Conversion Project Look Like?
  • 13. Confidential & Proprietarywww.dclab.com Conversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training What Does a Conversion Project Look Like?
  • 14. Confidential & Proprietarywww.dclab.com Inventory & Assessment • Log the batches received into a production control system. • By logging and tracking each unit you can gather information that can be used to: – Project delivery schedules – Confirm that processes are working properly – Track each unit and show you in what step of the production process it’s in.
  • 15. Confidential & Proprietarywww.dclab.com 15 Inventory & Assessment: What to Convert, and in What Order • Categorizing – Active documents in good shape – Active documents that need a lot of work – Somewhat inactive document that will likely be retired – Archival materials • Prioritizing – Documents that are most used – Documents that are customer favorites – Documents with longest product life – Start with most recent documents and go back • Identifying the process – Can be converted as is – Can be converted with some work – Needs to be rewritten – Don’t convert – just keep archival copies
  • 16. Confidential & Proprietarywww.dclab.com Conversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training What Does a Conversion Project Look Like?
  • 17. Confidential & Proprietarywww.dclab.com Why Is Reuse Analysis Important? • Increased consistency • Reduced development time • Lower maintenance costs • Rapid reconfiguration • Divide and conquer
  • 18. Confidential & Proprietarywww.dclab.com Why Is Reuse Analysis Important?
  • 19. Confidential & Proprietarywww.dclab.com 19 Content Reuse Analysis Reports • Finding exact or similar text will help you when mapping to Data Modules • It will also help to detect applicability and inconsistencies
  • 20. Confidential & Proprietarywww.dclab.com 20 Content Reuse Analysis Reports
  • 21. Confidential & Proprietarywww.dclab.com Conversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training What Does a Conversion Project Look Like?
  • 22. Confidential & Proprietarywww.dclab.com 22 Document Analysis & Conversion Specification • Evaluate document sources to determine the relative ease & accuracy of content extraction • Identify metadata sources • Identify the types of information in the documents and the appropriate level of tagging • Identify processes for various materials • Detailed analysis of documents by type • Review enough documents to understand the potential variations • Develop tagging instructions • Prepare specification • Normalize your data
  • 23. Confidential & Proprietarywww.dclab.com 23 Document Analysis – Text extraction Sample Document Text OCR Output
  • 24. Confidential & Proprietarywww.dclab.com The Conversion Specification (DMRL & specific rules) 24
  • 25. Confidential & Proprietarywww.dclab.com The Conversion Specification 25
  • 34. Confidential & Proprietarywww.dclab.com 34 Q&A Naveh Greenberg Director, U.S. Defense Development, Data Conversion Laboratory (718) 307-5758 ngreenberg@dclab.com @dclaboratory

Editor's Notes

  • #13: -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.
  • #14: -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.
  • #17: -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.
  • #22: -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.