SlideShare a Scribd company logo
easyDITA How-To Series:
Taxonomy 101: Classifying DITA Tasks



    Paul Wlodarczyk
    CEO, Jorsek LLC
    June 28, 2012
Poll: Please complete while folks arrive

 How are you delivering DITA Tasks or other
 procedural / how-to content?
 •    Portal with advanced / faceted search
 •    Static web pages or web help
 •    Print / PDF
 •    Windows Help
 •    Other



  6/28/2012              © Jorsek, LLC. All Rights Reserved.   2
Why talk about task oriented content?
 Task-oriented content is
 valuable:
 • It is versatile and can be
    reused in more
    deliverables than
    conceptual content
        –     Product user guides
        –     Context-sensitive help
        –     Knowledge base
        –     Support
        –     Training
 • It’s what most users are
   searching for in a
                                                                       A DITA Task published to MindTouch
   knowledge base or help

  6/28/2012                            © Jorsek, LLC. All Rights Reserved.                                  3
Benefits of using DTA for authoring tasks
 • Task authored in DITA are
        –     Concise
        –     Consistent
        –     Modular
        –     Semantic
 • DITA Tasks make good
   templates for content
   contributed by SMEs (like
   product engineers)
 • For software UA in
   particular, task-oriented
   content is perfect for QA. The
   task becomes the Test Case.

                                                                  The XML Source for a DITA Task
  6/28/2012                 © Jorsek, LLC. All Rights Reserved.                                    4
Anatomy of a DITA Task
 •    Title
 •    Short description
 •    Context
 •    Prerequisite
 •    Step section
 •    Step
 •    Command
 •    Sub Step
 •    Step Info
 •    Step Result
 •    Step Example
 •    Choice and Choice Table
 •    Example
 •    Post-requisite
 •    Result                                                          A DITA Task in easyDITA
  6/28/2012                     © Jorsek, LLC. All Rights Reserved.                             5
DITA Tasks are semantic

 • DITA tasks are inherently semantic
        – Not simple ordered lists
        – Not simple paragraphs
 • This is useful for
        – Dynamic rendition, e.g.
              • Expand / collapse steps
              • Interactive UI controls
        – Semantic Search in the context of the structure, e.g.
              • find STEPS that contain MENU CASCADES
              • Find STEP INFORMATION that contains IMAGES tagged with [text]
              • Find PREREQUISITES that contain [text]




  6/28/2012                          © Jorsek, LLC. All Rights Reserved.        6
Making tasks more findable with metadata

 • Q: How can we make content
   even more findable
        – For authors and content
          managers?
        – For end users in a dynamic
          delivery system?
 • A: Tag tasks with semantic
   metadata
        – Semantic = “meaning”
        – Metadata can be set with terms
          from controlled vocabularies
          defined and managed in a
          taxonomy
  6/28/2012                   © Jorsek, LLC. All Rights Reserved.   7
What is Metadata?
  • Literally “Data about the data”
  • Also known as “tags”
             – Not to be confused with the content itself (e.g.
               XML structure)
             – Can be embedded in a file (e.g. the DITA Prolog
               or attributes; JPEG image data) or associated in
               a CMS
  • Two main flavors:
             – Administrative metadata
                 • e.g. Content Type, Author, Date
                   Modified, Version, Title, etc.,
                 • Usually system-generated
                 • What the content is
             – Descriptive metadata
                 • Subject classification, keywords, etc.
                 • Usually manually authored
                 • What the content is about

 6/28/2012                                   © Jorsek, LLC. All Rights Reserved.   8
Key Concept: Taxonomy
   taxonomy n. A categorization
   scheme for concepts, often
   hierarchical
   • Most often, taxonomies show “is a”
     relationships, e.g. A mammal is a
     vertebrate, A rodent is a mammal, etc.
   • Navigation up and down the tree yields
     broader than (BT) and narrower than
     (NT) classification
             – Can be used to adjust search scope
   • Can also show related terms (RT)
             – Can be used to suggest related searches / “see
               also”
   • Can manage synonyms (UF – Use For)
             – Can be used to find content when search
               terms are not the preferred terms



 6/28/2012                                 © Jorsek, LLC. All Rights Reserved.   9
Using Taxonomy for controlled vocabularies
 • A taxonomy is the “source of truth” for what terms to use for
   various concepts – so terms are consistent.
 • Taxonomy terms can be used as controlled vocabularies (“pick-
   lists”) for metadata, so authors simply select preferred terms
        – Avoids typos, duplicates, word form variations, use of non-preferred terms
 • Some content management systems enable controlled
   vocabularies from taxonomies to be used for setting attribute
   values in DITA (e.g. selectatts like Audience, Product, Platform etc.).
 • Relationships between terms in a Taxonomy can improve search
        – CMS search and site search indexing tools can use equivalent and related
          terms to find content that does not contain the search term
        – Relationships between terms can be expressed as RDF in HTML content for
          improving web search indexing



  6/28/2012                        © Jorsek, LLC. All Rights Reserved.            10
Simple framework for tagging tasks
• In any industry, we’re all trying to help people do something to
  something in a context:
      – Who is doing what to what (+ other important context or condition)
• Examples
     Junior Service Technician doing preventive maintenance on Acme Jetpack
     XR7 that uses nitrous oxide injection technology
     Casual User clearing paper jam on MFD100 Copier with envelope tray option
     Case Worker performing an intake interview for a recently unemployed
     person in New York State
     Intermediate User publishing a DITA Map using DITA OT to PDF format
     Financial analyst calculating a WACC for a publicly traded company located in
     a country using GAPP accounting
     Registered Nurse administering medication to patient in the ICU and drug is a
     controlled substance
     Contract Service Technician doing diagnosis on P1000 Printer showing missing
     sections of the printed image
  6/28/2012                       © Jorsek, LLC. All Rights Reserved.            11
What metadata do you need?
 Information about the
 Performer, Activity, Object, and Context will help
 narrow search results for a user or author (see our
 blog post on Metadata 101: A Search First Approach)
 • Performer metadata:
     – Types of users (roles, experience, education
       level, etc.)
     – Types of employees
       (title, training, certifications, clearance, departmen
       t, skill level etc.)
     – Types of customers
 • Activity metadata:
       – Broad Task Types (e.g. for service:
           maintenance, diagnosis, repair, calibration, startup,
            etc.)
       – High Level Task names from a performance analysis
           / instructional design
       – Competencies from a model
       –
 6/28/2012 Commercial Services listing © Jorsek, LLC. All Rights Reserved.   12
What metadata do you need?
 Information about the
 Performer, Activity, Object, and Context will help
 narrow search results for a user or author (see our
 blog post on Metadata 101: A Search First Approach)
 • Object (i.e. “To what / to whom”) metadata:
       – Things: Product, product components, product
         subsystems
       – People: Types of customers or clients
 • Context metadata:
       –     Market / locale
       –     Product options
       –     Technologies
       –     Special situations
       –     Tools required
       –     Security classification
       –     Symptoms / Fault codes


 6/28/2012                             © Jorsek, LLC. All Rights Reserved.   13
Do we have to create these terms from scratch?
 No! You are surrounded by free sources for term lists, many are
 governed and authoritative. Don’t reinvent – borrow!
 Here are some common sources of terms:
 •       Corporate ECM or Web taxonomy (from IT or marketing)
 •       Industry-specific taxonomies (e.g. MeSH for life sciences, DSM for mental health)
 •       Government taxonomies (e.g. UK IPSV - Integrated Public Sector Vocabulary)
 •       Generic public domain taxonomies (e.g. People, Places, and Cultures; AP News)
 •       Other corporate sources:
           –     Training group (competency models, task analyses)
           –     HR (Job codes and Job Titles)
           –     Support / field service systems (Parts, fault classifications, failure modes, tools used)
           –     CRM data (Customer names, Customer categories, SKUs, Products & Services)
           –     Product data (Product BOMs, platforms, parts, subsystems, options)
           –     Organization Charts (Divisions, departments, locations, budget centers)
           –     Business Process Analysis (process names and steps, inputs and outputs)



     6/28/2012                                       © Jorsek, LLC. All Rights Reserved.                     14
Taxonomy Tools
 • You can build and manage a simple taxonomy in Microsoft Excel
 • Even if authors manually tag metadata, the Excel taxonomy can be a useful
   guide and source of terms to copy/paste
 • Each row is a term and each column is a level in the hierarchy




 • Put other data required for related and equivalent terms in columns to right of
   preferred term hierarchy
 • Add a column for scope notes
 • Use Grouping to help expand / collapse sections of a long taxonomy
 • If you have a CMS or other tool that consumes taxonomy, you can export a CSV
   file from Excel and import it to the CMS (see Mary Garcia’s excellent blog posts at
     TaxoDiary.com to learn how)
 6/28/2012                         © Jorsek, LLC. All Rights Reserved.              15
Taxonomy Tools
 • Consider using a Taxonomy Management System if:
    – You have a large taxonomy (over 500 terms)
    – The taxonomy changes often
    – You have a complex governance process for approving new terms
    – The taxonomy needs to be consumed by more than one system
    – You are using term relationships to improve search indexing




 6/28/2012                    © Jorsek, LLC. All Rights Reserved.     16
Guidelines for taxonomy quality
 • The hierarchy should reflect any of three relationships:
        – Generic (e.g. VehicleCar)
        – Instance (e.g. Mountain regionsRockies)
        – Whole-Part (e.g. HouseRoof)
 • Terms should be nouns or noun phrases.
 • Activities should be nouns or gerunds.
 • Avoid adjectives and prepositions unless integral to the term.
 • When in doubt singular vs. plural, choose plural; these are
   categories. Singular is OK for instances at the narrow end.
 • Named entities should be proper nouns.
 • Avoid punctuation and ampersands. Eliminate hyphens except
   where the term is confusing or unclear without them.
 • Make the most commonly used term the preferred term, even if it
   is an acronym (e.g. NASA). Make other forms Equivalent Terms.
  6/28/2012                       © Jorsek, LLC. All Rights Reserved.   17
Poll:

 Are you currently using controlled
 vocabularies for any of the following?
 •    CMS Metadata
 •    DITA Attributes
 •    Prolog Metadata and Keywords
 •    Other
 •    Not using controlled vocabularies



  6/28/2012             © Jorsek, LLC. All Rights Reserved.   18
Resources
 • LinkedIn Taxonomy Community of Practice
 • ANSI/NISO Z39.19-2005 - Guidelines on
   Construction, Format, and Management of Monolingual
   Controlled Vocabularies
 • IBM Presentation: Writing Effective DITA Task Topics
       – http://svdig.ditamap.com/DITATaskTopics_090310SR.ppt
 • TaxoDiary blog posts by Mary Garcia:
   Maintaining a Thesaurus in an Excel Workbook (two parts)
       – http://taxodiary.com/2012/04/maintaining-a-thesaurus-in-an-excel-
         workbook/
       – http://taxodiary.com/2012/05/maintaining-a-thesaurus-in-an-excel-
         workbook-part-2/
 • easyDITA blog posts and Twitter
       – easyDITA.com/blog and @easydita

 6/28/2012                       © Jorsek, LLC. All Rights Reserved.         19
Thank you!
 • Questions?
 • Recorded webcast will be available soon through our website –
   you will get an email with the link
 • Anyone can register after the event to view the recording
 • Slides will be available on SlideShare
       – www.slideshare.net/easydita
 • Next webcast July 25, featuring Amber Swope of DITA Strategies
   discussing Using Taxonomy for DITA Content. Please join us!




 6/28/2012                      © Jorsek, LLC. All Rights Reserved.   20

More Related Content

PDF
DITA and Metadata on an Enterprise Scale
PDF
DITA Metadata
PPTX
How to Optimize Your Metadata and Taxonomy
PPTX
DITA Quick Start for Authors Part II
PDF
Slides: Knowledge Graphs vs. Property Graphs
PPT
Taxonomy: Do I Need One
PDF
DITA Interoperability
PPTX
DITA Quick Start: System Architecture of a Basic DITA Toolset
DITA and Metadata on an Enterprise Scale
DITA Metadata
How to Optimize Your Metadata and Taxonomy
DITA Quick Start for Authors Part II
Slides: Knowledge Graphs vs. Property Graphs
Taxonomy: Do I Need One
DITA Interoperability
DITA Quick Start: System Architecture of a Basic DITA Toolset

What's hot (20)

PPTX
Successful Content Management Through Taxonomy And Metadata Design
PDF
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
PPTX
DITA Quick Start for Authors - Part I
PDF
SAP Extended ECM by OpenText 10.0 - What's New?
PPTX
Optimizing Content Reuse with DITA
PDF
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
PDF
Building Lakehouses on Delta Lake with SQL Analytics Primer
PPTX
Optimizing your DITA content model for translation
PDF
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
PDF
The Trip to DITA
PPTX
Big data architectures and the data lake
PDF
DITA Quick Start
PPTX
Master Data Management - Gartner Presentation
PPTX
EXACC Presentat CHEUG 2019 (9).pptx
PDF
Etl overview training
PDF
Complete+dbt+Bootcamp+slides-plus examples
PDF
Zensar Technologies Oracle Capabilities
PDF
Denver AUG ACE Jira Service Desk in an Hour Isos Tech
PDF
Data Catalog for Better Data Discovery and Governance
PPTX
RDA Intro - AACR2 / MARC> RDA / FRBR / Semantic Web
Successful Content Management Through Taxonomy And Metadata Design
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
DITA Quick Start for Authors - Part I
SAP Extended ECM by OpenText 10.0 - What's New?
Optimizing Content Reuse with DITA
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Building Lakehouses on Delta Lake with SQL Analytics Primer
Optimizing your DITA content model for translation
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
The Trip to DITA
Big data architectures and the data lake
DITA Quick Start
Master Data Management - Gartner Presentation
EXACC Presentat CHEUG 2019 (9).pptx
Etl overview training
Complete+dbt+Bootcamp+slides-plus examples
Zensar Technologies Oracle Capabilities
Denver AUG ACE Jira Service Desk in an Hour Isos Tech
Data Catalog for Better Data Discovery and Governance
RDA Intro - AACR2 / MARC> RDA / FRBR / Semantic Web
Ad

Viewers also liked (16)

PPT
Creating Documentation With A Wiki: The DITA Storm Project
PPTX
Surviving the Transition to DITA: Trusted Partners can Ease the Pain
PPTX
Localization and DITA: What you Need to Know - LocWorld32
PDF
Converting Unstructured Docs to XML/DITA/ePub
PPTX
Pat Farrell, Migrating Legacy Documentation to XML and DITA
PPSX
Metadata: Queen to King Content?
PPT
The Elusive Promise of Reuse
PDF
Joe Gelb: Taxonomy and Delivery
PPTX
Easy steps to convert your content to structured (frame maker and xml)
PDF
Reports and DITA Metrics IXIASOFT User Conference 2016
PDF
Developing training websites in multiple languages with (mostly) open-source ...
PDF
Blurring the Lines between ECM and CCMS
PDF
Understanding Information Architecture
PDF
Multiplying the Power of Taxonomy with Granular, Structured Content
PDF
Wireframing, Mockups, and Prototyping Made Easy
PPTX
10 Million Dita Topics Can't Be Wrong
Creating Documentation With A Wiki: The DITA Storm Project
Surviving the Transition to DITA: Trusted Partners can Ease the Pain
Localization and DITA: What you Need to Know - LocWorld32
Converting Unstructured Docs to XML/DITA/ePub
Pat Farrell, Migrating Legacy Documentation to XML and DITA
Metadata: Queen to King Content?
The Elusive Promise of Reuse
Joe Gelb: Taxonomy and Delivery
Easy steps to convert your content to structured (frame maker and xml)
Reports and DITA Metrics IXIASOFT User Conference 2016
Developing training websites in multiple languages with (mostly) open-source ...
Blurring the Lines between ECM and CCMS
Understanding Information Architecture
Multiplying the Power of Taxonomy with Granular, Structured Content
Wireframing, Mockups, and Prototyping Made Easy
10 Million Dita Topics Can't Be Wrong
Ad

Similar to Taxonomy 101: Classifying DITA Tasks (20)

PPTX
DITA-Workshop on Saturday 5 May 2018 at Pune
PPT
DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
PPT
Dita webinar 20th march
PDF
Adding structure to unstructured content for enhanced findability hakan tylen
PDF
Introduction to DITA
PDF
Dita Accelerator Xml2008
PPTX
DITA Surprise, Unwrapping DITA Best Practices - tekom tcworld 2016
PPT
Document repositories-and-metadata
PDF
ALA 2010 -- Jabin White
PPT
Taxonomies And Search Aiim Mn
PPT
Enterprise Navigation (KM World 2007)
PPT
Painless XML Authoring?: How DITA Simplifies XML
PPTX
Taxonomy and seo sla 05-06-10(jc)
PPT
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
PPTX
TWC 545 Presentation-DITA
PDF
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
PPT
Content Management, Metadata and Semantic Web
PPT
Content Management, Metadata and Semantic Web
PPTX
Taxonomies for Publishing
PDF
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
DITA-Workshop on Saturday 5 May 2018 at Pune
DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
Dita webinar 20th march
Adding structure to unstructured content for enhanced findability hakan tylen
Introduction to DITA
Dita Accelerator Xml2008
DITA Surprise, Unwrapping DITA Best Practices - tekom tcworld 2016
Document repositories-and-metadata
ALA 2010 -- Jabin White
Taxonomies And Search Aiim Mn
Enterprise Navigation (KM World 2007)
Painless XML Authoring?: How DITA Simplifies XML
Taxonomy and seo sla 05-06-10(jc)
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
TWC 545 Presentation-DITA
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
Taxonomies for Publishing
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Empathic Computing: Creating Shared Understanding
PPT
Teaching material agriculture food technology
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Electronic commerce courselecture one. Pdf
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Unlocking AI with Model Context Protocol (MCP)
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Empathic Computing: Creating Shared Understanding
Teaching material agriculture food technology
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MIND Revenue Release Quarter 2 2025 Press Release
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
A Presentation on Artificial Intelligence
Electronic commerce courselecture one. Pdf
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Taxonomy 101: Classifying DITA Tasks

  • 1. easyDITA How-To Series: Taxonomy 101: Classifying DITA Tasks Paul Wlodarczyk CEO, Jorsek LLC June 28, 2012
  • 2. Poll: Please complete while folks arrive How are you delivering DITA Tasks or other procedural / how-to content? • Portal with advanced / faceted search • Static web pages or web help • Print / PDF • Windows Help • Other 6/28/2012 © Jorsek, LLC. All Rights Reserved. 2
  • 3. Why talk about task oriented content? Task-oriented content is valuable: • It is versatile and can be reused in more deliverables than conceptual content – Product user guides – Context-sensitive help – Knowledge base – Support – Training • It’s what most users are searching for in a A DITA Task published to MindTouch knowledge base or help 6/28/2012 © Jorsek, LLC. All Rights Reserved. 3
  • 4. Benefits of using DTA for authoring tasks • Task authored in DITA are – Concise – Consistent – Modular – Semantic • DITA Tasks make good templates for content contributed by SMEs (like product engineers) • For software UA in particular, task-oriented content is perfect for QA. The task becomes the Test Case. The XML Source for a DITA Task 6/28/2012 © Jorsek, LLC. All Rights Reserved. 4
  • 5. Anatomy of a DITA Task • Title • Short description • Context • Prerequisite • Step section • Step • Command • Sub Step • Step Info • Step Result • Step Example • Choice and Choice Table • Example • Post-requisite • Result A DITA Task in easyDITA 6/28/2012 © Jorsek, LLC. All Rights Reserved. 5
  • 6. DITA Tasks are semantic • DITA tasks are inherently semantic – Not simple ordered lists – Not simple paragraphs • This is useful for – Dynamic rendition, e.g. • Expand / collapse steps • Interactive UI controls – Semantic Search in the context of the structure, e.g. • find STEPS that contain MENU CASCADES • Find STEP INFORMATION that contains IMAGES tagged with [text] • Find PREREQUISITES that contain [text] 6/28/2012 © Jorsek, LLC. All Rights Reserved. 6
  • 7. Making tasks more findable with metadata • Q: How can we make content even more findable – For authors and content managers? – For end users in a dynamic delivery system? • A: Tag tasks with semantic metadata – Semantic = “meaning” – Metadata can be set with terms from controlled vocabularies defined and managed in a taxonomy 6/28/2012 © Jorsek, LLC. All Rights Reserved. 7
  • 8. What is Metadata? • Literally “Data about the data” • Also known as “tags” – Not to be confused with the content itself (e.g. XML structure) – Can be embedded in a file (e.g. the DITA Prolog or attributes; JPEG image data) or associated in a CMS • Two main flavors: – Administrative metadata • e.g. Content Type, Author, Date Modified, Version, Title, etc., • Usually system-generated • What the content is – Descriptive metadata • Subject classification, keywords, etc. • Usually manually authored • What the content is about 6/28/2012 © Jorsek, LLC. All Rights Reserved. 8
  • 9. Key Concept: Taxonomy taxonomy n. A categorization scheme for concepts, often hierarchical • Most often, taxonomies show “is a” relationships, e.g. A mammal is a vertebrate, A rodent is a mammal, etc. • Navigation up and down the tree yields broader than (BT) and narrower than (NT) classification – Can be used to adjust search scope • Can also show related terms (RT) – Can be used to suggest related searches / “see also” • Can manage synonyms (UF – Use For) – Can be used to find content when search terms are not the preferred terms 6/28/2012 © Jorsek, LLC. All Rights Reserved. 9
  • 10. Using Taxonomy for controlled vocabularies • A taxonomy is the “source of truth” for what terms to use for various concepts – so terms are consistent. • Taxonomy terms can be used as controlled vocabularies (“pick- lists”) for metadata, so authors simply select preferred terms – Avoids typos, duplicates, word form variations, use of non-preferred terms • Some content management systems enable controlled vocabularies from taxonomies to be used for setting attribute values in DITA (e.g. selectatts like Audience, Product, Platform etc.). • Relationships between terms in a Taxonomy can improve search – CMS search and site search indexing tools can use equivalent and related terms to find content that does not contain the search term – Relationships between terms can be expressed as RDF in HTML content for improving web search indexing 6/28/2012 © Jorsek, LLC. All Rights Reserved. 10
  • 11. Simple framework for tagging tasks • In any industry, we’re all trying to help people do something to something in a context: – Who is doing what to what (+ other important context or condition) • Examples Junior Service Technician doing preventive maintenance on Acme Jetpack XR7 that uses nitrous oxide injection technology Casual User clearing paper jam on MFD100 Copier with envelope tray option Case Worker performing an intake interview for a recently unemployed person in New York State Intermediate User publishing a DITA Map using DITA OT to PDF format Financial analyst calculating a WACC for a publicly traded company located in a country using GAPP accounting Registered Nurse administering medication to patient in the ICU and drug is a controlled substance Contract Service Technician doing diagnosis on P1000 Printer showing missing sections of the printed image 6/28/2012 © Jorsek, LLC. All Rights Reserved. 11
  • 12. What metadata do you need? Information about the Performer, Activity, Object, and Context will help narrow search results for a user or author (see our blog post on Metadata 101: A Search First Approach) • Performer metadata: – Types of users (roles, experience, education level, etc.) – Types of employees (title, training, certifications, clearance, departmen t, skill level etc.) – Types of customers • Activity metadata: – Broad Task Types (e.g. for service: maintenance, diagnosis, repair, calibration, startup, etc.) – High Level Task names from a performance analysis / instructional design – Competencies from a model – 6/28/2012 Commercial Services listing © Jorsek, LLC. All Rights Reserved. 12
  • 13. What metadata do you need? Information about the Performer, Activity, Object, and Context will help narrow search results for a user or author (see our blog post on Metadata 101: A Search First Approach) • Object (i.e. “To what / to whom”) metadata: – Things: Product, product components, product subsystems – People: Types of customers or clients • Context metadata: – Market / locale – Product options – Technologies – Special situations – Tools required – Security classification – Symptoms / Fault codes 6/28/2012 © Jorsek, LLC. All Rights Reserved. 13
  • 14. Do we have to create these terms from scratch? No! You are surrounded by free sources for term lists, many are governed and authoritative. Don’t reinvent – borrow! Here are some common sources of terms: • Corporate ECM or Web taxonomy (from IT or marketing) • Industry-specific taxonomies (e.g. MeSH for life sciences, DSM for mental health) • Government taxonomies (e.g. UK IPSV - Integrated Public Sector Vocabulary) • Generic public domain taxonomies (e.g. People, Places, and Cultures; AP News) • Other corporate sources: – Training group (competency models, task analyses) – HR (Job codes and Job Titles) – Support / field service systems (Parts, fault classifications, failure modes, tools used) – CRM data (Customer names, Customer categories, SKUs, Products & Services) – Product data (Product BOMs, platforms, parts, subsystems, options) – Organization Charts (Divisions, departments, locations, budget centers) – Business Process Analysis (process names and steps, inputs and outputs) 6/28/2012 © Jorsek, LLC. All Rights Reserved. 14
  • 15. Taxonomy Tools • You can build and manage a simple taxonomy in Microsoft Excel • Even if authors manually tag metadata, the Excel taxonomy can be a useful guide and source of terms to copy/paste • Each row is a term and each column is a level in the hierarchy • Put other data required for related and equivalent terms in columns to right of preferred term hierarchy • Add a column for scope notes • Use Grouping to help expand / collapse sections of a long taxonomy • If you have a CMS or other tool that consumes taxonomy, you can export a CSV file from Excel and import it to the CMS (see Mary Garcia’s excellent blog posts at TaxoDiary.com to learn how) 6/28/2012 © Jorsek, LLC. All Rights Reserved. 15
  • 16. Taxonomy Tools • Consider using a Taxonomy Management System if: – You have a large taxonomy (over 500 terms) – The taxonomy changes often – You have a complex governance process for approving new terms – The taxonomy needs to be consumed by more than one system – You are using term relationships to improve search indexing 6/28/2012 © Jorsek, LLC. All Rights Reserved. 16
  • 17. Guidelines for taxonomy quality • The hierarchy should reflect any of three relationships: – Generic (e.g. VehicleCar) – Instance (e.g. Mountain regionsRockies) – Whole-Part (e.g. HouseRoof) • Terms should be nouns or noun phrases. • Activities should be nouns or gerunds. • Avoid adjectives and prepositions unless integral to the term. • When in doubt singular vs. plural, choose plural; these are categories. Singular is OK for instances at the narrow end. • Named entities should be proper nouns. • Avoid punctuation and ampersands. Eliminate hyphens except where the term is confusing or unclear without them. • Make the most commonly used term the preferred term, even if it is an acronym (e.g. NASA). Make other forms Equivalent Terms. 6/28/2012 © Jorsek, LLC. All Rights Reserved. 17
  • 18. Poll: Are you currently using controlled vocabularies for any of the following? • CMS Metadata • DITA Attributes • Prolog Metadata and Keywords • Other • Not using controlled vocabularies 6/28/2012 © Jorsek, LLC. All Rights Reserved. 18
  • 19. Resources • LinkedIn Taxonomy Community of Practice • ANSI/NISO Z39.19-2005 - Guidelines on Construction, Format, and Management of Monolingual Controlled Vocabularies • IBM Presentation: Writing Effective DITA Task Topics – http://svdig.ditamap.com/DITATaskTopics_090310SR.ppt • TaxoDiary blog posts by Mary Garcia: Maintaining a Thesaurus in an Excel Workbook (two parts) – http://taxodiary.com/2012/04/maintaining-a-thesaurus-in-an-excel- workbook/ – http://taxodiary.com/2012/05/maintaining-a-thesaurus-in-an-excel- workbook-part-2/ • easyDITA blog posts and Twitter – easyDITA.com/blog and @easydita 6/28/2012 © Jorsek, LLC. All Rights Reserved. 19
  • 20. Thank you! • Questions? • Recorded webcast will be available soon through our website – you will get an email with the link • Anyone can register after the event to view the recording • Slides will be available on SlideShare – www.slideshare.net/easydita • Next webcast July 25, featuring Amber Swope of DITA Strategies discussing Using Taxonomy for DITA Content. Please join us! 6/28/2012 © Jorsek, LLC. All Rights Reserved. 20