SlideShare a Scribd company logo
Metadata in general and
Dublin Core in specific;
  some experiences
                  Kerstin Forsberg
            Senior Information Architect
Information Strategy, Clinical Information Science
    Mailto:kerstin.l.forsberg@astrazeneca.com
Public homepage: http://www.viktoria.se/~kerstinf/




                      1
Metadata?
• The magic word metadata comes up both as
  a problem solver and a big problem in itself
  when …
• … taking about integrating databases,
  reviewing data, archiving records, loading
  source tables into DW, decomissioning
  systems, navigating between documents,
  people and projects, searching for
  information, etc. etc.


                      2
Challanges and Insights
• Providing professionals with contextualised
  information
   • “Volvo Core” metadata standard embryo for
     Volvo’s intranet 1998-99
   • Journalists out in the field need information
     based on their current tasks at hand
   • Clinical Scientists need information relevant
     for their research questions and decisions
• Information services for professionals must
  enable ever ongoing structuring and
  networking, they can never rely on stable
  structures or hierarchies
                       3
”Volvo Core”
• A very early attempt to make use of Dublin
  Core 15 elements
• Identified problems
• “These problems are a consequence of trying
  to describe information resources without
  taking into account the context in which end
  users create and consume information.”
  Experiences of metadata usage reported in a research paper:
  Forsberg, K. and L. Dannstedt (2000) "Extensible use of RDF in a
  business context," Presented at the 9th International World Wide
  Web Conference, Amsterdam, Netherlands, May 2000.
                             4
Research interest: New ways of using IT in
            the newsmaking
                                     ”… solutions that move beyond the
                                     desktop out to the workplace.”
                                      V. Bellotti and Y. Rogers




                         <



                 Metadata based architecture described in a research paper:
                 Fagrell, H., K. Forsberg and J. Sanneblad (2000)
                 “FieldWise: a Mobile Knowledge Management
                 Architecture,” In Proceedings of ACM 2000 Conference
                 on Computer Supported Cooperative Work
                         5
Medical Informatics Vision

   Increase creativity,
 support decision making
       and efficiency
by enabling researchers to
  exploit clinical scientific
 information globally, and
support personal networks.



                   6
Meeting the needs
• Powerful range of medicines, including
  many world leaders, in 7 major therapy
  areas:
   • Gastrointestinal
   • Cardiovascular
   • Cancer
   • Respiratory
   • Pain Control & Anaesthesia
   • Central Nervous System
   • Infection
• Active portfolio management to
  maintain quality and value

                             7
Drivers for strategic management and
optimal utilisation of clinical information
• Ensuring the usefulness of
  information over time (project in
  progress, abandoned project,
  product on the market or withdrawn
  from the market)
    • Formal and external
       requirements to preserve the
       evidential value due to
       regulatory and legal reasons
                                       “the industry has not yet
    • Informal and internal            learned to make best use
       requirements to enable re-      of the tools it already has,
       usability due to                such as ways to share
       scientifical and historical     information across the
       reasons                         various businesses”.
                                                The Economist July 2002



                               8
Today’s business focus on …
            Have you delivered your data and
            documents?

                                       Data capture

                                        p-CRF
                                           CRF
                                            CRF
      Study                                                                   Submission
      outline            Patient                          Study data            ready

                                          e-CRF                                                  e-CTD
CDP               CSP                                                    CSR-              CSR
                                                                       document
                        Investigator




            SMF
                                                      9
Instead ...
   Are you motivated, and provided with
   tools and procedures, to contribute to
   our shared information assets?




                     10
Are you motivated, and provided with tools
and procedures, to …
• … make the information assets
  accessible?
   • Do you know where to store and
      how to manage the different
      types of information (e.g.
      applying relevant version
      handling)
   • Is it available through different
      information services (e.g. is the
      source being properly indexed by
      search engines)
   • Is it formatted in a way that is
      open for different communication
      channels, presentation interfaces
      and device types
                             11
Are you motivated, and provided with tools
and procedures, to …
• …make the information assets
  understandable by putting it in a
  context?
   •   Relating it to the operational and
       scientific context, i.e. topics, things
       and tasks, we talk about and act
       upon today
   •   Making sense for the present
       community
   •   Combining it with other information
       types and other pieces of
       information



                                   12
Are you motivated, and provided with tools
and procedures, to …
• …ensure that the information assets
  could be part of other contexts that is,
  to enable re-purpose and future-proof
  of the information?
   •   To be able to relate it to other parts
       of operational and scientific contexts
   •   In the future, to be able to relate it to
       the the operational and scientific
       contexts as they may look like then
   •   Making sense for future
       communities
   •   In new combinations




                                   13
Vision
                               Single point access


                                      Clinical
                                specific information
                                      (Portal)
Explicit                                                                         Tacit
           Highly structured data   Semi-structured data        Unstructured data

                           Information model / Metaprocess


                         Diplomat      Planet       Library         Internal
            Impact                                                  Networks
                             SAS          GEL        Partners
                                                                     External
                             Amos         Shared      External       Networks
              Maud                         Files      Databases
             Olsson’s
              Notes
             Database                                      Disease characteristics
Exemples
Exemples existing sources and
applications

                       General
                   Search Service
                  GEL repository a
               “search enabled source”
R&D Portal
    and                 Targeted               Existing GEL
Study Webs               Gap
                 Operational Services
               Views of GEL information
             “portlet enabled functionality”



Occasional users                                              Power users
• Reviewers                                                   • GLAs
• Occational authors                                          • Publishers
• Document consumers                                          • SLiM contributors
• SLiM consumers                                              • Technical writers
The lack of metadata
  One Key Problem                                           to enable reuse of
                                                            information and to
                               Single point access         facilitate navigation
                                                             between data and
                                                                documents!

                                      Clinical
                                specific information
                                      (Portal)
Explicit                                                                         Tacit
           Highly structured data   Semi-structured data        Unstructured data

                           Information model / Metaprocess


                         Diplomat      Planet       Library         Internal
            Impact                                                  Networks
                             SAS          GEL        Partners
                                                                     External
                             Amos         Shared      External       Networks
              MATRIX                       Files      Databases


                                                           Disease characteristics
AZ R&D IM/KM metadata standard
        Implementing Dublin Core
Content                            Intellectual property              Instantiation
•   Title: A name given to the     •   Creator: An entity primarily   •   Date: A date associated
    resource.                          responsible for making the         with an event in the life
•   Subject: The topic of the          content of the resource.
    content of the resource                                               cycle of the resource.
                                   •   Publisher: An entity           •   Format: The physical or
•   Description: An account of         responsible for making the
    the content of the resource.                                          digital manifestation of the
                                       resource available.
•   Type: The nature or genre                                             resource
    of the content of the          •   Contributor: An entity
                                       responsible for making         •   Identifier: An
    resource.
                                       contributions to the content       unambiguous reference to
•   Source: A Reference to a                                              the resource within a given
    resource from which the            of the resource
    present resource is derived.   •   Rights: Information about          context.
•   Relation: A reference to a         rights held in and over the    •   Language: A language of
    related resource.                  resource                           the intellectual content of
•   Coverage: The extent or                                               the resource.
    scope of the content of the
    resource

                                                  18
AZ R&D IM/KM metadata standard
 Core Metadata Elements, parts of
Element Name   Description                                                      Comments

Identifier     An unambiguous reference to the resource within a given          Unique
               context. Recommended best practice is to identify the            within the
               resource by means of a string or number conforming to a          Informatio
               formal identification system.
                                                                                n Resource
Title          The name given to the resource. Typically, a Title will be a      Free text
               name by which the resource is formally known.


Description    An account of the content of the resource                         Free text


Subject        The topic of the content of the resource. Typically, a Subject   Controlled
               will be expressed as keywords or key phrases or                  Vocabulary
               classification codes that describe the topic of the resource.     required
               Recommended best practice is to select a value from a
               controlled vocabulary or formal *classification scheme.
               *In the IM/KM program we will pick one or several
               Subject(s) from a selected Taxonomy. Subjects are also
               known as Taxonomy Nodes/Terms in a Taxonomy context.
                                        19
General Issues
• What types of Information Resources do
 Dublin Core fit for?
   • Information Resources
       • Work Area
       • eRoom
       • Infospace
   • Information Content
       • News
       • Web Content
       • Links
   • Information Presentation layer/Container
       • Portlet        20
General Issues
• It is not a static list of standard metadata tags!
    • Only to be used as s requirement document for
        programming of content management applications
• Is it an extensible metadata framework for
  standardisation of metadata?
   •   For metadata element naming and encoding of
       metadata values across hetergenous information
       sources
   •   To enhance usage and sharing, searching and
       navigation between documents, data and web content
   •   Supporting portals, search engines, document
       management systems, content mangement systems,
       archiving of information, etc. etc. etc.
                          21
AZ R&D IM/KM metadata standard
        Implementing Dublin Core
Content                            Intellectual property              Instantiation
•   Title: A name given to the     •   Creator: An entity primarily   •   Date: A date associated
    resource.                          responsible for making the         with an event in the life
•   Subject: The topic of the          content of the resource.
    content of the resource                                               cycle of the resource.
                                   •   Publisher: An entity           •   Format: The physical or
•   Description: An account of         responsible for making the
    the content of the resource.                                          digital manifestation of the
                                       resource available.
•   Type: The nature or genre                                             resource
    of the content of the          •   Contributor: An entity
                                       responsible for making         •   Identifier: An
    resource.
                                       contributions to the content       unambiguous reference to
•   Source: A Reference to a                                              the resource within a given
    resource from which the            of the resource
    present resource is derived.   •   Rights: Information about          context.
•   Relation: A reference to a         rights held in and over the    •   Language: A language of
    related resource.                  resource                           the intellectual content of
•   Coverage: The extent or                                               the resource.
    scope of the content of the
    resource

                                                  22
Tricky, but important: Type?
• Type, as originally defined in Dublin Core:
   • “The nature or genre of the content of the
     resource. Type includes terms describing
     general categories, functions, genres, or
     aggregation levels for content.”




                       23
Information type (class of
content)


 Types of “information asset”
that are specificed as classes
 of content, having a purpose
         and lifecycle.



     Information Type




                                 24
Actual information, physical
representation


 Types of “information asset”
that are specificed as classes
 of content, having a purpose
         and lifecycle.



     Information Type




                                Operational perspective
                          Metadata detailing how and where the
                           representation of the content (or the
                            embodiment of the information) is
                                             25
                              created, stored and managed
Actual information, logical
”aboutness”
                                  “Subject” perspective
                                Metadata representing the
                                 “aboutness” of the actual
                               content and classifying it due
                               to a sustainable hierarchy of
                               organised subjects (themes,
 Types of “information asset”      topics, overall ideas)
that are specificed as classes
 of content, having a purpose
         and lifecycle.



     Information Type




                                Operational perspective
                          Metadata detailing how and where the
                           representation of the content (or the
                            embodiment of the information) is
                                             26
                              created, stored and managed
Actual information, logical
”coverage”
                                    “Subject” perspective
                                  Metadata representing the
                                   “aboutness” of the actual
                                 content and classifying it due
                                 to a sustainable hierarchy of
                                 organised subjects (themes,
 Types of “information asset”        topics, overall ideas)
that are specificed as classes
 of content, having a purpose
and lifecycle ín their contexts.                                     Business perspective
                                                                      Metadata representing
                                                                      the “aboutness” of the
      Information Type                                                  actual content and
                                                                     describing the extent or
                                                                      scope of the content in
                                                                     relation to the changing
                                                                        business context of
                                                                           interrelated of
                                                                    organisations, processes,
                                                                           products, etc.
                                 Operational perspective
                           Metadata detailing how and where the
                            representation of the content (or the
                             embodiment of the information) is
                                              27
                               created, stored and managed
Information type, metadata
application
                                   “Subject” perspective
                                 Metadata representing the
                                  “aboutness” of the actual
                                content and classifying it due
                                to a sustainable hierarchy of
                                organised subjects (themes,
  Types of “information asset”      topics, overall ideas)
 that are specificed as classes
of content, having purpose and
   lifecycle ín their contexts.                                      Business perspective
                                                                      Metadata representing
                                                                      the “aboutness” of the
      Information Type                                                  actual content and
                                                                     describing the extent or
                                                                      scope of the content in
                                                                     relation to the changing
 Specifies the metadata to be                                           business context of
         applied in the                                                    interrelated of
 creation and management                                            organisations, processes,
        of information                                                     products, etc.
                                 Operational perspective
                           Metadata detailing how and where the
                            representation of the content (or the
                             embodiment of the information) is
                                              28
                               created, stored and managed
Information type, metadata
application
                                   “Subject” perspective
                                 Metadata representing the
                                  “aboutness” of the actual
                                content and classifying it due
                                to a sustainable hierarchy of
                                organised subjects (themes,
  Types of “information asset”      topics, overall ideas)
 that are specificed as classes
of content, having purpose and
   lifecycle ín their contexts.                                    Business perspective
                                                                    Metadata representing
                                                                    the “aboutness” of the
      Information Type                                                actual content and
                                                                   describing the extent or
                                                                    scope of the content in
                                                                   relation to the changing
 Specifies the metadata to be                                         business context of
         applied in the                                                  interrelated of
  creation and management                                         organisations, processes,
         of information                                                  products, etc.
                                Operational perspective
          and used for     Metadata detailing how and where the
    selection and access representation of the content (or the
        to information       embodiment of the information) is
                                              29
                                created, stored and managed
Information Type (or Class????)
• Proposed definition and usage within the AZ
  R&D IM/KM context:
   • A class of content having a specified lifecycle
     and required utilisation (behaviour)
   • To specify the metadata structure and the
     metadata rules to be applied in the creation
     and management of information.
   • To specify available metadata to be utilised
     for selection, search, access and presentation
     of information
• Type List for now, but later on a Type
  Registry????
                        30
Are you motivated, and provided with
tools and procedures, to contribute to
our shared information assets?




                  31

More Related Content

PDF
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
PDF
Informatics technologies in an evolving r & d landscape
PPTX
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
PPTX
Repository Federation: Towards Data Interoperability
PPTX
SEAD Datanet and Sustainability Science
PPTX
Needs for Data Management & Citation Throughout the Information Lifecycle
PPTX
SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
PDF
Translational Research Intelligence - Beyond Traditional Bi
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Informatics technologies in an evolving r & d landscape
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
Repository Federation: Towards Data Interoperability
SEAD Datanet and Sustainability Science
Needs for Data Management & Citation Throughout the Information Lifecycle
SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
Translational Research Intelligence - Beyond Traditional Bi

What's hot (20)

PPTX
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
PPT
Where is the opportunity for libraries in the collaborative data infrastructure?
PDF
Provenance and Trust
KEY
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
PDF
Silverton cleversafe-object-based-dispersed-storage
PDF
Educating a New Breed of Data Scientists for Scientific Data Management
PDF
Anthony J brookes
PDF
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
PDF
Graham Pryor
PDF
Dc sheridan dlf_2011_final
PDF
Maximize the Business Value of Your Information
PPTX
Semantic Web powering Enterprise and Web Applications
PDF
03 heemskerk eramind mobility mtg_trieste italy_fh_27_may10
PPTX
10052012 luc vervenne synergetics van syntax portfolio naar semantische uitwi...
PDF
Hawaii Pacific GIS Conference 2012: GIS in Education: K-12 and University - H...
PPTX
Why I don't use Semantic Web technologies anymore, event if they still influe...
PDF
ER Studio Facts and Features
PDF
20120419 linkedopendataandteamsciencemcguinnesschicago
PDF
التحول الرقمي للوثائق والمحفوظات من النظم التقليدية لنظم المعلومات في المؤسسات
PPTX
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
Where is the opportunity for libraries in the collaborative data infrastructure?
Provenance and Trust
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
Silverton cleversafe-object-based-dispersed-storage
Educating a New Breed of Data Scientists for Scientific Data Management
Anthony J brookes
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
Graham Pryor
Dc sheridan dlf_2011_final
Maximize the Business Value of Your Information
Semantic Web powering Enterprise and Web Applications
03 heemskerk eramind mobility mtg_trieste italy_fh_27_may10
10052012 luc vervenne synergetics van syntax portfolio naar semantische uitwi...
Hawaii Pacific GIS Conference 2012: GIS in Education: K-12 and University - H...
Why I don't use Semantic Web technologies anymore, event if they still influe...
ER Studio Facts and Features
20120419 linkedopendataandteamsciencemcguinnesschicago
التحول الرقمي للوثائق والمحفوظات من النظم التقليدية لنظم المعلومات في المؤسسات
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Ad

Similar to Metadata in general and Dublin Core in specific; some experiences (20)

PDF
ASA conference Feb 2013
PDF
Pistoia Alliance SESL pilot Bio IT World Hanover 12 Oct 2011
PDF
Towards a brokering framework for knowledge-based services: Learning from the...
PDF
Linked data and the future of scientific publishing
PPTX
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
PPTX
Open data and Collaborative Governance (the UW lecture)
PPT
Supporting Libraries in Leading the Way in Research Data Management
PPTX
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
PDF
Jena based implementation of a iso 11179 meta data registry
PPTX
Information Management and Analytics
PPSX
Content Management in the Pharmaceutical Industry ©RIL
PPTX
Introduction to Advance Analytics Course
PPTX
Open Science
PDF
How Search 2.0 Has Been Redefined by Enterprise 2.0
PDF
Smarter Computing Big Data
PPTX
Small Data: How Elsevier Might Help with Research Data Management
PDF
Expert Webinar Series 5: &quot;De-mystifying Content Types - Four Key Content...
PDF
Big data and big content
PDF
Enterprise Content Management and the Librarian
PPTX
Industry Transformation via Health Analytics
ASA conference Feb 2013
Pistoia Alliance SESL pilot Bio IT World Hanover 12 Oct 2011
Towards a brokering framework for knowledge-based services: Learning from the...
Linked data and the future of scientific publishing
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
Open data and Collaborative Governance (the UW lecture)
Supporting Libraries in Leading the Way in Research Data Management
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
Jena based implementation of a iso 11179 meta data registry
Information Management and Analytics
Content Management in the Pharmaceutical Industry ©RIL
Introduction to Advance Analytics Course
Open Science
How Search 2.0 Has Been Redefined by Enterprise 2.0
Smarter Computing Big Data
Small Data: How Elsevier Might Help with Research Data Management
Expert Webinar Series 5: &quot;De-mystifying Content Types - Four Key Content...
Big data and big content
Enterprise Content Management and the Librarian
Industry Transformation via Health Analytics
Ad

More from Kerstin Forsberg (20)

PDF
Semantics and linked data at astra zeneca
PPTX
Linked Data efforts for data standards in biopharma and healthcare
PPTX
Linked data presentation for who umc 21 jan 2015
PPTX
A Justification-based Semantic Framework for Representing, Evaluating and Uti...
PPTX
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
PPTX
Lankade data Vinnova webbinarium
PPTX
Pushing back, standards and standard organizations in a Semantic Web enabled ...
PDF
CDISC2RDF overview with examples
PDF
CDISC2RDF poster for Conference on Data Integration in the Life Sciences 2013
PDF
Cdisc2 rdf overveiw
PDF
Linked open data it univ 22 nov 2012
PDF
Linked open data example uk spending
PDF
Semantic models for cdisc based standards and metadata management (1)
PDF
Semantic models for cdisc based standards and metadata management
PDF
Linked data in pharma it univ 2 april 2012
PPTX
Linked data introduction w exempel
PPT
Designing and launching the Clinical Reference Library
PDF
Linking clinical data standards
PPTX
Linked data in pharma
PDF
Linked data in pharma R&D
Semantics and linked data at astra zeneca
Linked Data efforts for data standards in biopharma and healthcare
Linked data presentation for who umc 21 jan 2015
A Justification-based Semantic Framework for Representing, Evaluating and Uti...
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
Lankade data Vinnova webbinarium
Pushing back, standards and standard organizations in a Semantic Web enabled ...
CDISC2RDF overview with examples
CDISC2RDF poster for Conference on Data Integration in the Life Sciences 2013
Cdisc2 rdf overveiw
Linked open data it univ 22 nov 2012
Linked open data example uk spending
Semantic models for cdisc based standards and metadata management (1)
Semantic models for cdisc based standards and metadata management
Linked data in pharma it univ 2 april 2012
Linked data introduction w exempel
Designing and launching the Clinical Reference Library
Linking clinical data standards
Linked data in pharma
Linked data in pharma R&D

Recently uploaded (20)

PDF
Pre independence Education in Inndia.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Business Ethics Teaching Materials for college
PPTX
Pharma ospi slides which help in ospi learning
PDF
Basic Mud Logging Guide for educational purpose
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Pre independence Education in Inndia.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
VCE English Exam - Section C Student Revision Booklet
O7-L3 Supply Chain Operations - ICLT Program
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Cell Structure & Organelles in detailed.
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Final Presentation General Medicine 03-08-2024.pptx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Microbial disease of the cardiovascular and lymphatic systems
O5-L3 Freight Transport Ops (International) V1.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Business Ethics Teaching Materials for college
Pharma ospi slides which help in ospi learning
Basic Mud Logging Guide for educational purpose
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf

Metadata in general and Dublin Core in specific; some experiences

  • 1. Metadata in general and Dublin Core in specific; some experiences Kerstin Forsberg Senior Information Architect Information Strategy, Clinical Information Science Mailto:kerstin.l.forsberg@astrazeneca.com Public homepage: http://www.viktoria.se/~kerstinf/ 1
  • 2. Metadata? • The magic word metadata comes up both as a problem solver and a big problem in itself when … • … taking about integrating databases, reviewing data, archiving records, loading source tables into DW, decomissioning systems, navigating between documents, people and projects, searching for information, etc. etc. 2
  • 3. Challanges and Insights • Providing professionals with contextualised information • “Volvo Core” metadata standard embryo for Volvo’s intranet 1998-99 • Journalists out in the field need information based on their current tasks at hand • Clinical Scientists need information relevant for their research questions and decisions • Information services for professionals must enable ever ongoing structuring and networking, they can never rely on stable structures or hierarchies 3
  • 4. ”Volvo Core” • A very early attempt to make use of Dublin Core 15 elements • Identified problems • “These problems are a consequence of trying to describe information resources without taking into account the context in which end users create and consume information.” Experiences of metadata usage reported in a research paper: Forsberg, K. and L. Dannstedt (2000) "Extensible use of RDF in a business context," Presented at the 9th International World Wide Web Conference, Amsterdam, Netherlands, May 2000. 4
  • 5. Research interest: New ways of using IT in the newsmaking ”… solutions that move beyond the desktop out to the workplace.” V. Bellotti and Y. Rogers < Metadata based architecture described in a research paper: Fagrell, H., K. Forsberg and J. Sanneblad (2000) “FieldWise: a Mobile Knowledge Management Architecture,” In Proceedings of ACM 2000 Conference on Computer Supported Cooperative Work 5
  • 6. Medical Informatics Vision Increase creativity, support decision making and efficiency by enabling researchers to exploit clinical scientific information globally, and support personal networks. 6
  • 7. Meeting the needs • Powerful range of medicines, including many world leaders, in 7 major therapy areas: • Gastrointestinal • Cardiovascular • Cancer • Respiratory • Pain Control & Anaesthesia • Central Nervous System • Infection • Active portfolio management to maintain quality and value 7
  • 8. Drivers for strategic management and optimal utilisation of clinical information • Ensuring the usefulness of information over time (project in progress, abandoned project, product on the market or withdrawn from the market) • Formal and external requirements to preserve the evidential value due to regulatory and legal reasons “the industry has not yet • Informal and internal learned to make best use requirements to enable re- of the tools it already has, usability due to such as ways to share scientifical and historical information across the reasons various businesses”. The Economist July 2002 8
  • 9. Today’s business focus on … Have you delivered your data and documents? Data capture p-CRF CRF CRF Study Submission outline Patient Study data ready e-CRF e-CTD CDP CSP CSR- CSR document Investigator SMF 9
  • 10. Instead ... Are you motivated, and provided with tools and procedures, to contribute to our shared information assets? 10
  • 11. Are you motivated, and provided with tools and procedures, to … • … make the information assets accessible? • Do you know where to store and how to manage the different types of information (e.g. applying relevant version handling) • Is it available through different information services (e.g. is the source being properly indexed by search engines) • Is it formatted in a way that is open for different communication channels, presentation interfaces and device types 11
  • 12. Are you motivated, and provided with tools and procedures, to … • …make the information assets understandable by putting it in a context? • Relating it to the operational and scientific context, i.e. topics, things and tasks, we talk about and act upon today • Making sense for the present community • Combining it with other information types and other pieces of information 12
  • 13. Are you motivated, and provided with tools and procedures, to … • …ensure that the information assets could be part of other contexts that is, to enable re-purpose and future-proof of the information? • To be able to relate it to other parts of operational and scientific contexts • In the future, to be able to relate it to the the operational and scientific contexts as they may look like then • Making sense for future communities • In new combinations 13
  • 14. Vision Single point access Clinical specific information (Portal) Explicit Tacit Highly structured data Semi-structured data Unstructured data Information model / Metaprocess Diplomat Planet Library Internal Impact Networks SAS GEL Partners External Amos Shared External Networks Maud Files Databases Olsson’s Notes Database Disease characteristics
  • 16. Exemples existing sources and applications General Search Service GEL repository a “search enabled source” R&D Portal and Targeted Existing GEL Study Webs Gap Operational Services Views of GEL information “portlet enabled functionality” Occasional users Power users • Reviewers • GLAs • Occational authors • Publishers • Document consumers • SLiM contributors • SLiM consumers • Technical writers
  • 17. The lack of metadata One Key Problem to enable reuse of information and to Single point access facilitate navigation between data and documents! Clinical specific information (Portal) Explicit Tacit Highly structured data Semi-structured data Unstructured data Information model / Metaprocess Diplomat Planet Library Internal Impact Networks SAS GEL Partners External Amos Shared External Networks MATRIX Files Databases Disease characteristics
  • 18. AZ R&D IM/KM metadata standard Implementing Dublin Core Content Intellectual property Instantiation • Title: A name given to the • Creator: An entity primarily • Date: A date associated resource. responsible for making the with an event in the life • Subject: The topic of the content of the resource. content of the resource cycle of the resource. • Publisher: An entity • Format: The physical or • Description: An account of responsible for making the the content of the resource. digital manifestation of the resource available. • Type: The nature or genre resource of the content of the • Contributor: An entity responsible for making • Identifier: An resource. contributions to the content unambiguous reference to • Source: A Reference to a the resource within a given resource from which the of the resource present resource is derived. • Rights: Information about context. • Relation: A reference to a rights held in and over the • Language: A language of related resource. resource the intellectual content of • Coverage: The extent or the resource. scope of the content of the resource 18
  • 19. AZ R&D IM/KM metadata standard Core Metadata Elements, parts of Element Name Description Comments Identifier An unambiguous reference to the resource within a given Unique context. Recommended best practice is to identify the within the resource by means of a string or number conforming to a Informatio formal identification system. n Resource Title The name given to the resource. Typically, a Title will be a Free text name by which the resource is formally known. Description An account of the content of the resource Free text Subject The topic of the content of the resource. Typically, a Subject Controlled will be expressed as keywords or key phrases or Vocabulary classification codes that describe the topic of the resource. required Recommended best practice is to select a value from a controlled vocabulary or formal *classification scheme. *In the IM/KM program we will pick one or several Subject(s) from a selected Taxonomy. Subjects are also known as Taxonomy Nodes/Terms in a Taxonomy context. 19
  • 20. General Issues • What types of Information Resources do Dublin Core fit for? • Information Resources • Work Area • eRoom • Infospace • Information Content • News • Web Content • Links • Information Presentation layer/Container • Portlet 20
  • 21. General Issues • It is not a static list of standard metadata tags! • Only to be used as s requirement document for programming of content management applications • Is it an extensible metadata framework for standardisation of metadata? • For metadata element naming and encoding of metadata values across hetergenous information sources • To enhance usage and sharing, searching and navigation between documents, data and web content • Supporting portals, search engines, document management systems, content mangement systems, archiving of information, etc. etc. etc. 21
  • 22. AZ R&D IM/KM metadata standard Implementing Dublin Core Content Intellectual property Instantiation • Title: A name given to the • Creator: An entity primarily • Date: A date associated resource. responsible for making the with an event in the life • Subject: The topic of the content of the resource. content of the resource cycle of the resource. • Publisher: An entity • Format: The physical or • Description: An account of responsible for making the the content of the resource. digital manifestation of the resource available. • Type: The nature or genre resource of the content of the • Contributor: An entity responsible for making • Identifier: An resource. contributions to the content unambiguous reference to • Source: A Reference to a the resource within a given resource from which the of the resource present resource is derived. • Rights: Information about context. • Relation: A reference to a rights held in and over the • Language: A language of related resource. resource the intellectual content of • Coverage: The extent or the resource. scope of the content of the resource 22
  • 23. Tricky, but important: Type? • Type, as originally defined in Dublin Core: • “The nature or genre of the content of the resource. Type includes terms describing general categories, functions, genres, or aggregation levels for content.” 23
  • 24. Information type (class of content) Types of “information asset” that are specificed as classes of content, having a purpose and lifecycle. Information Type 24
  • 25. Actual information, physical representation Types of “information asset” that are specificed as classes of content, having a purpose and lifecycle. Information Type Operational perspective Metadata detailing how and where the representation of the content (or the embodiment of the information) is 25 created, stored and managed
  • 26. Actual information, logical ”aboutness” “Subject” perspective Metadata representing the “aboutness” of the actual content and classifying it due to a sustainable hierarchy of organised subjects (themes, Types of “information asset” topics, overall ideas) that are specificed as classes of content, having a purpose and lifecycle. Information Type Operational perspective Metadata detailing how and where the representation of the content (or the embodiment of the information) is 26 created, stored and managed
  • 27. Actual information, logical ”coverage” “Subject” perspective Metadata representing the “aboutness” of the actual content and classifying it due to a sustainable hierarchy of organised subjects (themes, Types of “information asset” topics, overall ideas) that are specificed as classes of content, having a purpose and lifecycle ín their contexts. Business perspective Metadata representing the “aboutness” of the Information Type actual content and describing the extent or scope of the content in relation to the changing business context of interrelated of organisations, processes, products, etc. Operational perspective Metadata detailing how and where the representation of the content (or the embodiment of the information) is 27 created, stored and managed
  • 28. Information type, metadata application “Subject” perspective Metadata representing the “aboutness” of the actual content and classifying it due to a sustainable hierarchy of organised subjects (themes, Types of “information asset” topics, overall ideas) that are specificed as classes of content, having purpose and lifecycle ín their contexts. Business perspective Metadata representing the “aboutness” of the Information Type actual content and describing the extent or scope of the content in relation to the changing Specifies the metadata to be business context of applied in the interrelated of creation and management organisations, processes, of information products, etc. Operational perspective Metadata detailing how and where the representation of the content (or the embodiment of the information) is 28 created, stored and managed
  • 29. Information type, metadata application “Subject” perspective Metadata representing the “aboutness” of the actual content and classifying it due to a sustainable hierarchy of organised subjects (themes, Types of “information asset” topics, overall ideas) that are specificed as classes of content, having purpose and lifecycle ín their contexts. Business perspective Metadata representing the “aboutness” of the Information Type actual content and describing the extent or scope of the content in relation to the changing Specifies the metadata to be business context of applied in the interrelated of creation and management organisations, processes, of information products, etc. Operational perspective and used for Metadata detailing how and where the selection and access representation of the content (or the to information embodiment of the information) is 29 created, stored and managed
  • 30. Information Type (or Class????) • Proposed definition and usage within the AZ R&D IM/KM context: • A class of content having a specified lifecycle and required utilisation (behaviour) • To specify the metadata structure and the metadata rules to be applied in the creation and management of information. • To specify available metadata to be utilised for selection, search, access and presentation of information • Type List for now, but later on a Type Registry???? 30
  • 31. Are you motivated, and provided with tools and procedures, to contribute to our shared information assets? 31