SlideShare a Scribd company logo
Integrating 
Know-How 
in the Linked Data Cloud 
Paolo Pareti, Benoit Testu, Ryutaro Ichise, 
Ewan Klein and Adam Barker 
https://w3id.org/prohow/ 
“As we all know, there is a large amount of facts available on the Web. But what about human activities or know-how? The goal of this talk is to 
tell you how this kind of knowledge can be made machine understandable and available on the Web.”
Human activities (or know-how) 
1. can be represented as Linked Data 
2. can be automatically extracted 
3. can be automatically interlinked 
4. experiment: extracted a large Linked Data dataset 
5. evaluation: our system outperforms humans 
“In particular, the presentation will focus on those five points.”
339933,,660000 
“If we ask an intelligent system this question: ‘What is the population of the capital of New Zealand?’ we would now assume it can answer this 
question correctly, by accessing knowledge bases available on the Web. But what happens if we ask a seemingly easier question: ‘What do you 
need to wash you hands?’ In this case, the system would not be able to answer.”
??? 
“This is because, to answer this question, the intelligent system would need to have some understanding of what an activity is, and maybe what 
are its requirements. This knowledge, however, is not currently available in existing knowledge bases.”
Why Know-How? 
“But actually know-how is very useful and has a lot of applications. Know-how is relevant in almost all domains, and it can be common sense 
know-how available on the Web, or maybe internal know-how of specific organizations, such as standard operating procedures. This knowledge 
also has applications in fields such as question answering, recommender systems and activity recognition.”
“Human know-how is on the Web, but why is it not accessible? First of all, this knowledge is usually represented in unstructured resources. We 
can think for example of step-by-step instructions, which are typically represented as text in natural language, 
or maybe as pictures and videos.”
? 
? 
? 
“But the most serious limitation is the fact that a single document contains only limited information. What happens if we (or a machine) does not 
understand how to do a specific step, or what a particular ingredient is. In fact, it is often the case that humans look at multiple resources to 
complete a complex task for the same time.”
Data 
“The first step for making know-how machine understandable is by using a structured representation. We can identify several entities in a 
process, such as steps, methods, requirements and outputs. We can link those entities with each other, depending on which relation exists 
between them.”
Linked Data 
“To solve the problem of the isolation of single resources, we have adopted a Linked Data representation. In this way, humans and machines can 
discover related resources when they are interested in more information about a specific entity. It is important to notice that these are not just 
links between documents, but between specific entities contained in these documents.”
“Our simple Linked Data representation of know-how is a point of contact between humans and machines. From the human perspective, know-how 
as Linked Data is a way to manage and find relevant resources which are human understandable. From the machine perspective, this data 
can be easily used for analysis, inferencing, and it can be extended to more complex representations where required.”
“So all of this is not just an idea. It is actually possible and we have run experiments and evaluated our results.”
“What do we want to achieve exactly, when we talk about machine-understandable activities? While it is true that we want to have a knowledge 
representation more powerful than simple text in a document, we cannot yet aim to have machines capable of automating all human activities. 
Therefore we need to start by reaching a first significant but realistic goal.”
“We show the usefulness of this system in a real application. A task currently done by humans is the interlinking of related know-how resources. 
In particular, the WikiHow community is actively creating such kind of links; for example between the step of a process and another set of 
instructions that explains how to do it.”
How to 
Make a Pancake 
Steps: 
1. Prepare the mix 
2. Pour the mix 
in a hot pan 
3. Cook until golden 
Make a Pancake has_step 
has_step 
has_step 
Prepare the mix Cook until golden 
Pour the mix 
in a hot pan 
“This is a simplified example (e.g. missing the relations to specify the order of the steps) of how our system generates a Linked Data 
representation of a Web document. This can be done in many ways, but when the original document has some degree of structure, this 
knowledge extraction can be done easily and accurately.”
How to 
Make a Pancake 
Steps: 
1. Prepare the mix 
2. Pour the mix 
in a hot pan 
3. Cook until golden 
Make a Pancake has_step 
requires 
requires 
has_step 
has_step 
Eggs 
Milk 
Prepare the mix Cook until golden 
Pour the mix 
in a hot pan 
Requirements: 
● Eggs 
● Milk 
● Flour 
Flour 
requires 
“On the Web, most of these resources have some degree of structure. This is because a well structured set of instructions is better understood 
by humans, even before machines. This structure usually takes form of a simple enumeration of steps, methods and requirements.”
> 200,000 
procedures 
> 2,600,000 
entities 
“WikiHow and Snapguide are two large repositories that contain well organized know-how. We have extracted the knowledge of these websites 
and obtained a large dataset of over 200,000 procedures decomposed in over 2,600,000 entities. This can be seen as a large-scale extraction of 
know-how from the Web and conversion to Linked Data.”
Hot to Install an Operating System 
create a partition 
How to Create 
a Partition 
“In order to interlink the extracted entities, we have created a system to automatically discover two kinds of links. The first kind is a functional link 
between a step and another set of instructions that explains how this step can be done.”
DBpedia Guacamole 
How to Make Guacamole How to Serve Nachos 
“The second kind of links we discovered is similar to an Input/Output link between two processes. Instead of representing it directly, we have this 
link implicitly represented by the types of the input and the output of processes. In this example, we can infer that there is an Input/Output relation 
between the two processes, as one requires the object ‘Guacamole’ while the other outputs it.”
Evaluation 
+ 16% precision 
+ ×2 number of links 
+ ×2 coverage 
+ automatic 
+ semantic links 
“Finally we evaluated the links extracted by our system against the links generated manually by the WikiHow community. The result was a 
significant improvement. Our system identified links of better quality, more in number, and better spread across all resources. All of this on top of 
being a completely automatic system which creates semantic Linked Data links, more expressive than simple html links.”
Know How as Linked Data? 
….a dream that comes true! 
● Generated a large dataset of > 200,000 
human activities as Linked Data 
● Integrated in the Linked Data Cloud 
● Outperformed the human baseline 
https://w3id.org/prohow/ 
“In conclusion, we have seen how know-how can become a new useful resource on the Linked Data Cloud. Our system automated the extraction 
and the integration of this knowledge on a large scale. Please visit this website if you are interested in this dataset or information about the 
project. This website also contains a link to an online visualization tool to explore the dataset”.

More Related Content

PDF
Patchwork February 2013 UK
PDF
Patchwork February 2013 MAV
PDF
#1NWebinar: Cracking Big Content
PDF
The semanticweb may2001_timbernerslee
PPTX
io dance
PPTX
SciSoftDays Talk - Howison: Spreading the work in software ecosystems
PDF
Democratizing Data to transform gov., business & daily life
PPTX
Philosophy of Technology
Patchwork February 2013 UK
Patchwork February 2013 MAV
#1NWebinar: Cracking Big Content
The semanticweb may2001_timbernerslee
io dance
SciSoftDays Talk - Howison: Spreading the work in software ecosystems
Democratizing Data to transform gov., business & daily life
Philosophy of Technology

What's hot (20)

PDF
Systems Thinking workshop, given at Lean UX NYC
PPT
The Inline Interface
PPT
Hypertext2007 Wendy Hall - "Whatever Happened to Hypertext?"
PPT
Mashing up the web” - combining, fusing, creating ideas in linking web 2.0 t...
PDF
We Want Our Data Now! 7 principles of democratizing data
PPT
Grant: The Impact of Cloud, Mobile, and Managing the Changing Platforms of Di...
PPT
Open, social and linked - what do current Web trends tell us about the future...
PPTX
Breaking Out of the Walled Garden: Lessons Learned in Moving Library Linked D...
PDF
Where the Social Web Meets the Semantic Web. Tom Gruber
PDF
Cryptocollege how blockchain can reimagine higher education. J. David Judd
PPT
Data Big and Broad (Oxford, 2012)
PPT
Gabor Cselle - The Future of Email
PDF
A LITERATURE REVIEW ON SEMANTIC WEB – UNDERSTANDING THE PIONEERS’ PERSPECTIVE
PPT
Linked Data and the Semantic Web - Mimas Seminar
PPTX
Cultural heritage collections in a web 2
PDF
Facilitating Web Science Collaboration through Semantic Markup
KEY
Isle of Man open data overview
PDF
Mathews blockchain presentation
PDF
Semantic web and information graph
Systems Thinking workshop, given at Lean UX NYC
The Inline Interface
Hypertext2007 Wendy Hall - "Whatever Happened to Hypertext?"
Mashing up the web” - combining, fusing, creating ideas in linking web 2.0 t...
We Want Our Data Now! 7 principles of democratizing data
Grant: The Impact of Cloud, Mobile, and Managing the Changing Platforms of Di...
Open, social and linked - what do current Web trends tell us about the future...
Breaking Out of the Walled Garden: Lessons Learned in Moving Library Linked D...
Where the Social Web Meets the Semantic Web. Tom Gruber
Cryptocollege how blockchain can reimagine higher education. J. David Judd
Data Big and Broad (Oxford, 2012)
Gabor Cselle - The Future of Email
A LITERATURE REVIEW ON SEMANTIC WEB – UNDERSTANDING THE PIONEERS’ PERSPECTIVE
Linked Data and the Semantic Web - Mimas Seminar
Cultural heritage collections in a web 2
Facilitating Web Science Collaboration through Semantic Markup
Isle of Man open data overview
Mathews blockchain presentation
Semantic web and information graph
Ad

Viewers also liked (6)

PDF
A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Me...
PDF
How to Start Using LaTeX and BibTeX
PPTX
End note reference manager2013
PDF
BibTex:Bibliografía para Latex
PPTX
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
PPTX
An Intelligent Assistant for High-Level Task Understanding
A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Me...
How to Start Using LaTeX and BibTeX
End note reference manager2013
BibTex:Bibliografía para Latex
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
An Intelligent Assistant for High-Level Task Understanding
Ad

Similar to Human Activities as Linked Data (20)

PDF
Interlinking Data and Knowledge in Enterprises, Research and Society with Lin...
PDF
Linked Data 1st Edition Tom Heath Christian Bizer
PDF
Human Computation
PPSX
Linked Data to Improve the OER Experience
PDF
Using Linked Data Resources to generate web pages based on a BBC case study
PPT
RDFa From Theory to Practice
PDF
Semantic Search: We're Living in a Golden Age for Information
PPTX
03 interlinking-dass
PDF
Linked Data: The Real Web 2.0 (from 2008)
PDF
PDF
Cloud-based Linked Data Management for Self-service Application Development
PDF
Linking Open Government Data at Scale
PDF
Linking knowledge spaces
PDF
What do we want computers to do for us?
PDF
Knowledge Graphs in Retrieval Augmented Generation with WhyHow.AI
PPTX
Brief Introduction to Linked Data
PPTX
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
PDF
Wimmics Research Team Overview 2017
PDF
Scaling the (evolving) web data –at low cost-
PDF
Linked Data Management
Interlinking Data and Knowledge in Enterprises, Research and Society with Lin...
Linked Data 1st Edition Tom Heath Christian Bizer
Human Computation
Linked Data to Improve the OER Experience
Using Linked Data Resources to generate web pages based on a BBC case study
RDFa From Theory to Practice
Semantic Search: We're Living in a Golden Age for Information
03 interlinking-dass
Linked Data: The Real Web 2.0 (from 2008)
Cloud-based Linked Data Management for Self-service Application Development
Linking Open Government Data at Scale
Linking knowledge spaces
What do we want computers to do for us?
Knowledge Graphs in Retrieval Augmented Generation with WhyHow.AI
Brief Introduction to Linked Data
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
Wimmics Research Team Overview 2017
Scaling the (evolving) web data –at low cost-
Linked Data Management

Recently uploaded (20)

PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
BIOMOLECULES PPT........................
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Microbiology with diagram medical studies .pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPT
protein biochemistry.ppt for university classes
PPTX
famous lake in india and its disturibution and importance
Placing the Near-Earth Object Impact Probability in Context
microscope-Lecturecjchchchchcuvuvhc.pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Classification Systems_TAXONOMY_SCIENCE8.pptx
Cell Membrane: Structure, Composition & Functions
TOTAL hIP ARTHROPLASTY Presentation.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
2. Earth - The Living Planet Module 2ELS
BIOMOLECULES PPT........................
Derivatives of integument scales, beaks, horns,.pptx
bbec55_b34400a7914c42429908233dbd381773.pdf
Microbiology with diagram medical studies .pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
protein biochemistry.ppt for university classes
famous lake in india and its disturibution and importance

Human Activities as Linked Data

  • 1. Integrating Know-How in the Linked Data Cloud Paolo Pareti, Benoit Testu, Ryutaro Ichise, Ewan Klein and Adam Barker https://w3id.org/prohow/ “As we all know, there is a large amount of facts available on the Web. But what about human activities or know-how? The goal of this talk is to tell you how this kind of knowledge can be made machine understandable and available on the Web.”
  • 2. Human activities (or know-how) 1. can be represented as Linked Data 2. can be automatically extracted 3. can be automatically interlinked 4. experiment: extracted a large Linked Data dataset 5. evaluation: our system outperforms humans “In particular, the presentation will focus on those five points.”
  • 3. 339933,,660000 “If we ask an intelligent system this question: ‘What is the population of the capital of New Zealand?’ we would now assume it can answer this question correctly, by accessing knowledge bases available on the Web. But what happens if we ask a seemingly easier question: ‘What do you need to wash you hands?’ In this case, the system would not be able to answer.”
  • 4. ??? “This is because, to answer this question, the intelligent system would need to have some understanding of what an activity is, and maybe what are its requirements. This knowledge, however, is not currently available in existing knowledge bases.”
  • 5. Why Know-How? “But actually know-how is very useful and has a lot of applications. Know-how is relevant in almost all domains, and it can be common sense know-how available on the Web, or maybe internal know-how of specific organizations, such as standard operating procedures. This knowledge also has applications in fields such as question answering, recommender systems and activity recognition.”
  • 6. “Human know-how is on the Web, but why is it not accessible? First of all, this knowledge is usually represented in unstructured resources. We can think for example of step-by-step instructions, which are typically represented as text in natural language, or maybe as pictures and videos.”
  • 7. ? ? ? “But the most serious limitation is the fact that a single document contains only limited information. What happens if we (or a machine) does not understand how to do a specific step, or what a particular ingredient is. In fact, it is often the case that humans look at multiple resources to complete a complex task for the same time.”
  • 8. Data “The first step for making know-how machine understandable is by using a structured representation. We can identify several entities in a process, such as steps, methods, requirements and outputs. We can link those entities with each other, depending on which relation exists between them.”
  • 9. Linked Data “To solve the problem of the isolation of single resources, we have adopted a Linked Data representation. In this way, humans and machines can discover related resources when they are interested in more information about a specific entity. It is important to notice that these are not just links between documents, but between specific entities contained in these documents.”
  • 10. “Our simple Linked Data representation of know-how is a point of contact between humans and machines. From the human perspective, know-how as Linked Data is a way to manage and find relevant resources which are human understandable. From the machine perspective, this data can be easily used for analysis, inferencing, and it can be extended to more complex representations where required.”
  • 11. “So all of this is not just an idea. It is actually possible and we have run experiments and evaluated our results.”
  • 12. “What do we want to achieve exactly, when we talk about machine-understandable activities? While it is true that we want to have a knowledge representation more powerful than simple text in a document, we cannot yet aim to have machines capable of automating all human activities. Therefore we need to start by reaching a first significant but realistic goal.”
  • 13. “We show the usefulness of this system in a real application. A task currently done by humans is the interlinking of related know-how resources. In particular, the WikiHow community is actively creating such kind of links; for example between the step of a process and another set of instructions that explains how to do it.”
  • 14. How to Make a Pancake Steps: 1. Prepare the mix 2. Pour the mix in a hot pan 3. Cook until golden Make a Pancake has_step has_step has_step Prepare the mix Cook until golden Pour the mix in a hot pan “This is a simplified example (e.g. missing the relations to specify the order of the steps) of how our system generates a Linked Data representation of a Web document. This can be done in many ways, but when the original document has some degree of structure, this knowledge extraction can be done easily and accurately.”
  • 15. How to Make a Pancake Steps: 1. Prepare the mix 2. Pour the mix in a hot pan 3. Cook until golden Make a Pancake has_step requires requires has_step has_step Eggs Milk Prepare the mix Cook until golden Pour the mix in a hot pan Requirements: ● Eggs ● Milk ● Flour Flour requires “On the Web, most of these resources have some degree of structure. This is because a well structured set of instructions is better understood by humans, even before machines. This structure usually takes form of a simple enumeration of steps, methods and requirements.”
  • 16. > 200,000 procedures > 2,600,000 entities “WikiHow and Snapguide are two large repositories that contain well organized know-how. We have extracted the knowledge of these websites and obtained a large dataset of over 200,000 procedures decomposed in over 2,600,000 entities. This can be seen as a large-scale extraction of know-how from the Web and conversion to Linked Data.”
  • 17. Hot to Install an Operating System create a partition How to Create a Partition “In order to interlink the extracted entities, we have created a system to automatically discover two kinds of links. The first kind is a functional link between a step and another set of instructions that explains how this step can be done.”
  • 18. DBpedia Guacamole How to Make Guacamole How to Serve Nachos “The second kind of links we discovered is similar to an Input/Output link between two processes. Instead of representing it directly, we have this link implicitly represented by the types of the input and the output of processes. In this example, we can infer that there is an Input/Output relation between the two processes, as one requires the object ‘Guacamole’ while the other outputs it.”
  • 19. Evaluation + 16% precision + ×2 number of links + ×2 coverage + automatic + semantic links “Finally we evaluated the links extracted by our system against the links generated manually by the WikiHow community. The result was a significant improvement. Our system identified links of better quality, more in number, and better spread across all resources. All of this on top of being a completely automatic system which creates semantic Linked Data links, more expressive than simple html links.”
  • 20. Know How as Linked Data? ….a dream that comes true! ● Generated a large dataset of > 200,000 human activities as Linked Data ● Integrated in the Linked Data Cloud ● Outperformed the human baseline https://w3id.org/prohow/ “In conclusion, we have seen how know-how can become a new useful resource on the Linked Data Cloud. Our system automated the extraction and the integration of this knowledge on a large scale. Please visit this website if you are interested in this dataset or information about the project. This website also contains a link to an online visualization tool to explore the dataset”.