SlideShare a Scribd company logo
Using OpenCalais API in the context of Linked Data

                               Eldorina Andreea Alergus

                     Faculty of Computer Science, Distributed Systems
                        eldorina.alergus@info.uaic.ro



       Abstract. In this paper we discuss about OpenCalais Api in the context of
       Linked Data. With the growth of Linked Datasets, automating certain tasks,
       such as discovery or interlinking data becomes more and more important. We
       will survey in this work what OpenCalais is offering us for linking the data.

       Keywords: OpenCalais, linked data, Web of Data



1      Introduction
The OpenCalais Web Service automatically generates rich semantic metadata for the
submitted content. OpenCalais analyses the content using method as: natural language
processing (NLP) or machine learning and finds the entities (Company, Country,
City, Product, Movie etc) within it, and more, it finds events (person P was hired at
company C) and facts (person P works for company C) within your text. The
metadata returned as response is an RDF construct that is also centrally stored.
    The metadata gives us the possibility of building maps, networks or graphs by
linking documents to people, geographies, places, companies, etc. Those maps can be
used in order to verify if our content contains what we expect, to tag and organize it
and also to create structured folksonomies or to improve site navigation. We can share
our maps with anyone else in the content ecosystem.
    The Calais ecosystem is exposed via Linked Data endpoints. We use the term
Linked Data to describe a method of exposing, sharing and connecting data on the
Web via dereferenceable URIs.[15] Having linked data, we can find other related
data. This is the Semantic Web, it’s about interlinking data, so that a person or a
machine to be able to explore the web of data. The main idea behind linked data is
that we may increase the value and the usability of data by connecting it with other
related data.
2          Eldorina Andreea Alergus


    Calais is part of the Linked Open Data (LOD) Cloud, and it links to the following
assets: Dbpedia, Wikipedia, Freebase, Reuters.com, GeoNames, Shopping.com,
IMDB, LinkedMDB. In order to understand what Calais is offering, we must first
understand the concept of Linked Data.



2        Linked Data
As we said above, Linked Data is the technique of publishing data on the Web and
interlinking data between different sources. It is machine-readable, its meaning is
explicitly defined, it is linked to other external data sets and can also be linked from
external data sets. Linked Data is based on RDF (Resource Description Framework)
documents, which is used to make typed statements that link arbitrary things in the
world. In order to access the web of data, we use Linked Data browsers (Tabulator,
Disco, RDFViz, BrowseRDF, etc) which enable navigation between different sources
using RDF. For instance, while looking at data about a product, a user may be
interested in information about the company that produces the thing. Following the
RDF link, he can navigate to information about that company contained in another
dataset.
    Berners-Lee outlined a set of rules in order to publish data on the Web in a way
that all published data becomes part of a single global data space:
    1.   Name things using URIs (Uniform Resource Identifiers).
    2.   Use HTTP URIs so that people can look up those names.
    3.   When someone looks up a URI, provide useful information using the
         standards (RDF, SPARQL).
    4.   Data should be interlinked with other data.
    These principles provide a basic recipe for publishing and connecting data using
the infrastructure of the Web while adhering to its architecture and standards.
    Linked Data relies on two fundamental technologies: URI and HTTP. URIs
provide generic methods of identifying any existing entity. Entities identified by URIs
that use http:// can be looked up by dereferencing the URI. We say dereferencing a
URI is the act of retrieving the representation of a resource identified by that URI.[16]




                                             2
Using OpenCalais API in the context of Linked Data       3


   To URI and HTTP we add a necessarily technology to the Web of Data – the RDF.
Similarly to HTML which provides the means to structure and link documents on the
Web, RDF provides a graph-based data model to structure and link data that describes
things.
   In RDFs data has the form of a triple: subject, predicate, object. The subject and
the object are URIs that identify a resource, or a URI and a string. The predicate
describes how the subject and the object are related, and is also represented by a URI.
   A linked dataset is a collection of data, published and maintained by a single
provider, available as RDF on the Web, where at least some of the resources in the
dataset are identified by dereferenceable URIs (http://rdfs.org/ns/void/html). In the
image below, we have an image of the Linked Open Data Cloud, on which we can see
the available datasets, and the links between them.




   By publishing data on the Web according to the Linked Data principles, we add
our data to a global data space, which allows data to be discovered and used by
various applications. To publish data set a Linked Data on the web, we must follow
three basic steps:
4       Eldorina Andreea Alergus


    -   Assign Uris to the entities described by the dataset and provide for
        dereferencing these URIs into RDF representations.
    -   Set RDF links to other data sources on the Web.
    -   Provide metadata about the published data so that clients to evaluate the
        quality of the published data.
    We will talk forward about how we can create rich semantic metadata for some
content.



3       OpenCalais Web Service

    As we already said in Introduction, The OpenCalais Web Service automatically
generates rich semantic metadata for the submitted content. It uses natural language
processing (NLP), machine learning and other methods to analyze content and return
the entities it finds, such as the cities, countries and people with dereferenceable
Linked Data style URIs. The events, facts and entity types, are defined in the
OpenCalais RDF Schemas (http://s.opencalais.com/1/pred/asf/1/pred/.html).
    In order to get started with OpenCalais, you first need to get an API key. Do get
the key, you must register at http://www.opencalais.com/user/register. The Calais WS
can be called from .NET, java, php etc using SOAP or REST. We can also use Calais
Viewer to see how it works, and what the output of a Calais call is.
    When we want to make a call to Calais API, we must provide some input
parameters, whom must be HTTP encoded. The service we invoke is at
http://api.opencalais.com/enlighten/?wsdl. We will explain what do we need to call
the service via SOAP.
    The method enlighten which allows to call the Open Calais web service via soap
has three parameters:
    -   licenseId. This is your API key that you can get from Calais site.
    -   paramsXML. Those are the input parameters of the service in XML format.
        More    information    about     the   input   parameters   we   can   find   at
        http://opencalais.com/documentation/calais-web-service-api/forming-api-
        calls/input-parameters.




                                               4
Using OpenCalais API in the context of Linked Data       5


   -     content. This is the content on which the extraction will be performed.
   For start we use a simple text as content: The Palace of Versailles, or simply
Versailles, is a royal château in Versailles, the Île-de-France region of France.
When the château was built, Versailles was a country village; today, however, it is a
suburb of Paris, some twenty kilometers southwest of the French capital. The court of
Versailles was the center of political power in France from 1682, when Louis XIV
moved from Paris, until the royal family was forced to return to the capital in October
1789 after the beginning of French Revolution. Versailles is therefore famous not only
as a building, but as a symbol of the system of absolute monarchy of the Ancien
Régime.
   We call the service using C# as follows: add in our project a service reference to
the Calais wsdl, then call the service as it follows:
   CalaisReference.calaisSoapClient client = new
CalaisReference.calaisSoapClient();
   string response = client.Enlighten(m_Licence,m_Content,
m_Params());
   The m_Content and m_Params is better to be read fron a file, and the response (a
RDF) should also be kept in a file.
   The entities found are: City (Paris, France), Country (France) and Facility (Palace
of Versailles). If we look at the URI http://d.opencalais.com/er/geo/city/ralg-
geo1/797c999a-d455-520d-e5cf-04ca7fb255c1.html, we can say thet the entity (City)
has been disambiguated, because it contains /er/. The entities which contain /em/ are
not disambiguated by OpenCalais. If we open the link in a browser, we see that is was
linked to other data sets (OpenCalais is linked to Freebase, Dbpedia, Geonames,
Linked              IMDB)               as:             http://dbpedia.org/resource/Paris,
http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000002db30                         ,
http://sws.geonames.org/2988507/        and is also has assigned a Web link -
http://en.wikipedia.org/wiki/Paris.
   For the detected entities OpenCalais provides an entity relevance score (shown for
each respectively in the screen shots below ) The relevance capability detects the
importance of each unique entity and assigns a relevance score in the range 0-1 (1
6        Eldorina Andreea Alergus


being the most relevant and important). We see that France is the most relevant
(69%).




    For a better understanding of how Calais can be used, we take a look at
http://gvlt.appspot.com/opencalais-geo/. In this project, the Calais API is used to
identify geographic references in a text and display them on an Open Layers map. The
Calais is used with JSON output, and all the processing is done on client side in the
browser.
    OpenCalais can also be useful to content managers to create smart indexes. Instead
of indexing by keywords, you can index by referenced subject. If you have a
collection of unstructured documents, in a website for example, you can use
OpenCalais to help manage and reference them together. By using the OpenCalais
API, a website's side navigation bar can suggest other related documents based on the
conceptual subject, instead of word matching as is used by most indexes. By taking
the RDF/XML document returned by the OpenCalais HTTP interface and storing it in
a RDF store, you can enable an application to find documents related to anything in
the RDF store. (http://www.devx.com/semantic/Article/38517/1763/page/2).



4        Conclusions
    Nowadays, the Web means more than just putting data on the web, it means
interlinking and sharing data as we share documents. The web is seen as an increasing
global graph. It started with the assumption that the values and usefulness of the data




                                            6
Using OpenCalais API in the context of Linked Data      7


increases by creating links between the data. This is what Linked Data means: uses
the Web to create typed links between data from different sources.


      Calais is a rapidly growing toolkit of capabilities that allows you to readily
incorporate state-of-the-art semantic functionality within your blog, content
management system, website or application. We have described in this paper how the
Calais WS can be invoked and what the RDF output is offering us. OpenCalais
represents an important move forward Semantic Web. With OpenCalais computers
could do the research for you, combing through and comparing company names,
locations and rumored or real transactions real time to give you answers in a way that
keyword search simply cannot do.



5        References

[1] C. Bizer, R. Cyganiak, T. Heath, How to Publish Linked Data on the Web

[2] T. Heath, An Introduction to Linked Data, 2009
[3] C. Bizer, T. Heath, T. Berners-Lee, Linked Data - The Story So Far
[4] M. Watson, Practical Semantic Web Programming With AllegroGraph, 2009
[5] K. Alexander, R. Cyganiaky, M. Hausenblasz, J. Zhaox, Describing Linked
Datasets
[6] http://opencalais.com/
[7] http://www.w3.org/DesignIssues/LinkedData.html
[8] http://thomsonreuters.com/content/corporate/articles/398062
[9]http://philippeadjiman.com/blog/2009/09/16/open-calais-from-java-with-eclipse-
extract-entities-facts-and-events-in-4-minutes/
[10] http://www.devx.com/semantic/Article/38517/1763/page/2
[11] http://blog.3kbo.com/2009/09/26/opencalais-response/
[12] http://wiki.dbpedia.org/Interlinking
[13]http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataS
ets
[14] http://gvlt.wordpress.com/2008/10/17/tutorial-text-geotagging-with-opencalais/
8      Eldorina Andreea Alergus


[15] http://en.wikipedia.org/wiki/Linked_Data
[16] http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14




                                          8

More Related Content

PPTX
Linked data HHS 2015
PPTX
Search Engines After The Semanatic Web
PDF
Deploying PHP applications using Virtuoso as Application Server
KEY
RDFa Introductory Course Session 4/4 When RDFa
PPTX
Linked data MLA 2015
PPTX
Linked Data MLA 2015
PPT
Semantic Web Applications
PPTX
Consuming Linked Data 4/5 Semtech2011
Linked data HHS 2015
Search Engines After The Semanatic Web
Deploying PHP applications using Virtuoso as Application Server
RDFa Introductory Course Session 4/4 When RDFa
Linked data MLA 2015
Linked Data MLA 2015
Semantic Web Applications
Consuming Linked Data 4/5 Semtech2011

What's hot (20)

PDF
Introduction to RDF & SPARQL
ODP
Linked Data
PDF
Semantic web
PDF
Web of Data Usage Mining
PDF
A Term Based Ranking Methodology for Resources on the Semantic Web
PDF
Scoda openrefine-directordata
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PDF
Scoda company networks2
PPT
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
PDF
Quick Linked Data Introduction
PDF
ODI Summit 2016 - Linked Open Data at Springer Nature
PPTX
reegle - a new key portal for open energy data
PDF
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
ODP
Building a semantic website
PDF
An introduction to Linked (Open) Data
PDF
School of Data - mapping company networks
PPT
ORE and SWAP: Composition and Complexity
PPTX
Data.dcs: Converting Legacy Data into Linked Data
KEY
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
PDF
Danbri Drupalcon Export
Introduction to RDF & SPARQL
Linked Data
Semantic web
Web of Data Usage Mining
A Term Based Ranking Methodology for Resources on the Semantic Web
Scoda openrefine-directordata
Usage of Linked Data: Introduction and Application Scenarios
Scoda company networks2
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
Quick Linked Data Introduction
ODI Summit 2016 - Linked Open Data at Springer Nature
reegle - a new key portal for open energy data
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
Building a semantic website
An introduction to Linked (Open) Data
School of Data - mapping company networks
ORE and SWAP: Composition and Complexity
Data.dcs: Converting Legacy Data into Linked Data
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Danbri Drupalcon Export
Ad

Similar to OpenCalais in Linked Data context (20)

PPSX
Linked Data to Improve the OER Experience
PPTX
Linked Data In Action
PPTX
Linked dataresearch
PPT
Linked Data Tutorial
PDF
Introduction to linked data
PDF
Llinked open data training for EU institutions
PDF
Linked Open Data Principles, Technologies and Examples
ODP
Linked Data
ODP
State of the Semantic Web
PDF
Discovering Resume Information using linked data  
PDF
Linked sensor data
PDF
Semantically enriching content using OpenCalais
PDF
Open Calais
PPTX
Linked data 20171106
PDF
Web of Data as a Solution for Interoperability. Case Studies
PPTX
RDFa Semantic Web
KEY
When RDFa?
PPT
Lodlam saa 2011_jenelfarrell_2
PDF
Anatomy of a semantic virus
PDF
Semantic web browser
Linked Data to Improve the OER Experience
Linked Data In Action
Linked dataresearch
Linked Data Tutorial
Introduction to linked data
Llinked open data training for EU institutions
Linked Open Data Principles, Technologies and Examples
Linked Data
State of the Semantic Web
Discovering Resume Information using linked data  
Linked sensor data
Semantically enriching content using OpenCalais
Open Calais
Linked data 20171106
Web of Data as a Solution for Interoperability. Case Studies
RDFa Semantic Web
When RDFa?
Lodlam saa 2011_jenelfarrell_2
Anatomy of a semantic virus
Semantic web browser
Ad

Recently uploaded (20)

PPTX
sap open course for s4hana steps from ECC to s4
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Machine learning based COVID-19 study performance prediction
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation theory and applications.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Approach and Philosophy of On baking technology
sap open course for s4hana steps from ECC to s4
Encapsulation_ Review paper, used for researhc scholars
Dropbox Q2 2025 Financial Results & Investor Presentation
Machine learning based COVID-19 study performance prediction
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
20250228 LYD VKU AI Blended-Learning.pptx
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation theory and applications.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
Network Security Unit 5.pdf for BCA BBA.
Approach and Philosophy of On baking technology

OpenCalais in Linked Data context

  • 1. Using OpenCalais API in the context of Linked Data Eldorina Andreea Alergus Faculty of Computer Science, Distributed Systems eldorina.alergus@info.uaic.ro Abstract. In this paper we discuss about OpenCalais Api in the context of Linked Data. With the growth of Linked Datasets, automating certain tasks, such as discovery or interlinking data becomes more and more important. We will survey in this work what OpenCalais is offering us for linking the data. Keywords: OpenCalais, linked data, Web of Data 1 Introduction The OpenCalais Web Service automatically generates rich semantic metadata for the submitted content. OpenCalais analyses the content using method as: natural language processing (NLP) or machine learning and finds the entities (Company, Country, City, Product, Movie etc) within it, and more, it finds events (person P was hired at company C) and facts (person P works for company C) within your text. The metadata returned as response is an RDF construct that is also centrally stored. The metadata gives us the possibility of building maps, networks or graphs by linking documents to people, geographies, places, companies, etc. Those maps can be used in order to verify if our content contains what we expect, to tag and organize it and also to create structured folksonomies or to improve site navigation. We can share our maps with anyone else in the content ecosystem. The Calais ecosystem is exposed via Linked Data endpoints. We use the term Linked Data to describe a method of exposing, sharing and connecting data on the Web via dereferenceable URIs.[15] Having linked data, we can find other related data. This is the Semantic Web, it’s about interlinking data, so that a person or a machine to be able to explore the web of data. The main idea behind linked data is that we may increase the value and the usability of data by connecting it with other related data.
  • 2. 2 Eldorina Andreea Alergus Calais is part of the Linked Open Data (LOD) Cloud, and it links to the following assets: Dbpedia, Wikipedia, Freebase, Reuters.com, GeoNames, Shopping.com, IMDB, LinkedMDB. In order to understand what Calais is offering, we must first understand the concept of Linked Data. 2 Linked Data As we said above, Linked Data is the technique of publishing data on the Web and interlinking data between different sources. It is machine-readable, its meaning is explicitly defined, it is linked to other external data sets and can also be linked from external data sets. Linked Data is based on RDF (Resource Description Framework) documents, which is used to make typed statements that link arbitrary things in the world. In order to access the web of data, we use Linked Data browsers (Tabulator, Disco, RDFViz, BrowseRDF, etc) which enable navigation between different sources using RDF. For instance, while looking at data about a product, a user may be interested in information about the company that produces the thing. Following the RDF link, he can navigate to information about that company contained in another dataset. Berners-Lee outlined a set of rules in order to publish data on the Web in a way that all published data becomes part of a single global data space: 1. Name things using URIs (Uniform Resource Identifiers). 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information using the standards (RDF, SPARQL). 4. Data should be interlinked with other data. These principles provide a basic recipe for publishing and connecting data using the infrastructure of the Web while adhering to its architecture and standards. Linked Data relies on two fundamental technologies: URI and HTTP. URIs provide generic methods of identifying any existing entity. Entities identified by URIs that use http:// can be looked up by dereferencing the URI. We say dereferencing a URI is the act of retrieving the representation of a resource identified by that URI.[16] 2
  • 3. Using OpenCalais API in the context of Linked Data 3 To URI and HTTP we add a necessarily technology to the Web of Data – the RDF. Similarly to HTML which provides the means to structure and link documents on the Web, RDF provides a graph-based data model to structure and link data that describes things. In RDFs data has the form of a triple: subject, predicate, object. The subject and the object are URIs that identify a resource, or a URI and a string. The predicate describes how the subject and the object are related, and is also represented by a URI. A linked dataset is a collection of data, published and maintained by a single provider, available as RDF on the Web, where at least some of the resources in the dataset are identified by dereferenceable URIs (http://rdfs.org/ns/void/html). In the image below, we have an image of the Linked Open Data Cloud, on which we can see the available datasets, and the links between them. By publishing data on the Web according to the Linked Data principles, we add our data to a global data space, which allows data to be discovered and used by various applications. To publish data set a Linked Data on the web, we must follow three basic steps:
  • 4. 4 Eldorina Andreea Alergus - Assign Uris to the entities described by the dataset and provide for dereferencing these URIs into RDF representations. - Set RDF links to other data sources on the Web. - Provide metadata about the published data so that clients to evaluate the quality of the published data. We will talk forward about how we can create rich semantic metadata for some content. 3 OpenCalais Web Service As we already said in Introduction, The OpenCalais Web Service automatically generates rich semantic metadata for the submitted content. It uses natural language processing (NLP), machine learning and other methods to analyze content and return the entities it finds, such as the cities, countries and people with dereferenceable Linked Data style URIs. The events, facts and entity types, are defined in the OpenCalais RDF Schemas (http://s.opencalais.com/1/pred/asf/1/pred/.html). In order to get started with OpenCalais, you first need to get an API key. Do get the key, you must register at http://www.opencalais.com/user/register. The Calais WS can be called from .NET, java, php etc using SOAP or REST. We can also use Calais Viewer to see how it works, and what the output of a Calais call is. When we want to make a call to Calais API, we must provide some input parameters, whom must be HTTP encoded. The service we invoke is at http://api.opencalais.com/enlighten/?wsdl. We will explain what do we need to call the service via SOAP. The method enlighten which allows to call the Open Calais web service via soap has three parameters: - licenseId. This is your API key that you can get from Calais site. - paramsXML. Those are the input parameters of the service in XML format. More information about the input parameters we can find at http://opencalais.com/documentation/calais-web-service-api/forming-api- calls/input-parameters. 4
  • 5. Using OpenCalais API in the context of Linked Data 5 - content. This is the content on which the extraction will be performed. For start we use a simple text as content: The Palace of Versailles, or simply Versailles, is a royal château in Versailles, the Île-de-France region of France. When the château was built, Versailles was a country village; today, however, it is a suburb of Paris, some twenty kilometers southwest of the French capital. The court of Versailles was the center of political power in France from 1682, when Louis XIV moved from Paris, until the royal family was forced to return to the capital in October 1789 after the beginning of French Revolution. Versailles is therefore famous not only as a building, but as a symbol of the system of absolute monarchy of the Ancien Régime. We call the service using C# as follows: add in our project a service reference to the Calais wsdl, then call the service as it follows: CalaisReference.calaisSoapClient client = new CalaisReference.calaisSoapClient(); string response = client.Enlighten(m_Licence,m_Content, m_Params()); The m_Content and m_Params is better to be read fron a file, and the response (a RDF) should also be kept in a file. The entities found are: City (Paris, France), Country (France) and Facility (Palace of Versailles). If we look at the URI http://d.opencalais.com/er/geo/city/ralg- geo1/797c999a-d455-520d-e5cf-04ca7fb255c1.html, we can say thet the entity (City) has been disambiguated, because it contains /er/. The entities which contain /em/ are not disambiguated by OpenCalais. If we open the link in a browser, we see that is was linked to other data sets (OpenCalais is linked to Freebase, Dbpedia, Geonames, Linked IMDB) as: http://dbpedia.org/resource/Paris, http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000002db30 , http://sws.geonames.org/2988507/ and is also has assigned a Web link - http://en.wikipedia.org/wiki/Paris. For the detected entities OpenCalais provides an entity relevance score (shown for each respectively in the screen shots below ) The relevance capability detects the importance of each unique entity and assigns a relevance score in the range 0-1 (1
  • 6. 6 Eldorina Andreea Alergus being the most relevant and important). We see that France is the most relevant (69%). For a better understanding of how Calais can be used, we take a look at http://gvlt.appspot.com/opencalais-geo/. In this project, the Calais API is used to identify geographic references in a text and display them on an Open Layers map. The Calais is used with JSON output, and all the processing is done on client side in the browser. OpenCalais can also be useful to content managers to create smart indexes. Instead of indexing by keywords, you can index by referenced subject. If you have a collection of unstructured documents, in a website for example, you can use OpenCalais to help manage and reference them together. By using the OpenCalais API, a website's side navigation bar can suggest other related documents based on the conceptual subject, instead of word matching as is used by most indexes. By taking the RDF/XML document returned by the OpenCalais HTTP interface and storing it in a RDF store, you can enable an application to find documents related to anything in the RDF store. (http://www.devx.com/semantic/Article/38517/1763/page/2). 4 Conclusions Nowadays, the Web means more than just putting data on the web, it means interlinking and sharing data as we share documents. The web is seen as an increasing global graph. It started with the assumption that the values and usefulness of the data 6
  • 7. Using OpenCalais API in the context of Linked Data 7 increases by creating links between the data. This is what Linked Data means: uses the Web to create typed links between data from different sources. Calais is a rapidly growing toolkit of capabilities that allows you to readily incorporate state-of-the-art semantic functionality within your blog, content management system, website or application. We have described in this paper how the Calais WS can be invoked and what the RDF output is offering us. OpenCalais represents an important move forward Semantic Web. With OpenCalais computers could do the research for you, combing through and comparing company names, locations and rumored or real transactions real time to give you answers in a way that keyword search simply cannot do. 5 References [1] C. Bizer, R. Cyganiak, T. Heath, How to Publish Linked Data on the Web [2] T. Heath, An Introduction to Linked Data, 2009 [3] C. Bizer, T. Heath, T. Berners-Lee, Linked Data - The Story So Far [4] M. Watson, Practical Semantic Web Programming With AllegroGraph, 2009 [5] K. Alexander, R. Cyganiaky, M. Hausenblasz, J. Zhaox, Describing Linked Datasets [6] http://opencalais.com/ [7] http://www.w3.org/DesignIssues/LinkedData.html [8] http://thomsonreuters.com/content/corporate/articles/398062 [9]http://philippeadjiman.com/blog/2009/09/16/open-calais-from-java-with-eclipse- extract-entities-facts-and-events-in-4-minutes/ [10] http://www.devx.com/semantic/Article/38517/1763/page/2 [11] http://blog.3kbo.com/2009/09/26/opencalais-response/ [12] http://wiki.dbpedia.org/Interlinking [13]http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataS ets [14] http://gvlt.wordpress.com/2008/10/17/tutorial-text-geotagging-with-opencalais/
  • 8. 8 Eldorina Andreea Alergus [15] http://en.wikipedia.org/wiki/Linked_Data [16] http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 8