SlideShare a Scribd company logo
Querying the Web Out of the Slipstream :: September 27, 2007
Querying the Web “ Information wants to be free” Stewart Brand, Whole Earth Catalogue  May 1985 “ If the new computer set up allowed folks inside to be more creative and independent, why not open it up to outsiders, too?” Jeff Bezos, Amazon  March 2002 “ Data is the Next Intel Inside” Tim O’Reilly  September 2005 Open Source has commoditized software Creative Commons will commoditize information Which leaves servers, services and service…
General Medical Council
General Medical Council
General Medical Council
Freebase
Freebase
Freebase
Freebase
Freebase Metaweb Query Language Request: {  "type" : "/medicine/physician",  "name" : “Michael Maher“ } Response: { "code": "/api/status/ok",   "result": {   "type": "/medicine/physician",    "name": “Michael Maher",   “gender”: “Male”, “ education”: “Leeds University”} } JSON
GMC Authoritative Website based search Static Restrictive license Even if you pay for the data you still cannot use it, legally. Periodic updates GMC vs Freebase Freebase User sourced content API Extensible, dynamic Creative Commons / PD Automatic right to use Stepwise refinement
REST REpresentational State Transfer Less rigourous equivalent of SOAP Data are considered to be resources Every resource has a unique address Layered over http: Client/Server separation Stateless Cacheable Request: GET http://rest.georgejames.com/product/Serenji/ Response: Name=Serenji Price=195.00 OrderCode=H1001
Amazon S3 S3 :: Simple Storage Service Online storage space $0.15 per Gbyte per month for storage ~ $0.20 per Gbyte data transfer Storage request: PUT http://s3.amazonaws.com/[bucket-name]/[key-name]   Retrieval request: GET http://s3.amazonaws.com/[bucket-name]/[key-name]  EC2 :: Elastic Compute Clouds
Microformats
Microformats Without Microformats: <div class=‘opaque’> Out of the Slipstream is a one-day conference on Thursday 27 September 2007 at Brooklands Museum, Surrey, UK.  </div> With Microformats: <div class=‘opaque vevent’> <span class='summary'> Out of the Slipstream </span>  is a one-day conference on   <abbr class=&quot;dtstart&quot; title=&quot;20070927&quot;>  Thursday 27 September 2007 </abbr> at Brooklands Museum, Surrey, UK.  </div>
Microformats
Astoria
Astoria in action Request: http://astoria.sandbox.live.com/northwind/northwind.rse/Categories Response:
Astoria in action Request: http://astoria.sandbox.live.com/northwind/northwind.rse/Customers Response:
Astoria in action Request: /Customers[FRANK] Response:
Astoria in action Request: /Customers[FRANK]/Orders Response:
Astoria in action A variety of response formats: POX Web3S (Web, Structured, Schema’d and Searchable) ATOM  JSON JSON request: /Customers[FRANK]?$format=json Response:
Astoria is still evolving Ongoing discussion about the format of requests: /Customers!’FRANK’ /Customers!’FRANK’/Orders!10267 /Customers!CustomerID=‘FRANK’ /Customers(‘FRANK’) /Customers(‘FRANK’)/Orders(10267) Qualifiers control the response format: /Customers(‘FRANK’)/CustomerName /Customers(‘FRANK’)/CustomerName/$value /Customers(‘FRANK’)/$format=json /Customers/$skip=30&$take=10 Currently being Microsoftened…
Where is all this information going to come from?
Crowdsourcing Jeff Howe, Wired Magazine, June 2006 Delegating an activity to a large number of unidentified individuals Small finite tasks Quantity more important than quality The sum is greater than the parts Examples:  Wikipedia
Crowdsourcing
Crowdsourcing
Google Maps
Google Maps
Crowdsourcing Jeff Howe, June 2006, Wired Magazine Delegating an activity to a large number of unidentified individuals Small finite tasks Quantity more important than quality The sum is greater than the parts Examples:  Wikipedia Galaxy Zoo Amazon Mechanical Turk Google route planner Consequences: Drives down the cost of data Ownership may not be the traditional incubents Client / user needs to discriminate
The Power of Information Review Commissioned by the Cabinet Office, published in June 2007, to review and advise on the use of public sector information. Recommendation 9: By Budget 2008, government should commission  and publish an independent review of the costs and benefits of the current trading  fund charging model for the re-use  of public sector information, including  the role of the five largest trading  funds, the balance of direct versus  downstream economic revenue,  and the impact on the quality of  public sector information. US: Public Domain UK: Crown Copyright
AND - Automotive Navigation Data     Press release: July 4, 2007 Rotterdam - AND Automotive Navigation Data has agreed ... to donate digital maps of the Netherlands,  China and India to the community.
More ways of querying the web Google Search Google Events Google Base Yahoo! Pipes RSS – Really Simple Syndication KML BBC Backstage
The Internet is the Database
Thank you Questions?

More Related Content

PPT
Vertrauen und Kollaboration – Erfolgsfaktoren für die Akzeptanz künftiger E-G...
PDF
WoT 2013 Interop
PPT
3D technologies for teaching and learning
PPTX
Getting value from institutional repositories: IRUS UK - Jisc Digital Festiva...
PPT
Norfolk County Council Announces Cloud-based Storage Network
PPTX
Save money and consolidate data in one safe environment - Jisc Digital Festiv...
PPT
Data drives decisions
PDF
Seminar - OpenData
Vertrauen und Kollaboration – Erfolgsfaktoren für die Akzeptanz künftiger E-G...
WoT 2013 Interop
3D technologies for teaching and learning
Getting value from institutional repositories: IRUS UK - Jisc Digital Festiva...
Norfolk County Council Announces Cloud-based Storage Network
Save money and consolidate data in one safe environment - Jisc Digital Festiv...
Data drives decisions
Seminar - OpenData

What's hot (16)

PPTX
The future of cloud computing - Jisc Digifest 2016
PPTX
Research data: burden or treasure? (Talk from #fote13)
PPTX
Harnessing the power of indoor positioning technology - Jisc Digital Festival...
PDF
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
PPTX
Internet in space - Networkshop44
PPTX
The user -driven evolution of Janet - Jisc Digifest 2016
PPTX
Frictionless Sharing - The New Normal?
PPTX
Supercomputing and the cloud - the next big paradigm shift?
PPTX
Data centre networking at the University of Bristol - Networkshop44
PPTX
Telephony is changing - is your institution ready? - Jisc Digital Festival 2015
PPTX
Using jisc's JUSP and CCM services effectively to manage resources - Jisc Dig...
PPTX
Optimizing Open Data
PPTX
Easy SPARQLing for the Building Performance Professional
PPT
Semantic Puzzle
PPTX
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
PPT
Show Us A Better Way - A Look Back/Forward
The future of cloud computing - Jisc Digifest 2016
Research data: burden or treasure? (Talk from #fote13)
Harnessing the power of indoor positioning technology - Jisc Digital Festival...
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
Internet in space - Networkshop44
The user -driven evolution of Janet - Jisc Digifest 2016
Frictionless Sharing - The New Normal?
Supercomputing and the cloud - the next big paradigm shift?
Data centre networking at the University of Bristol - Networkshop44
Telephony is changing - is your institution ready? - Jisc Digital Festival 2015
Using jisc's JUSP and CCM services effectively to manage resources - Jisc Dig...
Optimizing Open Data
Easy SPARQLing for the Building Performance Professional
Semantic Puzzle
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
Show Us A Better Way - A Look Back/Forward
Ad

Similar to George James :: Querying The Web (20)

PPT
Querying the Web
PDF
Digital Economy, Digital Tourism based on Open Data and Open Access Approach
PPT
The Archives Forum - The National Archives - 02 March 2011
ZIP
Practical Semantic Web and Why You Should Care - DrupalCon DC 2009
PDF
NCGIC The Geospatial Revolution
PDF
Reliability & Scale in AWS while letting you sleep through the night
ODP
Business Models for Web 2.0
PDF
GIS in the Rockies Geospatial Revolution
PDF
Keynote Géomatique 2016 - 20 octobre 2016 - M.Paul Ramsey
PDF
Présentation du Keynote du jeudi 20 octobre 2016 - M. Paul Ramsey
PPT
Brian Kelly and Paul Walk, SaaSy APIs (Openness in the Cloud)
KEY
State of the Internet Operating System: Web2 expo10
KEY
Open Data Semantic Web Community Barn Raising
PDF
EDF2012 Chris Taggart - How the biggest Open Database of Companies was built
PDF
Service goes accessible_2013_sh
PDF
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
PDF
Social media and records management
PDF
Public private-cloud
KEY
To G or not to G
PPT
[MS PowerPoint 97/2000 format]
Querying the Web
Digital Economy, Digital Tourism based on Open Data and Open Access Approach
The Archives Forum - The National Archives - 02 March 2011
Practical Semantic Web and Why You Should Care - DrupalCon DC 2009
NCGIC The Geospatial Revolution
Reliability & Scale in AWS while letting you sleep through the night
Business Models for Web 2.0
GIS in the Rockies Geospatial Revolution
Keynote Géomatique 2016 - 20 octobre 2016 - M.Paul Ramsey
Présentation du Keynote du jeudi 20 octobre 2016 - M. Paul Ramsey
Brian Kelly and Paul Walk, SaaSy APIs (Openness in the Cloud)
State of the Internet Operating System: Web2 expo10
Open Data Semantic Web Community Barn Raising
EDF2012 Chris Taggart - How the biggest Open Database of Companies was built
Service goes accessible_2013_sh
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Social media and records management
Public private-cloud
To G or not to G
[MS PowerPoint 97/2000 format]
Ad

More from george.james (20)

PPT
Fosdem 2010 GT.M and OpenStreetMap
PPT
M/DB and M/DB:X
PDF
Lost In The Clouds
PPT
On a cloudy day you can scale forever
PPT
Bad Light Stops Play
ODP
Securing The Cloud
ODP
Out Of The Slipstream Proposal
PPT
Lightning In The Clouds
ODP
Lost In The Clouds
PPT
Mumps the Internet scale database
PPT
Web Development Environments: Choose the best or go with the rest
PPT
Web Servers: Architecture and Security
PPT
Google's BigTable
PPT
Report from DEVCON 2008
PPT
Michelle's Wallpaper
PPT
The experiences of migrating a large scale, high performance healthcare network
PPT
Beyond The MVC
PPT
Amazon S3 and EC2
PDF
FIS-PIP™ – A high end database application development platform
PPT
Web Design and Programming
Fosdem 2010 GT.M and OpenStreetMap
M/DB and M/DB:X
Lost In The Clouds
On a cloudy day you can scale forever
Bad Light Stops Play
Securing The Cloud
Out Of The Slipstream Proposal
Lightning In The Clouds
Lost In The Clouds
Mumps the Internet scale database
Web Development Environments: Choose the best or go with the rest
Web Servers: Architecture and Security
Google's BigTable
Report from DEVCON 2008
Michelle's Wallpaper
The experiences of migrating a large scale, high performance healthcare network
Beyond The MVC
Amazon S3 and EC2
FIS-PIP™ – A high end database application development platform
Web Design and Programming

Recently uploaded (20)

PPTX
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
PPTX
CTG - Business Update 2Q2025 & 6M2025.pptx
PDF
income tax laws notes important pakistan
PDF
NEW - FEES STRUCTURES (01-july-2024).pdf
PDF
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
PPTX
Slide gioi thieu VietinBank Quy 2 - 2025
PDF
THE COMPLETE GUIDE TO BUILDING PASSIVE INCOME ONLINE
DOCX
FINALS-BSHhchcuvivicucucucucM-Centro.docx
PPTX
chapter 2 entrepreneurship full lecture ppt
PPTX
Astra-Investor- business Presentation (1).pptx
PDF
Tortilla Mexican Grill 发射点犯得上发射点发生发射点犯得上发生
PDF
Charisse Litchman: A Maverick Making Neurological Care More Accessible
PPTX
TRAINNING, DEVELOPMENT AND APPRAISAL.pptx
PDF
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
PDF
Susan Semmelmann: Enriching the Lives of others through her Talents and Bless...
PPTX
interschool scomp.pptxzdkjhdjvdjvdjdhjhieij
PPT
Lecture 3344;;,,(,(((((((((((((((((((((((
DOCX
80 DE ÔN VÀO 10 NĂM 2023vhkkkjjhhhhjjjj
PDF
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
PDF
Booking.com The Global AI Sentiment Report 2025
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
CTG - Business Update 2Q2025 & 6M2025.pptx
income tax laws notes important pakistan
NEW - FEES STRUCTURES (01-july-2024).pdf
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
Slide gioi thieu VietinBank Quy 2 - 2025
THE COMPLETE GUIDE TO BUILDING PASSIVE INCOME ONLINE
FINALS-BSHhchcuvivicucucucucM-Centro.docx
chapter 2 entrepreneurship full lecture ppt
Astra-Investor- business Presentation (1).pptx
Tortilla Mexican Grill 发射点犯得上发射点发生发射点犯得上发生
Charisse Litchman: A Maverick Making Neurological Care More Accessible
TRAINNING, DEVELOPMENT AND APPRAISAL.pptx
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
Susan Semmelmann: Enriching the Lives of others through her Talents and Bless...
interschool scomp.pptxzdkjhdjvdjvdjdhjhieij
Lecture 3344;;,,(,(((((((((((((((((((((((
80 DE ÔN VÀO 10 NĂM 2023vhkkkjjhhhhjjjj
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
Booking.com The Global AI Sentiment Report 2025

George James :: Querying The Web

  • 1. Querying the Web Out of the Slipstream :: September 27, 2007
  • 2. Querying the Web “ Information wants to be free” Stewart Brand, Whole Earth Catalogue May 1985 “ If the new computer set up allowed folks inside to be more creative and independent, why not open it up to outsiders, too?” Jeff Bezos, Amazon March 2002 “ Data is the Next Intel Inside” Tim O’Reilly September 2005 Open Source has commoditized software Creative Commons will commoditize information Which leaves servers, services and service…
  • 10. Freebase Metaweb Query Language Request: { &quot;type&quot; : &quot;/medicine/physician&quot;, &quot;name&quot; : “Michael Maher“ } Response: { &quot;code&quot;: &quot;/api/status/ok&quot;, &quot;result&quot;: { &quot;type&quot;: &quot;/medicine/physician&quot;, &quot;name&quot;: “Michael Maher&quot;, “gender”: “Male”, “ education”: “Leeds University”} } JSON
  • 11. GMC Authoritative Website based search Static Restrictive license Even if you pay for the data you still cannot use it, legally. Periodic updates GMC vs Freebase Freebase User sourced content API Extensible, dynamic Creative Commons / PD Automatic right to use Stepwise refinement
  • 12. REST REpresentational State Transfer Less rigourous equivalent of SOAP Data are considered to be resources Every resource has a unique address Layered over http: Client/Server separation Stateless Cacheable Request: GET http://rest.georgejames.com/product/Serenji/ Response: Name=Serenji Price=195.00 OrderCode=H1001
  • 13. Amazon S3 S3 :: Simple Storage Service Online storage space $0.15 per Gbyte per month for storage ~ $0.20 per Gbyte data transfer Storage request: PUT http://s3.amazonaws.com/[bucket-name]/[key-name] Retrieval request: GET http://s3.amazonaws.com/[bucket-name]/[key-name] EC2 :: Elastic Compute Clouds
  • 15. Microformats Without Microformats: <div class=‘opaque’> Out of the Slipstream is a one-day conference on Thursday 27 September 2007 at Brooklands Museum, Surrey, UK. </div> With Microformats: <div class=‘opaque vevent’> <span class='summary'> Out of the Slipstream </span> is a one-day conference on <abbr class=&quot;dtstart&quot; title=&quot;20070927&quot;> Thursday 27 September 2007 </abbr> at Brooklands Museum, Surrey, UK. </div>
  • 18. Astoria in action Request: http://astoria.sandbox.live.com/northwind/northwind.rse/Categories Response:
  • 19. Astoria in action Request: http://astoria.sandbox.live.com/northwind/northwind.rse/Customers Response:
  • 20. Astoria in action Request: /Customers[FRANK] Response:
  • 21. Astoria in action Request: /Customers[FRANK]/Orders Response:
  • 22. Astoria in action A variety of response formats: POX Web3S (Web, Structured, Schema’d and Searchable) ATOM JSON JSON request: /Customers[FRANK]?$format=json Response:
  • 23. Astoria is still evolving Ongoing discussion about the format of requests: /Customers!’FRANK’ /Customers!’FRANK’/Orders!10267 /Customers!CustomerID=‘FRANK’ /Customers(‘FRANK’) /Customers(‘FRANK’)/Orders(10267) Qualifiers control the response format: /Customers(‘FRANK’)/CustomerName /Customers(‘FRANK’)/CustomerName/$value /Customers(‘FRANK’)/$format=json /Customers/$skip=30&$take=10 Currently being Microsoftened…
  • 24. Where is all this information going to come from?
  • 25. Crowdsourcing Jeff Howe, Wired Magazine, June 2006 Delegating an activity to a large number of unidentified individuals Small finite tasks Quantity more important than quality The sum is greater than the parts Examples: Wikipedia
  • 30. Crowdsourcing Jeff Howe, June 2006, Wired Magazine Delegating an activity to a large number of unidentified individuals Small finite tasks Quantity more important than quality The sum is greater than the parts Examples: Wikipedia Galaxy Zoo Amazon Mechanical Turk Google route planner Consequences: Drives down the cost of data Ownership may not be the traditional incubents Client / user needs to discriminate
  • 31. The Power of Information Review Commissioned by the Cabinet Office, published in June 2007, to review and advise on the use of public sector information. Recommendation 9: By Budget 2008, government should commission and publish an independent review of the costs and benefits of the current trading fund charging model for the re-use of public sector information, including the role of the five largest trading funds, the balance of direct versus downstream economic revenue, and the impact on the quality of public sector information. US: Public Domain UK: Crown Copyright
  • 32. AND - Automotive Navigation Data Press release: July 4, 2007 Rotterdam - AND Automotive Navigation Data has agreed ... to donate digital maps of the Netherlands, China and India to the community.
  • 33. More ways of querying the web Google Search Google Events Google Base Yahoo! Pipes RSS – Really Simple Syndication KML BBC Backstage
  • 34. The Internet is the Database