SlideShare a Scribd company logo
Instructor: Professor Lothar Piepmeyer




Beautifying Data
in the Real World
         Group 5:
     Toan Do - An Du
  Vinh Nguyen - Tan Tran

              1
How big is the data on the Internet?


2004: The first time Internet exceed 1EB
2005: Eric Schmidt estimated it was 5 million
 Terabytes (~ 5EB)
Cisco forecasts that in 2015, the size of the
 Internet will reach nearly 1,000 EB

           How big is it?
                    Source: http://www.wisegeek.com/how-big-is-the-internet.htm
                                                      http://techland.time.com/
If 1 byte = 0.5mm




                    Source:3http://blog.fliptop.com/how-much-data-is-on-the-internet/
Content



Introduction
Open Notebook Sciences appoaching
Curating and presenting the data
Beautfifying the data
Data Visualization & Building a portal from
 open data and free services
Demonstration
Data on the internet




                Source: http://news.bbc.co.uk/2/hi/technology/8562801.stm
Problems of data in real world
(Scientific)


Noisy source of data
The barrier of data presentation
  OCR version
  Text version
  Human-readable
  Machine readable
  …
How to verify the data?
Open Notebook Science


Purpose: record full scientific research raw data,
 make it available and online
Benefits:
   obtain detailed descriptions of procedures
   improve the communication of science
   increase the progress
   reduce time lost due to the repetition of failed
    experiments
   …
Apply ONS on free services
Crowdsourcing


a distributed problem-solving and
 production model
Crowdsourcing
Crowdsourcing
Crowdsourcing




                Source: http://r18ultrachair.com/
Validating crowdsourced data



According to ONS, all detail data have been
 recorded
The doubtful data also be kept and marked
 for
Unique Identifiers for Chemical
Entity



Standardize data

Facilitate the integration with other data sets

Consider 3 possibilities
   CAS Registry Number
   InChI
   SMILES
CAS Registry Number



 Proprietary

 Cannot converted to chemical structure

 Dependent to a external organization to issue

For example, the CAS number of water is 7732-18-5: the
   checksum 5 is calculated as (8 1 + 1 2 + 2 3 + 3 4 + 7 5 +
   7 6) = 105; 105 mod 10 = 5
http://en.wikipedia.org/wiki/CAS_registry_number
InChI
 IUPAC International Chemical Identifier
 Freely usable and non-proprietary
 Do not have to be assigned by some organization
 Can be computed from structural information
 Human readable (with practice)




            http://en.wikipedia.org/wiki/Inchi
SMILES

   Simplified molecular-input
    line-entry system

   More human-readable than
    InChI

   Can convert to InChI




http://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system
18
http://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system
Analysis Options



Access to live data
Get Summary
Complex Statistical representations of
 models
Mark the skeptical data for later
 consideration
20
Google Docs API


Allows developers to create, retrieve, update, and
 delete Google Docs files and collections
Also provides some advanced features like resource
 archives, Optical Character
Recognition, translation, and revision history.
Useful to store data in the cloud, perform resource
 management, convert document formats


https://developers.google.com/google-apps/documents-list/
Google Visualization API


Chart Library
  JavaScript classes
Data Table
  JavaScript DataTable class
Data Source
  Chart Tools Datasource
   protocol

                        https://developers.google.com/chart/interactive/docs/index
23
24
https://google-developers.appspot.com/chart/interactive/docs/gallery
RESTful Web Service


 Representational State Transfer - a simpler alternative to
  SOAP - and Web Services Description Language (WSDL)
  based Web services
 Principles:
      Use HTTP methods explicitly.
      Be stateless.
      Expose directory structure-like URIs.
      Transfer XML, JavaScript Object
 Notation (JSON), or both.

http://www.ibm.com/developerworks/webservices/library/ws-restful/
Compare REST and SOAP


Who's using REST?
     All of Yahoo's web services use REST, including Flickr,
      del.icio.us API uses it, pubsub, bloglines, technorati, and
      both eBay, and Amazon have web services for both
      REST and SOAP.
Who's using SOAP?
     Google seams to be consistent in implementing their
      web services to use SOAP, with the exception of
      Blogger, which uses XML-RPC. You will find SOAP web
      services in lots of enterprise software as well.
http://www.petefreitag.com/item/431.cfm
Compare REST and SOAP



REST                   SOAP
 Lightweight - not a    Easy to consume -
  lot of extra xml        sometimes
  markup                 Rigid - type
 Human Readable          checking, adheres to
  Results                 a contract
 Easy to build - no     Development tools
  toolkits required
28
An Effort to Aggregate Data from
Multiple Sources



Introducing ChemSpider
  An online lookup engine for Chemists
     http://www.chemspider.com
     40 mil substances
     Multiple data sources
     A "link farm" to other sources
What is "wrong" with
  wikipedia.com?


         30
Wikipedia.com


Not “wrong”:

   Very informative for human being
Wikipedia.com


This little guy is left behind

  Not machine-readable
Semantic Web

Describing things in a way that computers
 applications can understand it.
   “The Beatles was a band from Liverpool”
Describes the relationships between things (like A
 is a part of B and Y is a member of Z) and
 the properties of things (like size, weight, age, and
 price)
“..will make all the data in the world look like
 one huge database“ – Tim Berners-Lee
                             http://www.w3schools.com/web/web_semantic.asp
Resource Description Framework

Is a language to describe resources on
 the web
Component of the Semantic Web
Data is self-describing
  Triples: "subject", "predicate" and "value“
  URIs are used to denote resources
RDF

Graph Database
  Nodes
  Edges




Well-suited for Knowledge Representation
  Beautified Data => Knowledge
RDF Example

<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
  <cd:artist>Bob Dylan</cd:artist>
  <cd:country>USA</cd:country>
  <cd:company>Columbia</cd:company>
  <cd:price>10.90</cd:price>
  <cd:year>1985</cd:year>
</rdf:Description>
</rdf:RDF>
Semantic Web Example: DBPedia

“Old School” wikipedia:
     http://en.wikipedia.org/wiki/Porsche_Panamera


DbPedia Entries

   http://dbpedia.org/page/Porsche_Panamera
   http://dbpedia.org/page/Chromium_carbide
Query Language: SPARQL (sparkle)

Query Language for RDF
    Graph Traversal
    Matching the triples
Example:
    Data:
<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL
  Tutorial”

    Query:
  SELECT ?title
  WHERE { <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title>
  ?title . }

    Query Result:           title "SPARQL Tutorial"
To Infinity and Beyond

• DB2 and Oracle are ready for this train

•Object Database
    Versant OODBMS, anybody?

•Machine-Readable Data
    Will they become self-awareness?

                     39
“Data Finds Data” and Semantic Data
       Model – A Hypothesis




                 40
Non-Obvious Relationship Awareness




   LÂM



                         BẢO




                41
Non-Obvious Relationship Awareness

     LÂM’s
     iPhone




   LÂM


                         BẢO




                42
Non-Obvious Relationship Awareness

     LÂM’s
     iPhone

                         BẢO’s
                      SS Galaxy

   LÂM


                         BẢO




                43
TheGioiDi
           Dong.com


  LÂM’s
  iPhone

                          BẢO’s
                       SS Galaxy

LÂM


                          BẢO




            44
TheGioiDi
           Dong.com


  LÂM’s
  iPhone

                          BẢO’s
                       SS Galaxy

LÂM


                          BẢO




            45
TheGioiDi
                           Dong.com


             LÂM’s
             iPhone

                                          BẢO’s
                                       SS Galaxy

           LÂM


                                          BẢO
Connection Detected!
 -Bao could have met Lam at Thegioididong?
 -They could have discussed their World domination
scheme during the meeting there?
-???                         46
TheGioiDi
           Dong.com


  LÂM’s
  iPhone

                          BẢO’s
                       SS Galaxy

LÂM


                          BẢO




            47
 Data Visualization

 Building a portal from open data and
free services
Visualization of Data




                        Top million web
                        sites (per Alexa
                        traffic data) was
                        performed in
                        early 2010 ]


                        Source http://nmap.org/favicon/
Visualization of Data
Second Life
Second Life is a 3D world where everyone you see is a real person and
every place you visit is built by people just like you.
3D Visualization in SL
SL- The Opportunity for "Edutainment"




           iSchool                      Teaching: Quizzes and Lectures




  Classrooms with Powerpoint                        Research Center
                     Drexel Island on Second Life
3-D Environments




                               http://3rdrockgrid.com/
  http://www.secondlife.com/




                               http://www.craft-world.org


  http://www.osgrid.org/


                                 http://youralternativelife.com//
Visualization To Suggest New
Experiments
Building A Portal From Open Data And
 Free Services


 Freely hosted Wiki service
 Google Spreadsheet
 Google Docs API / javascripts
 Visualization services/anlalysis services (2D, 3D)
 RDF/ Senmantic Web/ Webservices
 Cost: free or fit to the purpose
Key To Success




                     Model
+ Transparency
                  Information


                    Data

                  Records
Demonstration
 Google Docs
 Second Life
References


Oreilly – Beautiful data – Chapter 16th
 Beautifying data in the real world
http://techland.time.com/2011/06/01/how-big-
 is-the-internet-spoiler-not-as-big-as-itll-be-in-
 2015/
http://drexelisland.wikispaces.com/
SMILE to 3D – Secon Life,
 http://www.youtube.com/watch?v=tOfhuoRbn
 Cg&feature=player_embedded

More Related Content

PDF
How to Build Linked Data Sites with Drupal 7 and RDFa
PPS
Linking Open Data with Drupal
PDF
Introduction to Graph Databases
PDF
Wimmics Overview 2021
PPTX
Web open standards for linked data and knowledge graphs as enablers of EU dig...
PPT
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
PPTX
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
PDF
Understanding the Standards Gap
How to Build Linked Data Sites with Drupal 7 and RDFa
Linking Open Data with Drupal
Introduction to Graph Databases
Wimmics Overview 2021
Web open standards for linked data and knowledge graphs as enablers of EU dig...
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
Understanding the Standards Gap

What's hot (17)

PDF
Overview of the Research in Wimmics 2018
PDF
The Web We Mix - benevolent AIs for a resilient web
PDF
Web science AI and IA
PDF
Learning Multilingual Semantics from Big Data on the Web
PDF
20141112 courtot big_datasemwebontologies
PDF
Semantic Web Applications in Libraries: The Road to BIBFRAME
PDF
Database Pro Power Days 2010 - Graph data in the cloud using .NET
PPTX
BIBFRAME
PPTX
Introduction to bibframe
PPT
Exploring the Semantic Web
PDF
EDF2012 Mariana Damova - Factforge
PPTX
Best Practices for Multilingual Linked Open Data
PPTX
Semantic Web Foundations for Representing, Reasoning, and Traversing Contextu...
PDF
DBpedia as Gaeilge Chapter
PDF
Why Link?
PDF
Jgd User Group Demo
PDF
Serendipity in Linked Open Data
Overview of the Research in Wimmics 2018
The Web We Mix - benevolent AIs for a resilient web
Web science AI and IA
Learning Multilingual Semantics from Big Data on the Web
20141112 courtot big_datasemwebontologies
Semantic Web Applications in Libraries: The Road to BIBFRAME
Database Pro Power Days 2010 - Graph data in the cloud using .NET
BIBFRAME
Introduction to bibframe
Exploring the Semantic Web
EDF2012 Mariana Damova - Factforge
Best Practices for Multilingual Linked Open Data
Semantic Web Foundations for Representing, Reasoning, and Traversing Contextu...
DBpedia as Gaeilge Chapter
Why Link?
Jgd User Group Demo
Serendipity in Linked Open Data
Ad

Viewers also liked (6)

PPTX
Hadoop at a glance
PDF
BIS Vietnamese-German University
PPTX
Brief Introduction to HCI
PPTX
Personal task management
PDF
Mật thư trò chơi lớn (tóm tắt)
PPSX
Phac thao compendium
Hadoop at a glance
BIS Vietnamese-German University
Brief Introduction to HCI
Personal task management
Mật thư trò chơi lớn (tóm tắt)
Phac thao compendium
Ad

Similar to Beautifying Data in the real world (20)

PDF
The Semantic Web: What IAs Need to Know About Web 3.0
KEY
Web Technology Trends (early 2009)
PPT
Semantic Web: In Quest for the Next Generation Killer Apps
PDF
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
PDF
What do we want computers to do for us?
PPT
Linked data and voyager
PPTX
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
PPT
Skb web2.0
PPSX
The Web of data and web data commons
PPT
Web 3 Mark Greaves
PPTX
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
PPT
Web3uploaded
PPT
Explaining The Semantic Web
DOCX
LODLAM Landscape NOTES
PPTX
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
PPT
Exploring and using the Semantic Web - SSSW09 tutorial
PDF
Semantic web and Linked Data
ODP
Linked Data
PPTX
鏈結資料在圖書館的應用20131107
ODP
State of the Semantic Web
The Semantic Web: What IAs Need to Know About Web 3.0
Web Technology Trends (early 2009)
Semantic Web: In Quest for the Next Generation Killer Apps
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
What do we want computers to do for us?
Linked data and voyager
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
Skb web2.0
The Web of data and web data commons
Web 3 Mark Greaves
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
Web3uploaded
Explaining The Semantic Web
LODLAM Landscape NOTES
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Exploring and using the Semantic Web - SSSW09 tutorial
Semantic web and Linked Data
Linked Data
鏈結資料在圖書館的應用20131107
State of the Semantic Web

More from Tan Tran (11)

PPT
Managing for results
PPTX
Software estimation techniques
PPTX
Jira in action
PPTX
Management skills in IT - Communication
PPTX
Internet governance and the filtering problems
PDF
C# conventions & good practices
PDF
Tổng hợp Dâng Ngài - nhạc sĩ Thy Yên
PDF
Flash coding convention for action script 3
PDF
Java convention
PPTX
VGU - BIS2010: Integrated Information Management
PPTX
Scrum introduction
Managing for results
Software estimation techniques
Jira in action
Management skills in IT - Communication
Internet governance and the filtering problems
C# conventions & good practices
Tổng hợp Dâng Ngài - nhạc sĩ Thy Yên
Flash coding convention for action script 3
Java convention
VGU - BIS2010: Integrated Information Management
Scrum introduction

Recently uploaded (20)

PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Basic Mud Logging Guide for educational purpose
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Business Ethics Teaching Materials for college
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Pre independence Education in Inndia.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Anesthesia in Laparoscopic Surgery in India
Microbial diseases, their pathogenesis and prophylaxis
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Basic Mud Logging Guide for educational purpose
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Business Ethics Teaching Materials for college
Supply Chain Operations Speaking Notes -ICLT Program
2.FourierTransform-ShortQuestionswithAnswers.pdf
VCE English Exam - Section C Student Revision Booklet
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Renaissance Architecture: A Journey from Faith to Humanism
Module 4: Burden of Disease Tutorial Slides S2 2025
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Cell Types and Its function , kingdom of life
Pre independence Education in Inndia.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx

Beautifying Data in the real world

  • 1. Instructor: Professor Lothar Piepmeyer Beautifying Data in the Real World Group 5: Toan Do - An Du Vinh Nguyen - Tan Tran 1
  • 2. How big is the data on the Internet? 2004: The first time Internet exceed 1EB 2005: Eric Schmidt estimated it was 5 million Terabytes (~ 5EB) Cisco forecasts that in 2015, the size of the Internet will reach nearly 1,000 EB How big is it? Source: http://www.wisegeek.com/how-big-is-the-internet.htm http://techland.time.com/
  • 3. If 1 byte = 0.5mm Source:3http://blog.fliptop.com/how-much-data-is-on-the-internet/
  • 4. Content Introduction Open Notebook Sciences appoaching Curating and presenting the data Beautfifying the data Data Visualization & Building a portal from open data and free services Demonstration
  • 5. Data on the internet Source: http://news.bbc.co.uk/2/hi/technology/8562801.stm
  • 6. Problems of data in real world (Scientific) Noisy source of data The barrier of data presentation OCR version Text version Human-readable Machine readable … How to verify the data?
  • 7. Open Notebook Science Purpose: record full scientific research raw data, make it available and online Benefits: obtain detailed descriptions of procedures improve the communication of science increase the progress reduce time lost due to the repetition of failed experiments …
  • 8. Apply ONS on free services
  • 12. Crowdsourcing Source: http://r18ultrachair.com/
  • 13. Validating crowdsourced data According to ONS, all detail data have been recorded The doubtful data also be kept and marked for
  • 14. Unique Identifiers for Chemical Entity Standardize data Facilitate the integration with other data sets Consider 3 possibilities  CAS Registry Number  InChI  SMILES
  • 15. CAS Registry Number  Proprietary  Cannot converted to chemical structure  Dependent to a external organization to issue For example, the CAS number of water is 7732-18-5: the checksum 5 is calculated as (8 1 + 1 2 + 2 3 + 3 4 + 7 5 + 7 6) = 105; 105 mod 10 = 5 http://en.wikipedia.org/wiki/CAS_registry_number
  • 16. InChI  IUPAC International Chemical Identifier  Freely usable and non-proprietary  Do not have to be assigned by some organization  Can be computed from structural information  Human readable (with practice) http://en.wikipedia.org/wiki/Inchi
  • 17. SMILES  Simplified molecular-input line-entry system  More human-readable than InChI  Can convert to InChI http://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system
  • 19. Analysis Options Access to live data Get Summary Complex Statistical representations of models Mark the skeptical data for later consideration
  • 20. 20
  • 21. Google Docs API Allows developers to create, retrieve, update, and delete Google Docs files and collections Also provides some advanced features like resource archives, Optical Character Recognition, translation, and revision history. Useful to store data in the cloud, perform resource management, convert document formats https://developers.google.com/google-apps/documents-list/
  • 22. Google Visualization API Chart Library JavaScript classes Data Table JavaScript DataTable class Data Source Chart Tools Datasource protocol https://developers.google.com/chart/interactive/docs/index
  • 23. 23
  • 25. RESTful Web Service  Representational State Transfer - a simpler alternative to SOAP - and Web Services Description Language (WSDL) based Web services  Principles:  Use HTTP methods explicitly.  Be stateless.  Expose directory structure-like URIs.  Transfer XML, JavaScript Object  Notation (JSON), or both. http://www.ibm.com/developerworks/webservices/library/ws-restful/
  • 26. Compare REST and SOAP Who's using REST? All of Yahoo's web services use REST, including Flickr, del.icio.us API uses it, pubsub, bloglines, technorati, and both eBay, and Amazon have web services for both REST and SOAP. Who's using SOAP? Google seams to be consistent in implementing their web services to use SOAP, with the exception of Blogger, which uses XML-RPC. You will find SOAP web services in lots of enterprise software as well. http://www.petefreitag.com/item/431.cfm
  • 27. Compare REST and SOAP REST SOAP Lightweight - not a Easy to consume - lot of extra xml sometimes markup Rigid - type Human Readable checking, adheres to Results a contract Easy to build - no Development tools toolkits required
  • 28. 28
  • 29. An Effort to Aggregate Data from Multiple Sources Introducing ChemSpider An online lookup engine for Chemists http://www.chemspider.com 40 mil substances Multiple data sources A "link farm" to other sources
  • 30. What is "wrong" with wikipedia.com? 30
  • 31. Wikipedia.com Not “wrong”:  Very informative for human being
  • 32. Wikipedia.com This little guy is left behind Not machine-readable
  • 33. Semantic Web Describing things in a way that computers applications can understand it. “The Beatles was a band from Liverpool” Describes the relationships between things (like A is a part of B and Y is a member of Z) and the properties of things (like size, weight, age, and price) “..will make all the data in the world look like one huge database“ – Tim Berners-Lee http://www.w3schools.com/web/web_semantic.asp
  • 34. Resource Description Framework Is a language to describe resources on the web Component of the Semantic Web Data is self-describing Triples: "subject", "predicate" and "value“ URIs are used to denote resources
  • 35. RDF Graph Database Nodes Edges Well-suited for Knowledge Representation Beautified Data => Knowledge
  • 36. RDF Example <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> <rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque"> <cd:artist>Bob Dylan</cd:artist> <cd:country>USA</cd:country> <cd:company>Columbia</cd:company> <cd:price>10.90</cd:price> <cd:year>1985</cd:year> </rdf:Description> </rdf:RDF>
  • 37. Semantic Web Example: DBPedia “Old School” wikipedia:  http://en.wikipedia.org/wiki/Porsche_Panamera DbPedia Entries  http://dbpedia.org/page/Porsche_Panamera  http://dbpedia.org/page/Chromium_carbide
  • 38. Query Language: SPARQL (sparkle) Query Language for RDF Graph Traversal Matching the triples Example: Data: <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial” Query: SELECT ?title WHERE { <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title . } Query Result: title "SPARQL Tutorial"
  • 39. To Infinity and Beyond • DB2 and Oracle are ready for this train •Object Database Versant OODBMS, anybody? •Machine-Readable Data Will they become self-awareness? 39
  • 40. “Data Finds Data” and Semantic Data Model – A Hypothesis 40
  • 42. Non-Obvious Relationship Awareness LÂM’s iPhone LÂM BẢO 42
  • 43. Non-Obvious Relationship Awareness LÂM’s iPhone BẢO’s SS Galaxy LÂM BẢO 43
  • 44. TheGioiDi Dong.com LÂM’s iPhone BẢO’s SS Galaxy LÂM BẢO 44
  • 45. TheGioiDi Dong.com LÂM’s iPhone BẢO’s SS Galaxy LÂM BẢO 45
  • 46. TheGioiDi Dong.com LÂM’s iPhone BẢO’s SS Galaxy LÂM BẢO Connection Detected! -Bao could have met Lam at Thegioididong? -They could have discussed their World domination scheme during the meeting there? -??? 46
  • 47. TheGioiDi Dong.com LÂM’s iPhone BẢO’s SS Galaxy LÂM BẢO 47
  • 48.  Data Visualization  Building a portal from open data and free services
  • 49. Visualization of Data Top million web sites (per Alexa traffic data) was performed in early 2010 ] Source http://nmap.org/favicon/
  • 51. Second Life Second Life is a 3D world where everyone you see is a real person and every place you visit is built by people just like you.
  • 53. SL- The Opportunity for "Edutainment" iSchool Teaching: Quizzes and Lectures Classrooms with Powerpoint Research Center Drexel Island on Second Life
  • 54. 3-D Environments http://3rdrockgrid.com/ http://www.secondlife.com/ http://www.craft-world.org http://www.osgrid.org/ http://youralternativelife.com//
  • 55. Visualization To Suggest New Experiments
  • 56. Building A Portal From Open Data And Free Services  Freely hosted Wiki service  Google Spreadsheet  Google Docs API / javascripts  Visualization services/anlalysis services (2D, 3D)  RDF/ Senmantic Web/ Webservices  Cost: free or fit to the purpose
  • 57. Key To Success Model + Transparency Information Data Records
  • 59. References Oreilly – Beautiful data – Chapter 16th Beautifying data in the real world http://techland.time.com/2011/06/01/how-big- is-the-internet-spoiler-not-as-big-as-itll-be-in- 2015/ http://drexelisland.wikispaces.com/ SMILE to 3D – Secon Life, http://www.youtube.com/watch?v=tOfhuoRbn Cg&feature=player_embedded