SlideShare a Scribd company logo
Spatio-temporal linkage of real and
virtual identity




  Muhammad Adnan (and Paul Longley)
  University College London
Geodemographics

• “Analysis of people by where they live [places]”
                                           (Sleight, 1993:3)

• Social similarity, not locational proximity




                       Home
    Person            Address


                                            Area
Spatio-temporal linkage of real and virtual identity
Identity of individuals in the real world

• Name (Forename & Surname)

• Surnames have geographic concentrations

• Prospects for linkage with socio-economic data
  • E.g. Analysing the socio-economic circumstances of
    different ethnic groups
An example – gbnames.publicprofiler.org




         Longley                  Cheshire
An example – Output Area Classification




  Kingston upon Hull          Hereford
A socio-economic and ethnic classification
A socio-economic and ethnic classification
Spatio-temporal linkage of real and virtual identity
Wu
Source: Cheshire and Longley (2011)
Courtesy: James Cheshire




                           12
Wordle.net
The European scale




                     16 countries.

                     400 million people.

                     5.95 million unique
                     surnames



                     Courtesy: James Cheshire
Onomap classification
      Forename-Surname clustering
        (based on Hanks and Tucker, 2000)
                       UK Electoral Roll
                                                         Mateos
                        Pablo
                                                         Garcia

                        Juan                             Pérez
          Forenames                    Surnames
                        Rosa                               ...

                        Marta                            Sánchez

                          ...                           Rodríguez
                                                            ...
  –     Several iterations until self-contained cluster is exhausted
  –     Cluster assigned a cultural, ethnic & linguistic Onomap type
  –     Probability of ethnicity assigned to each name
                                   Mateos et al (2007) CASA Working Paper 116
WorldNames CEL clusters




Source: Mateos et al (2011)
Spatio-temporal linkage of real and virtual identity
Spatio-temporal linkage of real and virtual identity
Uncertainty and virtual identity

• Identity increasingly shaped by online activities
   – => value may be leveraged from the fusion of physical
     and virtual data sources
• Data fusion and generalisation to relate physical
  and virtual properties
• Use of residence alongside activity patterns and
  social network information
Most of us have virtual identities

• Email address; social media accounts

• People use different procedures and providers to
  establish virtual identities

• Harvesting these data has interesting potential
  applications
  • Cyber crime
  • Cyber geodemographics (Facebook has already started
    this)
Most of us have virtual identities

• Facebook data mining engine
  • Analyses the words you use and tailors advertisement
    accordingly
Starting Point
http://worldnames.publicprofiler.org




• Worldnames holds data for approximately 1 billion
  population around 28 countries of the world

• Approximately 1.6 million unique users have visited
  the website since 2008
Starting Point
http://worldnames.publicprofiler.org




• Worldnames has been archiving „Surname search‟,
  „Email Address‟, „Gender‟, and „IP Address‟ for
  searches over the past 6 months
   • c. 175,000 records: email validation
   • 150,000 usable „IP Address‟ entries
IP Address to Latitude/Longitude conversion

http://quova.com




An API to convert “IP addresses” to their corresponding
  latitude / longitude values
IP Address to Latitude/Longitude conversion
http://quova.com

A search for an IP Address in UCL (128.40.214.196)
Top Countries
Website was searched from 155 countries over the past
 6 months                 UNITED STATES
                          UNITED KINGDOM
                                                 76708
                                                 21892
                          CANADA                  8154
                          GERMANY                 7158
                          ITALY                   4058
 90000                    AUSTRALIA               2978
                          BRAZIL                  2440
 80000                    FRANCE                  2028
                          ARGENTINA               1958
 70000                    SPAIN                   1830
                          NEW ZEALAND             1236
 60000
                          NETHERLANDS             1074
 50000                    GREECE                  1040
                          SWITZERLAND              992
 40000                    BELGIUM                  940
                          POLAND                   880
 30000                    AUSTRIA                  874
                          MEXICO                   834
 20000
                          IRELAND                  710
                          SWEDEN                   630
 10000

     0
UK and Ireland
Europe
North America
South America
India, China, Japan, Singapore
Popular Surname Searches
                            SMITH      708
                            JONES      306
                            JOHNSON    258
                            ANDERSON   224
                            WILLIAMS   222
800
                            MILLER     218
                            MARTIN     202
700                         WILSON     194
                            BROWN      194
                            MOORE      188
600
                            THOMAS     178
                            TAYLOR     170
500                         CLARK      164
                            LEE        160
                            ROBERTS    156
400
                            DAVIS      152
                            CAMPBELL   144
300                         LEWIS      138
                            HARRIS     138
                            MITCHELL   136
200


100


  0
Popular Email Domains
                        GMAIL.COM        31842
                        HOTMAIL.COM      22098
                        YAHOO.COM        15542
35000                   AOL.COM           5550
                        COMCAST.NET       2696
30000                   HOTMAIL.CO.UK     1948
                        MSN.COM           1624
                        WEB.DE            1522
25000                   YAHOO.CO.UK       1290
                        GMX.DE            1260
                        SBCGLOBAL.NET     1246
20000
                        BTINTERNET.COM     860
                        HOTMAIL.IT         844
15000                   VERIZON.NET        798
                        GOOGLEMAIL.COM     742
                        LIVE.COM           742
10000
                        COX.NET            708
                        ATT.NET            632
 5000                   MAILINATOR.COM     616
                        LIBERO.IT          616

    0
Popular Email Domains by Surnames

Smith (English)   Jones (Welsh)    Johnson (English)
                  GMAIL.COM        GMAIL.COM
GMAIL.COM
YAHOO.COM         HOTMAIL.COM      HOTMAIL.COM
HOTMAIL.COM       YAHOO.COM        YAHOO.COM
AOL.COM           COMCAST.NET      MSN.COM
MAILINATOR.COM    GOOGLEMAIL.COM   VERIZON.NET



Perez (Spanish)   Gupta (Indian)   Meyer (German)
GMAIL.COM         GMAIL.COM        GMAIL.COM
HOTMAIL.COM       HOTMAIL.COM      HOTMAIL.COM
YAHOO.ES          YAHOO.COM        YAHOO.COM
CHARTER.NET       GOOGLAMAIL.COM   AOL.COM
GRANDECOM.NET     INDIATIMES.COM   GMX.DE
Popular Email Domains by Country

UK              USA            France
GMAIL.COM       GMAIL.COM      HOTMAIL.FR
HOTMAIL.COM     YAHOO.COM      GMAIL.COM
HOTMAIL.CO.UK   HOTMAIL.COM    HOTMAIL.COM
YAHOO.CO.UK     AOL.COM        YAHOO.FR
YAHOO.COM       COMCAST.NET    LAPOSTE.NET



Germany         Brazil         Japan
WEB.DE          HOTMAIL.COM    YAHOO.COM
GMX.DE          GMAIL.COM      YAHOO.CO.JP
T-ONLINE.DE     YAHOO.COM.BR   GMAIL.COM
YAHOO.DE        IG.COM.BR      HOTMAIL.COM
GMAIL.COM       BOL.COM.BR     MSN.COM
Top GoogleMail.com users

Top Surnames
BINDER
WATKINS
WHITE
WOODS
ROBINSON
SLEEMAN
BENNETT
RITCHIE
SHARP
ROLLINGS
GoogleMail.com users
• Surname „Binder‟




   Germany             Switzerland
GoogleMail.com users
• Surname „Binder‟




   Germany             Switzerland
GoogleMail.com users
• Surname „Blackbourn‟




         New Zealand
Who use their surnames as part of their email
address
 • Approximately 40% of the users have their surname
   as part of their email address
   • abbie.harper@hotmail.com (Surname: Harper)
   • helmut.kempe@inode.at (Surname: Kempe)
 • Top Countries
      50
      45
      40
      35
      30
      25
      20
      15
      10
       5
       0
Who use long email addresses ?
• Grand mean average email length of 8 characters
   • Number of characters on the left side of „@‟
   • United Kingdom, USA, Canada, and other European countries


• People from South American countries and India have long
  email addresses (Average length: 13 characters)

   BRAZIL      ANA.ARAUJO3909@CREASP.ORG.BR (14 characters)
   CHILE       BYRON.DELGADO.INOSTROZA@HOTMAIL.COM (25 characters)
   URUGUAY     DIEGOJAVIERZEBALLOS@GMAIL.COM (17 characters)
   INDIA       GANGULYDEEPANJAN@HOTMAIL.COM (18 characters)
   ARGENTINA   AGUSTINAREYNOZO@GMAIL.COM (13 characters)


• South Indians have longer email address than North Indians
What else we can infer from email addresses
• Internet service provider
   •   A.GOODEVE@AOL. COM
   •   BERRYMANL@BTINTERNET.COM
   •   CARL@VALLEYWISP.NET (Person lives in a rural area of northeast Oregon)


• Country of origin
   •   A.HAKIM26@YAHOO.FR
   •   CBARNES@MEDIAWORKS.CO.NZ


• Probable temporal aspects
   •   ABBY527@OPTONLINE.NET
   •   BERZINSKY102@YAHOO.COM
   •   C.JOHNSTON2@BTINTERNET.COM
What else we can infer from email addresses
• Probable forename of a person
   •   BEVERLY.RICHARDS@YAHOO.COM
   •   BJORN.SOBRY@HOTMAIL.COM
   •   BRANDAN.HOLMES@HOTMAIL.COM


• How up to date someone is with technology
   •   ALEXANDER.BREUSCH@GMAIL.COM
   •   WILLIAM.NEALON@GOOGLEMAIL.COM


• Professional Affiliations
   •   CHRIS@IEEE.ORG
What else we can infer from email addresses
• Work Locations
  •   DOUG.GOODMAN@FOUNDATION.ORG.UK
  •   GRL@KCS.ORG.UK
  •   ERM43@CAM.AC.UK


• Studying
  •   RTRIPOLI@STUDENT.UMASS.EDU
  •   CBALIN01@STUDENTS.BBK.AC.UK
  •   KATHERINE.LITTEN@STUDENT.KIRKWOOD.EDU
Conclusion and future work

• There are some interesting patterns found in the study of
  email addresses
   •   some problems (accuracy of geocoding techniques)


• Prospect of data linkage of data coded to unit postcode level
   •   cluster analysis and data mining techniques


• Future work may involve the data mining of Facebook and
  Twitter data
   •   issues of generalisation



• Visualisation of the data
Thanks for Listening

Any Questions ?
A research agenda




1 Acquire relevant real and virtual data sources and devise DBMS
2 Devise GB-wide classification of NICT usage at neighbourhood
   scale
3 Devise GB-wide classification of social network traffic
4 Develop enhanced worldnames site to harvest real and virtual
   user data
5 Undertake text analysis of worldnames user data and use to link
   classifications (2) and (3)
6 Devise, implement and analyse social networking application and
   cybergeodemographic classification

More Related Content

PDF
BDO Industries - Real Estate & Construction
PPT
Strengthenings School Feeding Programmes in the framework of Latin America an...
 
PDF
Guia PUCP para_el registro y citado de fuentes 2015
PDF
Brochure evento educacion financiera
PDF
Actividad software vantage point (1)
PPT
Barringer e3 ppt_11
PPT
2.4.14 lecture ppt leadership skills
PPTX
HR practices at zong
BDO Industries - Real Estate & Construction
Strengthenings School Feeding Programmes in the framework of Latin America an...
 
Guia PUCP para_el registro y citado de fuentes 2015
Brochure evento educacion financiera
Actividad software vantage point (1)
Barringer e3 ppt_11
2.4.14 lecture ppt leadership skills
HR practices at zong

Viewers also liked (18)

PDF
annualreport_2015_finalBookletRevisedTable ofContents
PDF
Cocina Smeg CO68CMA8
PDF
Fotn 2015 colombia
PDF
Tracking acciones publicitarias en ConMenu.com
PPTX
TractorsPakistan Catalogue
PDF
Eric emmanuel schmitt cei-doi-domni-din-bruxelles-
PPT
21. századi újságírás - tanfolyami tananyag, munkaverzió
PPTX
Función cuadrática
PPTX
Palestra XII EEBBA - 07-10-14
DOC
Pequeño kant ilustrado
PDF
Presentación Innovaciones curriculares y evaluación de competencias
PDF
IMTC Presentation
PDF
Revista guayente nº 93 edita asociación guayente. issn 1576 401-x depósito le...
PDF
Proxecto interdisciplinar
PPT
Ppoint Dinámica atmosférica de España
PPTX
Rodríguez-Hidalgo et al. eshe 2016 hunting
PPTX
Seguridad Informática como Habilitador de Servicios Ciudadanos
PDF
The Joint Diffusion of a Digital Platform and its Complementary Goods: The Ef...
annualreport_2015_finalBookletRevisedTable ofContents
Cocina Smeg CO68CMA8
Fotn 2015 colombia
Tracking acciones publicitarias en ConMenu.com
TractorsPakistan Catalogue
Eric emmanuel schmitt cei-doi-domni-din-bruxelles-
21. századi újságírás - tanfolyami tananyag, munkaverzió
Función cuadrática
Palestra XII EEBBA - 07-10-14
Pequeño kant ilustrado
Presentación Innovaciones curriculares y evaluación de competencias
IMTC Presentation
Revista guayente nº 93 edita asociación guayente. issn 1576 401-x depósito le...
Proxecto interdisciplinar
Ppoint Dinámica atmosférica de España
Rodríguez-Hidalgo et al. eshe 2016 hunting
Seguridad Informática como Habilitador de Servicios Ciudadanos
The Joint Diffusion of a Digital Platform and its Complementary Goods: The Ef...
Ad

Similar to Spatio-temporal linkage of real and virtual identity (20)

PDF
Lesson1 Overview And Intro
KEY
Social media-analytics-2011
PDF
Facebook Now - Blake Chandlee, Facebook
PDF
ソーシャルメディアで進化するCSR
PDF
SES Amsterdam 17 maart 2009: Anne Kennedy
PDF
Dr. David Taylor - Protecting your brand in new gTLDs
PPTX
Movement for Liveable London Street Talks - Rachel Aldred 5th February 2013
PDF
Market Your Site For Free
PDF
NYC Foreign Population by Decade, Country - Seventh Revision
PDF
Moodle global landscape Moodle Moot New Zealand 2011
PDF
2010.05 everything you need to know about social media russia - raport agen...
PPS
Jdt 2012 Small
PDF
SA Facebook Stats - February 2008
PDF
Asia's Best of Breed
PPT
WidgetAvenue Offer
PPTX
How find information in the web
PPTX
Mobile social network
PDF
Mobile social network
PDF
Geoscape American Marketscape Datastream 2011 Executive Summary
PDF
Social Media - Where's Your Foundation?
Lesson1 Overview And Intro
Social media-analytics-2011
Facebook Now - Blake Chandlee, Facebook
ソーシャルメディアで進化するCSR
SES Amsterdam 17 maart 2009: Anne Kennedy
Dr. David Taylor - Protecting your brand in new gTLDs
Movement for Liveable London Street Talks - Rachel Aldred 5th February 2013
Market Your Site For Free
NYC Foreign Population by Decade, Country - Seventh Revision
Moodle global landscape Moodle Moot New Zealand 2011
2010.05 everything you need to know about social media russia - raport agen...
Jdt 2012 Small
SA Facebook Stats - February 2008
Asia's Best of Breed
WidgetAvenue Offer
How find information in the web
Mobile social network
Mobile social network
Geoscape American Marketscape Datastream 2011 Executive Summary
Social Media - Where's Your Foundation?
Ad

More from Dr Muhammad Adnan (9)

PPT
Spatio-temporal demographic classification of the Twitter users
PPTX
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
PDF
Analysing the digital traces of Social Media users
PPTX
Open Data: Analysis and Visualisation
PPTX
Geodemographics: Open tools and mehtods
PPTX
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
PPTX
Uncertainty of Identity: Classifying Twitter Data
PPTX
Visualising large spatial databases and Building bespoke geodemographics
PPT
Real Time Geodemographics
Spatio-temporal demographic classification of the Twitter users
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Analysing the digital traces of Social Media users
Open Data: Analysis and Visualisation
Geodemographics: Open tools and mehtods
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
Uncertainty of Identity: Classifying Twitter Data
Visualising large spatial databases and Building bespoke geodemographics
Real Time Geodemographics

Recently uploaded (20)

PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
Complications of Minimal Access-Surgery.pdf
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
HVAC Specification 2024 according to central public works department
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
International_Financial_Reporting_Standa.pdf
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
Empowerment Technology for Senior High School Guide
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
advance database management system book.pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Complications of Minimal Access-Surgery.pdf
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
HVAC Specification 2024 according to central public works department
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
International_Financial_Reporting_Standa.pdf
AI-driven educational solutions for real-life interventions in the Philippine...
Practical Manual AGRO-233 Principles and Practices of Natural Farming
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
FORM 1 BIOLOGY MIND MAPS and their schemes
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Empowerment Technology for Senior High School Guide
Weekly quiz Compilation Jan -July 25.pdf
What if we spent less time fighting change, and more time building what’s rig...
Paper A Mock Exam 9_ Attempt review.pdf.
advance database management system book.pdf
Share_Module_2_Power_conflict_and_negotiation.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc

Spatio-temporal linkage of real and virtual identity

  • 1. Spatio-temporal linkage of real and virtual identity Muhammad Adnan (and Paul Longley) University College London
  • 2. Geodemographics • “Analysis of people by where they live [places]” (Sleight, 1993:3) • Social similarity, not locational proximity Home Person Address Area
  • 4. Identity of individuals in the real world • Name (Forename & Surname) • Surnames have geographic concentrations • Prospects for linkage with socio-economic data • E.g. Analysing the socio-economic circumstances of different ethnic groups
  • 5. An example – gbnames.publicprofiler.org Longley Cheshire
  • 6. An example – Output Area Classification Kingston upon Hull Hereford
  • 7. A socio-economic and ethnic classification
  • 8. A socio-economic and ethnic classification
  • 10. Wu
  • 11. Source: Cheshire and Longley (2011)
  • 14. The European scale 16 countries. 400 million people. 5.95 million unique surnames Courtesy: James Cheshire
  • 15. Onomap classification Forename-Surname clustering (based on Hanks and Tucker, 2000) UK Electoral Roll Mateos Pablo Garcia Juan Pérez Forenames Surnames Rosa ... Marta Sánchez ... Rodríguez ... – Several iterations until self-contained cluster is exhausted – Cluster assigned a cultural, ethnic & linguistic Onomap type – Probability of ethnicity assigned to each name Mateos et al (2007) CASA Working Paper 116
  • 16. WorldNames CEL clusters Source: Mateos et al (2011)
  • 19. Uncertainty and virtual identity • Identity increasingly shaped by online activities – => value may be leveraged from the fusion of physical and virtual data sources • Data fusion and generalisation to relate physical and virtual properties • Use of residence alongside activity patterns and social network information
  • 20. Most of us have virtual identities • Email address; social media accounts • People use different procedures and providers to establish virtual identities • Harvesting these data has interesting potential applications • Cyber crime • Cyber geodemographics (Facebook has already started this)
  • 21. Most of us have virtual identities • Facebook data mining engine • Analyses the words you use and tailors advertisement accordingly
  • 22. Starting Point http://worldnames.publicprofiler.org • Worldnames holds data for approximately 1 billion population around 28 countries of the world • Approximately 1.6 million unique users have visited the website since 2008
  • 23. Starting Point http://worldnames.publicprofiler.org • Worldnames has been archiving „Surname search‟, „Email Address‟, „Gender‟, and „IP Address‟ for searches over the past 6 months • c. 175,000 records: email validation • 150,000 usable „IP Address‟ entries
  • 24. IP Address to Latitude/Longitude conversion http://quova.com An API to convert “IP addresses” to their corresponding latitude / longitude values
  • 25. IP Address to Latitude/Longitude conversion http://quova.com A search for an IP Address in UCL (128.40.214.196)
  • 26. Top Countries Website was searched from 155 countries over the past 6 months UNITED STATES UNITED KINGDOM 76708 21892 CANADA 8154 GERMANY 7158 ITALY 4058 90000 AUSTRALIA 2978 BRAZIL 2440 80000 FRANCE 2028 ARGENTINA 1958 70000 SPAIN 1830 NEW ZEALAND 1236 60000 NETHERLANDS 1074 50000 GREECE 1040 SWITZERLAND 992 40000 BELGIUM 940 POLAND 880 30000 AUSTRIA 874 MEXICO 834 20000 IRELAND 710 SWEDEN 630 10000 0
  • 31. India, China, Japan, Singapore
  • 32. Popular Surname Searches SMITH 708 JONES 306 JOHNSON 258 ANDERSON 224 WILLIAMS 222 800 MILLER 218 MARTIN 202 700 WILSON 194 BROWN 194 MOORE 188 600 THOMAS 178 TAYLOR 170 500 CLARK 164 LEE 160 ROBERTS 156 400 DAVIS 152 CAMPBELL 144 300 LEWIS 138 HARRIS 138 MITCHELL 136 200 100 0
  • 33. Popular Email Domains GMAIL.COM 31842 HOTMAIL.COM 22098 YAHOO.COM 15542 35000 AOL.COM 5550 COMCAST.NET 2696 30000 HOTMAIL.CO.UK 1948 MSN.COM 1624 WEB.DE 1522 25000 YAHOO.CO.UK 1290 GMX.DE 1260 SBCGLOBAL.NET 1246 20000 BTINTERNET.COM 860 HOTMAIL.IT 844 15000 VERIZON.NET 798 GOOGLEMAIL.COM 742 LIVE.COM 742 10000 COX.NET 708 ATT.NET 632 5000 MAILINATOR.COM 616 LIBERO.IT 616 0
  • 34. Popular Email Domains by Surnames Smith (English) Jones (Welsh) Johnson (English) GMAIL.COM GMAIL.COM GMAIL.COM YAHOO.COM HOTMAIL.COM HOTMAIL.COM HOTMAIL.COM YAHOO.COM YAHOO.COM AOL.COM COMCAST.NET MSN.COM MAILINATOR.COM GOOGLEMAIL.COM VERIZON.NET Perez (Spanish) Gupta (Indian) Meyer (German) GMAIL.COM GMAIL.COM GMAIL.COM HOTMAIL.COM HOTMAIL.COM HOTMAIL.COM YAHOO.ES YAHOO.COM YAHOO.COM CHARTER.NET GOOGLAMAIL.COM AOL.COM GRANDECOM.NET INDIATIMES.COM GMX.DE
  • 35. Popular Email Domains by Country UK USA France GMAIL.COM GMAIL.COM HOTMAIL.FR HOTMAIL.COM YAHOO.COM GMAIL.COM HOTMAIL.CO.UK HOTMAIL.COM HOTMAIL.COM YAHOO.CO.UK AOL.COM YAHOO.FR YAHOO.COM COMCAST.NET LAPOSTE.NET Germany Brazil Japan WEB.DE HOTMAIL.COM YAHOO.COM GMX.DE GMAIL.COM YAHOO.CO.JP T-ONLINE.DE YAHOO.COM.BR GMAIL.COM YAHOO.DE IG.COM.BR HOTMAIL.COM GMAIL.COM BOL.COM.BR MSN.COM
  • 36. Top GoogleMail.com users Top Surnames BINDER WATKINS WHITE WOODS ROBINSON SLEEMAN BENNETT RITCHIE SHARP ROLLINGS
  • 37. GoogleMail.com users • Surname „Binder‟ Germany Switzerland
  • 38. GoogleMail.com users • Surname „Binder‟ Germany Switzerland
  • 39. GoogleMail.com users • Surname „Blackbourn‟ New Zealand
  • 40. Who use their surnames as part of their email address • Approximately 40% of the users have their surname as part of their email address • abbie.harper@hotmail.com (Surname: Harper) • helmut.kempe@inode.at (Surname: Kempe) • Top Countries 50 45 40 35 30 25 20 15 10 5 0
  • 41. Who use long email addresses ? • Grand mean average email length of 8 characters • Number of characters on the left side of „@‟ • United Kingdom, USA, Canada, and other European countries • People from South American countries and India have long email addresses (Average length: 13 characters) BRAZIL ANA.ARAUJO3909@CREASP.ORG.BR (14 characters) CHILE BYRON.DELGADO.INOSTROZA@HOTMAIL.COM (25 characters) URUGUAY DIEGOJAVIERZEBALLOS@GMAIL.COM (17 characters) INDIA GANGULYDEEPANJAN@HOTMAIL.COM (18 characters) ARGENTINA AGUSTINAREYNOZO@GMAIL.COM (13 characters) • South Indians have longer email address than North Indians
  • 42. What else we can infer from email addresses • Internet service provider • A.GOODEVE@AOL. COM • BERRYMANL@BTINTERNET.COM • CARL@VALLEYWISP.NET (Person lives in a rural area of northeast Oregon) • Country of origin • A.HAKIM26@YAHOO.FR • CBARNES@MEDIAWORKS.CO.NZ • Probable temporal aspects • ABBY527@OPTONLINE.NET • BERZINSKY102@YAHOO.COM • C.JOHNSTON2@BTINTERNET.COM
  • 43. What else we can infer from email addresses • Probable forename of a person • BEVERLY.RICHARDS@YAHOO.COM • BJORN.SOBRY@HOTMAIL.COM • BRANDAN.HOLMES@HOTMAIL.COM • How up to date someone is with technology • ALEXANDER.BREUSCH@GMAIL.COM • WILLIAM.NEALON@GOOGLEMAIL.COM • Professional Affiliations • CHRIS@IEEE.ORG
  • 44. What else we can infer from email addresses • Work Locations • DOUG.GOODMAN@FOUNDATION.ORG.UK • GRL@KCS.ORG.UK • ERM43@CAM.AC.UK • Studying • RTRIPOLI@STUDENT.UMASS.EDU • CBALIN01@STUDENTS.BBK.AC.UK • KATHERINE.LITTEN@STUDENT.KIRKWOOD.EDU
  • 45. Conclusion and future work • There are some interesting patterns found in the study of email addresses • some problems (accuracy of geocoding techniques) • Prospect of data linkage of data coded to unit postcode level • cluster analysis and data mining techniques • Future work may involve the data mining of Facebook and Twitter data • issues of generalisation • Visualisation of the data
  • 47. A research agenda 1 Acquire relevant real and virtual data sources and devise DBMS 2 Devise GB-wide classification of NICT usage at neighbourhood scale 3 Devise GB-wide classification of social network traffic 4 Develop enhanced worldnames site to harvest real and virtual user data 5 Undertake text analysis of worldnames user data and use to link classifications (2) and (3) 6 Devise, implement and analyse social networking application and cybergeodemographic classification

Editor's Notes