SlideShare a Scribd company logo
WikiLeaks and the Myth of
     (Data-Driven) Citizen
               Journalism
Catalina Iorga, Research MA in Media Studies ’11, University of Amsterdam
The Benefits of Open Govt. Data
 External Contributions

  •   Specialised skills, local knowledge, ‘professional
        amateurs (Jennifer Bell, VisibleGovernment.ca, 2009)

 Citizen Empowerment

  •   “these applications arm citizens with the information
        they need to make decisions every day” (data.gov,
        2010)

 Software Innovation

  •   Open licenses, existing technologies and communities
       (Jonathan Gray, Open Knowledge Foundation, 2010)
Data-driven Journalism
 Large datasets available online

  •   The Afghan War Diary 2004 - 2010 (WikiLeaks,
        2010)

 Information visualisation tools

  •   Guardian Data Explorer (Tony Hirst)

 Narratives powered by the Web

  •   Using the Web “to tell a story, not just as a
        delivery medium” (Alan Maclean, The New York
        Times, 2010)
Data-driven Journalism




Image source: http://upload.wikimedia.org/wikipedia/commons/4/48/Data_driven_journalism_process.jpg
Research Question
What kind of stories do (data-driven)
citizen journalists tell about the war in
Afghanistan by referencing WikiLeaks
documents?
Intended WikiLeaks
‘Assange himself has stated that WikiLeaks has
deliberately moved away from the "egocentric"
blogosphere and assorted social media and
nowadays collaborates only with professional
journalists and human rights activists.’

(Geert Lovink, ‘Twelve Theses on WikiLeaks’, 2010)
Distinction
data-driven mainstream journalism
                               vs.
    data-driven citizen journalism
Why Links?
• Forms of citation
• Indicators of (citizen / user) engagement
Afghan War Diary
• “ an extraordinary secret
    compendium of over 91,000 reports
    covering the war in Afghanistan
    from 2004 to 2010.”

• “ the most significant archive about
    the reality of war to have ever been
    released during the course of a
    war.”

                             (WikiLeaks,
 2010) Image source:   http://wardiary.wikileaks.org
Afghan War Diary - Der Spiegel




“Deadly Toll”

(Der Spiegel, 2010: http://www.spiegel.de/international/world/bild-708314-114716.html)
Afghan War Diary - The
Guardian




“Afghanistan war logs: IED attacks on civilians, coalition and Afghan troops”

(The Guardian, 2010: http://www.guardian.co.uk/world/datablog/interactive/2010/jul/26/ied-
afghanistan-war-logs)
Methodology
•
 Observe the common root of all Afghan War Diary 2004 -
 2010 document URLs (‘http://wardiary.wikileaks.org/afg/
 event').

•
 Query Google with the Google Scraper to get the first 1000
 results which contain this common root as a textual
 component.

•
 Submit the top 100 results to the Link Ripper to extract all
 outlinks to specific Afghan War Diary 2004 - 2010 document
 pages.

•
 Insert the Link Ripper output in the Harvester to remove
 textual descriptions and alphabetize the obtained URL list.

•
 Manually clean the output by searching for the 'http://
 wardiary.wikileaks.org/afg/event' and produce a list of Afghan
 War Diary 2004 - 2010 document URLs.

•
 Select all documents that receive at least two links and
 compile a final list of the 'most mentioned' warlogs.
Limitations
• Searching for inlinks with Yahoo! Site Explorer
  or Google yielded similar results.

• Finding inlinks with different descriptions is
  very difficult.
Types of Accounts
Local Interest Descriptive Lists




(James Barlow, Jul 26 2010: http://jamesbarlow.co.uk/british-entries-afghan-
    war-diaries)
Types of Accounts
‘Connect the Dots’ / Conspiratorial Reasoning




(Peak of Elephants, Jul 26 2010: http://peakofelephants.posterous.com/post/
    861912878)
Most Linked Logs
  Idaho Soldier Captured in Afghanistan


• WTOP (news radio, Washington, DC)
                                               • 8 mentions
• KomoNews (news radio, Seattle, DC)           • Only
• SF Examiner (daily paper, San Francisco, CA) mainstream
                                               stories
• Newser (US-based news site)
• Lebanon Daily News (daily paper, Lebanon County, PA)
• Las Vegas Sun (daily paper, Las Vegas, NV)
• Yahoo! News
• AP (press agency)
Most Linked Logs
Four Canadians Killed in Friendly Fire


• UberVu
                               • 4 mentions
• Ottawa Forums                • Overlap of
•   Wikipedia                  mainstream and
                               alternative comments
• CyberPresse
Conclusion

Data-Driven Citizen Journalism = Absent
Possible Reasons

 •   too much data

 •   technical military terms

 •   mainstream media filters
Credits
Research done by:

 •
 Camilo Cristancho (PhD candidate in Political Science at the

  Universitat Autònoma de Barcelona)

 •
 Matteo Cernison (PhD Candidate in Social and Political Science at

  European University Institute, Florence)

 •
 Catalina Iorga

Wiki: https://wiki.digitalmethods.net/Dmi/
DataDrivenUserJournalism
P.S.
Go to http://wikileaks.ch/ (instead of the official website)
for:

• Guantanamo Files: http://wikileaks.ch/gitmo/
• Cablegate: http://wikileaks.ch/cablegate.html
• Iraq and Afghanistan War Logs: http://wikileaks.ch/iraq/
diarydig/
Thank you for your time and

attention!
E-mail: catalina.iorga@gmail.com

Web: http://catalinaiorga.wordpress.com

More Related Content

PPTX
Stopfake.org presentation by Margo Gontar
PDF
Research Methods in U. S. History
PPTX
The american revolution web quest
DOCX
Thematic unit on african american history
PPTX
Cybersecurity Who Cares? 2014
PPSX
2013 nercg librarians teachers day luncheon
PDF
DMI Summer 2010 - Final Presentations
PDF
Di salvo investigative_slides_final
Stopfake.org presentation by Margo Gontar
Research Methods in U. S. History
The american revolution web quest
Thematic unit on african american history
Cybersecurity Who Cares? 2014
2013 nercg librarians teachers day luncheon
DMI Summer 2010 - Final Presentations
Di salvo investigative_slides_final

Similar to DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wiki-leaks workshop) (20)

PDF
20140408 digital newspapers collections [idlc kuala lumpur]
PPTX
Zimmer wikileaks
PDF
The Future of Knowledge in the Age of Wikipedia - REMIXNYC 2014
PDF
20130630 What motivates library crowdsourcing volunteers? [ALA LITA]
PPT
The Wonderful World of Wikis
PPTX
Discovery Layers
PDF
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...
PDF
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
PDF
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
PPT
Wikinews
PPT
Intro to Blogs and the Blogosphere
PDF
Conversing around Data with Collaborative Analysis - Where2011
PDF
From personal engagement to public engagement: the journey from experiencing ...
PPTX
Reliable sources
PPTX
Med312 spies and whistleblowers lecture
PDF
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...
PPT
Wikipedia: the educator's friend (!)
20140408 digital newspapers collections [idlc kuala lumpur]
Zimmer wikileaks
The Future of Knowledge in the Age of Wikipedia - REMIXNYC 2014
20130630 What motivates library crowdsourcing volunteers? [ALA LITA]
The Wonderful World of Wikis
Discovery Layers
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
Wikinews
Intro to Blogs and the Blogosphere
Conversing around Data with Collaborative Analysis - Where2011
From personal engagement to public engagement: the journey from experiencing ...
Reliable sources
Med312 spies and whistleblowers lecture
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...
Wikipedia: the educator's friend (!)
Ad

More from Digital Methods Initiative (20)

PDF
Query Design for Digital Methods by Richard Rogers
PDF
Digital Methods by Richard Rogers
PPTX
Richard Rogers, Otherwise Engaged: Critical Analytics and the New Meanings of...
PDF
Digital Methods Tool Medley
PDF
Digital Methods Summer School 2015 Tool Medley
PDF
Rogers data days_2014_slides_opti
PDF
Digital Methods Summer School 2014 Tool Medley
PDF
Rogers studyingpoliticalissues mar2014_optimized_ii_
PDF
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
PDF
The Birth of Social Media Methods
PPTX
Interactive visualization and exploration of network data with Gephi
PDF
National Tracking Ecologies - Digital Methods Summer School 2013
PDF
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
PDF
Tracking the Trackers tutorial at the Digital Methods Summer School 2013
PDF
Repurposing Wikipedia: Wikipedia as data set and analytical device
PDF
Crawling and Scraping tutorial at the Digital Methods Summer School 2013
PDF
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
PDF
Digital Methods Summer School 2013 Tool Medley
PDF
Hashtag lifelines
KEY
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Query Design for Digital Methods by Richard Rogers
Digital Methods by Richard Rogers
Richard Rogers, Otherwise Engaged: Critical Analytics and the New Meanings of...
Digital Methods Tool Medley
Digital Methods Summer School 2015 Tool Medley
Rogers data days_2014_slides_opti
Digital Methods Summer School 2014 Tool Medley
Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
The Birth of Social Media Methods
Interactive visualization and exploration of network data with Gephi
National Tracking Ecologies - Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Tracking the Trackers tutorial at the Digital Methods Summer School 2013
Repurposing Wikipedia: Wikipedia as data set and analytical device
Crawling and Scraping tutorial at the Digital Methods Summer School 2013
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
Digital Methods Summer School 2013 Tool Medley
Hashtag lifelines
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Ad

Recently uploaded (20)

PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
Complications of Minimal Access Surgery at WLH
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Trump Administration's workforce development strategy
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
Cell Types and Its function , kingdom of life
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
advance database management system book.pdf
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
Classroom Observation Tools for Teachers
PPTX
Digestion and Absorption of Carbohydrates, Proteina and Fats
A powerpoint presentation on the Revised K-10 Science Shaping Paper
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
Final Presentation General Medicine 03-08-2024.pptx
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
RMMM.pdf make it easy to upload and study
Complications of Minimal Access Surgery at WLH
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Supply Chain Operations Speaking Notes -ICLT Program
Final Presentation General Medicine 03-08-2024.pptx
Trump Administration's workforce development strategy
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Cell Types and Its function , kingdom of life
Practical Manual AGRO-233 Principles and Practices of Natural Farming
advance database management system book.pdf
Chinmaya Tiranga quiz Grand Finale.pdf
A systematic review of self-coping strategies used by university students to ...
Classroom Observation Tools for Teachers
Digestion and Absorption of Carbohydrates, Proteina and Fats

DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wiki-leaks workshop)

  • 1. WikiLeaks and the Myth of (Data-Driven) Citizen Journalism Catalina Iorga, Research MA in Media Studies ’11, University of Amsterdam
  • 2. The Benefits of Open Govt. Data External Contributions • Specialised skills, local knowledge, ‘professional amateurs (Jennifer Bell, VisibleGovernment.ca, 2009) Citizen Empowerment • “these applications arm citizens with the information they need to make decisions every day” (data.gov, 2010) Software Innovation • Open licenses, existing technologies and communities (Jonathan Gray, Open Knowledge Foundation, 2010)
  • 3. Data-driven Journalism Large datasets available online • The Afghan War Diary 2004 - 2010 (WikiLeaks, 2010) Information visualisation tools • Guardian Data Explorer (Tony Hirst) Narratives powered by the Web • Using the Web “to tell a story, not just as a delivery medium” (Alan Maclean, The New York Times, 2010)
  • 4. Data-driven Journalism Image source: http://upload.wikimedia.org/wikipedia/commons/4/48/Data_driven_journalism_process.jpg
  • 5. Research Question What kind of stories do (data-driven) citizen journalists tell about the war in Afghanistan by referencing WikiLeaks documents?
  • 6. Intended WikiLeaks ‘Assange himself has stated that WikiLeaks has deliberately moved away from the "egocentric" blogosphere and assorted social media and nowadays collaborates only with professional journalists and human rights activists.’ (Geert Lovink, ‘Twelve Theses on WikiLeaks’, 2010)
  • 7. Distinction data-driven mainstream journalism vs. data-driven citizen journalism
  • 8. Why Links? • Forms of citation • Indicators of (citizen / user) engagement
  • 9. Afghan War Diary • “ an extraordinary secret compendium of over 91,000 reports covering the war in Afghanistan from 2004 to 2010.” • “ the most significant archive about the reality of war to have ever been released during the course of a war.” (WikiLeaks, 2010) Image source: http://wardiary.wikileaks.org
  • 10. Afghan War Diary - Der Spiegel “Deadly Toll” (Der Spiegel, 2010: http://www.spiegel.de/international/world/bild-708314-114716.html)
  • 11. Afghan War Diary - The Guardian “Afghanistan war logs: IED attacks on civilians, coalition and Afghan troops” (The Guardian, 2010: http://www.guardian.co.uk/world/datablog/interactive/2010/jul/26/ied- afghanistan-war-logs)
  • 12. Methodology • Observe the common root of all Afghan War Diary 2004 - 2010 document URLs (‘http://wardiary.wikileaks.org/afg/ event'). • Query Google with the Google Scraper to get the first 1000 results which contain this common root as a textual component. • Submit the top 100 results to the Link Ripper to extract all outlinks to specific Afghan War Diary 2004 - 2010 document pages. • Insert the Link Ripper output in the Harvester to remove textual descriptions and alphabetize the obtained URL list. • Manually clean the output by searching for the 'http:// wardiary.wikileaks.org/afg/event' and produce a list of Afghan War Diary 2004 - 2010 document URLs. • Select all documents that receive at least two links and compile a final list of the 'most mentioned' warlogs.
  • 13. Limitations • Searching for inlinks with Yahoo! Site Explorer or Google yielded similar results. • Finding inlinks with different descriptions is very difficult.
  • 14. Types of Accounts Local Interest Descriptive Lists (James Barlow, Jul 26 2010: http://jamesbarlow.co.uk/british-entries-afghan- war-diaries)
  • 15. Types of Accounts ‘Connect the Dots’ / Conspiratorial Reasoning (Peak of Elephants, Jul 26 2010: http://peakofelephants.posterous.com/post/ 861912878)
  • 16. Most Linked Logs Idaho Soldier Captured in Afghanistan • WTOP (news radio, Washington, DC) • 8 mentions • KomoNews (news radio, Seattle, DC) • Only • SF Examiner (daily paper, San Francisco, CA) mainstream stories • Newser (US-based news site) • Lebanon Daily News (daily paper, Lebanon County, PA) • Las Vegas Sun (daily paper, Las Vegas, NV) • Yahoo! News • AP (press agency)
  • 17. Most Linked Logs Four Canadians Killed in Friendly Fire • UberVu • 4 mentions • Ottawa Forums • Overlap of • Wikipedia mainstream and alternative comments • CyberPresse
  • 18. Conclusion Data-Driven Citizen Journalism = Absent Possible Reasons • too much data • technical military terms • mainstream media filters
  • 19. Credits Research done by: • Camilo Cristancho (PhD candidate in Political Science at the Universitat Autònoma de Barcelona) • Matteo Cernison (PhD Candidate in Social and Political Science at European University Institute, Florence) • Catalina Iorga Wiki: https://wiki.digitalmethods.net/Dmi/ DataDrivenUserJournalism
  • 20. P.S. Go to http://wikileaks.ch/ (instead of the official website) for: • Guantanamo Files: http://wikileaks.ch/gitmo/ • Cablegate: http://wikileaks.ch/cablegate.html • Iraq and Afghanistan War Logs: http://wikileaks.ch/iraq/ diarydig/
  • 21. Thank you for your time and attention! E-mail: catalina.iorga@gmail.com Web: http://catalinaiorga.wordpress.com

Editor's Notes