SlideShare a Scribd company logo
What I’ve learned
from studying
Wikipedia bias
Heather Ford
University of Technology Sydney | wikihistories project
As an activist And then as a
researcher
2005
2007
2009
Wikipedia is biased!
• Only about 20% of Wikipedia’s 6.2 million biographies are of
women;
• About 85% of Wikipedia editors are male;
• Wikipedia’s articles, topics, and contributors are more likely
to represent the United States and Western Europe with large
parts of the developing world underrepresented.
3 myths
about
Wikipedia’s
biases
• Wikipedia mirrors the world’s
biases;
• Bias on Wikipedia originates from
people because people are
inherently biased;
• That bias can be resolved by
volunteers “filling in gaps”.
The “mirror theory”
Ford 2022
3 myths
about
Wikipedia’s
biases
• Wikipedia mirrors the world’s biases;
• Wikipedia is also the source of systemic bias.
• Bias on Wikipedia originates from people
because people are inherently biased;
• Wikipedia has developed socio-technical
systems that embed discriminatory practices.
• That bias can be resolved by volunteers
“filling in gaps”;
• Resolving bias requires more than the
notification of its existence.
What should the role of research be… given
the predominance of such myths?
The answer has been a continuation of gaps measurement on a
global scale to find more and more patterns of content gaps
(e.g. and beyond gender to examine issues of race)
Gaps? What gaps?
Knowledge gaps: “disparities in content coverage
or participation of a specific group of readers or
contributors” (Redi et al., 2021: 4)
A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft) (Redi et al., 2021)
What I have learned from studying Wikipedia bias
The measurement of gaps is held up as research
community’s key contribution to solving
Wikipedia’s inclusion problems.
But…
1. Measuring gaps
demonstrates the
outcomes rather
than the sources
of inequalities.
2. Measuring gaps
doesn’t enable us to
understand
Wikipedia’s specific
role in the
production of
inequality and
exclusionary
behaviour.
3. The
measurement of
gaps is often
established on the
basis of flawed
assumptions
calculate gaps notify editors editors fill in
Beyond notification:
Filling gaps in peer
production projects
Ford, H., Pensa, I., Devouard,
F., Pucciarelli, M., Botturi, L.
(2018)
What I have learned from studying Wikipedia bias
https://en.wikipedia.org/wiki/Talk:Herero_and_Namaqua_genocide/Archive_2#Wikipedia_Primary_School_announcement
4. Gaps
measurement
emphasises
existing and easily
accessible
sociological
categories
(namely cis
gender) that can
actually hide
exclusions that
aren’t easily
mapped
5. Gaps
measurement rarely
accounts for the
context necessary
for enabling local
action.
What should our focus be?
What do we need to know?
Instead of asking “What are the gaps?”, we need to support
research that asks “What are the sources of gaps?”
“there needs to be a much
greater focus on the practice of
sorting and classifying
knowledge and the role it plays
in both new and old forms of
subjugation. (Tkacz, 2015, p.12)
[1]
Instead of more global studies of bias, we
need local/national studies that are able to
account for local contexts where there are
resources to enable change i.e. “What are
the sources of bias in the representations
of this country/region?”
[2]
And then collaborative syntheses in
national or regional contexts
[3]
i.e. What does our research tell us about
what our focus needs to be for
community/institutional responses?
Movement strategy is an opportunity…
A change in focus: from filling in gaps to including
people who “have been left out by structures of power
and privilege” (Wikimedia Movement Strategy 2030).
What will the role of research be in supporting this
strategy?
wikihistories.net

More Related Content

PDF
humaniki User Research Report
PDF
Wikimedia 2030 Challenges and Opportunities
PDF
humaniki User Research Report
PDF
Can social media save Wikipedia from itself? - Sarah Stierch - Social Media W...
PPTX
The gender gap on Wikipedia
PPTX
Wikipedia as a democracy
PDF
Diversity of Wikipedia
PPTX
Social Media Lecture 6 Wikipedia and knowledge management
humaniki User Research Report
Wikimedia 2030 Challenges and Opportunities
humaniki User Research Report
Can social media save Wikipedia from itself? - Sarah Stierch - Social Media W...
The gender gap on Wikipedia
Wikipedia as a democracy
Diversity of Wikipedia
Social Media Lecture 6 Wikipedia and knowledge management

Similar to What I have learned from studying Wikipedia bias (20)

PDF
Lessons from creating a diversity toolkit
PPTX
Wikipedia as a Platform for Change
PDF
Exploring Article Networks on Wikipedia with NodeXL
PPTX
Open Knowledge: Wikipedia and Beyond
PPTX
Wikipedia - Disruptive Technology
PPT
Dissecting Wikipedia
PDF
Towards a diversity-minded Wikipedia
PDF
Wikipedia Diversity
ODP
FirstWorkshopOnWikipediaResearch
PDF
PRSA Webinar: PR in a Wikipedia Age
PPTX
Sit wikipedia
PDF
Closing the Gender Gap on Wikimedia
PPTX
Doucet, D. Authority of knowledge: historians on Wikipedia in higher education
PPTX
From Public Pedagogy to Critical Digital Praxis: Learning/Writing as Reflecti...
PPTX
Getting to Grips with Wikipedia: a Practical Session
PPTX
ALIA Wikipedia and libraries
PPTX
WikiSym Poster
PDF
The Future of Knowledge in the Age of Wikipedia - REMIXNYC 2014
PDF
Wiki case study - Review year 1
PDF
Render Review: Wikipedia Case Study, Year 1
Lessons from creating a diversity toolkit
Wikipedia as a Platform for Change
Exploring Article Networks on Wikipedia with NodeXL
Open Knowledge: Wikipedia and Beyond
Wikipedia - Disruptive Technology
Dissecting Wikipedia
Towards a diversity-minded Wikipedia
Wikipedia Diversity
FirstWorkshopOnWikipediaResearch
PRSA Webinar: PR in a Wikipedia Age
Sit wikipedia
Closing the Gender Gap on Wikimedia
Doucet, D. Authority of knowledge: historians on Wikipedia in higher education
From Public Pedagogy to Critical Digital Praxis: Learning/Writing as Reflecti...
Getting to Grips with Wikipedia: a Practical Session
ALIA Wikipedia and libraries
WikiSym Poster
The Future of Knowledge in the Age of Wikipedia - REMIXNYC 2014
Wiki case study - Review year 1
Render Review: Wikipedia Case Study, Year 1
Ad

More from Heather Ford (20)

PPT
How to do qualitative analysis: In theory and practice
PDF
New Authorities: Wikipedia and the reconfiguration of expertise
PDF
Wikipedia and breaking news: The promise of a global media platform and the t...
PPT
From first cycle to second cycle qualitative coding: "Seeing a whole"
PDF
Qualitative codes and coding
KEY
Source management for Ushahidi and SwiftRiver
KEY
Wikimania presentation on the "Understanding Sources" project
PDF
Wikipedia sources: On the books and on the ground
KEY
Ushahidi research: Establishing a learning organisation
PDF
The Missing Wikipedians
PDF
How are norms around privacy and publicity reflected in online learning envir...
PDF
Wikiwars
PDF
Sustainability models for digitisation project
PDF
Developing a digital copyright strategy
PDF
Tools and terms: digital copyright
PDF
Global trends in online copyright
PPT
Open Context Developments
PDF
The iHeritage Project
PDF
The Flickr Commons
PDF
Digitisation and access
How to do qualitative analysis: In theory and practice
New Authorities: Wikipedia and the reconfiguration of expertise
Wikipedia and breaking news: The promise of a global media platform and the t...
From first cycle to second cycle qualitative coding: "Seeing a whole"
Qualitative codes and coding
Source management for Ushahidi and SwiftRiver
Wikimania presentation on the "Understanding Sources" project
Wikipedia sources: On the books and on the ground
Ushahidi research: Establishing a learning organisation
The Missing Wikipedians
How are norms around privacy and publicity reflected in online learning envir...
Wikiwars
Sustainability models for digitisation project
Developing a digital copyright strategy
Tools and terms: digital copyright
Global trends in online copyright
Open Context Developments
The iHeritage Project
The Flickr Commons
Digitisation and access
Ad

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
KodekX | Application Modernization Development
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Approach and Philosophy of On baking technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
cuic standard and advanced reporting.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
Per capita expenditure prediction using model stacking based on satellite ima...
NewMind AI Weekly Chronicles - August'25 Week I
KodekX | Application Modernization Development
Network Security Unit 5.pdf for BCA BBA.
Diabetes mellitus diagnosis method based random forest with bat algorithm
Understanding_Digital_Forensics_Presentation.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Approach and Philosophy of On baking technology
Advanced methodologies resolving dimensionality complications for autism neur...
Review of recent advances in non-invasive hemoglobin estimation
cuic standard and advanced reporting.pdf
The AUB Centre for AI in Media Proposal.docx

What I have learned from studying Wikipedia bias

  • 1. What I’ve learned from studying Wikipedia bias Heather Ford University of Technology Sydney | wikihistories project
  • 2. As an activist And then as a researcher
  • 6. Wikipedia is biased! • Only about 20% of Wikipedia’s 6.2 million biographies are of women; • About 85% of Wikipedia editors are male; • Wikipedia’s articles, topics, and contributors are more likely to represent the United States and Western Europe with large parts of the developing world underrepresented.
  • 7. 3 myths about Wikipedia’s biases • Wikipedia mirrors the world’s biases; • Bias on Wikipedia originates from people because people are inherently biased; • That bias can be resolved by volunteers “filling in gaps”.
  • 9. 3 myths about Wikipedia’s biases • Wikipedia mirrors the world’s biases; • Wikipedia is also the source of systemic bias. • Bias on Wikipedia originates from people because people are inherently biased; • Wikipedia has developed socio-technical systems that embed discriminatory practices. • That bias can be resolved by volunteers “filling in gaps”; • Resolving bias requires more than the notification of its existence.
  • 10. What should the role of research be… given the predominance of such myths? The answer has been a continuation of gaps measurement on a global scale to find more and more patterns of content gaps (e.g. and beyond gender to examine issues of race)
  • 12. Knowledge gaps: “disparities in content coverage or participation of a specific group of readers or contributors” (Redi et al., 2021: 4)
  • 13. A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft) (Redi et al., 2021)
  • 15. The measurement of gaps is held up as research community’s key contribution to solving Wikipedia’s inclusion problems. But…
  • 16. 1. Measuring gaps demonstrates the outcomes rather than the sources of inequalities.
  • 17. 2. Measuring gaps doesn’t enable us to understand Wikipedia’s specific role in the production of inequality and exclusionary behaviour.
  • 18. 3. The measurement of gaps is often established on the basis of flawed assumptions calculate gaps notify editors editors fill in
  • 19. Beyond notification: Filling gaps in peer production projects Ford, H., Pensa, I., Devouard, F., Pucciarelli, M., Botturi, L. (2018)
  • 22. 4. Gaps measurement emphasises existing and easily accessible sociological categories (namely cis gender) that can actually hide exclusions that aren’t easily mapped
  • 23. 5. Gaps measurement rarely accounts for the context necessary for enabling local action.
  • 24. What should our focus be? What do we need to know?
  • 25. Instead of asking “What are the gaps?”, we need to support research that asks “What are the sources of gaps?” “there needs to be a much greater focus on the practice of sorting and classifying knowledge and the role it plays in both new and old forms of subjugation. (Tkacz, 2015, p.12) [1]
  • 26. Instead of more global studies of bias, we need local/national studies that are able to account for local contexts where there are resources to enable change i.e. “What are the sources of bias in the representations of this country/region?” [2]
  • 27. And then collaborative syntheses in national or regional contexts [3] i.e. What does our research tell us about what our focus needs to be for community/institutional responses?
  • 28. Movement strategy is an opportunity… A change in focus: from filling in gaps to including people who “have been left out by structures of power and privilege” (Wikimedia Movement Strategy 2030). What will the role of research be in supporting this strategy?

Editor's Notes

  • #3: I became a volunteer for CC in 2002. I helped organise the first Wikipedia Academy on the continent in 2005 as Executive Director of iCommons. It was hosted by CIDA City Campus in Johannesburg, attended by Jimmy Wales, Ndesanjo Macha and organised by the iCommons team including Kerryn McKay, Rebecca Kahn and Daniela Faris. As an Advisory Board member of the Wikimedia Foundation between 2007 and 2009,
  • #4: the year is 2005. I had become executive director of icommons, an international NGO started by Creative Commons to drive the international movement around free and open soruce software and open content around the world. We organised a Wikipedia Academy in Johannesburg with Jimmy Wales, Ndesanjo Macha and others.
  • #5: I helped to shape Majority World (Global South) participation in the open content and free and open source software movement. I worked with my colleagues at iCommons to collect memories from people in Johannesburg sitting in a shopping mall and asking people to bring their old photos of Johannesburg to scan to Wikimedia Commons. I was filled with the passionate belief that Wikimedia and the FLOSS movement could enable Africans to participate.  
  • #6: https://geography.oii.ox.ac.uk/wikipedias-global-geography/#single/0 In 2009 I went bck to graduate school where I encountered my first map of Wikipedia’s gaps.
  • #7: Big shock. In the next few years we learned through gaps research that
  • #8: From these early days, many myths about Wikipedia’s biases were surfaced and remain predominant today.
  • #9: But this assumes that Wikipedians are disinterested actors and that the platform is a neutral space in which facts are technically accreted over time
  • #10: In my experience and research, I’ve realized that…
  • #16: And measurement of gaps was really important to denote that we have a problem!
  • #17: Measuring gaps demonstrates the outcomes rather than the sources of inequalities. The question is framed as: “What are the gaps?” rather than “What are the sources of gaps?” The answer we receive from studies that articulate Wikipedia’s gaps in content doesn’t demonstrate causal relationships. For example, women may be less well represented than men in our analyses but we don’t know whether that is because a) women are less well represented (or deemed socially as important) in the literature or in sources or b) because Wikipedia has determined that, despite the existence of reliable sources and external markers of notability, they will not grant them notable status inside Wikipedia (as Francesca Tripodi has recently discovered is happening on English Wikipedia for women’s biographies, 2021)
  • #18: Measuring gaps doesn’t enable us to understand Wikipedia’s specific role in the production of inequality and exclusionary behaviour. Research on Wikipedia’s gaps shows how Wikipedia corresponds to existing social inequalities such as gender, geography and now race or ethnicity. This helps us to understand that Wikipedia can exacerbate existing inequalities but not how it uniquely produces such inequalities as a result of its own policies, norms, technologies, economic structures, position in the internet ecosystem etc. Gaps measurement research supports what I’ve called a “mirror theory” of bias – that Wikipedia is a mirror of the world’s biases, not a source of them – a theory that is prominent in Wikipedia culture and that doesn’t enable a recognition of the role that Wikipedia is playing specifically in excluding people and their knowledges.
  • #19: The measurement of gaps is often established on flawed assumptions. If measuring gaps in Wikipedia’s coverage is the solution, the problem is being framed as an inability of Wikipedians to see the imbalances that they are producing through their editing practice. And that, therefore, surfacing gaps is the best way to solve the problem.   The role of research in this theory of change is limited to measuring gaps with the assumption that editors only need to know what the gaps are in order to fill them. This illusion of transparency as an automatic vehicle of accountability is well documented in internet policy and media literature and born out by evidence in my own experience.
  • #20: In 2018, I worked on a South African primary school project with iolanda Pensa, I. Francis Devouard, and others. The premise of the project was that we would work with South African educators to devise a list of Wikipedia articles related to the South African primary school curriculum that needed to be improved or created. Once the list was created, the team set about sending notifications of the need to improve the articles to the related Wikiprojects. Notification was determined by Wikipedia literature as a strong motivator for the creation of articles. But we found that notification strategies failed to engage interest from either the existing editing community or readers who might be turned into editors. Most of our requests to WikiProjects were simply not considered by any editors even when the Wikiproject was clearly active. Instead, the team turned to an unconventional method, what has been called “expert reviews”. This required contacting local academic experts and asking them to conduct an expert review of the article. The reviews were uploaded to the talk pages of the articles and the team responded by making edits to the article suggested by the academic expert.
  • #23: Gaps measurement requires looking for evidence of disparities according to existing (and easily accessible) sociological categories (namely cis gender) that can hide gaps that aren’t easily mapped (or are impossible to map at scale because they are only meaningful at the local level). The gender gap is the most widely studied by far. Predominantly cis gender because that is the data most readily available. Gender is also widely studied because it is a relatively stable concept globally. A handful of studies examines other gaps in coverage and participation including in terms of geography, language, sexual orientation, age, cultural background and socio-economic status. If this becomes the dominant means of evaluating progress towards equity and inclusion goals, we risk hiding and setting back groups that can’t be easily mapped e.g. Indigenous people who aren’t well represented in demographic data or are too small to count OR sources of inequalities in the practices e.g. of article deletion where even when successful or notable women are added in response to content drives, they are more likely to be deleted by Wikipedians who determine that they are non-notable (Tripodi, 2021).  
  • #24: Finally, gap measurement rarely accounts for context necessary for enabling action. It results in large-scale, global studies that do not facilitate action by those working on filling gaps in the local/national communities to solve problems of representation where they matter and where they can be solved.
  • #26: Instead of asking “What are the gaps?”, we need to ask “What are the sources of gaps?”   We don’t have to start from scratch in answering this question. Decades of scholarship in the history of knowledge can tell us that Wikipedia is a knowledge institution that, like any other, is shaped by questions of power, access and representation. The source of bias on Wikipedia is in relations of unequal power and authority in the socio-technical system that has baked in norms and rules that determine who gets to decide what can be included and what must be excluded on Wikipedia.   It is the practice of sorting – encyclopedic content from unencyclopedic content, knowledge from opinion, NPOV from POV, reliability from unreliability, original research from non-original research that we must focus on. As Tkacz writes: “there needs to be a much greater focus on the practice of sorting and classifying knowledge and the role it plays in both new and old forms of subjugation. “Wikipedia and the Politics of Openness” (Tkacz, 2015, p.12)  
  • #27: Finally, we need research that empowers those who have the resources, power and expertise to act on recommendations. Even gaps measurement, when it is focused on the national or local level, is useful if it provides tools that can be used by editors to make change. (e.g. Tripodi’s gender study of articles for deletion or our wikihistories study on how English Wikipedians responded to external notability queues in the Order of Australia awards across all genders). If we are to measure gaps, we need longitudinal studies with data that is meaningful at a national level, because it is the chapters that have the resources to respond to local imbalances (a key component of the current strategy).