SlideShare a Scribd company logo
Quality assessment
of Wikipedia and its sources
Dr. Włodzimierz Lewoniewski
Languages of the world in 2020
2
• 7,117 languages are spoken.
• 2,926 languages are endangered.
• just 23 languages account
for more than ½ the world’s
population
• Wikipedia articles have been
created in 314 languages
Source: ethnologue.com, meta.wikimedia.org 0 200 400 600 800 1000 1200 1400
Indonesian
Portuguese
Russian
Bengali
Standard Arabic
French
Spanish
Hindi
Mandarin Chinese
English
The top 10 most spoken languages (in millions)
Native speakers Number of speakers
Motivation – enrichment of multilingual information
3
Source: Lewoniewski, W. (2018). The method of comparing and enriching information in multilingual wikis based on the analysis of their quality. PhD thesis
Quality
matters!
Quality in Multilingual Wikipedia
• Wikipedia can be edited in each language independently
• same subject can be described differently
• user usually needs to understand those languages
• Information quality depends on language of Wikipedia
• Each language defines own rules and standards
• Standards may change over time
• Reliable sources are important
• Assessment of the same source depends on language edition of Wikipedia
• Reliablity of the same source may change over the time
4
Related works
• Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in
Multilingual Wikipedia. Information, 11(5), 263
• Lewoniewski, W., Węcel, K., Abramowicz, W. (2019). Multilingual ranking of Wikipedia articles with
quality and popularity assessment in different topics. Computers, 8(3), 60.
• Lewoniewski, W. (2019). Measures for quality assessment of articles and infoboxes in multilingual
Wikipedia. In International Conference on Business Information Systems (pp. 619-633). Springer, Cham.
• Lewoniewski, W. (2018). The method of comparing and enriching information in multilingual wikis
based on the analysis of their quality. PhD thesis
• Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Relative quality and popularity evaluation of
multilingual Wikipedia articles. Informatics 2017, 4(4), 43.
• Lewoniewski, W. (2017). Enrichment of information in multilingual Wikipedia based on quality analysis.
In International Conference on Business Information Systems (pp. 216-227). Springer, Cham.
• Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Analysis of references across Wikipedia languages.
In International Conference on Information and Software Technologies (pp. 561-573). Springer, Cham.
5
Quality classes in Wikipedia languages
Source: Lewoniewski, W. (2017). Enrichment of information in multilingual Wikipedia based on quality analysis.
In International Conference on Business Information Systems (pp. 216-227). Springer, Cham.
6
Quality
dimensions
Source: Lewoniewski, W. (2019). Measures for quality assessment of articles and infoboxes in multilingual Wikipedia. 7
Significance of measures depending on language
Source: Węcel, K., Lewoniewski, W. (2015). Modelling the quality of attributes in Wikipedia infoboxes.
8
Significance of measures depending on language (2)
Source: Lewoniewski, W. (2018). The method of comparing and enriching information in multilingual wikis based on the analysis of their quality. PhD thesis
9
Distribution of measures in quality classes
Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Relative quality and popularity evaluation of multilingual Wikipedia articles. Informatics 2017, 4(4), 43.
10
Normalized measures average (NMA):
𝑁𝑀𝐴 =
1
𝑐
𝑖=1
𝑐
𝑚𝑖
where 𝑚𝑖 is a normalized measure 𝑚𝑖
and 𝑐 is the numer of measures.
11
Article quality score - synthetic quality measure
𝑄𝑢𝑎𝑙𝑖𝑡𝑦𝑆𝑐𝑜𝑟𝑒 = 𝑁𝑀𝐴 ∙ (1 − 5% ∙ 𝑄𝐹𝑇)
Additionaly we need to take into account
the numer of quality flaw templates (QFT)
to measure the quality score:
Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2019). Multilingual ranking of Wikipedia articles with quality and popularity assessment in different topics. Computers, 8(3), 60.
Normalization of each measure 𝑚𝑖 was
conducted according to the following rule:
• if value of a given feature in a given language
exceeded the threshold of the median value
of the best articles in the same language
version, it was set to 100 points;
• otherwise, its value was linearly scaled to
reflect the relation of the value to the
median value.
For example, if the median for the number of references in Polish Wikipedia was 97:
• any article with a larger number of references would score 100 for this feature;
• an article with 59 references would score proportionally 60.82 (59/97) points after normalizing.
12
Quality score – an example of implementation
Implementation of the quality score on WikiRank.net
Wikipedia references
The calculation is based on Wikimedia dumps as of March 2020 using complex extraction of references. More languages:
Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 13
1
2
4
8
16
32
64
Number of references (in millions)
Unique references
Over 200 milion
references
in 55 languages
Wikipedia references – complex extraction
Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263.
14
Templates in references on English Wikipedia
Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 15
The most commonly used names in publisher parameter of citations templates:
Wikipedia references with special identifiers
The calculation is based on Wikimedia dumps as of March 2020 using complex extraction of references. More languages and indentifiers:
Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 16
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
de en es fr it ja nl pl pt ru sv uk zh
References with special identifier (in percentages)
DOI unique DOI ISBN unique ISBN ISSN unique ISSN PMID unique PMID
17
Popularity and reliability models
Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263.
• F model—based on frequency (F) of source usage.
• P model—based on cumulative pageviews (P) of the article
in which source appears.
• PR model—based on cumulative pageviews (P) of the
article in which source appears divided by number of the
references (R) in this article.
• PL model—based on cumulative pageviews (P) of the
article in which source appears divided by article length (L).
• Models Pm, PmR, PmL are modified versions with daily
pageviews median.
• Models A, AR, AL uses number of authors.
Publishers in references on English Wikipedia
Position in rankings of
publishers in English
Wikipedia depending on
popularity and reliability
model in February 2020.
Source: own calculation based
on Wikimedia dumps using
complex extraction and using
only values from publisher
parameter of citation
templates in references.
More publishers:
Lewoniewski, W., Węcel, K., Abramowicz, W.
(2020). Modeling Popularity and Reliability of
Sources in Multilingual Wikipedia.
Information, 11(5), 263.
18
Source
Position in the Ranking Depending on Model
F P PR PL Pm PmR PmL A AR AL
AllMusic 8 28 8 8 26 8 9 14 6 7
BBC 3 4 3 3 5 4 3 2 3 3
BBC News 10 5 7 5 6 7 5 7 8 8
BBC Sport 4 11 15 12 16 17 13 5 7 5
Cambridge University Press 5 3 2 2 3 2 2 3 4 4
CBS Interactive 20 9 10 7 9 10 8 12 15 10
CNN 22 2 9 6 2 9 6 6 16 12
ESPN 13 8 17 14 8 19 16 10 13 14
IGN 32 37 29 24 34 29 23 22 17 15
National Park Service 7 94 38 48 89 47 58 60 12 11
Official Charts Company 16 30 24 18 31 26 18 18 21 18
Oxford University Press 2 1 1 1 1 1 1 1 2 1
Routledge 6 6 4 4 4 3 4 4 5 6
Springer 19 12 6 9 10 5 7 16 10 13
United States Census Bureau 1 27 5 11 24 6 12 9 1 2
University of California Press 24 16 19 19 12 18 17 15 19 22
Yale University Press 9 13 28 28 13 28 29 21 33 28
BestRef – analysis of sources on Wikipedia
19Implementation of popularity and reliability models on BestRef.net
BestRef – analysis of sources on Wikipedia (2)
20Source: BestRef.net
Lang.
Models
F PR AR
all 7 3 5
ar 6 11 15
de 10 10 11
es 24 10 20
fr 14 18 18
it 12 11 13
ja 25 46 49
nl 23 17 23
pl 31 25 38
pt 10 11 22
ru 11 14 18
sv 46 23 32
uk 41 40 41
zh 13 27 32
nytimes.com
Lang.
Models
F PR AR
all 125 47 51
ar 270 423 415
de 2 1 2
es 413 457 584
fr 401 471 392
it 329 359 387
ja 692 992 899
nl 117 113 135
pl 534 392 421
pt 501 497 543
ru 355 333 324
sv 269 250 216
uk 342 501 340
zh 469 787 689
spiegel.de
Lang.
Models
F PR AR
all 111 67 76
ar 343 556 418
de 341 393 355
es 280 398 506
fr 5 1 3
it 315 275 328
ja 701 1381 1244
nl 203 345 342
pl 730 689 675
pt 438 514 597
ru 622 753 875
sv 663 1023 630
uk 771 1522 1013
zh 526 1339 1128
Lang.
Models
F PR AR
all 100 56 62
ar 265 852 400
de 312 445 346
es 21 3 3
fr 109 248 199
it 248 281 256
ja 630 2096 1236
nl 351 620 435
pl 577 1018 785
pt 102 67 101
ru 635 832 741
sv 577 963 758
uk 720 1537 916
zh 371 1051 729
elpais.comlemonde.fr
Browser extensions for quality assessment of Wikipedia
21
• Articles - WikiRank
• Chrome: chrome.google.com/webstore/detail/wikirank/cnomlnphfhgijoghjcbdpmmhfgeooabd
• Firefox: addons.mozilla.org/en-US/firefox/addon/wikirank
• presentation: youtube.com/watch?v=jJdKw2gf1aA
• Infoboxes
• Chrome: chrome.google.com/webstore/detail/infoboxes/njjjplipinhcglgjopmnnlphmlhdpkko
• presentation: youtube.com/watch?v=HCfvx0wQ5oM
• Sources - BestRef
• Chrome: chrome.google.com/webstore/detail/bestref/bnlfiilmigfboedocmjaejbklgbdmmio
• presentation: youtube.com/watch?v=FXnfaAIaixc
Futher applications
22
• Quality models can help to
enrich various language
editions of Wikipedia and
other knowledge bases with
information of better quality.
• Some of the approaches are
planned to be implemented
on global.dbpedia.org
GlobalFactSync data flow.
Source: commons.wikimedia.org/wiki/File:GFS.png
Thank you
23
• E-mail:
wlodzimierz.Lewoniewski.ue.
poznan.pl
• WWW: kie.ue.poznan.pl
E-mail: wlodzimierz.lewoniewski@ue.poznan.pl Web: kie.ue.poznan.pl

More Related Content

DOCX
Assigned Task- Revised
PPT
Twitter Research Toolkit
PDF
Enrichment of multilingual Wikipedia based on quality analysis
PDF
Reference Extraction from Wikipedia Infoboxes
PPTX
Lecture 25: Wikipedia and Reliability
PDF
Citations and References in DBpedia
PDF
Increasing access to free and open knowledge for speakers of underserved lang...
PDF
Multilinguals and Wikipedia Editing
Assigned Task- Revised
Twitter Research Toolkit
Enrichment of multilingual Wikipedia based on quality analysis
Reference Extraction from Wikipedia Infoboxes
Lecture 25: Wikipedia and Reliability
Citations and References in DBpedia
Increasing access to free and open knowledge for speakers of underserved lang...
Multilinguals and Wikipedia Editing

Similar to Quality assessment of Wikipedia and its sources (20)

PPT
Wikipedia for Researchers
PPT
Wikipedia Seminar For Cipr October 2010
PPTX
To Wikipedia and Beyond
DOCX
List of wikipedias
PPT
Open Knowledge Management
PPT
Wikimedia Presentation for Schools
PPTX
220711130089_Sumit_Pandit .pptx
PPTX
Getting to Grips with Wikipedia: a Practical Session
PDF
Role of libraries in wikipedia content development
PDF
Presentation1.pdf
PPT
Dissecting Wikipedia
PDF
An Analysis Of Topical Coverage Of Wikipedia
PPTX
The wisdom of Motivated Crowds and use of new media in creating services & pr...
PDF
Open Source Software Wikipedia 2008
PPTX
Getting started with Wikipedia editing ppt
PPTX
Publishing Articles in the English Wikipedia
PPT
WikipediaWise
PDF
BSYS Word 2007 Team Assignment
PPT
WikipediaWise
Wikipedia for Researchers
Wikipedia Seminar For Cipr October 2010
To Wikipedia and Beyond
List of wikipedias
Open Knowledge Management
Wikimedia Presentation for Schools
220711130089_Sumit_Pandit .pptx
Getting to Grips with Wikipedia: a Practical Session
Role of libraries in wikipedia content development
Presentation1.pdf
Dissecting Wikipedia
An Analysis Of Topical Coverage Of Wikipedia
The wisdom of Motivated Crowds and use of new media in creating services & pr...
Open Source Software Wikipedia 2008
Getting started with Wikipedia editing ppt
Publishing Articles in the English Wikipedia
WikipediaWise
BSYS Word 2007 Team Assignment
WikipediaWise
Ad

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
modul_python (1).pptx for professional and student
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Quality review (1)_presentation of this 21
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to Knowledge Engineering Part 1
modul_python (1).pptx for professional and student
Optimise Shopper Experiences with a Strong Data Estate.pdf
Clinical guidelines as a resource for EBP(1).pdf
[EN] Industrial Machine Downtime Prediction
Introduction-to-Cloud-ComputingFinal.pptx
IB Computer Science - Internal Assessment.pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
STUDY DESIGN details- Lt Col Maksud (21).pptx
Database Infoormation System (DBIS).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Ad

Quality assessment of Wikipedia and its sources

  • 1. Quality assessment of Wikipedia and its sources Dr. Włodzimierz Lewoniewski
  • 2. Languages of the world in 2020 2 • 7,117 languages are spoken. • 2,926 languages are endangered. • just 23 languages account for more than ½ the world’s population • Wikipedia articles have been created in 314 languages Source: ethnologue.com, meta.wikimedia.org 0 200 400 600 800 1000 1200 1400 Indonesian Portuguese Russian Bengali Standard Arabic French Spanish Hindi Mandarin Chinese English The top 10 most spoken languages (in millions) Native speakers Number of speakers
  • 3. Motivation – enrichment of multilingual information 3 Source: Lewoniewski, W. (2018). The method of comparing and enriching information in multilingual wikis based on the analysis of their quality. PhD thesis Quality matters!
  • 4. Quality in Multilingual Wikipedia • Wikipedia can be edited in each language independently • same subject can be described differently • user usually needs to understand those languages • Information quality depends on language of Wikipedia • Each language defines own rules and standards • Standards may change over time • Reliable sources are important • Assessment of the same source depends on language edition of Wikipedia • Reliablity of the same source may change over the time 4
  • 5. Related works • Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263 • Lewoniewski, W., Węcel, K., Abramowicz, W. (2019). Multilingual ranking of Wikipedia articles with quality and popularity assessment in different topics. Computers, 8(3), 60. • Lewoniewski, W. (2019). Measures for quality assessment of articles and infoboxes in multilingual Wikipedia. In International Conference on Business Information Systems (pp. 619-633). Springer, Cham. • Lewoniewski, W. (2018). The method of comparing and enriching information in multilingual wikis based on the analysis of their quality. PhD thesis • Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Relative quality and popularity evaluation of multilingual Wikipedia articles. Informatics 2017, 4(4), 43. • Lewoniewski, W. (2017). Enrichment of information in multilingual Wikipedia based on quality analysis. In International Conference on Business Information Systems (pp. 216-227). Springer, Cham. • Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Analysis of references across Wikipedia languages. In International Conference on Information and Software Technologies (pp. 561-573). Springer, Cham. 5
  • 6. Quality classes in Wikipedia languages Source: Lewoniewski, W. (2017). Enrichment of information in multilingual Wikipedia based on quality analysis. In International Conference on Business Information Systems (pp. 216-227). Springer, Cham. 6
  • 7. Quality dimensions Source: Lewoniewski, W. (2019). Measures for quality assessment of articles and infoboxes in multilingual Wikipedia. 7
  • 8. Significance of measures depending on language Source: Węcel, K., Lewoniewski, W. (2015). Modelling the quality of attributes in Wikipedia infoboxes. 8
  • 9. Significance of measures depending on language (2) Source: Lewoniewski, W. (2018). The method of comparing and enriching information in multilingual wikis based on the analysis of their quality. PhD thesis 9
  • 10. Distribution of measures in quality classes Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Relative quality and popularity evaluation of multilingual Wikipedia articles. Informatics 2017, 4(4), 43. 10
  • 11. Normalized measures average (NMA): 𝑁𝑀𝐴 = 1 𝑐 𝑖=1 𝑐 𝑚𝑖 where 𝑚𝑖 is a normalized measure 𝑚𝑖 and 𝑐 is the numer of measures. 11 Article quality score - synthetic quality measure 𝑄𝑢𝑎𝑙𝑖𝑡𝑦𝑆𝑐𝑜𝑟𝑒 = 𝑁𝑀𝐴 ∙ (1 − 5% ∙ 𝑄𝐹𝑇) Additionaly we need to take into account the numer of quality flaw templates (QFT) to measure the quality score: Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2019). Multilingual ranking of Wikipedia articles with quality and popularity assessment in different topics. Computers, 8(3), 60. Normalization of each measure 𝑚𝑖 was conducted according to the following rule: • if value of a given feature in a given language exceeded the threshold of the median value of the best articles in the same language version, it was set to 100 points; • otherwise, its value was linearly scaled to reflect the relation of the value to the median value. For example, if the median for the number of references in Polish Wikipedia was 97: • any article with a larger number of references would score 100 for this feature; • an article with 59 references would score proportionally 60.82 (59/97) points after normalizing.
  • 12. 12 Quality score – an example of implementation Implementation of the quality score on WikiRank.net
  • 13. Wikipedia references The calculation is based on Wikimedia dumps as of March 2020 using complex extraction of references. More languages: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 13 1 2 4 8 16 32 64 Number of references (in millions) Unique references Over 200 milion references in 55 languages
  • 14. Wikipedia references – complex extraction Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 14
  • 15. Templates in references on English Wikipedia Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 15 The most commonly used names in publisher parameter of citations templates:
  • 16. Wikipedia references with special identifiers The calculation is based on Wikimedia dumps as of March 2020 using complex extraction of references. More languages and indentifiers: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 16 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% de en es fr it ja nl pl pt ru sv uk zh References with special identifier (in percentages) DOI unique DOI ISBN unique ISBN ISSN unique ISSN PMID unique PMID
  • 17. 17 Popularity and reliability models Source: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. • F model—based on frequency (F) of source usage. • P model—based on cumulative pageviews (P) of the article in which source appears. • PR model—based on cumulative pageviews (P) of the article in which source appears divided by number of the references (R) in this article. • PL model—based on cumulative pageviews (P) of the article in which source appears divided by article length (L). • Models Pm, PmR, PmL are modified versions with daily pageviews median. • Models A, AR, AL uses number of authors.
  • 18. Publishers in references on English Wikipedia Position in rankings of publishers in English Wikipedia depending on popularity and reliability model in February 2020. Source: own calculation based on Wikimedia dumps using complex extraction and using only values from publisher parameter of citation templates in references. More publishers: Lewoniewski, W., Węcel, K., Abramowicz, W. (2020). Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information, 11(5), 263. 18 Source Position in the Ranking Depending on Model F P PR PL Pm PmR PmL A AR AL AllMusic 8 28 8 8 26 8 9 14 6 7 BBC 3 4 3 3 5 4 3 2 3 3 BBC News 10 5 7 5 6 7 5 7 8 8 BBC Sport 4 11 15 12 16 17 13 5 7 5 Cambridge University Press 5 3 2 2 3 2 2 3 4 4 CBS Interactive 20 9 10 7 9 10 8 12 15 10 CNN 22 2 9 6 2 9 6 6 16 12 ESPN 13 8 17 14 8 19 16 10 13 14 IGN 32 37 29 24 34 29 23 22 17 15 National Park Service 7 94 38 48 89 47 58 60 12 11 Official Charts Company 16 30 24 18 31 26 18 18 21 18 Oxford University Press 2 1 1 1 1 1 1 1 2 1 Routledge 6 6 4 4 4 3 4 4 5 6 Springer 19 12 6 9 10 5 7 16 10 13 United States Census Bureau 1 27 5 11 24 6 12 9 1 2 University of California Press 24 16 19 19 12 18 17 15 19 22 Yale University Press 9 13 28 28 13 28 29 21 33 28
  • 19. BestRef – analysis of sources on Wikipedia 19Implementation of popularity and reliability models on BestRef.net
  • 20. BestRef – analysis of sources on Wikipedia (2) 20Source: BestRef.net Lang. Models F PR AR all 7 3 5 ar 6 11 15 de 10 10 11 es 24 10 20 fr 14 18 18 it 12 11 13 ja 25 46 49 nl 23 17 23 pl 31 25 38 pt 10 11 22 ru 11 14 18 sv 46 23 32 uk 41 40 41 zh 13 27 32 nytimes.com Lang. Models F PR AR all 125 47 51 ar 270 423 415 de 2 1 2 es 413 457 584 fr 401 471 392 it 329 359 387 ja 692 992 899 nl 117 113 135 pl 534 392 421 pt 501 497 543 ru 355 333 324 sv 269 250 216 uk 342 501 340 zh 469 787 689 spiegel.de Lang. Models F PR AR all 111 67 76 ar 343 556 418 de 341 393 355 es 280 398 506 fr 5 1 3 it 315 275 328 ja 701 1381 1244 nl 203 345 342 pl 730 689 675 pt 438 514 597 ru 622 753 875 sv 663 1023 630 uk 771 1522 1013 zh 526 1339 1128 Lang. Models F PR AR all 100 56 62 ar 265 852 400 de 312 445 346 es 21 3 3 fr 109 248 199 it 248 281 256 ja 630 2096 1236 nl 351 620 435 pl 577 1018 785 pt 102 67 101 ru 635 832 741 sv 577 963 758 uk 720 1537 916 zh 371 1051 729 elpais.comlemonde.fr
  • 21. Browser extensions for quality assessment of Wikipedia 21 • Articles - WikiRank • Chrome: chrome.google.com/webstore/detail/wikirank/cnomlnphfhgijoghjcbdpmmhfgeooabd • Firefox: addons.mozilla.org/en-US/firefox/addon/wikirank • presentation: youtube.com/watch?v=jJdKw2gf1aA • Infoboxes • Chrome: chrome.google.com/webstore/detail/infoboxes/njjjplipinhcglgjopmnnlphmlhdpkko • presentation: youtube.com/watch?v=HCfvx0wQ5oM • Sources - BestRef • Chrome: chrome.google.com/webstore/detail/bestref/bnlfiilmigfboedocmjaejbklgbdmmio • presentation: youtube.com/watch?v=FXnfaAIaixc
  • 22. Futher applications 22 • Quality models can help to enrich various language editions of Wikipedia and other knowledge bases with information of better quality. • Some of the approaches are planned to be implemented on global.dbpedia.org GlobalFactSync data flow. Source: commons.wikimedia.org/wiki/File:GFS.png
  • 23. Thank you 23 • E-mail: wlodzimierz.Lewoniewski.ue. poznan.pl • WWW: kie.ue.poznan.pl E-mail: wlodzimierz.lewoniewski@ue.poznan.pl Web: kie.ue.poznan.pl