Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics
Włodzimierz Lewoniewski , Krzysztof Węcel , Witold Abramowicz
AbstractOn Wikipedia, articles about various topics can be created and edited independently in each language version. Therefore, the quality of information about the same topic depends on the language. Any interested user can improve an article and that improvement may depend on the popularity of the article. The goal of this study is to show what topics are best represented in different language versions of Wikipedia using results of quality assessment for over 39 million articles in 55 languages. In this paper, we also analyze how popular selected topics are among readers and authors in various languages. We used two approaches to assign articles to various topics. First, we selected 27 main multilingual categories and analyzed all their connections with sub-categories based on information extracted from over 10 million categories in 55 language versions. To classify the articles to one of the 27 main categories, we took into account over 400 million links from articles to over 10 million categories and over 26 million links between categories. In the second approach, we used data from DBpedia and Wikidata. We also showed how the results of the study can be used to build local and global rankings of the Wikipedia content.
|Journal series||MDPI Computers, ISSN 2073-431X, (N/A 20 pkt)|
|Publication size in sheets||1.5|
|Conference||24rd International Conference on Information and Software Technologies (ICIST 2018), 04-10-2018 - 06-10-2018, Vilnius, Litwa|
|Keywords in Polish||Wikipedia, jakość informacji, popularność, identyfikacja tematów, Wikidata, DBpedia, WikiRank|
|Keywords in English||Wikipedia, information quality, popularity, topics identification, Wikidata, DBpedia, WikiRank|
|Score||= 20.0, 09-04-2020, ArticleFromConference|
|Publication indicators||= 0|
|Citation count*||6 (2020-08-04)|
|Uwagi||Special Issue: Selected Papers from the 24th International Conference on Information and Software Technologies (ICIST 2018)|
* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.