Comparative Analysis of the Informativeness and Encyclopedic Style of the Popular Web Information Sources
Nina Khairova , Włodzimierz Lewoniewski , Krzysztof Węcel , Mamyrbayev Orken , Mukhsina Kuralai
AbstractNowadays, very often decision making relies on information that is found in the various Internet sources. Preferred are texts of the encyclopedic style, which contain mostly factual information. We propose to combine the logic-linguistic model and the universal dependency treebank to extract facts of various quality levels from texts. Based on Random Forest as a classification algorithm, we show the most significant types of facts and types of words that most affect the encyclopedic-style of the text. We evaluate our approach on four corpora based on Wikipedia, social and mass media texts. Our classifier achieves over 90% F-measure.
|Publication size in sheets||0.55|
|Book||Abramowicz Witold, Paschke Adrian (eds.): Business Information Systems 21st International Conference, BIS 2018, Berlin, Germany, July 18-20, 2018, Proceedings, Lecture Notes in Business Information Processing, vol. 320, 2018, Springer, ISBN 978-3-319-93930-8, [978-3-319-93931-5], 426 p., DOI:10.1007/978-3-319-93931-5|
|Keywords in English||encyclopedic, informativeness, universal dependency, random forest, facts extraction, Wikipedia, mass media|
|Score||= 70.0, 11-09-2020, ChapterFromConference|
|Citation count*||7 (2020-09-16)|
* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.