Named Entity Disambiguation for Maritime-related Data Retrieved from Heterogenous Sources

Jacek Małyszko , Witold Abramowicz , Milena Stróżyna


The article concerns integration and disambiguation of data related to the maritime domain. A developed system is described, which collects and merges data about several maritime-related entities (vessels, vessel types, ports, companies etc.) retrieved from different internet sources and feeds the data into a single database. This process is however not trivial. There are few challenges, which need to be faced to successfully conduct it. Firstly, in different sources, entities may be referenced to in different ways, for example, by using different text strings. Additionally, some of these references may be ambiguous, i.e. potentially the reference may point to more than one entity. To enable efficient analysis of data coming from different sources, such ambiguities must be resolved automatically as a preprocessing step, before the data is uploaded to the database and utilized in further computations. The aim of the disambiguation process is to assign artificial, unique identifiers to each entity and then, if possible, automatically assign these identifiers to each data item related to a given entity. In the article, developed methods for resolving such ambiguities are discussed and their evaluation is presented.
Author Jacek Małyszko (WIiGE / KIE)
Jacek Małyszko,,
- Department of Information Systems
, Witold Abramowicz (WIiGE / KIE)
Witold Abramowicz,,
- Department of Information Systems
, Milena Stróżyna (UEP)
Milena Stróżyna,,
- Poznań University of Economics and Business
Journal seriesTransNav -The International Journal on Marine Navigation and Safety of Sea Transportation, ISSN 2083-6473, e-ISSN 2083-6481, (B 12 pkt)
Issue year2016
Publication size in sheets0.6
Keywords in EnglishMaritime-Related Data, Heterogenous Sources, Disambiguation of Data, Data Sensors, Data Source, Common Operating Picture (COP), Maritime Domain Awareness (MDA), Maritime Mobile Service Identitiy (MMSI)
Languageen angielski
Score (nominal)12
Score sourcejournalList
ScoreMinisterial score = 12.0, 13-12-2019, ArticleFromJournal
Ministerial score (2013-2016) = 12.0, 13-12-2019, ArticleFromJournal
Publication indicators WoS Citations = 1
Citation count*4 (2021-07-22)
Share Share

Get link to the record

* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Are you sure?