Tools for automatic summarization of texts in Polish. State of the research and implementation workse

Piotr Glenc


The goal of the publication is to present the state of research and works carried out in Poland on the issue of automatic text summarization. The author describes principal theoretical and methodological issues related to automatic summary generation followed by the outline of the selected works on the automatic abstracting of Polish texts. The author also provides three examples of IT tools that generate summaries of texts in Polish (Summarize, Resoomer, and NICOLAS) and their characteristics derived from the conducted experiment, which included quality assessment of generated summaries using ROUGE-N metrics. The results of both actions showed a deficiency of tools allowing to automatically create summaries of Polish texts, especially in the abstractive approach. Most of the proposed solutions are based on the extractive method, which uses parts of the original text to create its abstract. There is also a shortage of tools generating one common summary of many text documents and specialized tools generating summaries of documents related to specific subject areas. Moreover, it is necessary to intensify works on creating the corpora of Polish-language text summaries, which the computer scientists could apply to evaluate their newly developed tools.

Keywords: text summarization, Natural Language Processing, text documents, Polish language processing, automation of knowledge acquisition


Glenc, P. (2021). Narzędzia do automatycznego streszczania tekstów w języku polskim. Stan badań naukowych i prac wdrożeniowych. e-mentor, 2(89), 67-77.