Thesaurus of Modern Slovene 2.0

Authors

  • Špela Arhar Holdt Author
  • Polona Gantar Author
  • Iztok Kosem Author
  • Eva Pori Author
  • Marko Robnik-Šikonja Author
  • Simon Krek Author

Keywords:

Thesaurus of Modern Slovene, responsive dictionary, automated lexicography, user involvement, post-editing lexicography

Abstract

his paper describes the improvement of the Thesaurus of Modern Slovene from version 1.0 to 2.0. The Thesaurus is a digitally-born, automatically created resource that provides fast access to open data on modern language use and is gradually improved through editing and user participation. The initial version 1.0 lacked metadata, dictionary labels, and semantic information, but was well-received by users. However, a user study identified priorities for improvement, which were addressed in the upgrade funded by the Slovenian Ministry of Culture in 2021-2022. The project aimed to upgrade the dictionary interface design, establish protocols for labeling negative vocabulary, pilot the automatic extraction of antonyms, and supplement the dictionary with semantic indicators for 2,000 entries. This paper presents the upgraded Thesaurus, the methodology for each enhancement, and the challenges and solutions of lexicographic work. The Thesaurus serves as an example of lexical data reuse, interconnectivity, and user involvement, with insights useful for other language communities pursuing similar initiatives.

Downloads

Published

2023-06-29