Collocations Dictionary of Modern Slovene 2.0

Authors

  • Iztok Kosem Author
  • Špela Arhar Holdt Author
  • Polona Gantar Author
  • Simon Krek Author

Keywords:

collocations dictionary, responsive dictionary, crowdsourcing, examples, postediting lexicography

Abstract

In this paper, we present the Collocations Dictionary of Modern Slovene 2.0, which is a substantial upgrade of the first version, both in terms of content and the interface. The Collocations Dictionary contains 81,445 headwords, nearly 4.5 million collocations, and more than 17 million examples. Relevant findings of user studies and other related research, as well as the development of new methodology for automatic extraction of collocations from corpora, which is based on the syntactically parsed corpus data, have been used to improve the contents of the dictionary. The interface has undergone some important changes such as the immediate view of all the collocations in the entry, and the easy-to-understand three levels of entry completion. In terms of the data storage, a crucial development has been the introduction of the combination of the Digital Dictionary Database, which allows sharing the data among various resources produced at the Centre for Language Resources and Technologies at the University of Ljubljana, and a data warehouse, where all the automatically extracted collocations and additional metadata are stored.

Downloads

Published

2023-06-29