Automating derivational morphology for Slovenian

Authors

  • Tomaž Erjavec Author
  • Marko Pranjić Author
  • Boris Kern Author
  • Senja Pollak Author
  • Andraž Pelicon Author
  • Irena Stramljič Breznik Author

Keywords:

derivational morphology, word formation, automated morphological segmentation, derivational dictionary, morphological chains

Abstract

In this paper, we focus on computational approaches for supporting derivational word formation analysis in Slovenian. The main contributions are two-fold: first, we derive word formation rules and chains from given examples of the trail volume of a derivational dictionary and apply them to larger lexicons from two Slovenian resources; and second, we propose the first morphological segmenter for Slovenian. More specifically, based on the digitised trail volume (words starting with ) of the derivational dictionary of Slovenian, we extracted suffixal word-formation rules, and applied them to two lexicons of Slovenian, Sloleks and the one extracted from the metaFida corpus, to acquire new word formation instances for each chaining rule. The study of word-formation chains is relevant because it gives us an insight into word-formation mechanisms and productivity. The results show that when the derived chaining rules were applied to Sloleks, 21.95% to 31.58% of derivational chains are correct. In contrast, when the chaining rules were applied to the metaFida lexicon, the results are very noisy, with an extremely low percentage of correct chains. Next, motivated by the fact that morphological segmentation is a prerequisite for determining the structure of word formation chains and the need for more general analysis on the level of morphemes, we implemented the first automated morphological segmentation models for Slovenian. The supervised model is based on BiLSTM-CRF and achieves F1-Score of 83.98%, which is significantly higher than the two implemented unsupervised baselines, Morfessor and MorphoChain, to which we the model is compared.

Downloads

Published

2023-06-29