paper-24

A multilingual trilogy: Developing three multi-language lexicographic datasets

Authors: Ilan Kernerman

Abstract:
This paper offers a brief overview of three multilingual developments by K Dictionaries and highlights the main editorial procedures involved and technical tools applied. The first regards an English multilingual dictionary bringing together 43 language versions of Password semi-bilingual dictionary. The second stems from the first, semi-automatically generating multilingual glossaries for any one of those languages to all others via detailed bilingual L2-English indexes. The third is part of the Global series and consists of monolingual datasets for over 20 languages that serve to create various bilingual and multilingual versions and multi-layered combinations. Further steps are anticipated in order to interlink and unify the different resources and processes, such as by associating translations in one lexicographic set to corresponding entries in others and thereby to more translations in other languages, and to converting the data to RDF format for interoperability with Linked Data and Semantic Web technologies.

Keywords: multilingual; dictionary; dataset; semi-automatic generation; linked data

Reference: In Kosem, I., Jakubiček, M., Kallas, J., Krek, S. (eds.) Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom. Ljubljana/Brighton: Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd., pp. 372-383.

URL: https://elex.link/elex2015/proceedings/eLex_2015_24_Kernerman.pdf

Published: 2015