Towards a lexical database of Dutch taboo language

Authors

  • Gerhard B van Huyssteen Author
  • Carole Tiberius Author

Keywords:

Dutch, lexical database, swearword, taboo language

Abstract

Over the past 45 years, at least eighteen Dutch paper-based dictionaries of taboo-language (or taboo-related language) have been published (i.e., as visible works of lexicography). However, none of these are available as (linked) lexical data that could be integrated in natural language processing (NLP) tools and applications (i.e., as invisible works of lexicography). In this paper, we describe the development of a comprehensive lexical database of taboo language (LDTL) for Dutch (TaboeLex) that can be integrated in NLP tools and applications. TaboeLex will be made available as open data, i.e., as a freely available, structured, annotated lexicon that can be linked to other data in the future. The paper focuses on the first phase of the project, namely, to define and design TaboeLex. 

Downloads

Published

2023-06-29