The Kosh Suite: A Framework for Searching and Retrieving Lexical Data Using APIs

Authors

  • Francisco Mondaca Author
  • Philip Schildkamp Author
  • Felix Rau Author
  • Luke Günther Author

Abstract

This paper presents the Kosh Suite, an API-centric framework designed to efficiently manage and access lexical data. The Kosh Suite aims to address the challenges in working with XML and lexical data, providing a flexible and customizable solution. The Kosh Suite architecture features a backend powered by Elasticsearch, which forms the foundation for efficient data management and retrieval. This backend offers two APIs per dataset for accessing the lexical data - a REST API and a GraphQL API per dataset. In addition, the Kosh Suite includes a frontend implemented in form of a React-based user interface, ensuring a user-friendly experience and adaptability to various use cases. Deployment specifications are described for the backend, with reference implementations for FreeDict and Cologne Sanskrit Dictionaries (CDSD). Future enhancements include asynchronous request handling using FastAPI, integration with CSV files, and leveraging advancements in large language models (LLMs). These improvements have the potential to significantly enhance the system’s performance and accessibility, promoting the integration of underrepresented languages into mainstream LLMs.

Downloads

Published

2023-06-29