Lecture on lexical packages in Hildesheim

Face-to-face scientific meetings, such as conferences, seminars and lecture series, remain essential forums for specialists in a particular field or technique to exchange ideas and discuss progress and challenges. In many cases, the feedback received at these meetings is invaluable in improving or redirecting the workflow of a research project. In the field of lexicography and natural language processing (NLP), one such forum is undoubtedly the European Master in Lexicography (EMLex), which carries the prestigious Erasmus Mundus label.

Presentation on Lexical Packages

During his participation as a lecturer in the summer semester of this Master’s programme, organised by the University of Hildesheim in 2024, Carlos Valcárcel (University of Vigo) had the opportunity to present the process of semi-automatic creation of lexical packages. These packages are an essential element of the various projects that have been developed over the years within PORTLEX, the R&D structure led by María José Dominguez of the University of Santiago de Compostela. Originally designed to feed generators of nominal phrases and their sentence contexts, these lexical packages have several uses, including language teaching and the creation of glossaries and lexicons.

Reusing Lexical Packages for Semantic Tagging

The possibility of reusing these lexical packages to create a semantic tagger is the aim of the current project within PORTLEX, ESMAS-ES+. A semantic tagger is a digital resource that classifies the words in a text according to their meaning. Such taggers can help people understand a text by highlighting difficult words and providing clues about their language level. For machines, a semantic tagger also improves their understanding of the meaning of a human-generated text or enables them to produce more accurate results when searching for information. With these ideas in mind, PORTLEX is now working on reusing data from previous projects to create a semantic tagger.

During his presentation in Hildesheim, Carlos Valcárcel had the opportunity not only to detail the process of creating and semantically annotating lexical packages, but also to show the ontology of semantic tags that has been developed in parallel to classify all these files. This resource, which will also be used to develop the tagger for the ESMAS-ES+ project, consists of hundreds of hierarchically organised tags.

Reception and future work

The presentation was very well received by the audience, which consisted of EMLex Masters students and professors from the University of Hildesheim. It is worth highlighting the comments made by Professors Ulrich Heid and Gertrud Faaß, two leading experts in different areas of applied linguistics. These comments will undoubtedly be useful for improving the revision and semantic annotation of the lexical packages already created, as well as for the development of new ones.