Multilingual generation of noun valency patterns and automated retrieval of syntactic-Semantic data.
The main goal of the research project MultiGenera is to develop a tool for automated generation of nominal phrases in Spanish, German and French. Therefore, MultiGenera enables the production of automatic nominal argument realizations with semantic information. This project is funded by the BBVA Foundation.
This multilingual simulator has been influenced by multiple sources: dependency grammar and lexicography (Engel, Mel’čuk, etc.), works on ontology for semantic networks such as WordNet, and other Natural Language Processing (NLP) tools. MultiGenera, therefore, develops a conceptual onomasiological analysis of different lexical-conceptual fields for both monolingual and multilingual contexts, as well as a comparative and contrastive approach to nouns representative of different lexical-conceptual fields.
In order to develop this multi-generator, it was necessary to create a combined method for the collection and analysis of the paradigmatic and syntagmatic relationships of nouns.
This method consists essentially in the automated extraction of data from NLP resources, corpus analysis, co-occurrence databases, Wordnets, and the evaluation of the data returned by MultiGenera itself for Spanish, German and French.
The results of this project are visible in the study of the viability of the prototype simulator concerning the combination of methods and the interoperability of resources, as well as the development of the MultiGenera tool itself.
How to cite:
María José Domínguez Vázquez / Carlos Valcárcel Riveiro / David Lindemann: Multilingual Generation of Noun Valency Patterns for Extracting Syntactic-Semantical Knowledge from Corpora (MultiGenera). In Jaka Čibej / Vojko Gorjanc / Iztok Kosem / Simon Krek: Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts (ISBN 978-961-06-0097-8), Ljubljana, Slovenia: Ljubljana University Press, 847-854.