MODELING SEMANTIC RELATIONSHIPS OF AVIATION TERMS: VECTOR SPACES AND LANGUAGE MODELS

Authors: Ryabchenko I., Anayatova R., Tulekova G., Koshekov A., Kuanov Y.
IRSTI 16.31.61

Abstract. The proposed article examines methods for modeling semantic relationships of aviation terms using the BERT and RoBERTa language models. The relevance of the study lies in the use of a pre-prepared and annotated corpus of aviation terms that align with international practice and are drawn from documents of international regulatory bodies. The developed language corpus provides the basis necessary for assessing the semantics of aviation terminology in the context of real aircraft operation. The research methodology involves fine-tuning language models trained on an aviation corpus of terms using cosine similarity, rank correlation, and cluster metrics of measurements. The experiments demonstrate the main differences between the two models in tracking synonyms, variability, and shifts in aviation discourse. The results of the study demonstrate that fine-tuning the models enhances their ability to cluster related terms, distinguish closely related but distinct concepts, and align the results with expert judgments. These results provide a methodological basis for the development of aviation terminology resources, enabling the application of lexicography transformer models and ontology construction.

Keywords: semantic proximity, aviation terminology, language models, corpus linguistics, transformers, embedding, natural language processing.