Show simple item record

dc.contributor.advisorRigo, Sandro José
dc.contributor.authorMartins, Mikaela Luzia
dc.date.accessioned2023-06-22T14:39:58Z
dc.date.accessioned2024-02-28T18:55:41Z
dc.date.available2023-06-22T14:39:58Z
dc.date.available2024-02-28T18:55:41Z
dc.date.issued2023-03-01
dc.identifier.urihttps://hdl.handle.net/20.500.12032/126348
dc.description.abstractThe aim of this work is to investigate the phenomenon of lexical variation in Portuguese and English in terms alignment and lexical substitution steps in Natural Language Processing (NLP) taking into account the specialized domain of retail. As a theoretical contribution, we are based on an interdisciplinary interface that considers the postulates of the areas of Computing and Linguistics. Therefore, we offer a theoretical overview of the use of semantic information in the development of NLP systems and demonstrate ways of implementing semantic information in computational lexical bases such as WordNet, FrameNet and FrameNet Brasil. With regard to Linguistics, we rely on the definitions of Murphy (2003, 2010), L'Homme (2020) and Croft & Cruse (2004) regarding the semantic relations directed to specialized terminology. We also take into account León-Araúz & Faber's (2014) classifications and inferences regarding lexical variations and translation equivalents within the scope of Terminology. Our methodology is based on the conjectures of Corpus Linguistics and relies on the use of the Sketch Engine tool to analyze the corpora in English and Portuguese that seek to represent the terminology of the domain. The pairs of terms chosen for the research exercise of the lexical substitution task are “plant” – “site” and “material” – “article”. The terminology used in the monolingual analysis stage comes from the predictions generated by three lexical substitution models: the first one takes into account the synonymy between terms, the second one considers an additional layer of information, the word embeddings, and the third one works with the aid of an additional information layer that recovers the semantic frames. The terminology used in the multilingual analysis stage comes from the corpus used and from a collection of retail terminological bases. Our monolingual analysis seeks to classify the models' predictions according to the semantic relations and results in a categorization of terms according to the definitions of terminological variation by León-Araúz & Faber (2014). The bilingual analysis, in turn, classifies the translation equivalents of the pairs of terms according to the translation problem they represent and according to the types of equivalence that were listed by León-Araúz & Faber (2014). Finally, based on analyses of a semantic-terminological nature, our results point to improvements in lexical substitution models and automatic translation models that take into account the semantic information and the terminological classification categories in order to advance in the quality and linguistic accuracy of the results.en
dc.description.sponsorshipCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superiorpt_BR
dc.languagept_BRpt_BR
dc.publisherUniversidade do Vale do Rio dos Sinospt_BR
dc.rightsopenAccesspt_BR
dc.subjectTerminologiapt_BR
dc.subjectTerminologyen
dc.titleThe lexicon as a possibility: the contribution of semantic-terminological information to lexical substitution tasks in natural language processingpt_BR
dc.typeDissertaçãopt_BR


Files in this item

FilesSizeFormatView
Mikaela Martins_PROTEGIDO.pdf2.211Mbapplication/pdfView/Open

This item appears in the following Collection(s)

Show simple item record


© AUSJAL 2022

Asociación de Universidades Confiadas a la Compañía de Jesús en América Latina, AUSJAL
Av. Santa Teresa de Jesús Edif. Cerpe, Piso 2, Oficina AUSJAL Urb.
La Castellana, Chacao (1060) Caracas - Venezuela
Tel/Fax (+58-212)-266-13-41 /(+58-212)-266-85-62

Nuestras redes sociales

facebook Facebook

twitter Twitter

youtube Youtube

Asociaciones Jesuitas en el mundo
Ausjal en el mundo AJCU AUSJAL JESAM JCEP JCS JCAP