Building of parallel and comparable cybersecurity corpora for bilingual terminology extraction

Utka, Andrius; Rackevičienė, Sigita; Rokas, Aivaras; Mockienė, Liudmila; Laurinaitis, Marius; Bielinskienė, Agnė

doi:https://doi.org/10.3384/ecp18912

Building of parallel and comparable cybersecurity corpora for bilingual terminology extraction

Link to:

straipsnio tekstas

Collection:

Mokslo publikacijos / Scientific publications

Document Type:

Knygų dalys / Parts of the books

Language:

Anglų kalba / English

Title:

Building of parallel and comparable cybersecurity corpora for bilingual terminology extraction

Authors:

In the Book:

Selected papers from the CLARIN Annual Conference 2021, Virtual Event, 2021, 27-29 September. P. 126-138.. Linköping : Linköping University Electronic Press, 2022

Subject terms:

Dvikalbystė / Bilingualism; Terminija / Terminology; Germanų kalbos / Germanic languages; Leksikografija / Lexicography.

Summary / Abstract:

ENThe paper aims at presenting English-Lithuanian corpora for bilingual term extraction (BiTE) in the cybersecurity domain within the framework of the project DVITAS. It is argued that a system of parallel, comparable, and training corpora for BiTE is particularly useful for less-resourced languages, as it allows efficiently to combine strengths and avoid weaknesses of comparable and parallel resources. A special focus is given to the availability of sources in the cybersecurity domain and issues related to copyright-protected publications, as well as the data curation performed for building the corpora and depositing them to CLARIN-LT repository. Keywords: Bilingual Terminology Extraction, Parallel Corpus, Comparable Corpus.

DOI:

10.3384/ecp18912

Subject area:

Kalbotyra / Linguistics

Related Publications:

Automatic extraction of Lithuanian cybersecurity terms using deep learning approaches. Human language technologies - the Baltic perspective: proceedings of the 9th international conference, Baltic HLT, Kaunas, Vytautas Magnus University, Lithuania, 22-23 September 2020. Amsterdam: IOS Press, 2020. p. 39-46.
Lithuanian-English cybersecurity termbase: principles of data collection and structuring. Rasprave: Časopis Instituta za hrvatski jezik i jezikoslovlje 2023, 49, 2, 439-461.
Methodological framework for the development of an English-Lithuanian Cybersecurity Termbase. Kalbų studijos 2021, 39, 85-92.
Preferences of Lithuanian cybersecurity synonymous terms in different user groups. Kalbų studijos 2024, 44, 107-122.

Permalink:

https://www.lituanistika.lt/content/103991

Updated:

2026-04-17 07:55:28

Metrics:

Views: 98 Downloads: 1

Export:

Choose type:

Download

User ID:
User Password: