LTDarbe aprašomi autorės jau anksčiau publikuoti ir toliau tęsiami lietuvių kalbos gramatikos tyrimai, kurių poreikis iškyla kompiuterizuojant kalbą. Kuriant "Lietuvių kalbos gramatikos informacinę sistemą", siekiama išvengti trūkumų, pastebėtų jau atliktuose darbuose. Vienas jų – nepakankama žodžių apimtis tiek tekstynuose, tiek žodynuose. Kad būtų galima sugeneruoti visus teoriškai įmanomus lietuvių kalbos žodžių vedinius ir dūrinius, reikia išanalizuoti darybines morfemas. Todėl Lietuvių kalbos institute (toliau – LKI) buvo atlikta išsami priešdėlių analizė. Kita problema iškilo dėl nevienodo kalbos dalių traktavimo skirtinguose šaltiniuose. Kadangi "Lietuvių kalbos gramatikos informacinėje sistemoje" pasirinktas dalyvio statusas nesutampa su pateikiamu akademinėje "Lietuvių kalbos gramatikoje", šioje mokslo studijoje, siekiant pagrįsti tokį pasirinkimą, visas poskyris parašytas remiantis straipsniu apie dalyvį. Dar vienas straipsnio pagrindu parengtas poskyris – apie lietuvių kalbos skaitmeninę gramatiką. Likusioje studijos dalyje panaudota medžiaga ir iš kitų straipsnių. [Iš Pratarmės]
ENAs soon as they made their first appearance, computers began to spread across most of the fields of human life. Languages are no exception here. A lot has been accomplished in the world. Many things have been accomplished in Lithuania, too. This study offers a description of endeavours in the field of computerisation of grammar, including corpus annotations, morphological analysers and parsers, digital grammar, and a grammar information system. Each individual chapter covers a particular subject. The first efforts to combine languages with digits were made back in the Middle Ages. However, it was with the advent of the computer that these ideas started seeing some potential for implementation. The latest technology – neural networks – has produced decent results in some areas, yet 100 percent accuracy is still out of reach. After they had started making corpus annotations, researchers have discovered that tags are highly affected by the diversity of languages – that is why no uniform annotation tag set has been developed yet. In addition to English tags, the morphologically annotated corpus of the Lithuanian language also provides Lithuanian tags. Many different formats have been developed for the purposes of syntactic annotation, yet not all of them have followed a similar spread pattern. The Prague Markup Language, or PML designed by Prague researchers was probably the one that enjoyed the highest degree of popularity. It is also used to annotate sentences written in the Lithuanian language. The morphological analyser developed by Vytautas Magnus University (VMU) on the basis of the 'Hunspell' platform operates under a rule-based approach. Morphological analysis of the Lithuanian language grounded on statistical methods is performed by the 'UDPipe' Lithuanian language module.Developed by the Institute of Mathematics and Informatics, the first morphemic database contains detailed information about morphemes, including their types, yet its contents are not freely accessible. The VMU online morphemic database produces words broken down in hyphenated morphemes yet contains zero information about the type of the morpheme. The rule-based parser had been accessible on VMU’s website until February 2020 and not updated version of it has been made available as yet. The 'UDPipe' module is a parser of the Lithuanian language that uses statistic methods to function. The syntactically annotated Lithuanian corpus ALKSNIS was used for the machine learning. Any inaccuracies in parsing are primarily caused by sentences that carry specific qualities of the Lithuanian language, such as a peculiar ordering of words that cannot be typically found in the English language, and so on. It appears that methods that successfully apply to the English language cannot always be used with other languages. For now, only a pilot sample of a digital grammar of the Lithuanian language and a limited Sketch grammar are available. The purpose of developing an information system for the grammar of the Lithuanian language is to draw documentation on the grammar of the Lithuanian language. The system stores two types of information: data designed for the wide public, which are available in a popular and comprehensive form, and a computer friendly format used for scientific research purposes. The website contains both morphological and morphemic data with indication of morpheme type. The structure of the word – the lemma and underlying words for derivatives – is reflected as well. All inflexional forms can be viewed by clicking the OTHER FORMS button. The information on the website is available in seven languages: Lithuanian, English, German, French, Italian, Russian, and Japanese.Only the model for the morphological segment is available at this time, with the syntactic segment slated for development some time in the future. The development of the "Lithuanian Grammar Information System" (LIGIS) has highlighted new phenomena that have never been covered by linguists before: sometimes words do not have all of the paradigmatic forms, which is the product of the semantics of the word, or rather the difference between its semantic meaning and its grammatical meaning, for instance: verbs that denote a group action cannot have a singular form; passive-voice participles made from intransitive verbs can only have neuter forms, and so on. [From the publication]