?
Reflections of syntactic structures in nonautoregressive language models
.
Плетенев С. А.
In book
Issue 20. , Russian State University for the Humanitie, 2021
Kutuzov A. B., Никишина И. А., , in : Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected Papers. Vol. 11832.: Cham : Springer, 2019. P. 3-8.
Double-blind peer reviewing has been proved to be a pretty effective and fair way of academic work selection. However, to the best of our knowledge, nobody has yet analysed the effects caused by its introduction at the Russian NLP conferences. We investigate how the double-blind peer reviewing influences gender and location (according to authors’ affiliations) ...
Added: January 20, 2020
Chirkova N., Lobacheva E., Vetrov D., , in : Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. : Association for Computational Linguistics, 2018. P. 2910-2915.
In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters. The majority of these parameters are often concentrated in the embedding layer, which size grows proportionally to the vocabulary length. We propose a Bayesian sparsification technique for RNNs which allows ...
Added: September 5, 2018
Afanasev I., / НИУ ВШЭ. Series WP BRP "Linguistics". 2021.
The article considers a lemmatiser that is developed specifically for Old Church Slavonic (OCS). The introduction underlines the problem of the lack of lemmatisers that might deal with different datasets of the OCS. The review gives a short description of previous attempts and current trends in lemmatisation. The lemmatiser is hybrid-based and uses the advantages ...
Added: December 28, 2021
Association for Computational Linguistics, 2018
Added: September 5, 2018
Pletnev Sergey, , in : Crowd Science Workshop: Trust, Ethics, and Excellence in Crowdsourced Data Management at Scale (CSW 2021). : Copenhagen, Denmark : CEUR Workshop Proceedings, 2021. Ch. 1. P. 15-20.
Most speech-driven systems on the first step convert audio to text through an automatic speech recognition (ASR) model and then pass the text to any downstream natural language processing (NLP) modules. However, these ASR models can lead to system failure or undesirable output when being exposed to natural language perturbation or variation in practice. In ...
Added: December 13, 2021
Kutuzov A. B., Velldal E., Øvrelid L., , in : Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. : Berlin : Association for Computational Linguistics, 2016. P. 115-125.
This paper studies how word embeddings trained on the British National Corpus interact with part of speech boundaries. Our work targets the Universal PoS tag set, which is currently actively being used for annotation of a range of languages. We experiment with training classifiers for predicting PoS tags for words based on their embeddings. The ...
Added: November 12, 2016
Marseille : Association pour le Traitement Automatique des Langues, 2014
Dans la suite du premier atelier TALAf qui s'est tenu le 8 juin 2012 à Grenoble, lors de la conférence JEP-TALN-RECITAL 2012 (voir les actes : http://aclweb.org/anthology//W/W12/#1300), nous proposons une nouvelle édition de cet atelier lors de la conférence TALN 2014 le premier juillet à Marseille.
Cette deuxième édition montre l'intérêt d'un atelier francophone sur le traitement ...
Added: March 26, 2015
Lyashevskaya O., Droganova K., Zeman D. et al., / НИУ ВШЭ. Series WP BRP "Linguistics". 2016. No. 44.
This paper presents the Universal Dependencies tagset (UD v1) as a new annotation scheme for Russian treebanks. The universal list of dependency relations was adopted and extended to comply with certain language-specific syntactic constructions. The tagset was validated, converting two Russian treebanks into the UD format, UD-Russian-SynTagRus and UD-Russian-Google. ...
Added: December 14, 2016
Wohlgenannt G., von Waldenfels R., Toldova S. et al., Manchester : EasyChair, 2019
The EPiC Series in Language and Linguistics publishes high quality collections of papers in language, linguistics and related areas. ...
Added: September 9, 2019
Braslavski P. undefined., Markov I., Pardalos P. M. et al., ACM SIGIR Forum 2016 Vol. 49 No. 2 P. 72-79
This paper provides the reader with a report on 9th Russian Summer School in Information Retrieval (RuSSIR 2015). ...
Added: February 27, 2017
Пономарева М. А., Дроганова К. А., Smurov I. et al., Florence : Association for Computational Linguistics, 2019
This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a ...
Added: September 5, 2019
The Use of Khislavichi Lect Morphological Tagging to Determine its Position in the East Slavic Group
Afanasev I., , in : Proceedings of Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023). : Association for Computational Linguistics, 2023. P. 174-186.
The study of low-resourced East Slavic lects is becoming increasingly relevant as they face the prospect of extinction under the pressure of standard Russian while being treated by academia as an inferior part of this lect. The Khislavichi lect, spoken in a settlement on the border of Russia and Belarus, is a perfect example of ...
Added: May 15, 2023
Association for Computational Linguistics, 2019
This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019). ...
Added: January 7, 2021
Berlin : Association for Computational Linguistics, 2016
The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...
Added: November 12, 2016
Dereza O., , in : Artificial Intelligence and Natural Language, 7th International Conference, AINL 2018, St. Petersburg, Russia, October 17–19, 2018, Proceedings. Issue 930.: Switzerland : Springer, 2018. P. 35-47.
Lemmatisation, which is one of the most important stages of text preprocessing, consists in grouping the inflected forms of a word together so they can be analysed as a single item. This task is often considered solved for most modern languages irregardless of their morphological type, but the situation is dramatically different for ancient languages. Rich inflectional system and ...
Added: November 14, 2018
Kutuzov A. B., Kuzmenko E., Marakasova A., , in : Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH). : Osaka : [б.и.], 2016. P. 26-34.
We present an approach to detect differences in lexical semantics across English language registers, using word embedding models from distributional semantics paradigm. Models trained on register-specific subcorpora of the BNC corpus are employed to compare lists of nearest associates for particular words and draw conclusions about their semantic shifts depending on register in which they ...
Added: November 12, 2016
Zhukov L. E., Sukharev J., Popescul A., , in : Proceedings of 14th International Conference on Data Mining (ICDM 2014). : NY : IEEE Computer Society, 2014. P. 995-1000.
Record linkage, or entity resolution, is an important area of data mining. Name matching is a key component of systems for record linkage. Alternative spellings of the same name are a common occurrence in many applications. We use the largest collection of genealogy person records in the world together with user search query logs to ...
Added: March 18, 2015
Marseille : European Language Resources Association (ELRA), 2022
The proceedings are organised on the basis of the 22 Tracks of the Conference on Language Resources and Evaluation (LREC) held in Marseille, France, from 20 to 25 June 2022. Major topics include corpora and annotation (including tools, systems, treebanks), information extraction and information retrieval (including ner, qa, text mining, document classification, text categorisation), applications involving lrs and evaluation (including ...
Added: February 22, 2023
Trnavac R., Poldvere N., Corpus Pragmatics 2024
The present corpus study, which is grounded in Appraisal Theory, investigates evaluative language use in fake news in English. The primary aim is to find out how and why, if at all, evaluative meanings are construed differently in fake news compared to genuine news. The secondary aim is to explore potential differences between types of ...
Added: October 26, 2023
Kirill Maslinsky, , in : TALN-RECITAL 2014 Workshop TALAf 2014 : Traitement Automatique des Langues Africaines (TALAf 2014: African Language Processing). : Marseille : Association pour le Traitement Automatique des Langues, 2014. P. 114-122.
This article provides a brief overview of Daba software package created in the course of building corpora for Manding languages. Key software features are motivated by the tasks and problems characteristic of many African languages. The corpus-building model proposed here was initially developed for Bambara Reference Corpus which is available online and is freely accessible. ...
Added: March 26, 2015
Kirina M., Человек: образ и сущность. Гуманитарные аспекты 2023
The article focuses on the application of opinion mining techniques to evaluate user experience on the Hyperskill educational platform, using Python, Java, and Kotlin programming projects as the basis of analysis. The study utilizes sentiment analysis and keyword extraction methods to gauge users' attitudes towards the platform, learning process, and topics covered. To achieve this, ...
Added: December 9, 2023
NY : Springer, 2014
This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...
Added: November 4, 2014
Karpov N., В кн. : Современные проблемы информатизации в анализе и синтезе технологических и программно-телекоммуникационных систем: Сборник трудов. Вып. 17.: Воронеж : Научная книга, 2012. С. 264-266.
Added: November 7, 2012
Copenhagen, Denmark : CEUR Workshop Proceedings, 2021
The second workshop on Crowd Science is organized in conjunction with the 47th International Conference on Very Large Data Bases (VLDB 2021). This workshop is the second in a series of events that has the goal of helping crowdsourcing “transition” from art to science, and tackles the research challenges that we face to make crowdsourcing ...
Added: December 13, 2021