A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

N. Chirkova; S. Troshin

doi:10.18653/v1/2021.naacl-main.26

Publications

?

A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

P. 278–288.

Chirkova N., Troshin S.

Language: English

DOI

Text on another site

Keywords: Transformer source code processing

In book

2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2021)

Association for Computational Linguistics, 2021.

Empirical Study of Transformers for Source Code

Chirkova N., Troshin S., , in: ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery (ACM), 2021. P. 703–715.

Initially developed for natural language processing (NLP), Transformers are now widely used for source code processing, due to the format similarity between source code and text. In contrast to natural language, source code is strictly structured, i.e., it follows the syntax of the programming language. Several recent works develop Transformer modifications for capturing syntactic information ...

Added: August 31, 2021

Empirical Study of Transformers for Source Code

Chirkova N., Troshin S., / Series arxiv "CS". 2020.

Initially developed for natural language processing (NLP), Transformers are now widely used for source code processing, due to the format similarity between source code and text. In contrast to natural language, source code is strictly structured, i. e. follows the syntax of the programming language. Several recent works develop Transformer modifications for capturing syntactic information ...

Added: October 19, 2020

On the Embeddings of Variables in Recurrent Neural Networks for Source Code

Chirkova N., , in: 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2021). Association for Computational Linguistics, 2021. P. 2679–2689.

Source code processing heavily relies on the methods widely used in natural language processing (NLP), but involves specifics that need to be taken into account to achieve higher quality. An example of this specificity is that the semantics of a variable is defined not only by its name but also by the contexts in which ...

Added: August 31, 2021

LIORI at the FinCausal 2020 Shared task

Gordeev D., Davletov A., Rey A. et al., , in: Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation. COLING, 2020. P. 45–49.

In this paper, we describe the results of team LIORI at the FinCausal 2020 Shared task held as a part of the 1st Joint Workshop on Financial Narrative Processing and MultiLingual Financial Summarisation. The shared task consisted of two subtasks: classifying whether a sentence contains any causality and labelling phrases that indicate causes and consequences. ...

Added: December 7, 2020

Gorynych Transformer at SemEval-2020 Task 6: Multi-task Learning for Definition Extraction

Davletov A., Nikolay Arefyev, Shatilov A. et al., , in: Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2020). Association for Computational Linguistics, 2020. P. 487–493.

This paper describes our approach to “DeftEval: Extracting Definitions from Free Text in Textbooks” competition held as a part of Semeval 2020. The task was devoted to finding and labeling definitions in texts. DeftEval was split into three subtasks: sentence classification, sequence labeling and relation classification. Our solution ranked 5th in the first subtask and ...

Added: December 7, 2020