NB-MLM: Efficient Domain Adaptation of Masked Language Models for Sentiment Analysis

Arefyev N.; Kharchev D.; Shelmanov A.

?

NB-MLM: Efficient Domain Adaptation of Masked Language Models for Sentiment Analysis

P. 9114–9124.

Arefyev N., Kharchev D., Shelmanov A.

While Masked Language Models (MLM) are pre-trained on massive datasets, the additional training with the MLM objective on domain or task-specific data before fine-tuning for the final task is known to improve the final performance. This is usually referred to as the domain or task adaptation step. However, unlike the initial pre-training, this step is performed for each domain or task individually and is still rather slow, requiring several GPU days compared to several GPU hours required for the final task fine-tuning. We argue that the standard MLM objective leads to inefficiency when it is used for the adaptation step because it mostly learns to predict the most frequent words, which are not necessarily related to a final task. We propose a technique for more efficient adaptation that focuses on predicting words with large weights of the Naive Bayes classifier trained for the task at hand, which are likely more relevant than the most frequent words. The proposed method provides faster adaptation and better final performance for sentiment analysis compared to the standard approach

Language: English

Keywords: masked language models

Publication based on the results of:

Development of Mathematical Models and Methods for Recommender Systems and Natural Language Processing (2020)

In book

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Association for Computational Linguistics, 2021.

RuSentEval: Linguistic Source, Encoder Force!

Mikhailov V., Taktasheva E., Сигдел Э. С. et al., , in: Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. Association for Computational Linguistics, 2021. P. 43–65.

The success of pre-trained transformer language models has brought a great deal of interest on how these models work, and what they learn about language. However, prior research in the field is mainly devoted to English, and little is known regarding other languages. To this end, we introduce RuSentEval, an enhanced set of 14 probing ...

Added: September 27, 2021

Artificial Text Detection via Examining the Topology of Attention Maps

Kushnareva L., Cherniavskii D., Mikhailov V. et al., , in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2021. P. 635–649.

Added: September 27, 2021