?
Neural Networks Compression for Language Modeling
P. 351-357.
In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g., LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time. This problem is especially crucial
for mobile applications, in which the constant interaction with the remote server is inappropriate. By using the Penn Treebank (PTB) dataset we compare pruning, quantization, low-rank factorization, tensor train decomposition for LSTM networks in terms of model size and suitability for fast inference.
Попова А. С., Рассадин А. Г., Пономаренко А. А., В кн. : Материалы XXIV международной научно-технической конференции «Информационные системы и технологии-2018. : [б.и.], 2018. С. 1083-1089.
Рассматривается задача автоматической классификации эмоций в цифровом аудио сигнале. В работе рассматривается и верифицируется подход, в котором классификация звукового фрагмента производится с помощью рекуррентной нейронной сети c долговременно-кратковременной памятью. В качестве признаков использовались мел-кепстральные коэффициенты. Произведен численный эксперимент на открытом наборе данных Ravdess, включающий 8 различных эмоций: “нейтральный”, “спокойный”, “счастливый”, “грустный”, “злой”, “испуганный”, “отвращение”, “удивление” ...
Added: October 21, 2018
Losev Ivan, Compositio Mathematica 2017 Vol. 153 No. 12 P. 2445-2481
In this paper we study categories O over quantizations of symplectic resolutions admitting Hamiltonian tori actions with finitely many fixed points. In this generality, these categories were introduced by Braden, Licata, Proudfoot and Webster. We establish a family of standardly stratified structures (in the sense of the author and Webster) on these categories O. We ...
Added: October 15, 2017
Natalia Sizykh, Said Dandamaev, Dmitry Sizykh, , in : 16th International Conference Management of large-scale system development (MLSD). : IEEE, 2023. P. 1-5.
Forecasting data and research on cryptocurrency price forecasting methods are increasing in importance. So far, methods based on LSTM deep learning architecture have shown the best results in forecasting cryptocurrency prices. In order to improve the accuracy of forecasting data, this paper investigates the application of a multivariate multistep forecasting method based on the LSTM ...
Added: December 22, 2023
Pereskokov A., Липская А. В., Вестник Московского энергетического института 2010 № 6 С. 99-109
Рассмотрены радиально-симметричные решения уравнения типа Хартри, содержащего как кулоновский потенциал, так и интегральную нелинейность с потенциалом взаимодействия Юкавы. В квазиклассическом приближении выведены и исследованы уравнения для самосогласованного потенциала. Выписано правило квантования типа Бора-Зоммерфельда. Найдены асимптотические собственные значения и собственные функции. ...
Added: December 16, 2012
Vologodsky V., Finkelberg M. V., Bezrukavnikov R., Cambridge Journal of Mathematics 2014 Vol. 2 No. 2 P. 163-190
Marc Haiman has reduced Macdonald Positivity Conjecture to a statement about geometry of the Hilbert scheme of points on the plane, and formulated a generalization of the conjecture where the symmetric group is replaced by the wreath product of S_n and Z/rZ. He has proven the original conjecture by establishing the geometric statement about the ...
Added: December 17, 2015
Cham : Birkhäuser, 2020
The book consists of articles based on the XXXVIII Białowieża Workshop on Geometric Methods in Physics, 2019. The series of Białowieża workshops, attended by a community of experts at the crossroads of mathematics and physics, is a major annual event in the field. The works in this book, based on presentations given at the workshop, ...
Added: November 3, 2021
Neklyudov K. O., Molchanov D., Ashukha A. et al., , in : Advances in Neural Information Processing Systems 30 (NIPS 2017). : Montreal : Curran Associates, 2017. P. 6776-6785.
Dropout-based regularization methods can be regarded as injecting random noise with pre-defined magnitude to different parts of the neural network during training. It was recently shown that Bayesian dropout procedure not only improves generalization but also leads to extremely sparse neural architectures by automatically setting the individual noise magnitude per weight. However, this sparsity can ...
Added: January 29, 2018
Sofia : Avangard Prima, 2018
Added: January 31, 2018
Manakhov P., Ковшов Е. Е., Прикладная информатика 2012 № 3(39) С. 71-81
The article examines the issue of developing models of the text input methods. The urgency of this matter is dictated by the reduction of financial costs of designing new input methods and upgrading existing ones. The article suggests a modeling method, which is verified by a series of experiments. Also the article gives recommendations on ...
Added: January 17, 2015
Grachev A., Ignatov D. I., Savchenko A., Applied Soft Computing Journal 2019 Vol. 79 P. 354-362
Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long–Short Term Memory models. We make particular attention ...
Added: June 12, 2019
Beknazarov N., Jin S., Poptsova M., Scientific Reports 2020 Vol. 10 P. 19134
Computational methods to predict Z-DNA regions are in high demand to understand the functional role of Z-DNA. The previous state-of-the-art method Z-Hunt is based on statistical mechanical and energy considerations about B- to Z-DNA transition using sequence information. Z-DNA CHiP-seq experiment results showed little overlap with Z-Hunt predictions implying that sequence information only is not ...
Added: December 11, 2020
Eduard Gorbunov, Kovalev D., Makarenko D. et al., , in : Advances in Neural Information Processing Systems 33 (NeurIPS 2020). : Curran Associates, Inc., 2020. P. 20889-20900.
Added: December 7, 2020
Klyshinskiy E., Рысаков С. В., Новые информационные технологии в автоматизированных системах 2016
Статья знакомит читателя с базовыми понятиями параметрической оптимизации. Описывается разработанная модель аппроксимация вероятности, функции-счётчики и коэффициенты корреляции. Небольшое внимание уделено методу полного перебора, в результате работы которого достигнуты новые показатели точности. В конце приведена модификация метода снятия омонимии, разработанная авторами. ...
Added: June 14, 2016
Le Gouic T., Paris Q., Electronic journal of statistics 2018 Vol. 12 No. 2 P. 4239-4263
In this paper, we define and study a new notion of stability for the k-means clustering scheme building upon the field of quantization of a probability measure. We connect this definition of stability to a geometric feature of the underlying distribution of the data, named absolute margin condition, inspired by recent works on the subject. ...
Added: November 9, 2018
Кусакин И. К., Федорец О. В., Romanov A., Научно-техническая информация. Серия 2: Информационные процессы и системы 2022 Т. 12 С. 6-9
This paper discusses modern approaches to natural language processing and appliance of artificial intelligence technologies in the task of classifying scientific texts in Russian. The report contains an analysis of implementations of text vectorization methods, a description of experiments with training various classifier models: from classical machine learning algorithms to neural network transformer architectures. ...
Added: January 31, 2023
Sergeev A., European Mathematical Society Publishing house, 2014
This book is based on a lecture course given by the author at the Educational Center of the Steklov Mathematical Institute in 2011. It is designed for a one-semester course for undergraduate students familiar with basic differential geometry and complex and functional analysis.
The universal Teichmüller space T is the quotient of the space of quasisymmetric ...
Added: April 9, 2015
I. K. Kusakin, Fedorets O. V., A. Y. Romanov, Scientific and Technical Information Processing 2023 Vol. 50 No. 3 P. 176-183
This paper discusses modern approaches to natural language processing and the application of machine learning models to the task of classifying short scientific texts in Russian. This study is devoted to the analysis of methods for vectorization of textual information, selection of a model for scientific paper clas- sification, and training of linguistic model BERT ...
Added: November 4, 2023
Vukovic D., Romanyuk K., Ivashchenko S. et al., Expert Systems with Applications 2022 Vol. 194 No. May 2022 Article 116553
This paper investigates the forecasting performance for credit default swap (CDS) spreads by Support Vector
Machines (SVM), Group Method of Data Handling (GMDH), Long Short-Term Memory (LSTM) and Markov
switching autoregression (MSA) for daily CDS spreads of the 513 leading US companies, in the period
2009–2020. The goal of this study is to test the forecasting performance of ...
Added: February 4, 2022
Feigin B. L., Russian Mathematical Surveys 2017 Vol. 72 No. 4 P. 707-763
This paper discusses the main known constructions of vertex operator algebras. The starting point is the lattice algebra. Screenings distinguish subalgebras of lattice algebras. Moreover, one can construct extensions of vertex algebras. Combining these constructions gives most of the known examples. A large class of algebras with big centres is constructed. Such algebras have applications ...
Added: November 5, 2020
M. : Russian State University for the Humanitie, 2019
The book includes 61 reports of the International conference on computer and intellectual technology "Dialogue-2019", representing a wide range of theoretical and applied research in the field of natural language description, modeling of language processes, creating practically applicable computer linguistic technologies. For specialists in the field of theoretical and applied linguistics and intellectual technologies. ...
Added: June 12, 2019
Takasaki K., Takebe T., Теоретическая и математическая физика (Российская Федерация) 2012 Vol. 171 No. 2 P. 683-690
We briefly review a recursive construction of hbar-dependent solutions of the Kadomtsev-Petviashvili hierarchy. We give recurrence relations for the coefficients X_n of an ħ-expansion of the operator X = X_0 + hbar X_1 + hbar^2 X_2 + ... for which the dressing operator W is expressed in the exponential form W = exp(X/hbar). The wave ...
Added: June 22, 2012
Tikhonova M., Mikhailov V., Dina Pisarevskaya et al., Natural Language Engineering 2022 P. 1-30
Recent research has reported that standard fine-tuning approaches can be unstable due to being prone to various sources of randomness, including but not limited to weight initialization, training data order, and hardware. Such brittleness can lead to different evaluation results, prediction confidences, and generalization inconsistency of the same models independently fine-tuned under the same experimental setup. ...
Added: May 21, 2022
Kodryan M., Grachev A., Ignatov D. I. et al., , in : Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019). Issue W19-43.: Association for Computational Linguistics, 2019. P. 40-48.
Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in ...
Added: November 1, 2019