Tensorizing neural networks

Novikov A.; Podoprikhin D.; Osokin A.; D. Vetrov

?

Tensorizing neural networks

Novikov A., Podoprikhin D., Osokin A., Vetrov D.

Deep neural networks currently demonstrate state-of-the-art performance in several domains.At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved.In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times.

Language: English

Keywords: neural networks

In book

Advances in Neural Information Processing Systems 28 (NIPS 2015)

NY: Curran Associates, 2015.

Neural Network for Real-Time Object Detection on FPGA

Rzaev E., Khanaev A., Amerikanov A., , in: 2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM). IEEE, 2021. P. 719–723.

Added: July 4, 2021

Об одной модели адаптивного управления сложными организационными структурами

Akopov A. S., Аудит и финансовый анализ 2010 № 3 С. 310–317

In work the developed model of adaptive management by the vertically integrated companies based on the system approach supporting the mechanism of an operational management in a uniform cycle of strategic planning, within the limits of faster time is presented. Thus for a finding of optimum values of operating parameters special algorithms of a class ...

Added: September 28, 2012

Advances in Neural Computation, Machine Learning, and Cognitive Research VII

Magaj G., Soroka A., Studies in Computational Intelligence, 2023.

The basis of transfer learning methods is the ability of deep neural networks to use knowledge from one domain to learn in another domain. However, another important task is the analysis and explanation of the internal representations of deep neural networks models in the process of transfer learning. Some deep models are known to be ...

Added: October 25, 2023

Artificial Intelligence. RCAI 2021. Lecture Notes in Computer Science

Springer, 2021.

This book constitutes the proceedings of the 19th Russian Conference on Artificial Intelligence, RCAI 2021, held in Moscow, Russia, in October 2021. The 19 full papers and 7 short papers presented in this volume were carefully reviewed and selected from 80 submissions. The conference deals with a wide range of topics, categorized into the following topical ...

Added: October 28, 2021

[Re]“Towards Understanding Grokking”

Alexander Shabalin, Sadrtdinov I., Evgeniy Shabalin, , in: ML Reprobucibility Challenge 2022. [б.и.], 2023.

Scope of Reproducibility In this work, we attempt to reproduce the results of the NeurIPS 2022 paper "Towards Understanding Grokking: An Effective Theory of Representation Learning". This study shows that the training process can happen in four regimes: memorization, grokking, comprehension and confusion. We first try to reproduce the results on the toy example described in ...

Added: November 2, 2023

Trusted artificial intelligence: Strengthening digital protection

Avdoshin S. M., Elena Yu. Pesotskaya, Business Informatics 2022 Vol. 16 No. 2 P. 62–73

Added: June 23, 2022

Language barriers in metaverses: the power of neural networks in translation

Osipov D., Евразийский филологический вестник 2023 No. 2 P. 21–39

The metaverse is a shared, virtual space, accessible to users worldwide, offering a platform for global interaction. The physical barriers of geographical location and time are non-existent, allowing for seamless connectivity and interaction. Language barriers within and between metaverses present substantial impediments to fluid interaction and collaboration. Failure to address this linguistic divergence can stifle ...

Added: March 19, 2024

К вопросу о структуре идеализированных когнитивных моделей в актах переноса

Pushkarev E., Вестник Южно-Уральского государственного университета. Серия: Лингвистика 2015 Т. 12 № 4 С. 56–60

The paper theorizes on the general architectonics of idealized cognitive models (ICMs) and their involvement in metonymy and metaphor. The article posits that an ICM's structure should reflect the architecture of the neural network/s engaged in processing of a given concept. The ICM nodes, or cogs, construct a complex, hierarchically organized neural connections, with the ...

Added: December 8, 2015

Возникновение новых объектов правовой защиты в условиях цифровой экономики

Kirsanova E., Юрист 2018 № 11 С. 19–24

The article analyzes the legal status of information. The main interpretations of this term are discussed. The attempt is made to figure out problems of regulation of self-learning programs and to offer a classification of existing approaches to determination of their legal status. ...

Added: September 13, 2022

15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings

Springer, 2018.

The sixteen-volume set comprising the LNCS volumes 11205-11220 constitutes the refereed proceedings of the 15th European Conference on Computer Vision, ECCV 2018, held in Munich, Germany, in September 2018. The 776 revised papers presented were carefully reviewed and selected from 2439 submissions. The papers are organized in topical sections on learning for vision; computational photography; human analysis; ...

Added: October 30, 2018

Voting: a machine learning approach

Clemens Puppe, Burka D., Szepesváry L. et al., / Series ISSN 2190-9806 "KIT Working paper in Economics". 2020. No. 145.

Voting rules can be assessed from quite different perspectives: the axiomatic, the pragmatic, in terms of computational or conceptual simplicity, susceptibility to manipulation, and many others aspects. In this paper, we take the machine learning perspective and ask how ‘well’ a few prominent voting rules can be learned by a neural network. To address this ...

Added: October 31, 2021

Сентимент частных инвесторов в объяснении различий в биржевых характеристиках акций российского рынка

Teplova T., Sokolova T., Tomtosov A. et al., Журнал Новой экономической ассоциации 2022 Т. 1 № 53 С. 53–84

Abstract. In our paper, for the first time, we examine the influence of the sentiment of private investors in social networks on the trade characteristics of stocks in the Russian market. Monthly return rates and trading volumes are analyzed under the control of financial indicators and indicators of the quality of corporate governance of stock ...

Added: April 5, 2022

Segmenting Prostate Cancer on TRUS Images with a Small Dataset: A Comprehensive Methodology

Lyutkin D., Romanov A., Nasonov D., , in: 2023 International Russian Smart Industry Conference (SmartIndustryCon), 27-31 March 2023. Sochi: IEEE, 2023. P. 454–459.

The use of mathematical algorithms for disease identification has gained traction in recent years and has paved the way for the creation of novel tools that can swiftly and accurately detect pathologies. In particular, modern machine learning techniques have garnered significant attention in this domain and are currently among the most widely used algorithms. Despite ...

Added: July 30, 2023

Влияние тональности писем CEO на финансовые показатели компании

Fedorova E., Осетров Р. А., Демин И. С. et al., Российский журнал менеджмента 2017 Т. 15 № 4 С. 441–462

The paper is devoted to the analysis of CEO letters as an instrument for influencing the expectations of shareholders and potential investors. The aim of the research is to analyze empirically the influence of semantic characteristics of CEO letters on financial indicators of the company. The authors suggested that CEO letter’s tonality, its length and ...

Added: October 23, 2018

Averaging Weights Leads to Wider Optima and Better Generalization

Izmailov P., Garipov T., Подоприхин Д. А. et al., , in: Proceedings of the international conference on Uncertainty in Artificial Intelligence (UAI 2018). [б.и.], 2018. P. 876–885.

Deep neural networks are typically trained by optimizing a loss function with an SGD variant, in conjunction with a decaying learning rate, until convergence. We show that simple averaging of multiple points along the trajectory of SGD, with a cyclical or constant learning rate, leads to better generalization than conventional training. We also show that ...

Added: February 27, 2019

Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track

PMLR, 2022.

Added: July 27, 2022

Non-invasive monitoring of blood glucose by means of wearable tracking technology

Kascheev N. I., Kozyrev O., Leykin M. et al., , in: Proceedings of XV IEEE East-West Design & Test Symposium (EWDTS'2017). Piscataway: IEEE, 2017. P. 1–4.

The secular outcome of our investigation is development of new monitoring service for glucose control related to diabetes. It is based on the main results of research: 1) New innovative wearable sensor that carry non-invasive measurement of glucose level. Sensor uses several independent technologies, simultaneously: radio-frequency with different levels of signal, ultrasonic, electromagnetic and thermal; ...

Added: February 20, 2018

Big Transformers for Code Generation

Arutyunov G.A., Avdoshin S. M., Proceedings of the Institute for System Programming of the RAS 2022 Vol. 34 No. 4 P. 79–88

IT industry has been thriving over the past decades. Numerous new programming languages have emerged, new architectural patterns and software development techniques. Tools involved in the process ought to evolve as well. One of the key principles of new generation of instruments for software development would be the ability of the tools to learn using ...

Added: December 26, 2022

Моделирование урожайности зерновых культур сельскохозяйственных регионов с использованием технологий компьютерного зрения

Arkhipova M., Экономика региона 2022 Т. 18 № 2 С. 581–594

The article examines new methodologies for modelling crop yield in agricultural regions of Russia based on the use of remote capabilities to get information on the field state. The proposed approach can be applied to develop indicator systems and create methodological platforms and models necessary to obtain more accurate estimates. In comparison with the traditional ...

Added: January 12, 2023

How to use neural network and web technologies in modeling complex technical systems

Semenenko M. G., Kniazeva I. V., Beckel L. S. et al., , in: IOP Conference Series: Materials Science and Engineering, Volume 537, Issue 3Vol. 537. Issue 3. Institute of Physics Publishing (IOP), 2019.

Added: October 20, 2021

A Deep Learning Method Study of User Interest Classification

Malafeev A., Nikolaev K., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Kazan, Russia, July 17–19, 2019, Revised Selected Papers. Communications in Computer and Information ScienceVol. 1086. Springer, 2020. P. 154–159.

In this paper, a deep learning method study is conducted to solve a new multiclass text classification problem, identifying user interests by text messages. We used an original dataset of almost 90 thousand forum text messages, labeled for ten interests. We experimented with different modern neural network architectures: recurrent and convolutional, as well as simpler ...

Added: November 7, 2019

Artificial Intelligence in Music, Sound, Art and Design: 12th International Conference, EvoMUSART 2023, Held as Part of EvoStar 2023, Brno, Czech Republic, April 12–14, 2023, Proceedings

Cham: Springer, 2023.

This book constitutes the refereed proceedings of the 12th European Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2023, held as part of Evo* 2023, in April 2023, co-located with the Evo* 2023 events, EvoCOP, EvoApplications, and EuroGP. The 20 full papers and 7 short papers presented in this book were carefully reviewed ...

Added: April 4, 2023

Прогнозирование котировок валютного курса евро и доллара с использованием искусственных нейронных сетей

Nazarova V., Ульзутуева Б. Д., Управление финансовыми рисками 2016 № 1 (45) С. 42–57

The first part of the issue gives general information about foreign exchange market (FOREX), review of forecasting foreign exchange rate is given. In addition we will consider the new model of nonlinear analysis to give a broader theoretical basis to the research - an artificial neural network (ANN).The nonlinear analysis and the ANN is still ...

Added: February 13, 2017

Опыт среднесрочного прогнозирования изменения площади морских льдов в северном полушарии на основе расчетов приходящей солнечной радиации и нейросетевого моделирования

Bukharov O., Bogolyubov D., Федоров В. М. et al., Криосфера Земли 2016 № 3 С. 43–50

The medium-term forecasting of the sea ice extent has been carried out by determining of the relationship between incoming solar radiation and the sea ice extent in the Northern Hemisphere. Different methods of the statistic and neural modeling have been used. Forecast shows that the main factor determining the variation of the maximum and minimum ...

Added: August 18, 2016