Effective post-training quantization of neural networks for inference on low power neural accelerator

A. Demidovskij; Smirnov E.

doi:10.1109/IJCNN48605.2020.9207281

Publications

?

Effective post-training quantization of neural networks for inference on low power neural accelerator

P. 1–7.

Demidovskij A., Smirnov E.

Neural network deployment to the target environment is considered a challenging task especially because of heavy burden of hardware requirements that DNN models lay on computation capabilities and power consumption. In case of low power edge devices, such as GNA - neural co-processor, quantization becomes the only way to make the deployment possible. This paper draws attention to the post-training quantization for low-power devices and proves that this approach is practically effective. We propose a novel quantization algorithm capable of reducing DNNs precision to 16-bit or 8-bit integer with negligible drop in accuracy (less than 0.1 percent). The elaborated approach is demonstrated on a set of speech recognition networks trained in Kaldi framework with OpenVINO framework as an inference backend that supports quantization and GNA as a target. Quantization influence on original topologies was rigorously measured and analyzed.

Language: English

Full text

DOI

Keywords: artificial neural networks low-bit inference quantization techniques

In book

Proceedings of 2020 International Joint Conference on Neural Networks (IJCNN)

Asim R. Vol. 1. , IEEE, 2020.

Hebb-Inspired Low Rank Adapters for Large Language Models Fine-Tuning

Demidovskij A., Тугарев А. М., Сальников И. Г. et al., , in: Lecture Notes in Computer ScienceVol. 16453.: Springer, 2026. P. 603–612.

The backpropagation method is the predominant method for pre-training and fine-tuning of Large Language models. At the same time, it is considerably demanding in terms of memory and hardware. Therefore, it makes fine-tuning and pre-training very expensive, harmful for the environment due to the large carbon footprint, and raises the blocks for the development of ...

Added: April 21, 2026

Lecture Notes in Computer Science

Springer, 2026.

This proceedings contain the papers presented at the 22nd Pacific Rim International Conference on Artificial Intelligence (PRICAI), held on November 17–21, 2025 in Wellington, New Zealand. PRICAI 2025 was co-hosted with the 40th International Conference on Image and Vision Computing New Zealand (IVCNZ 2025) and the annual conference of the New Zealand Artificial Intelligence Researchers ...

Added: April 21, 2026

Semi-automatic annotation of brain vessels in magnetic resonance angiography images

Bernadotte A, Elfimov N., Menshikov I., Scientific data 2025 Vol. 13 No. 41

Accurate segmentation of brain vessels in magnetic resonance angiography (MRA) is essential for surgical procedures. Neural networks are powerful tools for medical image segmentation, but their development requires well-annotated datasets. However, publicly available MRA datasets with detailed vessel annotations are scarce. We present a dataset of 100 manually annotated brain MRA images from the IXI ...

Added: February 25, 2026

Тесты как инструменты оценивания в вузах: трудности и решения

Antipkina I., Иванущенко А. В., Калабина И. А. et al., Мир психологии. Научно-методический журнал 2025 № 4(123) С. 295–316

Low-quality test items pose significant risks of biased and inaccurate assessment in higher education. In this study, multi-disciplinary test banks were examined, first, using classical test theory and then using a Large Language Model (Grok). Our findings reveal a number of problems in university test items due to methodological shortcomings rather than content inaccuracies. Based ...

Added: January 22, 2026

Формирование требований к технологическим параметрам серийного производства на основе нейросетевого подхода

Yasnitsky L., Голдобин М. А., Прикладная информатика 2025 Т. 20 № 3(117) С. 85–100

Currently, artificial intelligence methods are widely used in the practice of serial production enterprises. They are used to detect defects, classify and eliminate them, identify the causes of defects, predict the quality and properties of the resulting product, select optimal parameters of the production process, and identify and study its patterns. However, outside the field ...

Added: July 10, 2025

Экономические и социальные аспекты атомной энергетики в условиях развития технологий искусственного интеллекта

Podchufarov A., Galkina A. N., Ванина С. С. et al., Экономика и управление: проблемы, решения 2025 Т. 5 № 4 С. 61–74

Under modern conditions, the introduction of artificial intelligence technologies is becoming a significant factor in the development of high-tech industries. The article presents the results of a study of the prospects for the use of intelligent analytical systems in nuclear energy. The experience of foreign countries is analyzed and the features of successful projects using ...

Added: June 5, 2025

Where Do Large Learning Rates Lead Us?

Sadrtdinov I., Kodryan M., Pokonechny E. et al., , in: 38th Conference on Neural Information Processing Systems (NeurIPS 2024).: [б.и.], 2024. P. 58445–58479.

Added: February 19, 2025

Big Data Analytics Approach with Multiple Text Types: The Case of the Computer Gaming

Aleksandr Belov, Zakharov F., Litvinenko E. et al., , in: International IoT, Electronics and Mechatronics Conference, Volume 2. Proceedings of IEMTRONICS 2024. LNEE, volume 1228Vol. 1228.: Springer Publishing Company, 2025. P. 275–287.

Added: January 26, 2025

Artificial Neural Networks as a Natural Tool in Solution of Variational Problems in Hydrodynamics

Litvinenko N., IEEE Access 2024

Added: December 9, 2024

ALOE: Boosting Large Language Model Fine-Tuning with Aggressive Loss-Based Elimination of Samples

Demidovskij A., Трутнев А. И., Тугарев А. М. et al., , in: Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, SpainVol. 392.: IOS Press Ebooks, 2024. P. 3980–3986.

As modern neural network training and fine-tuning requires a lot of computational resources, there is a huge demand for novel, specialized algorithms for efficient and cost-effective training procedures. Aggressive Loss-based Elimination of Samples (ALOE) is an innovative method that operates with training samples based on losses obtained from a currently trained model or a pre-trained ...

Added: November 5, 2024

Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, Spain

IOS Press Ebooks, 2024.

The field of AI has grown enormously since 1974, when a summer conference on Artificial Intelligence and Simulation of Behaviour was held in Brighton, UK. This milestone in the history of AI has since come to be thought of as the 1st European Conference on Artificial Intelligence (ECAI). This book presents the proceedings of ECAI-2024, the ...

Added: November 5, 2024

The Complex Neural Network Model for Mass Appraisal and Scenario Forecasting of the Urban Real Estate Market Value That Adapts Itself to Space and Time

Leonid N. Yasnitsky, Yasnitsky V., Aleksander O. Alekseev, Complexity 2021 Vol. 2021 Article 5392170

In the modern scientific literature, there are many reports about the successful application of neural network technologies for solving complex applied problems, in particular, for modeling the urban real estate market. There are neural network models that can perform mass assessment of real estate objects taking into account their construction and operational characteristics. However, these ...

Added: February 10, 2024

Моделирование рынков жилой недвижимости крупнейших городов России

Yasnitsky L., Ясницкий В. Л., Alekseev A., Экономика региона 2022 Т. 18 № 2 С. 609–622

The existing mass appraisal models and mathematical tools for predicting the market value of residential property have a number of disadvantages, as they are developed for individual regions. Without considering the constantly changing economic environment, these models quickly become outdated and require constant updating. Thus, they are not suitable for construction business optimisation. The study ...

Added: February 10, 2024

Data Preprocessing and Neural Network Architecture Selection Algorithms in Cases of Limited Training Sets—On an Example of Diagnosing Alzheimer’s Disease

Alekseev A., Kozhemyakin L., Nikitin V. et al., Algorithms 2023 Vol. 16 No. 5 Article 219

This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables applicated to aggregation of a few indicators to one integrated assessment presents ...

Added: February 10, 2024

Neural Networks for Speech Synthesis of Voice Assistants and Singing Machines

Pantiukhin D., , in: Integral Robot Technologies and Speech Behavior.: Newcastle upon Tyne: Cambridge Scholars Publishing, 2024. Ch. 9 P. 281–296.

Added: December 10, 2023

Selected Papers from the XXV International Conference on Neuroinformatics, October 23-27, 2023, Moscow, Russia. Advances in Neural Computation, Machine Learning, and Cognitive Research VII (NEUROINFORMATICS 2023)

Frankfurt: Springer, 2023.

Reports on advanced theories and applications of artificial neural networks Focuses on problems in neuroscience, systems biophysics, cognitive research, and adaptive control Merges topics in neurobiology, machine learning, and evolutionary programming ...

Added: November 1, 2023

Latent Stochastic Differential Equations for Change Point Detection

Ryzhikov A., Hushchyn M., Derkach D., IEEE Access 2023 Vol. 11 P. 104700–104711

Automated analysis of complex systems based on multiple readouts remains a challenge. Change point detection algorithms are aimed to locating abrupt changes in the time series behaviour of a process. In this paper, we present a novel change point detection algorithm based on Latent Neural Stochastic Differential Equations (SDE). Our method learns a non-linear deep ...

Added: October 5, 2023

Real-time low latency estimation of brain rhythms with deep neural networks

Ilia Semenkov, Nikita Fedosov, Makarov I. et al., Journal of Neural Engineering 2023 Vol. 20 No. 5 Article 056008

Objective. Neurofeedback and brain-computer interfacing technology open the exciting opportunity for establishing interactive closed-loop real-time communication with the human brain. This requires interpreting brain's rhythmic activity and generating timely feedback to the brain. Lower delay between neuronal events and the appropriate feedback increase the efficacy of such interaction. Novel more efficient approaches capable of tracking brain ...

Added: September 9, 2023

2023 IX International Conference on Information Technology and Nanotechnology (ITNT)

IEEE, 2023.

Added: June 13, 2023

Artificial Intelligence in Music, Sound, Art and Design: 12th International Conference, EvoMUSART 2023, Held as Part of EvoStar 2023, Brno, Czech Republic, April 12–14, 2023, Proceedings

Cham: Springer, 2023.

This book constitutes the refereed proceedings of the 12th European Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2023, held as part of Evo* 2023, in April 2023, co-located with the Evo* 2023 events, EvoCOP, EvoApplications, and EuroGP. The 20 full papers and 7 short papers presented in this book were carefully reviewed ...

Added: April 4, 2023

Recognition of the Bare Soil Using Deep Machine Learning Methods to Create Maps of Arable Soil Degradation Based on the Analysis of Multi-Temporal Remote Sensing Data

Rukhovich D., Koroleva P., Rukhovich D. et al., Remote Sensing 2022 Vol. 14 No. 9 Article 2224

The detection of degraded soil distribution areas is an urgent task. It is difficult and very time consuming to solve this problem using ground methods. The modeling of degradation processes based on digital elevation models makes it possible to construct maps of potential degradation, which may differ from the actual spatial distribution of degradation. The ...

Added: November 14, 2022

Proceedings of Seventh International Congress on Information and Communication Technology. ICICT 2022, London, Volume 2

Springer, 2022.

The conference targets state-of-the-art as well as emerging topics pertaining to ICT and its Supported e-Agriculture and Rural Development Technologies, e-Education and Computing Technologies, e-Mining and Inclusive Technologies for implementation for Engineering and Managerial Applications through ICT. ...

Added: August 11, 2022