• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • ALOE: Boosting Large Language Model Fine-Tuning with Aggressive Loss-Based Elimination of Samples
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

ALOE: Boosting Large Language Model Fine-Tuning with Aggressive Loss-Based Elimination of Samples

P. 3980–3986.
Demidovskij A., Трутнев А. И., Тугарев А. М., Сальников И. Г.

As modern neural network training and fine-tuning requires a lot of computational resources, there is a huge demand for novel, specialized algorithms for efficient and cost-effective training procedures. Aggressive Loss-based Elimination of Samples (ALOE) is an innovative method that operates with training samples based on losses obtained from a currently trained model or a pre-trained one. ALOE is designed to accelerate the fine-tuning process of Large Language Models and is perfectly integrated with the state-of-the-art Parameter-Efficient Fine-Tuning method LoRA. ALOE is a two-stage fine-tuning acceleration method. The two stages of ALOE are called offline and online. The proposed method is based on the idea that reducing the number of samples due to a certain rule decreases the number of training steps, thus reducing the overall fine-tuning time for LLM. This reduction allows to either get a fine-tuned version of the model faster or to perform more training iterations within the same time period as the fine-tuning baseline (ALOE Max). The ALOE (Offline) performs dataset reduction before the fine-tuning starts, while the ALOE (Online) selects the training samples from each training batch during the fine-tuning process. Results demonstrate significant acceleration by 45.6% in average across 6 models: GPT-2 S, GPT-2 M, DeBERTa-V2-XL, LLaMA-7B, LLaMA-2-7B, LLaMA-2-13B with average accuracy improvement by 5.91% in comparison to the fine-tuning results obtained with the use of LoRA method. ALOE (Offline) is able to accelerate GPT-2 M E2E-NLG fine-tuning by up to 92% with 1.2% BLEU improvement.

Language: English
DOI
Text on another site
Keywords: artificial neural networksfine-tuning acceleration

In book

Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, Spain
Vol. 392. , IOS Press Ebooks, 2024.
Similar publications
Hebb-Inspired Low Rank Adapters for Large Language Models Fine-Tuning
Alexander Demidovskij, Artyom Tugaryov, Igor Salnikov et al., , in: PRICAI 2025: Trends in Artificial Intelligence: 22nd Pacific Rim International Conference on Artificial Intelligence, PRICAI 2025, Wellington, New Zealand, November 17–21, 2025, Proceedings, Part IIIVol. 16453.: Springer, 2026. P. 603–612.
The backpropagation method is the predominant method for pre-training and fine-tuning of Large Language models. At the same time, it is considerably demanding in terms of memory and hardware. Therefore, it makes fine-tuning and pre-training very expensive, harmful for the environment due to the large carbon footprint, and raises the blocks for the development of ...
Added: April 21, 2026
PRICAI 2025: Trends in Artificial Intelligence: 22nd Pacific Rim International Conference on Artificial Intelligence, PRICAI 2025, Wellington, New Zealand, November 17–21, 2025, Proceedings, Part III
Springer, 2026.
This proceedings contain the papers presented at the 22nd Pacific Rim International Conference on Artificial Intelligence (PRICAI), held on November 17–21, 2025 in Wellington, New Zealand. PRICAI 2025 was co-hosted with the 40th International Conference on Image and Vision Computing New Zealand (IVCNZ 2025) and the annual conference of the New Zealand Artificial Intelligence Researchers ...
Added: April 21, 2026
Semi-automatic annotation of brain vessels in magnetic resonance angiography images
Bernadotte A, Elfimov N., Menshikov I., Scientific data 2025 Vol. 13 No. 41
Accurate segmentation of brain vessels in magnetic resonance angiography (MRA) is essential for surgical procedures. Neural networks are powerful tools for medical image segmentation, but their development requires well-annotated datasets. However, publicly available MRA datasets with detailed vessel annotations are scarce. We present a dataset of 100 manually annotated brain MRA images from the IXI ...
Added: February 25, 2026
Тесты как инструменты оценивания в вузах: трудности и решения
Antipkina I., Иванущенко А. В., Калабина И. А. et al., Мир психологии. Научно-методический журнал 2025 № 4(123) С. 295–316
Low-quality test items pose significant risks of biased and inaccurate assessment in higher education. In this study, multi-disciplinary test banks were examined, first, using classical test theory and then using a Large Language Model (Grok). Our findings reveal a number of problems in university test items due to methodological shortcomings rather than content inaccuracies. Based ...
Added: January 22, 2026
Performance Study of Modern Zeroth-Order Optimization Methods for LLM Fine-Tuning
A. V. Demidovskij, A. I. Trutnev, Optical Memory and Neural Networks (Information Optics) 2025 Vol. 34 No. Suppl. 1 P. S16–S29
Large Language Models (LLMs) are widely employed across a broad range of applications due to their versatility and state-of-the-art performance. However, as usage scenarios grow, there is a pressing demand for task-specific adaptation of LLMs through fine-tuning. While full fine-tuning (FT) remains the most preferred in terms of quality, its high memory and computation requirements ...
Added: December 22, 2025
On the Influence of Layer Importance on LLM Fine-Tuning Acceleration and Quality
Demidovskij A., Irina Novikova, Artyom Tugaryov et al., , in: Frontiers in Artificial Intelligence and Applications: 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, ItalyVol. 413.: IOS Press Ebooks, 2025. P. 4233–4240.
Large Language Models (LLMs) have become central advancements in artificial intelligence, particularly in machine learning, natural language processing, and computer vision. Their ability to understand and generate human-like text has made them crucial in applications ranging from automated translation to text generation. Despite the vast capabilities of pre-trained LLMs, their deployment in specialized domains often ...
Added: October 23, 2025
Going Beyond LoRA Fine-Tuning with Hebb Learning: Blazingly Fast and Accurate
Demidovskij A., Igor Salnikov, Olga Frolova et al., , in: Frontiers in Artificial Intelligence and Applications: 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, ItalyVol. 413.: IOS Press Ebooks, 2025. P. 2426–2433.
Modern Multimodal Large Language Models have increased demands on computational resources required for both pretraining and fine-tuning procedures. This challenge is primarily attributed to the backpropagation step because the computation of gradients is time-consuming and memory-intensive. This paper aims to alleviate the presented issues, and introduces novel fine-tuning strategy. Low-Rank Adaptation with Hebb Rapid Optimization ...
Added: October 23, 2025
Формирование требований к технологическим параметрам серийного производства на основе нейросетевого подхода
Yasnitsky L., Голдобин М. А., Прикладная информатика 2025 Т. 20 № 3(117) С. 85–100
Currently, artificial intelligence methods are widely used in the practice of serial production enterprises. They are used to detect defects, classify and eliminate them, identify the causes of defects, predict the quality and properties of the resulting product, select optimal parameters of the production process, and identify and study its patterns. However, outside the field ...
Added: July 10, 2025
Экономические и социальные аспекты атомной энергетики в условиях развития технологий искусственного интеллекта
Podchufarov A., Galkina A. N., Ванина С. С. et al., Экономика и управление: проблемы, решения 2025 Т. 5 № 4 С. 61–74
Under modern conditions, the introduction of artificial intelligence technologies is becoming a significant factor in the development of high-tech industries. The article presents the results of a study of the prospects for the use of intelligent analytical systems in nuclear energy. The experience of foreign countries is analyzed and the features of successful projects using ...
Added: June 5, 2025
Comprehensive Weight Decomposition Analysis of Modern Parameter-Efficient Methods
A.V. Demidovskij, I.G. Salnikov, A.M. Tugaryov et al., Optical Memory and Neural Networks (Information Optics) 2024 Vol. 33 No. 3 P. S513–S522
Large Language Models fine-tuning is an essential part of modern artificial intelligent systems that solve numerous tasks, such as natural language processing and computer vision. Among the various fine-tuning strategies, the most prominent approach for Large Language Model fine-tuning is Parameter-Efficient Fine-Tuning (PEFT), as it allows to achieve state-of-the-art performance on multiple tasks while minimizing ...
Added: March 12, 2025
Where Do Large Learning Rates Lead Us?
Sadrtdinov I., Kodryan M., Pokonechny E. et al., , in: 38th Conference on Neural Information Processing Systems (NeurIPS 2024).: [б.и.], 2024. P. 58445–58479.
Added: February 19, 2025
Big Data Analytics Approach with Multiple Text Types: The Case of the Computer Gaming
Aleksandr Belov, Zakharov F., Litvinenko E. et al., , in: International IoT, Electronics and Mechatronics Conference, Volume 2. Proceedings of IEMTRONICS 2024. LNEE, volume 1228Vol. 1228.: Springer Publishing Company, 2025. P. 275–287.
Added: January 26, 2025
Artificial Neural Networks as a Natural Tool in Solution of Variational Problems in Hydrodynamics
Litvinenko N., IEEE Access 2024
Added: December 9, 2024
Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, Spain
IOS Press Ebooks, 2024.
The field of AI has grown enormously since 1974, when a summer conference on Artificial Intelligence and Simulation of Behaviour was held in Brighton, UK. This milestone in the history of AI has since come to be thought of as the 1st European Conference on Artificial Intelligence (ECAI). This book presents the proceedings of ECAI-2024, the ...
Added: November 5, 2024
The Complex Neural Network Model for Mass Appraisal and Scenario Forecasting of the Urban Real Estate Market Value That Adapts Itself to Space and Time
Leonid N. Yasnitsky, Yasnitsky V., Aleksander O. Alekseev, Complexity 2021 Vol. 2021 Article 5392170
In the modern scientific literature, there are many reports about the successful application of neural network technologies for solving complex applied problems, in particular, for modeling the urban real estate market. There are neural network models that can perform mass assessment of real estate objects taking into account their construction and operational characteristics. However, these ...
Added: February 10, 2024
Моделирование рынков жилой недвижимости крупнейших городов России
Yasnitsky L., Ясницкий В. Л., Alekseev A., Экономика региона 2022 Т. 18 № 2 С. 609–622
The existing mass appraisal models and mathematical tools for predicting the market value of residential property have a number of disadvantages, as they are developed for individual regions. Without considering the constantly changing economic environment, these models quickly become outdated and require constant updating. Thus, they are not suitable for construction business optimisation. The study ...
Added: February 10, 2024
Data Preprocessing and Neural Network Architecture Selection Algorithms in Cases of Limited Training Sets—On an Example of Diagnosing Alzheimer’s Disease
Alekseev A., Kozhemyakin L., Nikitin V. et al., Algorithms 2023 Vol. 16 No. 5 Article 219
This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables applicated to aggregation of a few indicators to one integrated assessment presents ...
Added: February 10, 2024
Neural Networks for Speech Synthesis of Voice Assistants and Singing Machines
Pantiukhin D., , in: Integral Robot Technologies and Speech Behavior.: Newcastle upon Tyne: Cambridge Scholars Publishing, 2024. Ch. 9 P. 281–296.
Added: December 10, 2023
Selected Papers from the XXV International Conference on Neuroinformatics, October 23-27, 2023, Moscow, Russia. Advances in Neural Computation, Machine Learning, and Cognitive Research VII (NEUROINFORMATICS 2023)
Frankfurt: Springer, 2023.
Reports on advanced theories and applications of artificial neural networks Focuses on problems in neuroscience, systems biophysics, cognitive research, and adaptive control Merges topics in neurobiology, machine learning, and evolutionary programming ...
Added: November 1, 2023
Latent Stochastic Differential Equations for Change Point Detection
Ryzhikov A., Hushchyn M., Derkach D., IEEE Access 2023 Vol. 11 P. 104700–104711
Automated analysis of complex systems based on multiple readouts remains a challenge. Change point detection algorithms are aimed to locating abrupt changes in the time series behaviour of a process. In this paper, we present a novel change point detection algorithm based on Latent Neural Stochastic Differential Equations (SDE). Our method learns a non-linear deep ...
Added: October 5, 2023
Real-time low latency estimation of brain rhythms with deep neural networks
Ilia Semenkov, Nikita Fedosov, Makarov I. et al., Journal of Neural Engineering 2023 Vol. 20 No. 5 Article 056008
Objective. Neurofeedback and brain-computer interfacing technology open the exciting opportunity for establishing interactive closed-loop real-time communication with the human brain. This requires interpreting brain's rhythmic activity and generating timely feedback to the brain. Lower delay between neuronal events and the appropriate feedback increase the efficacy of such interaction. Novel more efficient approaches capable of tracking brain ...
Added: September 9, 2023
2023 IX International Conference on Information Technology and Nanotechnology (ITNT)
IEEE, 2023.
Added: June 13, 2023
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit