ALOE: Boosting Large Language Model Fine-Tuning with Aggressive Loss-Based Elimination of Samples

?

ALOE: Boosting Large Language Model Fine-Tuning with Aggressive Loss-Based Elimination of Samples

P. 3980–3986.

Demidovskij A., Трутнев А. И., Тугарев А. М., Salnikov I.

As modern neural network training and fine-tuning requires a lot of computational resources, there is a huge demand for novel, specialized algorithms for efficient and cost-effective training procedures. Aggressive Loss-based Elimination of Samples (ALOE) is an innovative method that operates with training samples based on losses obtained from a currently trained model or a pre-trained one. ALOE is designed to accelerate the fine-tuning process of Large Language Models and is perfectly integrated with the state-of-the-art Parameter-Efficient Fine-Tuning method LoRA. ALOE is a two-stage fine-tuning acceleration method. The two stages of ALOE are called offline and online. The proposed method is based on the idea that reducing the number of samples due to a certain rule decreases the number of training steps, thus reducing the overall fine-tuning time for LLM. This reduction allows to either get a fine-tuned version of the model faster or to perform more training iterations within the same time period as the fine-tuning baseline (ALOE Max). The ALOE (Offline) performs dataset reduction before the fine-tuning starts, while the ALOE (Online) selects the training samples from each training batch during the fine-tuning process. Results demonstrate significant acceleration by 45.6% in average across 6 models: GPT-2 S, GPT-2 M, DeBERTa-V2-XL, LLaMA-7B, LLaMA-2-7B, LLaMA-2-13B with average accuracy improvement by 5.91% in comparison to the fine-tuning results obtained with the use of LoRA method. ALOE (Offline) is able to accelerate GPT-2 M E2E-NLG fine-tuning by up to 92% with 1.2% BLEU improvement.

Language: English

DOI

Text on another site

In book

Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, Spain

Vol. 392. , IOS Press Ebooks, 2024.

Hebb-Inspired Low Rank Adapters for Large Language Models Fine-Tuning

Alexander Demidovskij, Artyom Tugaryov, Igor Salnikov et al., , in: PRICAI 2025: Trends in Artificial Intelligence: 22nd Pacific Rim International Conference on Artificial Intelligence, PRICAI 2025, Wellington, New Zealand, November 17–21, 2025, Proceedings, Part IIIVol. 16453.: Springer, 2026. P. 603–612.

The backpropagation method is the predominant method for pre-training and fine-tuning of Large Language models. At the same time, it is considerably demanding in terms of memory and hardware. Therefore, it makes fine-tuning and pre-training very expensive, harmful for the environment due to the large carbon footprint, and raises the blocks for the development of ...

Added: April 21, 2026

PRICAI 2025: Trends in Artificial Intelligence: 22nd Pacific Rim International Conference on Artificial Intelligence, PRICAI 2025, Wellington, New Zealand, November 17–21, 2025, Proceedings, Part III

Springer, 2026.

This proceedings contain the papers presented at the 22nd Pacific Rim International Conference on Artificial Intelligence (PRICAI), held on November 17–21, 2025 in Wellington, New Zealand. PRICAI 2025 was co-hosted with the 40th International Conference on Image and Vision Computing New Zealand (IVCNZ 2025) and the annual conference of the New Zealand Artificial Intelligence Researchers ...

Added: April 21, 2026

Semi-automatic annotation of brain vessels in magnetic resonance angiography images

Bernadotte A, Elfimov N., Menshikov I., Scientific data 2025 Vol. 13 No. 41

Accurate segmentation of brain vessels in magnetic resonance angiography (MRA) is essential for surgical procedures. Neural networks are powerful tools for medical image segmentation, but their development requires well-annotated datasets. However, publicly available MRA datasets with detailed vessel annotations are scarce. We present a dataset of 100 manually annotated brain MRA images from the IXI ...

Added: February 25, 2026

Тесты как инструменты оценивания в вузах: трудности и решения

Antipkina I., Иванущенко А. В., Калабина И. А. et al., Мир психологии. Научно-методический журнал 2025 № 4(123) С. 295–316

Low-quality test items pose significant risks of biased and inaccurate assessment in higher education. In this study, multi-disciplinary test banks were examined, first, using classical test theory and then using a Large Language Model (Grok). Our findings reveal a number of problems in university test items due to methodological shortcomings rather than content inaccuracies. Based ...

Added: January 22, 2026

Performance Study of Modern Zeroth-Order Optimization Methods for LLM Fine-Tuning

A. V. Demidovskij, A. I. Trutnev, Optical Memory and Neural Networks (Information Optics) 2025 Vol. 34 No. Suppl. 1 P. S16–S29

Large Language Models (LLMs) are widely employed across a broad range of applications due to their versatility and state-of-the-art performance. However, as usage scenarios grow, there is a pressing demand for task-specific adaptation of LLMs through fine-tuning. While full fine-tuning (FT) remains the most preferred in terms of quality, its high memory and computation requirements ...

Added: December 22, 2025

On the Influence of Layer Importance on LLM Fine-Tuning Acceleration and Quality

Demidovskij A., Irina Novikova, Artyom Tugaryov et al., , in: Frontiers in Artificial Intelligence and Applications: 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, ItalyVol. 413.: IOS Press Ebooks, 2025. P. 4233–4240.

Large Language Models (LLMs) have become central advancements in artificial intelligence, particularly in machine learning, natural language processing, and computer vision. Their ability to understand and generate human-like text has made them crucial in applications ranging from automated translation to text generation. Despite the vast capabilities of pre-trained LLMs, their deployment in specialized domains often ...

Added: October 23, 2025

Going Beyond LoRA Fine-Tuning with Hebb Learning: Blazingly Fast and Accurate

Demidovskij A., Igor Salnikov, Olga Frolova et al., , in: Frontiers in Artificial Intelligence and Applications: 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, ItalyVol. 413.: IOS Press Ebooks, 2025. P. 2426–2433.

Modern Multimodal Large Language Models have increased demands on computational resources required for both pretraining and fine-tuning procedures. This challenge is primarily attributed to the backpropagation step because the computation of gradients is time-consuming and memory-intensive. This paper aims to alleviate the presented issues, and introduces novel fine-tuning strategy. Low-Rank Adaptation with Hebb Rapid Optimization ...

Added: October 23, 2025

Формирование требований к технологическим параметрам серийного производства на основе нейросетевого подхода

Yasnitsky L., Голдобин М. А., Прикладная информатика 2025 Т. 20 № 3(117) С. 85–100

Currently, artificial intelligence methods are widely used in the practice of serial production enterprises. They are used to detect defects, classify and eliminate them, identify the causes of defects, predict the quality and properties of the resulting product, select optimal parameters of the production process, and identify and study its patterns. However, outside the field ...

Added: July 10, 2025

Экономические и социальные аспекты атомной энергетики в условиях развития технологий искусственного интеллекта

Podchufarov A., Galkina A. N., Ванина С. С. et al., Экономика и управление: проблемы, решения 2025 Т. 5 № 4 С. 61–74

Under modern conditions, the introduction of artificial intelligence technologies is becoming a significant factor in the development of high-tech industries. The article presents the results of a study of the prospects for the use of intelligent analytical systems in nuclear energy. The experience of foreign countries is analyzed and the features of successful projects using ...

Added: June 5, 2025

Comprehensive Weight Decomposition Analysis of Modern Parameter-Efficient Methods

A.V. Demidovskij, I.G. Salnikov, A.M. Tugaryov et al., Optical Memory and Neural Networks (Information Optics) 2024 Vol. 33 No. 3 P. S513–S522

Large Language Models fine-tuning is an essential part of modern artificial intelligent systems that solve numerous tasks, such as natural language processing and computer vision. Among the various fine-tuning strategies, the most prominent approach for Large Language Model fine-tuning is Parameter-Efficient Fine-Tuning (PEFT), as it allows to achieve state-of-the-art performance on multiple tasks while minimizing ...

Added: March 12, 2025

Where Do Large Learning Rates Lead Us?

Sadrtdinov I., Kodryan M., Pokonechny E. et al., , in: 38th Conference on Neural Information Processing Systems (NeurIPS 2024).: [б.и.], 2024. P. 58445–58479.

Added: February 19, 2025

Big Data Analytics Approach with Multiple Text Types: The Case of the Computer Gaming

Aleksandr Belov, Zakharov F., Litvinenko E. et al., , in: International IoT, Electronics and Mechatronics Conference, Volume 2. Proceedings of IEMTRONICS 2024. LNEE, volume 1228Vol. 1228.: Springer Publishing Company, 2025. P. 275–287.

Added: January 26, 2025

Artificial Neural Networks as a Natural Tool in Solution of Variational Problems in Hydrodynamics

Litvinenko N., IEEE Access 2024

Added: December 9, 2024

Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, Spain

IOS Press Ebooks, 2024.

The field of AI has grown enormously since 1974, when a summer conference on Artificial Intelligence and Simulation of Behaviour was held in Brighton, UK. This milestone in the history of AI has since come to be thought of as the 1st European Conference on Artificial Intelligence (ECAI). This book presents the proceedings of ECAI-2024, the ...

Added: November 5, 2024

The Complex Neural Network Model for Mass Appraisal and Scenario Forecasting of the Urban Real Estate Market Value That Adapts Itself to Space and Time

Leonid N. Yasnitsky, Yasnitsky V., Aleksander O. Alekseev, Complexity 2021 Vol. 2021 Article 5392170

In the modern scientific literature, there are many reports about the successful application of neural network technologies for solving complex applied problems, in particular, for modeling the urban real estate market. There are neural network models that can perform mass assessment of real estate objects taking into account their construction and operational characteristics. However, these ...

Added: February 10, 2024

Моделирование рынков жилой недвижимости крупнейших городов России

Yasnitsky L., Ясницкий В. Л., Alekseev A., Экономика региона 2022 Т. 18 № 2 С. 609–622

The existing mass appraisal models and mathematical tools for predicting the market value of residential property have a number of disadvantages, as they are developed for individual regions. Without considering the constantly changing economic environment, these models quickly become outdated and require constant updating. Thus, they are not suitable for construction business optimisation. The study ...

Added: February 10, 2024

Data Preprocessing and Neural Network Architecture Selection Algorithms in Cases of Limited Training Sets—On an Example of Diagnosing Alzheimer’s Disease

Alekseev A., Kozhemyakin L., Nikitin V. et al., Algorithms 2023 Vol. 16 No. 5 Article 219

This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables applicated to aggregation of a few indicators to one integrated assessment presents ...

Added: February 10, 2024

Neural Networks for Speech Synthesis of Voice Assistants and Singing Machines

Pantiukhin D., , in: Integral Robot Technologies and Speech Behavior.: Newcastle upon Tyne: Cambridge Scholars Publishing, 2024. Ch. 9 P. 281–296.

Added: December 10, 2023

Selected Papers from the XXV International Conference on Neuroinformatics, October 23-27, 2023, Moscow, Russia. Advances in Neural Computation, Machine Learning, and Cognitive Research VII (NEUROINFORMATICS 2023)

Frankfurt: Springer, 2023.

Reports on advanced theories and applications of artificial neural networks Focuses on problems in neuroscience, systems biophysics, cognitive research, and adaptive control Merges topics in neurobiology, machine learning, and evolutionary programming ...

Added: November 1, 2023

Latent Stochastic Differential Equations for Change Point Detection

Ryzhikov A., Hushchyn M., Derkach D., IEEE Access 2023 Vol. 11 P. 104700–104711

Automated analysis of complex systems based on multiple readouts remains a challenge. Change point detection algorithms are aimed to locating abrupt changes in the time series behaviour of a process. In this paper, we present a novel change point detection algorithm based on Latent Neural Stochastic Differential Equations (SDE). Our method learns a non-linear deep ...

Added: October 5, 2023

Real-time low latency estimation of brain rhythms with deep neural networks

Ilia Semenkov, Nikita Fedosov, Makarov I. et al., Journal of Neural Engineering 2023 Vol. 20 No. 5 Article 056008

Objective. Neurofeedback and brain-computer interfacing technology open the exciting opportunity for establishing interactive closed-loop real-time communication with the human brain. This requires interpreting brain's rhythmic activity and generating timely feedback to the brain. Lower delay between neuronal events and the appropriate feedback increase the efficacy of such interaction. Novel more efficient approaches capable of tracking brain ...

Added: September 9, 2023

2023 IX International Conference on Information Technology and Nanotechnology (ITNT)

IEEE, 2023.

Added: June 13, 2023