• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Lion's sign noise can make training more stable
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Lion's sign noise can make training more stable

.
Elistratov S., Podivilov A., Iuzhakov T., Vetrov D.

Lion is a novel optimization method that has outperformed traditional optimizers like Adam across a variety of tasks. Despite its empirical success, the reasons behind Lion's superiority remain unclear. In this paper, we investigate the mechanisms contributing to Lion's enhanced performance, focusing on the structured noise introduced by the use of the sign function in gradient updates. We characterize this noise by the angle of rotation between the true gradient and its signum. By injecting this noise as a random rotation of a fixed angle into normalized updates, we analyze how the performance of this method corresponds to that of Lion. We demonstrate that this method has a stronger performance than Lion in our setting. This approach reveals a relationship between the rotation angle and the learning rate in Lion, providing insights into its improved performance metrics. Additionally, we identify an effect called "momentum tracing" in neural networks with normalization layers and ReLU activations, which can significantly destabilize the training process. Our analysis demonstrates that the rotation noise inherent in Lion mitigates the negative impact of "momentum tracing", leading to more stable learning. These findings offer theoretical justification for Lion's effectiveness and suggest avenues for developing more robust optimization algorithms.

Language: English
Text on another site
Keywords: optimizationdeep learningLion

In book

NeurIPS 2024 Optimization for ML Workshop
[б.и.], 2025.
Similar publications
NeurIPS 2024 Optimization for ML Workshop
[б.и.], 2025.
Added: February 5, 2026
Method of Critical Set construction for Successive Cancellation List Decoder of Polar Codes Based on Deep Learning of Neural Networks
Котов Ф. И., Timokhin I., Ivanov F., , in: 2023 XVIII International Symposium Problems of Redundancy in Information and Control Systems (REDUNDANCY).: IEEE, 2023.
The Successive Cancellation List (SCL) algorithm is a widely used decoding technique in communication systems. However, constructing the critical set for SCL decoding is a challenging task, as it requires a large number of computations and can lead to significant decoding delays. In this paper, a new approach to critical set construction for SCL decoding ...
Added: January 26, 2026
Method of Automated Dataset Collection for Microwave Filters Synthesis
Arinin O. V., Bakhmach D. M., Katsnelson A. et al., , in: 2025 Systems of Signals Generating and Processing in the Field of on Board Communications.: IEEE, 2025. P. 1–5.
This research discusses the method of dataset collection automatization for microwave filter synthesis by integrating machine learning techniques, thus reducing development time. Utilizing the 3D electromagnetic analysis software package, the study involves simulation and collecting geometric parameters and amplitude-frequency characteristics from three variants of passband highly selective microstrip tworesonator combined filters with stepped impedance resonators. ...
Added: December 6, 2025
Physics-Informed Bayesian Optimization for Conformational Ensemble Augmentation
Medvedev M., Journal of Chemical Information and Modeling 2025 Vol. 65 No. 12
Added: November 12, 2025
Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V
Cham: Springer, 2025.
This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025.   The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...
Added: September 29, 2025
Optimization of Multi-Currency Deposit Structure by Two Indicators (Income and Risk) under Uncertainty
Molostvov V., Advances in Systems Science and Applications 2025 Vol. 25 No. 1 P. 1–11
A two-criteria vector optimization problem – finding Pareto-optimal solutions in linear systems with interval uncertainty of coefficients – is considered. The problem of resource allocation to multiple activities is investigated. The uncertainty-adjusted income is a bilinear function, linear by strategy under fixed uncertainty and by uncertain parameters under fixed strategy. Guaranteed income is a linear ...
Added: August 26, 2025
Deep learning deciphers the related role of master regulators and G-quadruplexes in tissue specification
Artem B., Andreasyan A., Konovalov D. et al., Scientific Reports 2025 Vol. 15 Article 23119
G-quadruplexes (GQs) are non-canonical DNA structures encoded by G-flipons with potential roles in gene regulation and chromatin structure. Here, we explore the role of G-flipons in tissue specification. We present a deep learning-based framework for the genome-wide G-flipon predictions across 14 human tissue types. The model was trained using high-confidence experimental maps of GQ-forming sequences ...
Added: August 8, 2025
AI in drug development: advances in response, combination therapy, repositioning, and molecular design
Shaitan A., Science China Information Sciences 2025 Vol. 68 No. 7 Article 170102
Artificial intelligence (AI) is revolutionizing the field of drug development, particularly in addressing key challenges such as drug response prediction, drug combination design, drug repositioning, and drug molecule generation. Traditional drug discovery is hindered by long timelines, high costs, and low success rates, necessitating innovative technologies to accelerate the process. AI technologies, such as deep ...
Added: June 25, 2025
An Approach to Finding a Robust Deep Learning Model
Boldyrev A., Ratnikov F., Shevelev A., IEEE Access 2025 Vol. 13 P. 102390–102406
The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of a large numbers of models. This growing demand highlights the importance of training models without human supervision, while ensuring that their predictions are reliable. In response to this need, we propose a novel approach for determining model robustness. This approach, supplemented with a ...
Added: June 15, 2025
Экономические и социальные аспекты атомной энергетики в условиях развития технологий искусственного интеллекта
Podchufarov A., Galkina A. N., Ванина С. С. et al., Экономика и управление: проблемы, решения 2025 Т. 5 № 4 С. 61–74
Under modern conditions, the introduction of artificial intelligence technologies is becoming a significant factor in the development of high-tech industries. The article presents the results of a study of the prospects for the use of intelligent analytical systems in nuclear energy. The experience of foreign countries is analyzed and the features of successful projects using ...
Added: June 5, 2025
Численная оптимизация проверочной матрицы LDPC-кода для применения в протоколе квантового распределения ключей с использованием высокопараллельных вычислений
Morozov V., Башара В. О., Емельяненко М. В., В кн.: Параллельные вычислительные технологии – XIX всероссийская научная конференция с международным участием, ПаВТ’2025, г. Москва, 8–10 апреля 2025 г. Короткие статьи и описания плакатов.: Челябинск: Издательский центр ЮУрГУ, 2025. С. 193–210.
Error correction in the secret key is a mandatory step in quantum key distribution (QKD) protocols. Usually, modern error-correcting codes are used for its implementation. Imperfections of the hardware used in QKD systems lead to bit flipping errors in the channel. Moreover, such systems are characterized by an asymmetric distribution of such errors. Taking into ...
Added: June 3, 2025
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 3-5 May 2025, Splash Beach Resort in Mai Khao, Thailand, PMLR: vol. 258
PMLR, 2025.
Added: May 18, 2025
Deep learning for customs classification of goods based on their textual descriptions analysis
Ryzhova A., Sochenkov I., , in: Proceeding 2019 Ivannikov Ispras Open Conference (ISPRAS).: IEEE Computer Society, 2019. P. 60–67.
Added: May 1, 2025
Distilling Normalizing Flows
Walton S., Klyukin V., Artemev M. et al., , in: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).: IEEE, 2025. P. 3328–3337.
Explicit density learners are becoming an increasingly popular technique for generative models because of their ability to better model probability distributions. They have advantages over Generative Adversarial Networks due to their ability to perform density estimation and having exact latent-variable inference. This has many advantages, including: being able to simply interpolate, calculate sample likelihood, and ...
Added: April 1, 2025
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Derkach D., Artemev M., IEEE, 2025.
Added: April 1, 2025
Deep learning captures the effect of epistasis in multifactorial diseases
Perelygin V., Kamelin A., Syzrantsev N. et al., Frontiers in Medicine 2025 Vol. 11 Article 1479717
Polygenic risk score (PRS) prediction is widely used to assess the risk of diagnosis and progression of many diseases. Routinely, the weights of individual SNPs are estimated by the linear regression model that assumes independent and linear contribution of each SNP to the phenotype. However, for complex multifactorial diseases such as Alzheimer’s disease, diabetes, cardiovascular ...
Added: March 4, 2025
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks
Ivan Rubachev, Nikolay Kartashev, Gorishniy Y. et al., , in: Proceedings of the 13th International Conference on Learning Representations (ICLR 2025).: ICLR, 2025. P. 53831–53867.
Advances in machine learning research drive progress in real-world applications. To ensure this progress, it is important to understand the potential pitfalls on the way from a novel method's success on academic benchmarks to its practical deployment. In this work, we analyze existing tabular deep learning benchmarks and find two common characteristics of tabular data ...
Added: March 1, 2025
Editorial
Panos Pardalos, Valery Kalyagin, Mario R. Guarracino, Computational Management Science 2024 Vol. 21 No. 1 Article 35
Big data has become an integral part of modern networks. With the increasing amount of data generated by devices, machines, and applications, networks are constantly being challenged to handle and process this data in a timely and efficient manner. The size, complexity, and variety of data in networks are increasing rapidly, which requires new approaches ...
Added: February 22, 2025
Weight Perturbations for Simulating Virtual Lesions in a Convolutional Neural Network
W. Joseph MacInnes, Zhozhikashvili N., Feurra M., , in: First International Conference, AIiH 2024, Swansea, UK, September 4–6, 2024, Proceedings, Part II. Artificial Intelligence in Healthcare. LNCS, volume 14976Vol. 14976.: Springer, 2024. P. 221–234.
Convolutional Neural Networks (CNNs) match human performance in many visual tasks like the classification of images, however they may not simulate the underlying biological processes. We implemented a CNN to try replicate results from an object inversion experiment with Transcranial Magnetic Stimulation (TMS). After training on upright faces, the CNN model went through three stages ...
Added: January 28, 2025
TabR: Tabular Deep Learning Meets Nearest Neighbors
Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev et al., , in: Proceedings of the 12th International Conference on Learning Representations (ICLR 2024).: ICLR, 2024.
Deep learning (DL) models for tabular data problems (e.g. classification, regression) are currently receiving increasingly more attention from researchers. However, despite the recent efforts, the non-DL algorithms based on gradient-boosted decision trees (GBDT) remain a strong go-to solution for these problems. One of the research directions aimed at improving the position of tabular DL involves ...
Added: January 22, 2025
Deep Learning Approaches for LHCb ECAL Reconstruction
Boldyrev A., Derkach D., Ratnikov F. et al., EPJ Web of Conferences 2024 Vol. 295 Article 09008
Calorimeters are a crucial component for most detectors mounted on modern colliders. Their tasks include identifying and measuring the energy of photons and neutral hadrons, recording energetic hadronic jets, and contributing to the identification of electrons, muons, and charged hadrons. To fulfill these many tasks while keeping costs reasonable, the calorimeter construction requires good and ...
Added: January 8, 2025
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit