• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 3, 2026
Pocket Money, Personal Interest, and Family Practices: What Shapes Students Economic Literacy?
University students' economic literacy depends not only on their field of study but also on their interest in economics, the learning environment, and family financial practices. For example, students who received pocket money irregularly tend to perform better on economic literacy tests than their peers who received financial support on a regular basis. These findings come from a study conducted by HSE University involving more than 1,100 students from five Russian universities. The findings have been published in Cakrawala Pendidikan.
June 3, 2026
Creative Work as a Remedy for Burnout
The creative, supportive atmosphere and innovative methods at the Centre for Sociocultural Research make it appealing to early-career scholars. Over years of working at HSE University, they grow into researchers and lecturers recognised both in Russia and abroad. Chief Research Fellow Zarina Lepshokova and Leading Research Fellow Ekaterina Bushina spoke about their journey at the centre and at HSE, their research, and the role of mentors in their academic success.
June 2, 2026
HSE Study Reveals Imbalance in the Generative AI Market
Researchers at HSE University analysed how effectively the global generative artificial intelligence market converts investment into real revenue, concluding that AI is currently developing faster than it is paying off. The results have been published in the journal Foresight and STI Governance.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

P. 15936–15964.
Sadrtdinov I., Dmitrii Pozdeev, Dmitry P Vetrov, Lobacheva E.

Transfer learning and ensembling are two popular techniques for improving the performance and robustness of neural networks. Due to the high cost of pre-training, ensembles of models fine-tuned from a single pre-trained checkpoint are often used in practice. Such models end up in the same basin of the loss landscape, which we call the pre-train basin, and thus have limited diversity. In this work, we show that ensembles trained from a single pre-trained checkpoint may be improved by better exploring the pre-train basin, however, leaving the basin results in losing the benefits of transfer learning and in degradation of the ensemble quality. Based on the analysis of existing exploration methods, we propose a more effective modification of the Snapshot Ensembles (SSE) for transfer learning setup, StarSSE, which results in stronger ensembles and uniform model soups.

Language: English
DOI
Text on another site
Keywords: neural network ensemblestransfer learningloss landscapemodel soups

In book

Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
Curran Associates, Inc., 2023.
Similar publications
Extraction of properties of anisotropic spin model by deep transfer learning methods
D.D. Sukhoverkhova, L.N. Shchur, , in: Параллельные вычислительные технологии – XIX всероссийская конференция с международным участием, ПаВТ'2025. Короткие статьи и описания плакатов.: Издательский центр ЮУрГУ, 2025. P. 82–89.
We apply supervised deep machine learning techniques to extract properties of the anisotropic Ising model. We consider two cases of anisotropy: orthogonal and diagonal. From the predictions of the neural network, we obtained phase probability functions, from which we measured two quantities: the critical temperature and the critical exponent of the correlation length. We estimated ...
Added: December 4, 2025
Machine Learning Domain Adaptation in Spin Models with Continuous Phase Transitions
Chertenkov V., Shchur L., Physical Review E - Statistical, Nonlinear, and Soft Matter Physics 2025 Vol. 112 No. 3 Article 034104
The main question raised in the  article  is whether a neural network trained on a spin lattice model in one universality class   can be used to test a model in another universality class. The quantities of interest are the critical phase transition temperature and the correlation length exponent. In other words, the question of ...
Added: August 12, 2025
Supervised and Transfer Learning for Phase Transition Research
Chertenkov V., Shchur L., Lecture Notes in Computer Science 2025 Vol. 15406 P. 434–449
Machine learning is a new tool for investigating physical models. One possible applications is the study of phase transitions analyzing the distribution of spins on regular lattices using supervised learning approach. A new question is the applicability of transfer learning, a network supervised on a particular model and used to infer information about another model. The ...
Added: February 10, 2025
Transfer Machine Learning of an Anisotropic Model
D. D. Sukhoverkhova, L. N. Shchur, Lobachevskii Journal of Mathematics 2025 Vol. 46 No. 1 P. 528–534
We investigate the possibility of extracting features  of second-order phase transitions using transfer machine learning. We have performed supervised machine learning for binary classification of snapshots of the spin distribution of the isotropic Ising model. The binary classification is performed in ferromagnetic and paramagnetic phases using a known critical temperature. The trained network is used ...
Added: January 13, 2025
Влияние анизотропии на исследование критического поведения спиновых моделей методами машинного обучения
Sukhoverkhova D., Shchur L., Письма в Журнал экспериментальной и теоретической физики 2024 Т. 120 № 8 С. 644–649
In this paper, we applied a deep neural network to study the issue of knowledge transferability between statistical mechanics models. The following computer experiment was conducted. A convolutional neural network was trained to solve the problem of binary classification of snapshots of snapshots of the location of spins of the Ising model on a two-dimensional ...
Added: September 25, 2024
Применение метода Transfer Learning к задаче машинного перевода для пары русско-хакасский
Лебедева А. Ю., В кн.: Одиннадцатая Международная конференция по компьютерной обработке тюркских языков «TurkLang 2023».: Каз.: Издательство Академии наук Республики Татарстан, 2023. С. 460–471.
Added: March 6, 2024
Unsupervised domain adaptation methods for cross-species transfer of regulatory code signals
Pavel Latyshev, Fedor Pavlov, Herbert A. et al., , in: Proceedings of 11th Moscow Conference on Computational Molecular Biology MCCMB'23.: IITP RAS, 2023.
Added: December 1, 2023
Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model
Malik M. S., Nazarova A., Mona M. J. et al., Journal of King Saud University - Computer and Information Sciences 2023 Vol. 35 No. 8 Article 101736
Hope Speech Detection (HSD) from social media is a new direction for promoting and supporting positive content to encourage harmony and positivity in society. As users of social media belong to different linguistic communities, hope speech detection is rarely studied as a multilingual task considering low-resource languages. Moreover, prior studies explored only monolingual techniques, and the ...
Added: November 22, 2023
Advances in Neural Computation, Machine Learning, and Cognitive Research VII
Magaj G., Soroka A., Studies in Computational Intelligence, 2023.
The basis of transfer learning methods is the ability of deep neural networks to use knowledge from one domain to learn in another domain. However, another important task is the analysis and explanation of the internal representations of deep neural networks models in the process of transfer learning. Some deep models are known to be ...
Added: October 25, 2023
Unsupervised Domain Adaptation Methods for Cross-Species Transfer of Regulatory Code Signals
Pavel Latyshev, Fedor Pavlov, Herbert A. et al., Frontiers in Big Data 2023 Vol. 6 Article 1140663
Due to advances in NGS technologies whole-genome maps of various functional genomic elements were generated for a dozen of species, however experiments are still expensive and are not available for many species of interest. Deep learning methods became the state-of-the-art computational methods to analyze the available data, but the focus is often only on the ...
Added: June 8, 2023
Ensemble Distribution Distillation
Malinin A., Mlodozeniec B., Gales M., , in: Proceedings of the 8th International Conference on Learning Representations (ICLR 2020).: ICLR, 2020.
Added: November 1, 2021
Irony detection via sentiment-based transfer learning
Zhang S., Zhang X., Chan J. et al., Information Processing and Management 2019 Vol. 56 No. 5 P. 1633–1644
Irony as a literary technique is widely used in online texts such as Twitter posts. Accurate irony detection is crucial for tasks such as effective sentiment analysis. A text's ironic intent is defined by its context incongruity. For example in the phrase "I love being ignored", the irony is defined by the incongruity between the ...
Added: October 29, 2020
On Power Laws in Deep Ensembles
Lobacheva E., Chirkova N., Kodryan M. et al., , in: Advances in Neural Information Processing Systems 33 (NeurIPS 2020).: Curran Associates, Inc., 2020. P. 2375–2385.
Added: October 29, 2020
Research of heuristic approaches for determining the tonality of text messages in natural language processing problems
Polyakov E. V., Polyakov S. V., Abramov P., , in: Proceedings of 2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY).: IEEE, 2019. P. 159–164.
Determining the tonality of the text is a difficult task, the solution of which essentially depends on the context, the field of study and the amount of text data. The analysis shows that the authors in their works do not jointly use the full range of possible transformations on the data and their combinations. The ...
Added: September 20, 2020
Generalized approach to sentiment analysis of short text messages in natural language processing
Polyakov E. V., Voskov L., Abramov P. et al., Informatsionno-upravliaiushchie sistemy [Information and Control Systems] 2020 No. 1 P. 2–14
Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study and amount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformations and their combinations. Only a part of the transformations is used, limiting the ways to ...
Added: February 20, 2020
Метод коррекции ошибок классификации распознанных символов
Breyman A., Яковлев И. А., Прикаспийский журнал: управление и высокие технологии 2014 № 1 (25) С. 102–112
Optical recognition of text documents is inevitably error-prone process. To identify and correct that errors systems use post-processing techniques that are usually based on dictionary search. Using dictionaries can bring an acceptable quality of recognition for Latin, Cyrillic and other phonetic alphabets, but of little use for the languages ​​in which the selection of individual ...
Added: February 27, 2014
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit