• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Learning to Run with Potential-Based Reward Shaping and Demonstrations from Video Data
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Learning to Run with Potential-Based Reward Shaping and Demonstrations from Video Data

P. 286–291.
Shpilman A., Malysheva A., Kudenko D.

Learning to produce efficient movement behaviour for humanoid robots from scratch is a hard problem, as has been illustrated by the “Learning to run” competition at NIPS 2017. The goal of this competition was to train a two-legged model of a humanoid body to run in a simulated race course with maximum speed. All submissions took a tabula rasa approach to reinforcement learning (RL) and were able to produce relatively fast, but not optimal running behaviour. In this paper, we demonstrate how data from videos of human running (e.g. taken from YouTube) can be used to shape the reward of the humanoid learning agent to speed up the learning and produce a better result. Specifically, we are using the positions of key body parts at regular time intervals to define a potential function for potential-based reward shaping (PBRS). Since PBRS does not change the optimal policy, this approach allows the RL agent to overcome sub-optimalities in the human movements that are shown in the videos. We present experiments in which we combine selected techniques from the top ten approaches from the NIPS competition with further optimizations to create an high-performing agent as a baseline. We then demonstrate how video-based reward shaping improves the performance further, resulting in an RL agent that runs twice as fast as the baseline in 12 hours of training. We furthermore show that our approach can overcome sub-optimal running behaviour in videos, with the learned policy significantly outperforming that of the running agent from the video.

Language: English
DOI
Text on another site
Keywords: optimization TrainingHumanoid Robotstask analysis

In book

2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV)
IEEE, 2018.
Similar publications
NeurIPS 2024 Optimization for ML Workshop
[б.и.], 2025.
Added: February 5, 2026
Method of Automated Dataset Collection for Microwave Filters Synthesis
Arinin O. V., Bakhmach D. M., Katsnelson A. et al., , in: 2025 Systems of Signals Generating and Processing in the Field of on Board Communications.: IEEE, 2025. P. 1–5.
This research discusses the method of dataset collection automatization for microwave filter synthesis by integrating machine learning techniques, thus reducing development time. Utilizing the 3D electromagnetic analysis software package, the study involves simulation and collecting geometric parameters and amplitude-frequency characteristics from three variants of passband highly selective microstrip tworesonator combined filters with stepped impedance resonators. ...
Added: December 6, 2025
Physics-Informed Bayesian Optimization for Conformational Ensemble Augmentation
Medvedev M., Journal of Chemical Information and Modeling 2025 Vol. 65 No. 12
Added: November 12, 2025
Optimization of Multi-Currency Deposit Structure by Two Indicators (Income and Risk) under Uncertainty
Molostvov V., Advances in Systems Science and Applications 2025 Vol. 25 No. 1 P. 1–11
A two-criteria vector optimization problem – finding Pareto-optimal solutions in linear systems with interval uncertainty of coefficients – is considered. The problem of resource allocation to multiple activities is investigated. The uncertainty-adjusted income is a bilinear function, linear by strategy under fixed uncertainty and by uncertain parameters under fixed strategy. Guaranteed income is a linear ...
Added: August 26, 2025
Численная оптимизация проверочной матрицы LDPC-кода для применения в протоколе квантового распределения ключей с использованием высокопараллельных вычислений
Morozov V., Башара В. О., Емельяненко М. В., В кн.: Параллельные вычислительные технологии – XIX всероссийская научная конференция с международным участием, ПаВТ’2025, г. Москва, 8–10 апреля 2025 г. Короткие статьи и описания плакатов.: Челябинск: Издательский центр ЮУрГУ, 2025. С. 193–210.
Error correction in the secret key is a mandatory step in quantum key distribution (QKD) protocols. Usually, modern error-correcting codes are used for its implementation. Imperfections of the hardware used in QKD systems lead to bit flipping errors in the channel. Moreover, such systems are characterized by an asymmetric distribution of such errors. Taking into ...
Added: June 3, 2025
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 3-5 May 2025, Splash Beach Resort in Mai Khao, Thailand, PMLR: vol. 258
PMLR, 2025.
Added: May 18, 2025
Editorial
Panos Pardalos, Valery Kalyagin, Mario R. Guarracino, Computational Management Science 2024 Vol. 21 No. 1 Article 35
Big data has become an integral part of modern networks. With the increasing amount of data generated by devices, machines, and applications, networks are constantly being challenged to handle and process this data in a timely and efficient manner. The size, complexity, and variety of data in networks are increasing rapidly, which requires new approaches ...
Added: February 22, 2025
Savage's Solution to the Problem of Three-Currency Deposit Diversification: Program Tools and Modeling Results
Molostvov V., Advances in Systems Science and Applications 2024 Vol. 24 No. 2 P. 103–115
This paper presents the development of computing tools for finding optimal structures of multi-currency deposits in terms of guaranteed risk under uncertain exchange rates. The approach utilizes Savage's minimax regret concept to calculate risk and guaranteed risk functions explicitly, assuming only the limits of possible changes in uncertain parameters are known.  The Excel environment implements ...
Added: August 9, 2024
Влияние цифровых технологий на бизнес-процессы и конкурентные преимущества FMCG-компаний в Казахстане
Sizov M., Shushkin M., Информационное общество 2024 № 6 С. 2–15
This study investigates the influence of digital technologies on conventional business processes and the competitive advantages of Kazakhstani companies in the FMCG sector. Additionally, it explores their contribution to innovativeness and productivity. A systematic literature review was conducted to achieve the research objective. The study results indicate that digital business processes possess unique characteristics, such ...
Added: May 14, 2024
Random beliefs in Cournout competition
Dranov E., Fedyanin D., , in: 2023 5th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA)Vol. 5.: IEEE, 2023. P. 464–469.
The authors focus on a problem on equilibrium control in Cournout competition. Our approach uses Cognitive Approach to Reasoning. From one point of view agents plays a game with a fixed market size, and at the same time they have their own beliefs about what market size really is. It influence the equilibrium and the ...
Added: February 24, 2024
14th International Conference, OPTIMA 2023, Petrovac, Montenegro, September 18–22, 2023, Revised Selected Papers. Communications in Computer and Information Science (CCIS, volume 1913)
Springer, 2023.
This book constitutes the refereed proceedings of the 14th International Conference on Advances in Optimization and Applications, OPTIMA 2023, held in Petrovac, Montenegro, during September 18–22, 2023. The 21 full papers included in this book were carefully reviewed and selected from 68 submissions. They were organized in topical sections as follows: ​mathematical programming; global optimization; continuous optimization; ...
Added: December 9, 2023
Data Analysis and Optimization. In honor of Boris Mirkin’s 80th birthday
Cham: Springer, 2023.
This book presents the state of the art in the emerging field of data science which includes models for layered security, protection of large gathering sites, cancer diagnostics, self-driving cars and other applications with catastrophic consequences of wrong decisions. The manipulability of aggregation procedures for the case of large numbers of voters is analyzed from ...
Added: November 3, 2023
PSIICOS projection optimality for EEG and MEG based functional coupling detection
Altukhov D., Kleeva D., Ossadtchi A., Neuroimage 2023 Vol. 280 Article 120333
Functional connectivity is crucial for cognitive processes in the healthy brain and serves as a marker for a range of neuropathological conditions. Non-invasive exploration of functional coupling using temporally resolved techniques such as MEG allows for a unique opportunity of exploring this fundamental brain mechanism. The indirect nature of MEG measurements complicates the estimation of functional coupling due to the volume ...
Added: September 24, 2023
Тестирование методов обмена данными между процессами на суперкомпьютере JETSON TX2 в сравнении с другими платформами
Смирнов И. А., КРАВЧЕНКО В. О., Разумов П. В. et al., ГНИИ "НацРазвитие", 2019.
this article will describe the various methods of exchanging data between processes, with the subsequent conclusion about the speed of each. Tests were conducted on different processors and different versions of operating systems. This study was conducted to find out the fastest way to transfer data between processes on a Jetson TX2 supercomputer compared to other platforms. ...
Added: May 11, 2023
22nd International Conference, MMST 2022, Nizhny Novgorod, Russia, November 14–17, 2022, Revised Selected Papers
Springer, 2022.
This book constitutes selected and revised papers from the 22nd International Conference on Mathematical Modeling and Supercomputer Technologies, MMST 2022, held in Nizhny Novgorod, Russia, in November 2022.    The 20 full papers and 5 short papers presented in the volume were thoroughly reviewed and selected from the 48 submissions. They are organized in topical secions on ​computational methods ...
Added: December 26, 2022
A New Interpolation-Based Polynomial Algorithm for Estimating Lateness in Single Machine Scheduling Problem
Lazarev A. A., Lemtyuzhnikova D. V., Tyunyatkin A. A. et al., IFAC-PapersOnLine 2022 Vol. 55 No. 10 P. 2881–2886
This research extends the interpolation approach to approximating the objective function value for the minimization maximum lateness problem. The interpolation approach is defined using a special objective function Lmax(α), which is proven to be continuous and depends only on α transform coefficient. Such a function is proven to be monotonically increasing, and this property is ...
Added: December 5, 2022
Variational Autoencoders for Precoding Matrices with High Spectral Efficiency
Bobrov E., Markov A., Panchenko S. et al., , in: Mathematical Optimization Theory and Operations Research: Recent Trends. 21st International Conference, MOTOR 2022, Petrozavodsk, Russia, July 2–6, 2022, Revised Selected Papers.: Springer, 2022. Ch. 22 P. 315–326.
Neural networks are used for channel decoding, channel detection, channel evaluation, and resource management in multi-input and multi-output (MIMO) wireless communication systems. In this paper, we consider the problem of finding precoding matrices with high spectral efficiency (SE) using variational autoencoder (VAE). We propose a computationally efficient algorithm for sampling precoding matrices with minimal loss ...
Added: November 1, 2022
An Achievability Bound of Energy Per Bit for Stabilized Massive Random Access Gaussian Channel
Burkov A., Shneer S., Andrey Turlikov, IEEE Communications Letters 2021 Vol. 25 No. 1 P. 299–302
Developing communication standards highlight a scenario called massive Machine-Type Communication within the framework of the Internet of Things. Under this scenario, a large number of devices with autonomous power supplies will operate in a random multiple access mode. It is crucial to keep the system stable, while consuming minimal energy. We describe a model of ...
Added: October 28, 2022
Recent Theoretical Advances in Decentralized Distributed Convex Optimization
Gorbunov E., Rogozin A., Beznosikov A. et al., , in: High-Dimensional Optimization and Probability: With a View Towards Data Science.: Springer, 2022. Ch. 191 P. 253–325.
In the last few years, the theory of decentralized distributed convex optimization has made significant progress. The lower bounds on communications rounds and oracle calls have appeared, as well as methods that reach both of these bounds. In this paper, we focus on how these results can be explained based on optimal algorithms for the ...
Added: October 28, 2022
High-Dimensional Optimization and Probability: With a View Towards Data Science
Springer, 2022.
Added: October 28, 2022
Springer Optimization and Its Applications
Springer, 2022.
Optimization has continued to expand in all directions at an astonishing rate. New algorithmic and theoretical techniques are continually developing and the diffusion into other disciplines is proceeding at a rapid pace, with a spot light on machine learning, artificial intelligence, and quantum computing. Our knowledge of all aspects of the field has grown even ...
Added: October 28, 2022
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit