Task Planning in “Block World” with Deep Reinforcement Learning

Blokhin A., Kalev V., Пусев Р. С. и др., , in: 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON).: Novosibirsk: IEEE, 2024. P. 25–30.

Добавлено: 18 декабря 2024 г.

Generative Flow Networks as Entropy-Regularized RL

Тяпкин Д. Н., Морозов Н. В., Наумов А. А. и др., , in: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), 2-4 May 2024, Palau de Congressos, Valencia, Spain. PMLR: Volume 238Vol. 238.: Valencia: PMLR, 2024. P. 4213–4221.

Добавлено: 22 июня 2024 г.

Model-free Posterior Sampling via Learning Rate Randomization

Тяпкин Д. Н., Беломестный Д. В., Calandriello D. и др., , in: Advances in Neural Information Processing Systems 36 (NeurIPS 2023).: Curran Associates, Inc., 2023. P. 73719–73774.

Добавлено: 17 февраля 2024 г.

Reinforcement Procedure for Randomized Machine Learning

Yuri S. Popkov, Дубнов Ю. А., Alexey Yu. Popkov, Mathematics 2023 Vol. 11 No. 17 Article 3651

This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. ...

Добавлено: 5 февраля 2024 г.

Fast Rates for Maximum Entropy Exploration

Тяпкин Д. Н., Беломестный Д. В., Calandriello D. и др., , in: Proceedings of the 40th International Conference on Machine Learning: Volume 202: International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USAVol. 202: International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USA.: PMLR, 2023. P. 34161–34221.

Добавлено: 1 декабря 2023 г.

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

Тяпкин Д. Н., Беломестный Д. В., Наумов А. А. и др., Working papers by Cornell University. Series math "arxiv.org" 2023 Article 2304.03056

In this work, we derive sharp non-asymptotic deviation bounds for weighted sums of Dirichlet random variables. These bounds are based on a novel integral representation of the density of a weighted Dirichlet sum. This representation allows us to obtain a Gaussian-like approximation for the sum distribution using geometry and complex analysis methods. Our results generalize ...

Добавлено: 28 июня 2023 г.

Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization

Беломестный Д. В., Каледин М. Л., Golubev A., /. 2022.

Добавлено: 14 апреля 2023 г.

A note on observational equivalence of micro assumptions on macro level

Пономаренко А. А., Economics: The Open-Access, Open-Assessment E-Journal 2020 Vol. 14 P. 1–15

The author set up a simplistic agent-based model where agents learn with reinforcement observing an incomplete set of variables. The model is employed to generate an artificial dataset that is used to estimate standard macro econometric models. The author shows that the results are qualitatively indistinguishable (in terms of the signs and significances of the ...

Добавлено: 28 марта 2023 г.

Ambiguous tDCS: variability of the transcranial direct current stimulation effects in a reinforcement learning task

Anastasia Grigoreva, Aleksei Gorin, Valeriy Klyuchnikov и др., Brain Stimulation 2023 Vol. 16 No. 1 P. 273

Добавлено: 1 марта 2023 г.

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

Тяпкин Д. Н., Беломестный Д. В., Calandriello D. и др., , in: Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022.: Curran Associates, Inc., 2022. P. 10737–10751.

Добавлено: 3 февраля 2023 г.

Massive MIMO Adaptive Modulation and Coding Using Online Deep Learning Algorithm

Bobrov E., Kropotov Dmitry, Lu H. и др., IEEE Communications Letters 2022 Vol. 26 No. 4 P. 818–822

Добавлено: 26 октября 2022 г.

Primal-Dual Stochastic Mirror Descent for MDPs

Тяпкин Д. Н., Alexander Gasnikov, , in: International Conference on Artificial Intelligence and Statistics, 28-30 March 2022, A Virtual ConferenceVol. 151: Proceedings of The 25th International Conference on Artificial Intelligence and Statistics.: PMLR, 2022. P. 9723–9740.

Добавлено: 16 октября 2022 г.

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

Тяпкин Д. Н., Беломестный Д. В., Мулине Э. Ф. и др., , in: Proceedings of the 39th International Conference on Machine LearningVol. 162.: PMLR, 2022. P. 21380–21431.

Добавлено: 11 июля 2022 г.

Обзор нейросетевых методов анализа и генерации кода

С. М. Авдошин, Г. А. Арутюнов, Информационные технологии 2022 Т. 28 № 7 С. 378–391

В условиях пандемии как никогда стала актуальной проблема нехватки кадров в сфере информационных технологий. По оценкам аналитиков в 2021 году Россия не досчиталась от 500 тыс. до 1 млн IT-специалистов. Образование и вывод на рынок такого большого числа специалистов может занять годы. Очень остро стоит вопрос оптимизации процесса создания IT-решений, в том числе путем разработки ...

Добавлено: 11 июня 2022 г.

Learning under social versus nonsocial uncertainty: A meta-analytic approach

Мартинез Саито М., Gorina E., Human Brain Mapping 2022 Vol. 43 No. 13 P. 4185–4206

Добавлено: 27 мая 2022 г.