?
Analysis of a Company Model in Conditions of Unstable Demand Using Reinforcement Learning Methods
Profit is one of the most important economic indicators of a company’s performance, and for every company it is necessary to allocate resources in such a way as to obtain the maximum possible profit. The profit maximization problem is usually a dynamic optimization problem. This article discusses an approach to solving the production expansion problem using reinforcement learning (RL) methods. The task is to determine the company’s long-term strategy: how to use the company’s resources to expand production to maximize profits in the long term. The paper examines the possibility of applying reinforcement learning algorithms to similar problems and, to confirm this possibility, compares the results of the analytical solution of the problem using classical methods of optimal control theory with the results of RL models. Also in the work, both methods solve the problem of expanding production in conditions of unstable demand and show that RL methods converge to an analytical solution. The results of reinforcement learning algorithms and classical solution methods are compared based on the final profit that the company will receive in the long term over a 10-year horizon.