Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey. (arXiv:1904.04973v2 [q-fin.PM] UPDATED)

Sun, 05 May 2019 19:40:22 GMT

Financial portfolio management is one of the problems that are most
frequently encountered in the investment industry. Nevertheless, it is not
widely recognized that both Kelly Criterion and Risk Parity collapse into Mean
Variance under some conditions, which implies that a universal solution to the
portfolio optimization problem could potentially exist. In fact, the process of
sequential computation of optimal component weights that maximize the
portfolio's expected return subject to a certain risk budget can be
reformulated as a discrete-time Markov Decision Process (MDP) and hence as a
stochastic optimal control, where the system being controlled is a portfolio
consisting of multiple investment components, and the control is its component
weights. Consequently, the problem could be solved using model-free
Reinforcement Learning (RL) without knowing specific component dynamics. By
examining existing methods of both value-based and policy-based model-free RL
for the portfolio optimization problem, we identify some of the key unresolved
questions and difficulties facing today's portfolio managers of applying
model-free RL to their investment portfolios.