Journal of Computational Finance

Risk.net

Hedging of financial derivative contracts via Monte Carlo tree search

Oleg Szehr

  • We interpret derivatives hedging as a “game with the world”, where one player (the Investor) bets on what will happen and the other player (the Market) decides what will happen.
  • Inspired by the success of Monte Carlo Tree Search in specific games and general game play, we introduce this algorithm as a method for hedging in the presence of risk and market friction.
  • We show that Monte Carlo Tree search is capable of maximizing the utility of the investor’s terminal wealth in a setting, where no external pricing information is available and rewards are granted only as a result of contractual cash flows.
  • We observe superior performance as compared to the deep Q-network algorithm and comparable performance to “deep-hedging” methods.

The construction of replication strategies for the pricing and hedging of derivative contracts in incomplete markets is a key problem in financial engineering. We interpret this problem as a “game with the world”, where one player (the investor) bets on what will happen and the other player (the market) decides what will happen. Inspired by the success of the Monte Carlo tree search (MCTS) in a variety of games and stochastic multiperiod planning problems, we introduce this algorithm as a method for replication in the presence of risk and market friction. Unlike model-free reinforcement learning methods (such as Q-learning), MCTS makes explicit use of an environment model. The role of this model is taken by a market simulator, which is frequently adopted even in the training of model-free methods, but its use allows MCTS to plan for the consequences of decisions prior to the execution of actions. We conduct experiments with the AlphaZero variant of MCTS on toy examples of simple market models and derivatives with simple payoff structures. We show that MCTS is capable of maximizing the utility of the investor’s terminal wealth in a setting where no external pricing information is available and rewards are granted only as a result of contractual cashflows. In this setting, we observe that MCTS has superior performance compared with the deep Q-network algorithm and comparable performance to “deep-hedging” methods.

Sorry, our subscription options are not loading right now

Please try again later. Get in touch with our customer services team if this issue persists.

New to Risk.net? View our subscription options

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here