Learning To Trade

Deep Hedging
Hans Buehler
Here is a summary with links of my talk Learning to Trade at the AI/ML Summit at QuantMinds in Barcelona, December 2021; of the Bachelier Conference talk June 2022; and the TU Munich material summer 2022.

The ambition of learning to trade is moving away from classic model-driven pricing and hedging methodologies and focus instead on real life performance of our now data-driven models, under transaction cost and trading restrictions across time. Our methods do not rely on greeks as primary risk management tool anymore. Putting ML as a tool aside, intuitively we try ask what would have worked in the past with due considerations to risk vs return.

The core of our approach is not new - indeed, there is a wide range of literature on how to move beyond the classic assumption of full replication of a derivative and assess the impact of transaction cost and market frictions; here are some thoughts from my two university supervisors. What is new is that we now have the computational tools to being able to solve such problems efficiently.

Essentially, we propose to solve the original hedging problem, using data and AI. In somce cases, it is sufficient to use rather straight forward 'regression' methods, which we call Statistical Hedging. For products which have no observable market prices, or for portfolios with path dependence, we have developed Deep Hedging. Vanilla Deep Hedging requires simulators of the market (e.g. stock and implied volatility) to generate sufficient synthetic data. Such simulators are once again built with modern quanti finance and AI methods

Market data, simulated or played back, will naively pick up the historic drifts in our spot and option market. However, the drift is well-known to be subject to model uncertainty. Unless this is explicitly modelled (e.g model alpha), we propose to increase robustnness by finding a close martingale measure, and then find an optimal hedging strategy using Deep Hedging.

Our approach is rather generic and applicable to most commonly traded derivatives. It is not limited to equities markets.

The original Deep Hedging approach uses essentially a Monte Carlo "periodic policy search" algorithm. That means that we have to re-train our agent networks any time our portfolio or the market changes sufficiently. The working paper Deep Bellman Hedging addresses this by expanding our a patent application beyond the entropy, and, importantly, by providing an implementable and realistic representation for "any" portfolio of derivatives. The latter is one of the most challenging aspects of dynamic programming for Deep Hedging.

Lecture Notes

The following lecture notes summarize the overall Learning to Trade topic:

GIT hub

Git Hub repositiory for Deep Hedging


  • Statistical Hedging, 2013: initial research on "regression" based local P&L hedging, suitable objective functions, and link to greek hedging. It also discusses statistical arbitrage.
    Statistical Hedging is a robust, model-free approach to creating automated hedging using existing (classic) valuation models. It does not use the Greeks anymore, and captures naturally co-movements of relevant market parameters. If you read nothing else, start with the section "Where did my Greeks go".

  • Deep Hedging, 2019: extension from local hedging to full path hedging; validation that the method recovers theoretically known results, and that machine learning tools are very well suited for this approach. A recent patent also covers a `Bellman' aporoach.

  • Generating financial markets with signatures, 2020: focus on efficient market simulation for paths based on small data environments, with signatures.

  • Learning to Simulate Equity Option Markets, 2019 and our more recent Multi-Asset Spot and Option Market Simulation, 2021 on simulating market simulator for the implied volatility surface based on our Discrete Local Volatility parametrization

  • Deep hedging: learning to remove the drift, 2022: market simulators under the statistical measure will have statistical arbitrage which means trading strategies constructed under such simulator will be a mix of hedging and arbitrage strategies. This paper shows how to find a close risk-neutral measure to remove statistical arbitrage and improve robustness of hedge.
    This also means that it allows constructing a Stochastic Implied Volatiliy model from a machine learned model of implied volatility.
    Arxiv version is here.

  • Deep Bellman Hedging, 2022 is a first version of a dynamic programming approach to hedging a portfolio of arbitrary financial products with derivatives, under transaction cost, with continuous state space.


Podcasts and presentations


Publicly available Presentations

Dr. Hans Buehler (connect via linkedin or have a look at my website)