Partial differential equations (PDEs) are a dominant modelling paradigm ubiquitous throughout science, from fluid dynamics to quantum mechanics, to mathematical biology and quantitative finance. Solving high-dimensional PDEs is a major challenge in scientific computing, mainly because of the unmanageable complexity of classical numerical methods, such as finite differences and finite elements, when executed on fine computational grids.
Recent advances in Machine Learning (ML) have enabled the development of novel computational techniques for tackling PDE-based problems considered unsolvable with classical methods. Physics-informed neural networks, such as Deep Galerkin and Deep BSDEs, are among the most popular Deep Learning-based PDE solvers recently proposed in the field. Kernel methods and their connections to neural networks offer a set of tools to solve challenging PDEs that offer a convenient framework for analysis, in particular regarding their theoretical properties, consistency, stability, and convergence rates.
The aim of this workshop is to consolidate the academic links established in occasion of the first edition of the workshop and to keep discussing recent advances on machine learning for PDEs, both at the practical and theoretical level, as well as interesting opportunities for future research. The inter-disciplinary nature of the network provides an all-in-one approach to build new numerical schemes, analyse their theoretical properties, and investigate their optimal implementation on recent computer hardware advances, such as GPUs and NISQ-type quantum computers.
The workshop will take place in the Translation and Innovation Hub Building (I-Hub), White City campus, in room CR1/CR2 (ground floor) on Monday 3 April from 13:00 to 18:00 and in room IX5 (5th floor) on Tuesday 4 April from 9:00 to 18:00.
Organisers
Antoine Jacquier, Panos Parpas, Johannes Ruf, Cris Salvi.
Please contact the organisers if you are interested in attending the workshop.
Schedule
April, 3rd
12:45-13:00
Welcome Speech
13:00-13:45
Christa Cuchiero: Global universal approximation of functional input maps on weighted spaces
13:45-14:30
Deqing Jiang: Convergence of Deep Galerkin Method’s Limit
14:30-15:15
Zan Zuric: A random neural network approach to pricing SPDEs for rough volatility
Coffee Break
15:45-16:30
Yufei Zhang: Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems
16:30-17:15
Tamara Grossmann: Can Physics-informed Neural Networks beat the Finite Element Method?
April, 4th
09:00-09:45
Michael Giegrich: Nearest-Neighbour Resampling for Off-Policy Policy Evaluation in Stochastic Environments
09:45-10:30
Camilo Garcia Trillos: A Deep Neural Network solver for backward stochastic Volterra integral equations
Coffee Break
11:00-11:45
Sam Cohen: Optimal Control with Online Learning
11:45-12:30
Lisa Maria Kreusser: Generalised eikonal equations on graphs with applications to semi-supervised learning
Lunch Break
13:45-14:30
Alexandre Pannier: Solving path-dependent PDEs with signature kernels
14:30-15:15
Alexander Lobbe: Deep Learning Algorithm for the Nonlinear Stochastic Filtering Problem: A Case Study for the Benes Filter
Coffee Break
15:30-16:15
Ariel Neufeld: Deep Learning based algorithm for nonlinear PDEs in finance and gradient descent type algorithm for non-convex stochastic optimization problems with ReLU neural networks
16:15-17:00
James Rowbottom: Physics Inspired Graph Neural Networks
19:00-22:00
Workshop Dinner (by invitation only)
Titles and abstracts
Yufei Zhang (London School of Economics)
Title: Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems
Abstract: Recently, policy gradient (PG) methods have attracted substantial research interests. Much of the attention and success, however, has been for the discrete-time setting. Convergence of PG methods for controlled diffusions remains a challenging and open problem. This work proves the convergence of PG methods for finite-horizon linear-quadratic control problems. We consider a continuous-time Gaussian policy whose mean is linear in the state variable and whose covariance is state-independent. We propose geometry-aware gradient descents for the mean and covariance of the policy using the Fisher geometry and the Bures-Wasserstein geometry, respectively. The policy iterates are shown to converge globally to the optimal policy with a linear rate. We further propose a novel PG method with discrete-time policies. The algorithm leverages the continuous-time analysis, and achieves a robust linear convergence across different action frequencies. A numerical experiment confirms the convergence and robustness of the proposed algorithm.
Christa Cuchiero (University of Vienna)
Title: Global universal approximation of functional input maps on weighted spaces
Abstract: We introduce so-called functional input neural networks defined on an infinite dimensional weighted space. To this end, we use an additive family as hidden layer maps and a non-linear activation function applied to each hidden layer. Relying on Stone-Weierstrass theorems on weighted spaces, we can prove a global universal approximation result for generalizations of continuous functions going beyond the usual approximation on compact sets. This then applies in particular to approximation of (non-anticipative) path space functionals via functional input neural networks, but also via linear maps of the signature of the respective paths. As an application, we use functional input neural networks to learn the solution operator of certain PDEs corresponding to pricing operators of financial derivatives. The talk is based on joint work with Philipp Schmocker and Josef Teichmann.
Camilo Garcia Trillos (University College London)
Title: A Deep Neural Network solver for backward stochastic Volterra integral equations
Abstract: Backward Stochastic Volterra Integral Equations (BSVIE) are extensions to Backward Stochastic Differential Equations (BSDEs). They have played an important role in the solution of several problems in stochastic control (particularly connected to time-inconsistent control), and finance (associated to dynamic risk measures and arbitrage-free pricing in some specific settings). Despite their theoretical relevance, few works have been devoted to finding and studying a fully implementable solution to BSVIES, particularly to the so-called Type-II BSVIEs. In the talk, we will discuss a fully implementable method based on deep neural networks.
James Rowbottom (University of Cambdridge)
Title: Physics Inspired Graph Neural Networks
Abstract: In this talk, we explore recent developments in graph neural networks (GNNs) inspired by physics and dynamical systems. We present Graph Neural Diffusion (GRAND), which approaches deep learning on graphs as a continuous diffusion process and treats GNNs as discretizations of an underlying PDE. We also propose a novel class of GNNs based on the discretized Beltrami flow and Graph-Coupled Oscillator Networks (GraphCON) based on a second-order system of ordinary differential equations. Lastly, we propose a GNN framework that parametrizes and learns an energy functional and takes the GNN equations to be the gradient flow of such energy. The physics inspiration provides a useful inductive bias that can overcome GNN problems such as oversmoothing and heterophily. Our approaches achieve state-of-the-art or competitive results on several standard graph benchmarks, demonstrating the effectiveness of the physics-inspired inductive bias in overcoming common GNN problems such as oversmoothing and heterophily.
Lisa Maria Kreusser (University of Bath)
Title: Generalised eikonal equations on graphs with applications to semi-supervised learning
Abstract: Many computational methods for semi-supervised and unsupervised classification are based on variational models and PDEs. Since shortest path graph distances are widely used in data science and machine learning, it is natural to introduce the concept of information propagation to data classification and semi-supervised learning. The success of eikonal equations in the continuum setting motivates the development of similar tools on graphs. We propose and unify classes of different models for information propagation over graphs, and prove equivalences between them. Motivated by the connection between first arrival time model and the eikonal equation in the continuum setting, we derive mean field limits for graphs based on uniform grids in Euclidean space under grid refinement. For a specific parameter setting, we demonstrate that the solution on the grid approximates the Euclidean distance. Finally, we illustrate the use of front propagation on graphs to semi-supervised learning.
Alexandre Pannier (Université Paris-Cité)
Title: Solving path-dependent PDEs with signature kernels
Abstract: The goal of this paper is to develop a kernel framework for solving linear and non-linear path-dependent PDEs (PPDEs), which include pricing PDEs under rough volatility models. This approach leverages a recently introduced class of kernels indexed on pathspace, called signature kernels: this allows to solve an optimal recovery problem by approximating the solution of a PPDE with an element of minimal norm in the (signature) reproducing kernel Hilbert space (RKHS) constrained to satisfy the PPDE at a finite collection of collocation points. In the linear case, by the representer theorem, it can be shown that the optimisation has a unique, analytic solution expressed entirely in terms of simple linear algebra operations. In the non-linear case, the optimal recovery problem can be reformulated as a two-level optimisation that can be solved by minimising a quadratic objective subject to nonlinear constraints. The proposed method comes with convergence guarantees as the number of collocation points increases and is amenable to rigorous error analysis. Finally we will discuss some motivating examples and present preliminary numerical results for pricing PDEs under the rough Bergomi model.
Tamara Grossmann (University of Cambridge)
Title: Can Physics-informed Neural Networks beat the Finite Element Method?
Abtract: Partial differential equations play a fundamental role in the mathematical modelling of many processes and systems in physical, biological and other sciences. To simulate such processes and systems, the solutions of PDEs often need to be approximated numerically. The finite element method, for instance, is a usual standard methodology to do so. The recent success of deep neural networks at various approximation tasks has motivated their use in the numerical solution of PDEs. These so-called physics-informed neural networks and their variants have shown to be able to successfully approximate a large range of partial differential equations. So far, physics-informed neural networks and the finite element method have mainly been studied in isolation of each other. In this talk, we compare the methodologies in a systematic computational study. Indeed, we employ both methods to numerically solve various linear and nonlinear partial differential equations: Poisson in 1D, 2D, and 3D, Allen-Cahn in 1D, semilinear Schrödinger in 1D and 2D. We then compare computational costs and approximation accuracies
Ariel Neufeld (Nanyang Technological University)
Title: Deep Learning based algorithm for nonlinear PDEs in finance and gradient descent type algorithm for non-convex stochastic optimization problems with ReLU neural networks
Abstract: In this talk, we first present a deep-learning based algorithm which can solve nonlinear parabolic PDEs in up to 10’000 dimensions with short run times, and apply it to price high-dimensional financial derivatives under default risk. Then, we discuss a general problem when training neural networks, namely that it typically involves non-convex stochastic optimization. To that end, we present TUSLA, a gradient descent type algorithm (or more precisely : stochastic gradient Langevin dynamics algorithm) for which we can prove that it can solve non-convex stochastic optimization problems involving ReLU neural networks. This talk is based on joint works with C. Beck, S. Becker, P. Cheridito, A. Jentzen, and D.-Y. Lim, S. Sabanis, Y. Zhang, respectively.
Zan Zuric (Imperial College London)
Title: A random neural network approach to pricing SPDEs for rough volatilit
Abstract: We propose a novel machine learning-based scheme for solving partial differential equations (PDEs) and backward stochastic partial differential equations (BSPDE) stemming from option pricing equations of Markovian and non-Markovian models respectively. The use of the so-called random weighted neural networks (RWNN) allows us to formulate the optimisation problem as linear regression, thus immensely speeding up the training process. Furthermore, we analyse the convergence of the RWNN scheme and are able to specify error estimates in terms of the number of hidden nodes. The scheme’s performance is tested on the rBergomi model and shown to have superior training times with accuracy comparable to existing deep learning approaches.
Alexander Lobbe (Imperial College London)
Title: Deep Learning Algorithm for the Nonlinear Stochastic Filtering Problem: A Case Study for the Benes Filter
Abstract: The filtering stochastic partial differential equations (SPDE) arise as the evolution equations for the conditional distribution of an underlying signal given partial, and possibly noisy, observations. In this talk we present a numerical method based on a mesh-free neural network representation of the density of the solution of the filtering model achieved by deep learning. Based on the classical SPDE splitting method, the algorithm includes a recursive normalisation procedure to recover the normalised conditional distribution of the signal process. Within the analytically tractable setting of the Benes filter, we present a study of the algorithm. We also discuss a domain adaptation procedure, through which the domain of resolution moves together with the support of the posterior to preserve the numerical resolution.
Michael Giegrich (University of Oxford)
Title: Nearest-Neighbour Resampling for Off-Policy Policy Evaluation in Stochastic Environments
Abstract: In reinforcement learning, off-policy policy evaluation deals with the problem of estimating the value of a target policy with observations generated from a different sampling policy. We propose a novel procedure for off-policy policy evaluation based on a nearest-neighbour resampling algorithm in environments with continuous state-action spaces, system-inherent stochasticity effected by control decisions, and potential risk-sensitivity. Assuming continuity in the rewards and the state transitions, the procedure exploits that similar state/action pairs (in a metric sense) are associated with similar rewards and state transitions. This enables the procedure to tackle the counterfactual estimation problem underlying off-policy policy evaluation. The resampling algorithm is mimicking a Monte Carlo method without explicitly assuming a parametric model. Furthermore, the Monte Carlo-like behaviour also allows for the estimation of path-dependent quantities of interest often related to risk quantification. Being based on a nearest-neighbour search, the algorithm primarily relies on the metric structure of the underlying problem, it does not require optimization and can be efficiently implemented via parallelization. Viewing the algorithm as subsampling a local averaging regression and leveraging a random dynamical system formulation, one can establish a consistency result for the algorithm in an episodic setting. These theoretical findings can be considered of independent interest as they provide general consistency conditions for locally weighted regression approaches for model-based off-policy policy evaluation and the first theoretical results for a more general class of nearest-neighbour bootstrapping algorithms for the simulation of random dynamical systems. Finally, we conduct numerical experiments to show the effectiveness of the algorithm in environments with a variety of control mechanisms, metric structures and fields of application. Joint work with Roel Ohmen and Christoph Reisinger.
Sam Cohen (University of Oxford)
Title: Optimal Control with Online Learning
Abstract. If one takes a Bayesian view, optimal control with model uncertainty can be theoretically reduced to classical optimal control. The key difficulty is that the state space for the control problem is typically very large, leading to numerically intractable problems. In this talk, we will see that this view is nevertheless productive, as one can then exploit asymptotic expansions for the control problem to yield a computationally efficient and flexible algorithm, which performs well in practice. We will consider applications of this approach to multiarmed bandit problems, which include controlled learning as a key part of the optimal control problem, and can be used as models of interesting problems in finance.
Deqing Jiang (University of Oxford)
Title: Convergence of Deep Galerkin Method’s Limit
Abstract: A variety of deep-learning-based PDE-solving methods have been developed in the past decade. The deep Galerkin method (DGM) is one of the most commonly used benchmarks among these methods. In this session, we analyze the behavior of the approximator trained by DGM. We prove that as the number of hidden units goes to infinity, the approximator’s trajectory converges to an infinite-dimensional ordinary differential equation (ODE). We show that, despite the lack of a spectral gap in the kernel function, the PDE residual of the approximator’s wide limit converges to zero as training time goes to infinity. Under mild assumptions, this convergence result suggests that the approximator’s limit converges to the solution of the PDE.