Research Projects

Optimisation, Control & RL

Optimisation, Control & Reinforcement Learning

In the context of Process Engineering, there are problems which suffer from three conditions: 1) there is no precise model known for the process under consideration (plant-model mismatch), leading to inaccurate predictions and convergence to suboptimal solutions, 2) the process presents disturbances and 3) the system is risk-sensitive hence exploration is inconvenient or dangerous. A simple example of this are chemical reactors and aircrafts. To address these problems there are two main approaches, knowledge based schemes (where models are derived from physical, biological and chemical information) and data-driven optimization algorithms. On one hand, knowledge based methods work very well for problems with disturbances, and under some assumptions present convergence guarantees, however, they struggle when plant-model mistmatch is presents (which is the case with most models). On the other hand, data-driven optimization algorithms can handle plant-model mismatch and process disturbances in practice, but they do not have convergence guarantees, are data-hungry, and easily violate constraints due to their exploratory nature. Under this predicament we develop new efficient algorithms that combine both approaches through hybrid modelling, statistical modelling (e.g. Gaussian processes) and reinforcement learning.

MPC

Reinforcement Learning (RL) is a subfield of Artificial Intelligence (AI) which trains Machine Learning models to make optimal decisions. This is done in such a way that the model (or agent; or controller in a process engineering domain) learns how to take optimal actions as it explores the environment in which it resides.
RL has received a lot of attention, notable examples are “machines” learning to play board games such as Go, or videogames such as DOTA 2 or Starcraft. Furthermore, game-playing is not all that RL is useful for. In previous work we have already optimized chemical processes under this same philosophy.

Reinforcement learning was designed to address the optimisation of stochastic dynamic systems. It so happens that in reality chemical processes are stochastic (due to process disturbances) and in many cases operated in dynamic mode. The difference with traditional RL applied in mainstream AI resides in the fact that RL is “data hungry” and does not consider constraints. This is a major drawback given that processes generally rely on much less data, while unbounded exploration (without constraints) can be dangerous or costly. This project aims to design a new RL algorithm that can be used to optimize complex chemical and biochemical processes which are still unresolved today. 

Selected Publications:

  1. Reinforcement Learning for Batch Bioprocess Optimization, 2020 (preprint)
  2. Chance Constrained Policy Optimization for Process Control and Optimization, 2020
  3. Constrained Model-Free Reinforcement Learning for Process Optimization, 2020 (Neurips Workshop video)
  4. Modifier-Adaptation Schemes Employing Gaussian Processes and Trust Regions for Real-Time Optimization, 2019