Abstract:

In this talk, Will Grathwohl will discuss two of his works on Energy-Based Models (EBMs). EBMs are a class of probabilistic generative models like normalizing flows, variational auto-encoders, autoregressive models, and GANs. EBMs explicitly parameterize the distribution’s log-probability function while ignoring its normalizing constant. This gives immense flexibility when designing EBMs but comes at the (very high) cost of making likelihood computation and sampling generally intractable.

The first work that he will discuss is titled “Your classifier is secretly an Energy-Based Model and you should treat it like one” (published ICLR 2020). This work demonstrates how the flexibility of EBMs allows us to reinterpret existing neural network classifier architectures as generative models, imbuing them with new properties and giving benefits to out-of-distribution detection, uncertainty quantification, adversarial robustness, and semi-supervised learning.

The next work, “Oops I took a gradient: Scalable sampling for discrete distributions” (published ICML 2021) addresses a key issue with EBMs. Much recent work on scalable EBMs can only be applied to continuous data. This is because many EBM training methods rely on the gradients of the likelihood function. In discrete settings, this gradient is not defined — causing the scalability of these models on discrete data to lag significantly behind the continuous setting. This work proposes a new, generic (and shockingly simple) MCMC sampler for generic discrete distributions which notably outperforms prior approaches while remaining widely applicable. This sampler accelerates sampling on many important discrete distributions and enables deep EBMs to be scaled to high-dimensional discrete datasets.

Biography:

Will Grathwohl is a research scientist at Deepmind in New York City. His research focuses around methods and applications for large-scale generative models. More specifically, he is interested in making theoretical and methodological improvements to our current generative models which will enable them to be more useful and easier to apply to important problems in the natural sciences and general intelligence. Will received his PhD in Computer Science from the University of Toronto in 2021 under the supervision of David Duvenaud and Richard Zemel. Prior to graduate school Will worked on machine learning applications in the tech industry and at Lawrence Livermore National Laboratory. Prior to this work he received his bachelors in mathematics from MIT in 2014.