Mission statement
Bayesian inference has become a popular framework for decision-making given its consistent and flexible handling of uncertainty. In this regime, however, the statistician is subject to several surprisingly strong assumptions, which are violated in almost all modern machine learning settings. This is in fact well-understood, and has led to a range of methods which aim to retain characteristics of Bayesian uncertainty quantification without the restrictive assumptions that underpin it. Collectively, this body of work is sometimes referred to as “generalised Bayes”. This name, however, does not capture the main appeal of these conceptual frameworks: by unapologetically endorsing posteriors that lie outside the confines of Bayesian epistemology, they are intrinsically post-Bayesian. This is not a minor difference in semantics, but a major shift in outlook.
This seminar series aims to shed light on the post-Bayesian community’s ongoing work, its successes, and the challenges that lie ahead once we dare to go beyond orthodox Bayesian procedures.
Structure
The seminar will run fortnightly from the end of January onwards. The first iteration of the series will be broken down into three ‘chapters’ consisting of between 4-6 talks in each chapter. Each chapter will focus on a different set of post-Bayesian ideas: generalised Bayes (led by Jeremias Knoblauch), predictive resampling-based ideas like Martingale posteriors (led by Edwin Fong), and PAC-Bayes (led by Pierre Alquier). To make this useful for the entire community, the talks in each chapter will seek to cover some key aspects of literature conducted under that chapter.
The seminars take place on the second and fourth Tuesday of each month at either 9am-10am GMT or 2pm-3pm GMT depending on speaker availability, to be announced closer to the date. You can keep up to date by subscribing to our Google calendar.
Zoom link
Join the Zoom meeting here.
Talks will last 45-50 minutes, with 10-15 minutes for discussion. We will record all talks, and upload them to our YouTube channel. Links to these recordings will appear in the schedule following the talk.
All the information related to the seminar series will be distributed through a mailing list. To join that mailing list, click this link.
Schedule
Tell us what you want (what you really, really want)
To let us know what chapters you would like to see in the future, who you would like to see lead them, or who you would like to hear talk, submit a suggestion through this form and we’ll see what we can do!
Please select one of the chapters below to see their schedule, talk titles and abstracts, and links to recordings and slides.
Chapter 1: generalised Bayes Chapter 2: predictive Bayes Chapter 3: PAC Bayes
Introduction and overview: Jeremias Knoblauch (14:00 GMT @ 11.02.2025)
This talk will serve two purposes. In the first half, I will explain why this seminar series exists, and how it is organised. In particular, I will give some of the reasons why research in statistics and machine learning has increasingly ventured beyond vanilla Bayesian procedures, and where this has led us this far, focusing particularly on generalised Bayes, PAC-Bayes, and resampling-based strategies. I will briefly characterise some of the most fruitful approaches in this area and relate them to the structure of this seminar series. In the second half of the talk, I will zoom in on what will be covered in the first 6 talks of this series — generalised Bayesian inference. I will cover the basics of these ideas, and explain some of the most important directions in the field. I will link these directions to the seminars that will be given in the subsequent weeks.
Slides YouTube
Theoretical foundations: David Frazier (09:00 GMT @ 25.02.2025)
Post-Bayesian belief updates, such as generalized Bayes and Gibbs posteriors, can deliver very different belief updates to those obtained via classical Bayesian beliefs. To ensure that such belief updates are useful in practice, we must therefore understand the behavior of these beliefs from a statistical standpoint. Answering questions such as, how reliable are the inferences obtained from post-Bayesian beliefs, or how do posterior predictives based on these beliefs perform, are integral for the adoption of these methods into the larger toolkit of machine learning and statistics. In this talk, I give a broad overview of the theoretical landscape for generalized and Gibbs posteriors, including what questions have been answered and what questions remain. I also give examples of how these theoretical developments can be leveraged to answer interesting questions regarding the choice of learning rate for predictive accuracy, and the impact on inferences when loss functions must be estimated.
Slides YouTube
Learning rate selection and the power posterior: Ryan Martin (14:00 GMT @ 11.03.2025)
Bayesian inference generally works well when the model is well-specified. But model mis- or under-specification is the norm rather than the exception, so there’s good reason to consider other posterior constructions, which is precisely the motivation for this seminar series. In this installment, I’ll focus on Gibbs posteriors and, more specifically, on aspects pertaining to Gibbs posteriors’ so-called learning rate parameter. This is a challenging problem in various respects—philosophically, theoretically, and computationally—and I aim to say a bit about all of these aspects in my presentation. Time permitting, I’ll also talk briefly about situations beyond Gibbs posterior inference where a learning rate choice is involved.
Slides YouTube
Prediction-centric approaches: Chris Oates (09:00 GMT @ 25.03.2025)
Generalised Bayesian methodologies have been proposed for inference with misspecified models, but these are typically associated with vanishing parameter uncertainty as more data are observed. In the deterministic modelling context, this can have the undesirable consequence that predictions become certain, while being incorrect. Taking this observation as a starting point, we will critically review some prediction-centric alternatives to generalised Bayes.
Slides YouTube
Coarsened Bayes and applications to biomedical sciences: David Dunson (15:15 GMT @ 09.04.2025)
The standard approach to Bayesian inference is based on the assumption that the distribution of the data belongs to the chosen model class. However, even a small violation of this assumption can have a large impact on the outcome of a Bayesian procedure. We introduce a simple, coherent approach to Bayesian inference that improves robustness to perturbations from the model: rather than condition on the data exactly, one conditions on a neighborhood of the empirical distribution. When using neighborhoods based on relative entropy estimates, the resulting “coarsened” posterior can be approximated by simply tempering the likelihood—that is, by raising it to a fractional power. Thus, inference is often easily implemented with standard methods, and one can even obtain analytical solutions when using conjugate priors. Some theoretical properties are derived, and we illustrate the approach with real and simulated data, using mixture models, autoregressive models of unknown order, and variable selection in linear regression.
Slides YouTube
From generalised Bayes to Martingale posteriors: Chris Holmes (13:00 GMT @ 22.04.2025)
We review the historical motivation in the development of general Bayesian updating, leading to Martingale Posteriors, and the central role played by the Bayesian Bootstrap. The talk will focus on fundamental concepts in uncertainty quantification rather than mathematical results, and the notion of targeted learning for estimands.
Slides YouTube
Introduction and overview: Edwin Fong (09:00 GMT @ 06.05.2025)
Chapter 2 of the seminar series explores the predictive view of Bayesian inference. The revitalized focus on prediction as the cornerstone of Bayesian inference shifts attention away from the traditional likelihood-prior framework, giving rise to novel prediction-centric methods such as quasi-Bayesian approaches and the martingale posterior. This talk will provide an overview and a brief history of the key ideas underpinning the predictive Bayesian approach, setting the stage for the subsequent four talks that will delve deeper into its specifics.
Slides YouTube
Theoretical foundations of predictive Bayes: Sandra Fortini (14:00 GMT @ 20.05.2025)
There is currently a renewed interest in the Bayesian predictive approach to statistics. The talk offers a review on foundational concepts and focuses on predictive modelling, which by directly reasoning on prediction, bypasses inferential models or may characterize them. The underlying concept is that Bayesian predictive rules formalize through conditional probability how we learn on future events given the available information. This concept has implications in any statistical problem and in inference, from classic contexts to less explored challenges, such as providing Bayesian uncertainty quantification to predictive algorithms in data science.
Slides YouTube
Predictive model selection and uncertainty : Vik Shirvaikar (09:00 GMT @ 03.06.2025)
How can the predictive view of Bayesian inference be extended to model selection and uncertainty? In this talk, we’ll discuss the prequential argument and proper scoring rules as foundational concepts in predictive Bayesian model comparison. We’ll then consider a prediction-driven framework for model uncertainty, where the key task of resampling the missing data is conducted through the sequential comparison of candidate models. This suggests an alternative approach to classical questions, such as density estimation and hypothesis testing, which sidesteps certain issues with standard methods such as the Bayes factor.
Slides YouTube
Recursive methods for predictive Bayes: Lorenzo Capello (15:00 GMT @ 01.07.2025)
Predictive Bayes is driving the development of new recursive methods for use in predictive resampling. In the literature, we observe both the adaptation of existing algorithms and the creation of entirely new ones tailored to this purpose. In this talk, we will explore both perspectives. First, we will review the derivation of several well-established algorithms and examine how they can inspire new proposal mechanisms. Next, we will consider the reverse situation: starting with a preferred algorithm and evaluating its suitability within this emerging framework.
Slides YouTube
This talk explores recent advances in Post-Bayesian inference and their practical applications across a range of inference settings. The first part introduces a novel generalised Bayesian inference scheme based on diffusion score matching, a weighted extension of score-matching divergence. We demonstrate its robustness to outliers, as well as its ability to retain the conjugacy property familiar from standard Bayesian inference. We then discuss its applicability to changepoint detection, Kalman filtering, and Gaussian processes. The second part presents applications of the Bayesian Nonparametric Learning (NPL) framework and the Posterior Bootstrap. These include inference for simulator-based models using maximum mean discrepancy estimators within the NPL framework, as well as measurement error problems where covariates are corrupted by noise. Finally, we briefly mention the application of such generalised notions of Bayesian posteriors in the decision-making setting under uncertainty.
Slides YouTube
Introduction and overview: Pierre Alquier (14:00 GMT @ 23.09.2025)
The PAC-Bayesian theory provides tools to understand the accuracy of Bayes-inspired algorithms that learn probability distributions on parameters. This theory was initially developed by McAllester about 20 years ago, and applied successfully to various machine learning algorithms for various learning tasks. Recently, it led to tight generalization bounds for deep neural networks, an objective that could not be achieved by standard “worst-case” generalization bounds such as Vapnik-Chervonenkis bounds. In this talk, I will provide a brief introduction to PAC-Bayes bounds, and explain the core ideas of the theory. I will then highlight the relevance of this approach for the post-Bayes community in general. In particular, PAC-Bayes provides a very natural theoretical analysis of variational approximations and other generalized posteriors.
Slides YouTube
The Size of Teachers as a Measure of Data Complexity: PAC-Bayes Excess Risk Bounds and Scaling Laws: Dan Roy (14:00 GMT @ 30.09.2025)
We study the generalization properties of neural networks through the lens of data complexity. Recent work by Buzaglo et al. (2024) shows that random (nearly) interpolating networks generalize, provided there is a small “teacher” network that achieves small excess risk. We give a short single-sample PAC-Bayes proof of this result and an analogous “fast-rate” result for random samples from Gibbs posteriors. The resulting oracle inequality motivates a new notion of data complexity, based on the minimal size of a teacher network required to achieve any given level of excess risk. We show that polynomial data complexity gives rise to power laws connecting risk to the number of training samples, like in empirical neural scaling laws. By comparing the “scaling laws” resulting from our bounds to those observed in empirical studies, we provide evidence for lower bounds on the data complexity of standard benchmarks.
Slides YouTube
Recursive PAC Bayes: Yevgeny Seldin (14:00 GMT @ 07.10.2025)
PAC-Bayesian analysis is a frequentist framework for incorporating prior knowledge into learning. It was inspired by Bayesian learning, which allows sequential data processing and naturally turns posteriors from one processing step into priors for the next. However, despite two and a half decades of research, the ability to update priors sequentially without losing confidence information along the way remained elusive for PAC-Bayes. While PAC-Bayes allows construction of data-informed priors, the final confidence intervals depend only on the number of points that were not used for the construction of the prior, whereas confidence information in the prior, which is related to the number of points used to construct the prior, is lost. This limits the possibility and benefit of sequential prior updates, because the final bounds depend only on the size of the final batch.
I will present a novel and, in retrospect, surprisingly simple and powerful PAC-Bayesian procedure that allows sequential prior updates with no information loss. The procedure is based on a novel decomposition of the expected loss of randomized classifiers. The decomposition rewrites the loss of the posterior as an excess loss relative to a downscaled loss of the prior plus the downscaled loss of the prior, which is bounded recursively. In empirical evaluation the recursive procedure significantly outperforms state-of-the-art.
If time permits, I will also present applications of PAC-Bayesian analysis to provide generalization guarantees to weighted majority votes.
Slides YouTube
PAC-Bayes Hypernetworks: Pascal Germain (14:00 GMT @ 28.10.2025)
The PAC-Bayesian learning framework has been shown instrumental for deriving tight (non-vacuous) generalization bounds for neural networks. We propose a new way to exploit these results in a meta-learning scheme, relying on a hypernetwork that outputs the parameters of a downstream predictor from a dataset input. The originality of our approach lies in the investigated hypernetwork architectures that encode the dataset before decoding the parameters, explicitly implementing a form of information bottleneck. The PAC-Bayesian encoder expresses a posterior distribution over a latent space, from which we compute generalization guarantees for the “decoded” downstream predictor. The versatility of the PAC-Bayesian learning framework drove us to explore variations of this promising neural network architecture, notably by exploiting elements of the sample compress theory in conjunction with traditional PAC-Bayes methods.
Rethinking Generalisation: Beyond KL with Geometry and Comparators: Benjamin Guedj (9:00 GMT @ 4.11.2025)
Generalisation is arguably one of the central problems in machine learning and foundational AI. Generalisation theory has traditionally relied on KL-based PAC-Bayesian bounds, which, despite their elegance, often obscure geometry and limit applicability. In this talk, I will present recent advances that move beyond traditional bounds. One line of work replaces KL with Wasserstein distances, yielding high-probability bounds valid for heavy-tailed losses and leading to new, optimisable learning objectives. Another line introduces a general comparator framework, showing how optimal bounds naturally arise from convex conjugates of cumulant generating functions, unifying and extending many classical results. Together, these perspectives highlight how rethinking divergences and comparators opens new directions in both theory and practice. I will conclude by discussing links with information theory and how these ideas might shape the next generation of PAC-Bayesian learning algorithms.
PAC-Bayes Meets Variational Inference: Theory and Generalizations: Badr-Eddine Chérief-Abdellatif (14:00 GMT @ 18.11.2025)
Variational inference (VI) is a cornerstone of modern Bayesian learning, offering tractable approximations to intractable posteriors. At the same time, the PAC-Bayesian framework provides tight and interpretable generalization guarantees for randomized predictors, often formulated in terms of Gibbs posteriors. These two perspectives are deeply connected: non-exact minimization of PAC-Bayes bounds can be interpreted as a form of variational approximation, while tempered and generalized posteriors arising in PAC-Bayes lead to new insights into the theoretical properties of VI. Recent advances highlight how PAC-Bayesian analysis can establish consistency results for variational methods, extend the classical prior mass condition, and motivate divergences beyond KL in practical inference. In this talk, I will explore the interplay between PAC-Bayes and VI, emphasizing how this dual perspective informs both statistical theory and scalable algorithms.
Organisers