Olivier Catoni


CREST --- Centre de Recherche en Économie et Statistique
CNRS UMR 9194
Laboratoire de Statistiques

Page personnelle en français (ce n'est pas une traduction de ma page anglaise, mais plutôt un complément portant sur des informations plus spécifiquement francophones.)

Olivier Catoni Olivier Catoni
Directeur de Recherche
CREST, Laboratoire de Statistiques
UMR 9194 du CNRS
bureau 3035
5, avenue Henry Le Chatelier
TSA 96642
91764 Palaiseau cedex,
FRANCE
e-mail: olivier.catoni
followed by « at » ensae.fr

Three-day meeting of statisticians in Paris I.H.P., July 18-20 2022

Another talk on k-means with updated results.

My research report (in French)

Conference on robustness and privacy 2021

You can download the slides of my talk on generalization bounds for means and k-means.

New bounds for k-means algorithms

This is a joint work with Gautier Appert. The preprint is here. Gautier's PhD is here.

About stochastic games

Olivier Catoni, Miquel Oliu-Barton et Bruno Ziliotto. Ť Constant payoff in zero-sum stochastic games ť. In : Ann. Inst. H. Poincaré Probab. Statist. (2020). to appear, p. 1-13 pdf

Conférence en l'honneur de Robert Azencott, May 14-15 2019

I was very pleased to pay a tribute to my former PhD adviser Robert Azencott in this conference dedicated to his impact on Artificial Intelligence. I gave a talk about statistical syntax analysis for signal processing.

NIPS 2017 Workshop, Long Beach, CA, USA, December 9, 2017

I gave two talks, one as invited speaker on dimension-free PAC-Bayesian bounds for vectors and matrices and another more specific one on a srinkage estimator of the mean of a random vector, in the workshop (Almost) 50 Shades of Bayesian Learning: PAC-Bayesian trends and insights, part of NIPS 2017. I presented the two joint works with Ilaria Giulini that you can download below.

Dimension-free PAC-Bayesian bounds, joint work with Ilaria Giulini, December 2017

Two new papers are available:

Dimension-free PAC-Bayesian bounds for matrices. vectors and linear least squares regression
(You may experience some font problems, depending on your pdf viewer.)

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector
NIPS 2017 workshop; (Almost) 50 shades of Bayesian learning: PAC-Bayesian trends and insights.

Markov substitute processes, June 2017

I gave a talk at INRIA Lille, on the subject of Markov substitute processes. This was focussed mainly on two results: Markov substitute processes form exponential families (or Gibbs measures in other terms), and crossing-over dynamics can be used to compute the maximum likelihood estimator.

PAC Bayesian bounds for the Gram matrix and least squares regression with a random design

I posted a paper on the subject on ArXiv. I also gave a talk at the CREST seminar in January 2016, and an earlier one in May 2015 at the SMILE seminar at ENS.

A talk on spectral clustering

Given at the Séminaire Parisien de Statistiques, in October 2015.

Two video talks on PAC-Bayes learning bounds at the IFCAM Summer School on Applied Mathematics, on July 2014

Slides with sound :
An informal introduction to PAC-Bayes bounds (with a small mistake in the definition of psi, that should be equal to the log of what it is claimed to be)
PAC-Bayes Bounds for arbitrary loss functions
Bounds for binary loss functions
PAC-Bayes bounds for binary loss functions
PAC-Bayes margin bounds for Support Vector Mahines

A small test catoni_videotest.avi (small video file to test if you can read the format I am using).

Slides in pdf format without sound
Lecture notes

Markov substitute models and statistical inference in linguistics

A talk , given in April 2014 at the Séminaire Parisien de Statistique.

Statistical learning of syntactic structures

Toric Grammars : a new statistical approach to natural language modeling, Olivier Catoni and Thomas Mainguy (2013) arXiv
The simulation shown in this paper was made with the following code.
Please note that this is only a demonstration code, suitable to show how the method behaves on small examples, but not optimized to scale properly with large data sets.

A talk in Moscow, at the Institute for Information Transmission Problems (Nov 29, 2012)

Unsupervised statistical learning through label aggregation ,
slides.

A talk in Nice (on May 12, 2011, in French)

Petites perturbations des estimateurs et bornes PAC-Bayésiennes

Some lecture notes on PAC-Bayes bounds (Statistical learning, L3, ENS)

notes of 06/12/2013
notes of 04/02/2012
notes of 09/15/2011

My talk at ENS on March 16, 2011 (in French)

Apprentissage PAC-Bayésien : de la classification à la régression

My talk in Lille on January 21, 2011

La moyenne empirique est-elle perfectible ?

My last preprints are on arXiv (and HAL)

Challenging the empirical mean and empirical variance: a deviation study, Olivier Catoni (2010), on arXiv

High confidence estimates of the mean of heavy-tailed real random variables, Olivier Catoni (2009), on arXiv .
This one can be skipped : it is an early draft of the previous preprint, which presents improved estimators, improved bounds and some experiments.

Robust linear least squares regression, Jean-Yves Audibert, Olivier Catoni (2010), on arXiv

Robust linear regression through PAC-Bayesian truncation, Jean-Yves Audibert, Olivier Catoni (2010), on arXiv

Risk bounds in linear regression through PAC-Bayesian truncation, Jean-Yves Audibert, Olivier Catoni (2009), on arXiv.
This one can be skipped also: it is an early draft covering the matter of the two previous preprints.

Foundations and New Trends of PAC Bayesian Learning

My talk in video.

Exposé devant le Comité des Projets de l'INRIA --- 4 juin 2009

The slides of my talk to present the CLASSIC INRIA team proposal.

Evaluation du DMA --- 29 janvier 2009

The slides of my talk on the occasion of the evaluation of the DMA.

Journée de rentrée du DMA --- 2 octobre 2008

You can download the slides of my presentation.

International Meeting on Empirical Processes and Asymptotic Statistics

Univ. Rennes 1, June 18-20 2007. The slides of my talk, ``Learning, information theory and thermodynamics'', pdf file.

PAC-Bayesian supervised classification (The thermodynamics of statistical learning)

This is the title of a monograph published in the Lecture Notes series of the IMS. pdf file, dvi file.

Classification

Publication list

publications
CV and report (in French)

Preprints

I moved in October 1998 from ENS to Paris 6 and back to ENS in september 2008. My older preprints and those of some of my students (Cécile Cot, Gilles Blanchard and Jean-Philippe Vert who stayed at the ENS) can be found on the preprint server of the Laboratoire de Mathématiques de l'Ecole Normale Supérieure de Paris.

Preprints from the period october 1998 - september 2008 are on the server of the laboratoire de Probabilités et Modèles Aléatoires., on the server HAL or on ArXiv

You can also use the national preprint search engine of the cellule mathdoc.

The last revision of ``The loop erased exit path and the metastability of a biased vote process'', a joint paper with Dayue Chen and Jun Xie, to appear in Stochastic Processes and their Applications, is also available.

Click here for the last revision of ``Free energy estimates and deviation inequalities'', with a more precise study of the unbounded case and improved bounds for Markov chains.

The last revision of Gibbs estimators describes general integrability conditions under which it is possible to define a Gibbs estimator and to bound its risk.

Lecture notes

You can download the last revision of my lecture notes on ``Simulated Annealing Algorithms and Markov chains with Rare Transitions'', published in the Séminaire de Probabilités.

You can download here the draft of my Saint-Flour lecture notes (July 2001) on statiscial learning theory and stochastic optimization. The final version of these notes is now published as Springer Lecture Notes in Mathematics Number 1851. Please consider buying the book or encouraging your library to buy it if you liked the draft ! (as a courtesy to Springer's efforts to make the Saint Flour summer school notes widely available: authors don't get royalties on lecture notes, this is why I feel free to give you this piece of advice).

Information theory, statistical learning and pattern recognition

A workshop on this theme was held at the CIRM in December 1998. The program of this meeting is kept here.

The Gibbs estimator in action : a downloadable software for density estimation

I wrote a software to illustrate a communication I presented to Foundations of Computational Mathematics (July 13-17 2000, Hong-Kong). You can have a look at its documentation here, where you will also find download instructions.

Learning Theoretic and Bayesian Inductive Principles
19-21 July 2004 Gatsby Computational Neuroscience Unit, London

The slides of my talk.

Empirical complexity and randomized estimators

You can download here the slides (as a .dvi or .pdf file) of my talk at the workshop Ť Statistical Learning in Classification and Model Selection ť EURANDOM, Eindhoven, The Netherlands January 15-18, 2003, organized by Prof.dr. R.D.Gill (Universiteit Utrecht/EURANDOM), Dr. P. Grünwald (CWI), Prof.dr A.W. van der Vaart (Vrije Universiteit Amsterdam/EURANDOM), Dr. J. Lember (EURANDOM) as well as the corresponding preprint (as a .dvi or .pdf file), (to be also available soon on the PMA server).

DEA lectures : Classification and model selection (2003)

Lecture notes in postscript and pdf formats are available.

Théorèmes PAC Bayésiens locaux et estimateurs randomisés

Those who understand french can download the slides of this talk from my french homepage.

Back to the department homepage.