2025-2026

Les séminaires de STATQAM ont lieu à 15h30 (Heure de l’Est), en présentiel au PK-5115.

Merci de contacter Michaël Lalancette (lalancette.michael@uqam.ca) si vous voulez être ajouté à la liste de diffusion des séminaires.

Session Automne 2025

25 septembre : Zinsou Max Debaly (UQAM)

Titre : Learning centre’s partitions from summaries

Résumé : Multi-centre studies increasingly rely on distributed inference, where sites share only centre-level summaries. Homogeneity of parameters across centres is often violated, motivating methods that both test for equality and learn centre groupings before estimation. We develop multivariate Cochran-type tests that operate on summary statistics and embed them in a sequential, test-driven Clusters-of-Centres (CoC) algorithm that merges centres (or blocks) only when equality is not rejected. We derive the asymptotic χ2-mixture distributions of the test statistics and provide plug-in estimators for implementation. To improve finite-sample integration, we introduce a multi-round bootstrap CoC that re-evaluates merges across independently resampled summary sets; under mild regularity and a separation condition, we prove a golden-partition recovery result: as the number of rounds grows with n, the true partition is recovered with probability tending to one. We also give simple numerical guidelines, including a plateau-based stopping rule, to make the multi-round procedure reproducible. Simulations and a real-data analysis of U.S. airline on-time performance (2007) show accurate heterogeneity detection and partitions that change little with the choice of resampling scheme.

2 octobre : Josée Dupuis (Université McGill)

Titre : Novel Statistical Approaches to Exploit Family History Information to Improve Power to Detect Rare Genetic Variant Associations

Résumé : The growing availability of sequencing data has enabled the investigation of the role of rare variants in disease etiology.  However, detecting associations with rare variants or groups of rare variants requires large sample sizes for adequate power, especially for late-onset diseases, when the number of cases in cohorts of younger participants may be low.   Family history (FH) contains information on the disease status of relatives, adding valuable information about the probands’ health problems and risk of diseases. Incorporating data from FH is a cost-effective way to improve statistical evidence in genetic studies and overcome limitations in study designs with insufficient cases. We proposed a family history aggregation unit-based test (FHAT) and optimal FHAT (FHAT-O) to exploit available FH for rare variant association analysis. We also proposed a robust version of FHAT and FHAT-O for unbalanced case-control designs.  By applying FHAT and FHAT-O to the analysis of all-cause dementia and hypertension using the exome sequencing data from the UK Biobank, we show that our methods can improve significance for known regions.

9 octobre : Steven Golovkine (Université Laval)

Titre : Functional data and sport science

Résumé : Functional data analysis (FDA) provides a powerful framework for studying data that can be represented as curves, trajectories, or other continuous functions. This talk introduces the main ideas of FDA and illustrates its potential through recent applications in sport science. The first example concerns performance analysis in cycling, where functional methods are used to examine variations in power output across the menstrual cycle, highlighting the capacity of FDA to capture subtle temporal dynamics in physiological data. The second application focuses on gait analysis of recreational Irish runners, where functional representations of stride patterns allow for a detailed comparison of biomechanical features across individuals. Finally, I will present a clustering study of NBA players based on functional representations of their shooting patterns, which reveals groups of athletes with distinct styles of play. Together, these examples demonstrate how FDA can offer new insights into sports performance by capturing information that would be missed by conventional, scalar-based analyses.

The presentation will be done in French and (hopefully) subtitled in English. The slides will be presented in English.

23 octobre : Nahid Sadr (Université de Sherbrooke)

Titre : Asymptotic Behavior, Risk Measures, and Simulation of Distorted Copulas

Résumé : Distorting multivariate distributions is a useful approach for introducing flexibility and capturing model uncertainty. In particular, applying distortions to the copulas representing the underlying dependence structure allows one to generate new, flexible dependence models from existing ones. In this presentation, we investigate the extremal domain of attraction problem for Morillas-type distorted copulas. We establish not only conditions under which such copula-to-copula transformations alter the respective asymptotic behavior, but also discuss conditions under which the distorted copulas remain in the same domain of attraction as the initial undistorted copula. Furthermore, we discuss the effect of these distortions on multivariate risk measures, such as the lower-orthant Value-at-Risk and Range-Value-at-Risk. Finally, we propose a simulation algorithm for Morillas-type distorted copulas, addressing a gap in the literature and providing the means to utilize these modified dependence structures in practice.

30 octobre : Junxi Zhang (Université Concordia)

Titre : Statistical Inference for Bayesian Nonparametric Models Based on Normalized Random Measures with Independent Increments

Résumé : Normalized random measures with independent increments (NRMIs) form a broad class of Bayesian nonparametric priors and are widely used. In this talk, I will address two key asymptotic problems for NRMIs: posterior consistency and the Bernstein–von Mises (BvM) theorem. For posterior consistency, which validates Bayesian nonparametric procedures under NRMI priors, I establish the posterior consistency for a very general class of nonhomogeneous NRMIs under a proposed assumption, illustrating its applicability with several examples. For the BvM theorem, which connects Bayesian and frequentist inference, I present an explicit BvM result for a particularly rich subclass of NRMIs, namely the normalized generalized gamma processes (NGGPs). A key insight from our theorem is the identification of a bias term that persists even as the sample size tends to infinity. Accordingly, I propose a necessary bias correction when constructing credible sets based on the BvM theorem. This talk is based on a joint work with Dr. Yaozhong Hu at the University of Alberta.

6 novembre : Kirill Neklyudov (Université de Montréal)

Titre : Transferable Monte Carlo Methods via Generative Modeling

Résumé : Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches, such as molecular dynamics or Markov chain Monte Carlo, inherently lack transferability across systems and parameters; the computational cost of sampling must be paid in full each time. The widespread success of generative models has inspired interest in overcoming this limitation through learning sampling algorithms. In this talk, I will present our recent works on Monte Carlo algorithms based on generative modeling, which demonstrate transferability across different systems and different temperatures. In the first part of the talk, I will present Progressive Inference-Time Annealing (PITA), a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: 1) Annealing of the Boltzmann distribution and 2) Diffusion processes. PITA proposes an efficient Sequential Monte Carlo (SMC) algorithm that samples from the temperature annealed marginals of the diffusion process at the inference time, which allows for training a diffusion model using the samples from a high temperature distribution and then sampling from a lower temperature density. Progressively annealing the target Boltzmann density and re-training the diffusion model on newly generated samples, PITA enables, for the first time, equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates with dramatically lower energy function evaluations. In the second part of the talk, I will present PROSE, a 280-million-parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. PROSE draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation, we demonstrate the efficacy of PROSE as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve superior performance to established methods such as sequential Monte Carlo on unseen tetrapeptides.

13 novembre : Jeffrey Negrea (University of Waterloo)

Titre : Follow-the-Perturbed-Leader with Between-Action Dependence

Résumé : We present a framework for analyzing Gaussian follow-the-perturbed-leader (FTPL) algorithms for full-information online learning problems when the perturbation distribution exhibits between-action dependence. Applications include FTPL algorithms for online learning for i) infinite action spaces when the adversary plays bounded Lipschitz reward functions, where the perturbations are random functions sampled from a Gaussian process; and ii) linear polyhedral games, where the perturbation is a random linear function. We demonstrate how to tightly account for dependence between actions in the FTPL analysis and present an ansatz for the selection of the perturbation distribution based on a Bayesian perspective of FTPL as a variant of Thompson sampling.

20 novembre :

Titre :

Résumé :

27 novembre : Philippe Boileau (Université McGill)

Titre : Causal Machine Learning Methods for Heterogeneous Treatment Effect Detection

Résumé : The conditional average treatment effect (CATE) is frequently estimated to refute the homogeneous treatment effect assumption. Under this assumption, all units making up the population under study experience identical benefit from a given treatment. Uncovering heterogeneous treatment effects through inference about the CATE, however, requires that covariates truly modifying the treatment effect be reliably collected at baseline. CATE-based techniques will necessarily fail to detect violations when effect modifiers are omitted from the data due to, for example, resource constraints. Severe measurement error has a similar impact. To address these limitations, we prove that the homogeneous treatment effect assumption can be gauged through inference about contrasts of the potential outcomes’ variances. We derive causal machine learning estimators of these contrasts and study their asymptotic properties. We establish that these estimators are doubly robust and asymptotically linear under mild conditions, permitting formal hypothesis testing about the homogeneous treatment effect assumptions even when effect modifiers are missing or mismeasured. Numerical experiments demonstrate that these estimators’ asymptotic guarantees are approximately achieved in experimental and observational data alike. These inference procedures are then used to detect heterogeneous treatment effects in the re-analysis of a randomized controlled trial investigating targeted temperature management in cardiac arrest patients.

4 décembre : Roxane Turcotte (UQAM)

Titre : Modélisation des réserves d’assurance en présence de dépendance

Résumé : Une réserve en assurance est une somme mise de côté pour couvrir les coûts futurs liés à des sinistres. Le montant est déterminé en fonction d’une estimation de la distribution du coût total. Les méthodes de réserves individuelles ont connu un développement substantiel au cours de la dernière décennie, favorisé par un meilleur accès aux données. Toutefois, les méthodes permettant de modéliser la dépendance entre les différentes couvertures d’un même produit d’assurance demeurent limitées. Dans ce séminaire, on détaillera une approche de régression multivariée basée sur une copule pour modéliser conjointement le temps de règlement et le montant des pertes assurables. On abordera la procédure d’estimation ainsi que l’impact de la dépendance sur la modélisation et le montant de la réserve.

Session Hiver 2026

22 janvier : Miceline Mésidor (INRS)

Titre : Revue des méthodes d’inférence causale appliquées aux études cas-témoins

Résumé : Ces dernières années ont été marquées par une croissance importante de l’utilisation et du développement des méthodes d’inférence causale dans les études de cohorte. Cependant, ces approches ont été beaucoup moins explorées dans le contexte des études cas-témoins. Cette présentation fera le point sur les méthodes avancées d’estimation d’effets causaux spécifiquement adaptées à ce type de devis. Chaque méthode sera présentée en détail, en mettant en évidence ses forces et ses limites. Enfin, la séance soulignera les principales lacunes méthodologiques actuelles et les pistes de recherche à privilégier pour de futurs développements statistiques.

29 janvier : Marouane Il Idrissi (UQAM)

Titre : Interprétation des modèles boîte noire de prévision

Résumé : La théorie des jeux coopératifs est devenue un pilier de l’interprétabilité en apprentissage automatique, notamment via l’usage de la valeur de Shapley. Pourtant, malgré leur adoption massive, les méthodes fondées sur Shapley s’appuient souvent sur des justifications axiomatiques dont la pertinence, en particulier pour l’attribution d’influence, reste débattue. Dans cette présentation, nous revisitons la théorie des jeux coopératifs du point de vue de l’interprétation des modèles de prévision et plaidons pour un usage plus maîtrisé de ces outils. Nous proposerons une lecture intuitive des valeurs de Shapley, puis nous esquisserons un cadre général pour concevoir des méthodes d’interprétation plus riches et mieux alignées sur l’objectif de l’étude du comportement des modèles. La discussion sera structurée autour de cinq défis clés : le choix de la fonction de valeur, le choix de l’allocation, l’exploration de nouvelles quantités d’intérêt, les contraintes computationnelles, et l’application aux systèmes critiques. Pour conclure, nous soulignerons l’importance de ces questions pour une adoption responsable et crédible de l’IA dans les domaines à enjeux élevés.

5 février : Catherine Haeck (UQAM)

Titre : Age at Immigration and the Intergenerational Income Mobility of the 1.5 Generation

Résumé : We exploit longitudinal tax files linked to Census data to measure the contribution of age at immigration to the intergenerational income mobility of immigrant children. We first estimate the causal effect of children’s age at immigration on adulthood income using a siblings fixed effects model of years of exposure to the host country. Up to 10 years old, the relationship between age at immigration and income is weak, but starting at age 11, each additional year is associated with a decrease in adulthood income rank of close to half a percentile rank. We then find that adjusting the 1.5 generation’s income ranks for age at immigration results in an intergenerational rank-rank coefficient that is lower by 0.018, or 10.2% of the (unadjusted) intergenerational income transmission estimate of the 1.5 generation. Earlier immigration has the potential to improve intergenerational economic mobility.

12 février : Carlotta Pacifici (Université Bocconi)

Titre :

Résumé :

19 février : Cédric Beaulac (UQAM)

Titre :

Résumé :

26 février :

Titre :

Résumé :

12 mars : Thierry Duchesne (Université Laval)

Titre :

Résumé :

19 mars : Alexandra Schmidt (Université McGill)

Titre :

Résumé :

26 mars : Guilherme Lopes de Oliveira (Federal Center for Technological Education of Minas Gerais)

Titre :

Résumé :

2 avril : David Haziza (Université d’Ottawa)

Titre :

Résumé :

9 avril : Julie Mireille Thériault (UQAM)

Titre :

Résumé :

Vendredi 24 avril : Joanna Mills Flemming (Dalhousie University)

Titre :

Résumé :