Finite-Size Corrections and Likelihood Ratio Fluctuations in the Spiked Wigner Model

Ahmed El Alaoui Statistical Theory, Theoretical ML

In this paper we study principal components analysis in the regime of high dimensionality and high noise. Our model of the problem is a rank-one deformation of a Wigner matrix where the signal-to-noise ratio (SNR) is of constant order, and we are interested in the fundamental limits of detection of the spike. Our main goal is to gain a fine understanding of the asymptotics for the log-likelihood ratio process, also known as the free energy, as a function of the SNR. Our main results are twofold. We first prove that the free energy has a finite-size correction to its limit---the replica-symmetric formula---which we explicitly compute. This provides a formula for the Kullback-Leibler divergence between the planted and null models. Second, we prove that below the reconstruction threshold, where it becomes impossible to reconstruct the spike, the log-likelihood ratio has fluctuations of constant order and converges in distribution to a Gaussian under both the planted and (under restrictions) the null model. As a consequence, we provide a general proof of contiguity between these two distributions that holds up to the reconstruction threshold, and is valid for an arbitrary separable prior on the spike. Formulae for the total variation distance, and the Type-I and Type-II errors of the optimal test are also given. Our proofs are based on Gaussian interpolation methods and a rigorous incarnation of the cavity method, as devised by Guerra and Talagrand in their study of the Sherrington--Kirkpatrick spin-glass model.

Authors: Ahmed El Alaoui, Florent Krzakala, Michael Jordan