-----------
Angelina Roche
Déplacé au 13/06/24
----------------
Aude Sportisse (déplacé au 20/06/24)
Titre: Challenges raised by Missing Not At Random Data
Abstract: One of the ironies of increased data collection is that missing data are
inevitable: the more data there are, the more missing data there are. The purpose of this
presentation is to provide an overview of data Missing Not At Random (MNAR), which is
when the unavailability of the data depends on the values taken by the data. It implies that
the observed population is not representative of the general one. These missing data are
widely encountered in real data sets, but they introduce significant biases into the
samples, which most existing methods ignore. We will discuss the main difficulties raised
by MNAR data, as the identifiability of the parameters, and we will see some examples
illustrating how to deal with them in some specific contexts: semi-supervised learning,
clustering and low-rank models.