Master2 Data Sciences


Theoretical guidelines for high-dimensional data analysis


Schedule


12/10: amphi Monge, Ecole Polytechnique, 14h-18h
09/11: amphi Carnot, Ecole Polytechnique, 14h-18h
16/11: amphi Carnot, Ecole Polytechnique, 14h-18h
23/11: amphi Carnot, Ecole Polytechnique, 14h-18h
30/11: amphi Carnot, Ecole Polytechnique, 14h-18h



Warning!

This course is not suited as a training for a PhD in mathematical statistics. You should instead follow the course Statistiques en grande dimension from the Master2 MDA and MSV.

Program

Goal of the lectures: The lecture will be based on some recent research papers. The presence during the lectures is mandatory.


LectureTopic
Paper(s)SlidesFurther reading
1False discoveries, multiple testing, online issue
paper 1 (short review) Slides
Reliability of scientific findings? Online FDR control
2Strength and weakness of the Lasso
Paper 1
Slides No free computationnal lunch
3Adaptive data analysis
Paper 1
Slides Kaggle overfiting
4Curse of dimensionality, robust PCA, theoretical limits
Paper 1 (suppl. material)
Slides Robust PCA
5Statistical / computational gap
Paper 1 Slides


Evaluation

The reports must be sent by email in a zip file including:
- the report in pdf format (8 to 12 pages)
- the source code for the numerics