Tropical algebra, optimization and gamesStéphane Gaubert* |
Common lecture to Master Optimization, University Paris-Saclay and IPP and Master Mathématiques de la modélisation, Year 2024-2025. The lecture is in English. It will be scheduled from November to January.
The lectures will take place in Paris, at Sorbonne Université, Campus Jussieu (next to the Jussieu tube station), Lecture 1: 25 nov morning, 9h00-12h30, Room 15/25-104 Lecture 2: 2 dec au matin, idem - Room 24/34-212 Lecture 3: 9 dec au matin, idem - Room 15/25-104 Lecture 4: 16 dec au matin, idem - Room 15/25-104 Lecture 5: 6 jan au matin, idem - Room 15/25-104 Lecture 6: 20 jan au matin, idem - Room 15/25-102 Lecture 7: mercredi 22 jan afternoon, 14h00-17h30 - Room 15/25-101 Lecture 8: 27 jan au matin, 9h00-12h30 Room 15/25-104 https://www.ljll.math.upmc.fr/MathModel/info/pour_venir.html for information on how to come to Jussieu.
Further resources (including copy of earlier lectures) are available for registered students on the moodle of the lecture MAP654G. The access to the moodle site of IPP requires a registration, students not belonging to IPP should contact me at the beginning of the lecture to obtain access to this site.
The dynamic programming approach allows one to deal with deterministic and stochastic optimal control, and more generally, with repeated zero-sum two-player problems, by studying operators introduced by Bellan and Shapley, that propagate the value function. These operators are order preserving, and nonexpansive with respect to natural norms or metrics. For discrete problems, they have a rich combinatorial structure, which can be understood through tropical geometry. The term “Tropical” refers to structures of “characteristic one”, in which the addition is idempotent. The max-plus semifield, which consists of the real numbers equipped with the maximum, thought of as an additive operation, and the addition, thought of as a multiplication operation, is the prototypical example of such structures. The tropical approach provides a geometrical intepretation of repeated zero-sum games, leading to new insights.
The objective of this lecture is to present methods from tropical mathematics and from Perron-Frobenius theory (fixed point theory of nonexpansive and order preserving non-linear mappings), with applications to zero-sum repeated games. We have a special interest in problems with long horizon, in which one is interested in the growth of the income as the horizon tends to infinity, including the “mean payoff problem”. An emphasis is made on combinatorial aspects and computational complexity issues. The lecture is illustrated by applications, with examples of one or two-player games originating from population dynamics, nonnegative matrix theory, discrete event systems, or nonnegative tensors.
The lecture has relatively few prerequisites, except elements of optimisation (convexity, polyhedra) and analysis (fixed point theory). The needed results are systematically recalled. This lecture is primarily of a mathematical nature, however, it comprises a significant part concerning algorithms and modelling, which may be of interest to students interested by the applications of operations research and by games in computer science.
First two lectures.
Various examples of one and two-player games. Examples from population dynamics or discrete event systems that reduce to games. The mean payoff problem for deterministic games - a challenging open problem in computational complexity. First steps in the tropical approach: tropicalization of polynomials, log-convexity of the Laplace transform, generalizations of Cauchy bound on polynomial roots.
Some references for this part [Vir01], and the introductive chapter (sections 1.2 and 1.2 especially) of [MS15], available on line https://janos.cs.technion.ac.il/COURSES/238900-13/Tropical/MaclaganSturmfels.pdf, also section 1-2 of [AGG12].
We review the basics of dynamic programming, showing that the value function is governed by Shapley operators, or by Bellman operators in the one player case. First properties of Shapley operators.
The simplest combinatorial case: the one player deterministic problem. Asymptotics of the value function. How it is controlled by non-linear eigenvectors. Structure of the eigenspace. The turnpike property. (properties of optimal strategies when the horizon tends to infinity).
Order preserving maps on cones. Hilbert’s and Thompson’s metric. Examples of the orthant and of the cone of positive semidefinte matrices. Nonexpansiveness properties in these metrics. The ergodic non-linear eigenproblem, and its applications: Markov decision processes, combinatorial games, matrix scaling problems (entropy maximization), population dynamics.
Existence and uniqueness of nonlinear fixed-points of non-expansive mappings. Application to the non-linear eigenproblem. Nonlinear généralisations of the Perron-Frobenius theorem.
The horoboundary of a metric space. A generalized Wolff-Denjoy theorem. Application to the mean payoff problem. Collatz-Wielandt certificates.
Analysis of non-linear power algorithms (aka value iteration). The Birkhoff-Hopf contraction theorem. Dobrushin ergodicity coefficient and its nonlinear extensions. Policy iteration (à la Hoffman-Karp) for zero-sum games. Relation between mean payoff games and linear or semidefinite programming over nonarchimedean fields.
Page of the lecture:
http://www.cmap.polytechnique.fr/~gaubert/coursodjups/index.html
Ce document a été traduit de LATEX par HEVEA