Optimal Control
1 Optimal Control
This section covers fundamental approaches to optimal control, including dynamic programming, linear programming, tree-based planning, control theory, and model predictive control.
1.1 Dynamic Programming
- (book) Dynamic Programming, Bellman R. (1957).
- (book) Dynamic Programming and Optimal Control, Volumes 1 and 2, Bertsekas D. (1995).
- (book) Markov Decision Processes - Discrete Stochastic Dynamic Programming, Puterman M. (1995).
- An Upper Bound on the Loss from Approximate Optimal-Value Functions, Singh S., Yee R. (1994).
- Stochastic optimization of sailing trajectories in an upwind regatta, Dalang R. et al. (2015).
1.2 Linear Programming
- (book) Markov Decision Processes - Discrete Stochastic Dynamic Programming, Puterman M. (1995).
REPS
Relative Entropy Policy Search, Peters J. et al. (2010).
1.3 Tree-Based Planning
ExpectiMinimax
Optimal strategy in games with chance nodes, Melkó E., Nagy B. (2007).Sparse sampling
A sparse sampling algorithm for near-optimal planning in large Markov decision processes, Kearns M. et al. (2002).MCTS
Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, Rémi Coulom, SequeL (2006).UCT
Bandit based Monte-Carlo Planning, Kocsis L., Szepesvári C. (2006).- Bandit Algorithms for Tree Search, Coquelin P-A., Munos R. (2007).
OPD
Optimistic Planning for Deterministic Systems, Hren J., Munos R. (2008).OLOP
Open Loop Optimistic Planning, Bubeck S., Munos R. (2010).SOOP
Optimistic Planning for Continuous-Action Deterministic Systems, Buşoniu L. et al. (2011).OPSS
Optimistic planning for sparsely stochastic systems, L. Buşoniu, R. Munos, B. De Schutter, and R. Babuska (2011).HOOT
Sample-Based Planning for Continuous ActionMarkov Decision Processes, Mansley C., Weinstein A., Littman M. (2011).HOLOP
Bandit-Based Planning and Learning inContinuous-Action Markov Decision Processes, Weinstein A., Littman M. (2012).BRUE
Simple Regret Optimization in Online Planning for Markov Decision Processes, Feldman Z. and Domshlak C. (2014).LGP
Logic-Geometric Programming: An Optimization-Based Approach to Combined Task and Motion Planning, Toussaint M. (2015). 🎞️AlphaGo
Mastering the game of Go with deep neural networks and tree search, Silver D. et al. (2016).AlphaGo Zero
Mastering the game of Go without human knowledge, Silver D. et al. (2017).AlphaZero
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, Silver D. et al. (2017).TrailBlazer
Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning, Grill J. B., Valko M., Munos R. (2017).MCTSnets
Learning to search with MCTSnets, Guez A. et al. (2018).ADI
Solving the Rubik’s Cube Without Human Knowledge, McAleer S. et al. (2018).OPC/SOPC
Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values, Buşoniu L., Pall E., Munos R. (2018).- Real-time tree search with pessimistic scenarios: Winning the NeurIPS 2018 Pommerman Competition, Osogami T., Takahashi T. (2019)
1.4 Control Theory
- (book) The Mathematical Theory of Optimal Processes, L. S. Pontryagin, Boltyanskii V. G., Gamkrelidze R. V., and Mishchenko E. F. (1962).
- (book) Constrained Control and Estimation, Goodwin G. (2005).
PI²
A Generalized Path Integral Control Approach to Reinforcement Learning, Theodorou E. et al. (2010).PI²-CMA
Path Integral Policy Improvement with Covariance Matrix Adaptation, Stulp F., Sigaud O. (2010).iLQG
A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems, Todorov E. (2005). :octocat:iLQG+
Synthesis and stabilization of complex behaviors through online trajectory optimization, Tassa Y. (2012).
1.5 Model Predictive Control
- (book) Model Predictive Control, Camacho E. (1995).
- (book) Predictive Control With Constraints, Maciejowski J. M. (2002).
- Linear Model Predictive Control for Lane Keeping and Obstacle Avoidance on Low Curvature Roads, Turri V. et al. (2013).
MPCC
Optimization-based autonomous racing of 1:43 scale RC cars, Liniger A. et al. (2014). 🎞️ | 🎞️MIQP
Optimal trajectory planning for autonomous driving integrating logical constraints: An MIQP perspective, Qian X., Altché F., Bender P., Stiller C. de La Fortelle A. (2016).