Skip to content

Calculate optimal trajectories and parameter sensitivities in a classic discrete-time Linear Quadratic Regulator problem

License

Notifications You must be signed in to change notification settings

PhilipLoewen/DiscreteLQR

Repository files navigation

DiscreteLQR

The title is the name of a Python class that encodes discrete-time Linear-Quadratic Regulator (LQR) problems. An object in this class includes all the dynamic coefficients, and comes with methods to compute the optimal trajectory for any given initial point. But that's just the beginning!

Also provided are methods that return gradients of the problem's optimal value, V, with respect to each and every one of the coefficient matrices in the system definition. Additional methods will return the gradients with respect to the system parameters for an arbitrary smooth function W defined in terms of the optimal trajectory.

Details of the design and notation are provided in the docstring for the __init__ function.

Inspiration came from OptNet: Differentiable Optimization as a Layer in Neural Networks by Brandon Amos and J. Zico Kolter, online at https://arxiv.org/pdf/1703.00443.

This code refreshes and extends original work by Thiago da Cunha Vasco, https://github.com/thiagodcv.

Dependencies

This is a pure Python module. It relies on Numpy, and on two other modules here on the same GitHub site,

  • PhilipLoewen/PrettyPrinter, to display the results in an attractive format, and
  • PhilipLoewen/TensorGradient, to double-check theoretical gradients by finite-difference approximations.

Try it Yourself

To see what the module can do, download these files and the ones in the dependencies, and then try running the testing scripts. Start with test-LTI-optim.py. Look at the docstring for the __init__ function in module DiscreteLQR for an explanation of the output this produces, then try test-LQR-optim.py and move on to test-gradV-LTI-all.py and test-gradW-LTI-all.py.

Accuracy

For a system with n state components, m control components, and T time steps, a KKT formulation of the problem will generate a square matrix with N=(2n+m)T columns. When N is not too large, the methods here work quite well. However, things go wrong in interesting ways when N is large.

Efficiency

The KKT matrix mentioned above is block-tridiagonal and sparse. This makes it possible to solve the KKT equations using an efficent two-pass method. In fact, keeping track of the KKT structure makes it possible to proceed without ever explicitly forming (or writing, or storing) the fully KKT matrix. The code takes advantage of this, but it also includes a function that will build and print the matrix. (That was useful in the early stages of testing.)

Documentation

There are lots of comments in the code. For an overview that may clarify the mathematical side of the notation, look at the slide deck named presentation.pdf in the main repo.

Alternatives

Several authors have implementations of these ideas. This one is totally independent, so it may provide a useful comparison to check the consistency of different approaches. It would be interesting to compare the results available here with automated sensitivity analysis tools like the ones built into these recent packages from DeepMind: jax and optax.

Philip D Loewen, 2024-11-27

About

Calculate optimal trajectories and parameter sensitivities in a classic discrete-time Linear Quadratic Regulator problem

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages