diff --git a/class02/class02.jl b/class02/class02.jl new file mode 100644 index 0000000..63200b9 --- /dev/null +++ b/class02/class02.jl @@ -0,0 +1,1333 @@ +### A Pluto.jl notebook ### +# v0.20.13 + +using Markdown +using InteractiveUtils + +# This Pluto notebook uses @bind for interactivity. When running this notebook outside of Pluto, the following 'mock version' of @bind gives bound variables a default value (instead of an error). +macro bind(def, element) + #! format: off + return quote + local iv = try Base.loaded_modules[Base.PkgId(Base.UUID("6e696c72-6542-2067-7265-42206c756150"), "AbstractPlutoDingetjes")].Bonds.initial_value catch; b -> missing; end + local el = $(esc(element)) + global $(esc(def)) = Core.applicable(Base.get, el) ? Base.get(el) : iv(el) + el + end + #! format: on +end + +# ╔═╡ 7b896268-4336-47e2-a8b5-f985bfde51f5 +begin + import Pkg + + try + Pkg.activate(@__DIR__) + catch + # Fallback if @__DIR__ not available in your setup + end + Pkg.instantiate() + + # Add any missing packages + for pkg in [ + "PlutoUI", + "PlutoTeachingTools", + "MarkdownLiteral", + "ForwardDiff", + "Plots" + ] + if Base.find_package(pkg) === nothing + Pkg.add(pkg) + end + end + + using Markdown, InteractiveUtils, PlutoUI, PlutoTeachingTools, MarkdownLiteral, LinearAlgebra, ForwardDiff + import Plots + const plt = Plots + plt.default(size=(700,420)) +end + + +# ╔═╡ 81ebc291-89f0-4c1e-ac34-d5715977dd86 +md"""| | | | +|-----------:|:--|:------------------| +| Lecturer | : | Arnaud Deza | +| Date | : | 29 of August, 2025 |""" + +# ╔═╡ 9543f7bc-ab36-46ff-b471-9aa3db9739e4 +ChooseDisplayMode() + +# ╔═╡ 8969e78a-29b0-46d3-b6ba-59980208fe5b +md""" +#### Reference Material + +[^cmu11]: Z. Manchester, Optimal Control (CMU 16-745) 2025 Lecture 3, Carnegie Mellon University, YouTube, 2025. [Online]. Available: [https://www.youtube.com/watch?v=f7yF0KOV-sI](https://www.youtube.com/watch?v=f7yF0KOV-sI) + +[^cmu13]: Z. Manchester, Optimal Control (CMU 16-745) 2025 Lecture 4, Carnegie Mellon University, YouTube, 2025. [Online]. Available: [https://www.youtube.com/watch?v=lIuPIlDxLNU](https://www.youtube.com/watch?v=lIuPIlDxLNU) + +[^cmu13]: Z. Manchester, Optimal Control (CMU 16-745) 2025 Lecture 5, Carnegie Mellon University, YouTube, 2025. [Online]. Available: [https://www.youtube.com/watch?v=bsBXk17rff4](https://www.youtube.com/watch?v=bsBXk17rff4) +""" + +# ╔═╡ d90e9be0-7b68-4139-b185-6cbaad0d307e +md""" +### Some imports +""" + +# ╔═╡ 342decc1-43fa-432a-9a9c-757a10ba6a5d +md""" +# Part 0: Overview of Lecture 2 + +Lecture 2 is the course’s optimization backbone: it makes explicit the idea that most control problems are optimization problems in disguise. +We set the common language (gradients/Hessians, KKT systems, globalization) and the “solver toolbox” (penalty, augmented Lagrangian, primal–dual interior-point, SQP) that shows up everywhere else: +* MPC is a QP/NLP solved online; +* trajectory optimization is an NLP with sparse structure; +* distributed control leans on operator splitting; + +and modern differentiable controllers backprop through these solvers. + +The point isn’t to prove every theorem—it’s to give you a reliable recipe for turning a control task into a well-posed QP/NLP and picking/configuring a solver that actually converges. + +Positionally, this lecture is the hinge between “dynamics & modeling” (Class 1) and everything that follows: PMP/LQR reframed via KKT and Riccati; nonlinear trajectory optimization and collocation (Class 5); distributed MPC/ADMM (Class 8); GPU-accelerated solves (Class 9); and the learning side—adjoints for Neural DEs (Class 10) and optimization-as-a-layer for PINNs/neural operators (Classes 11–13). + +Concretely, you leave with (i) a map from control formulations to QP/NLP templates, (ii) rules of thumb for choosing between ALM, IPM, and SQP, and (iii) practical globalization (regularization + line search) so demos and projects are reproducible and fast. + +### General structure of lecture + +- **Root finding:** how implicit time-stepping (Backward Euler) turns into solving $r(x)=0$ each step. +- **Unconstrained minimization:** Newton’s method and why we need *globalization* (regularization + line search). +- **Constrained minimization:** KKT conditions in action for equality constraints. +- **Interior Point Method:** IPM, augmented lagragnian method. +- **SQP (Sequential Quadratic Programming):** how to reduce a nonlinear program to a sequence of easy QPs. +""" + +# ╔═╡ fd1ad74b-cb74-49a7-80b8-1a282abfdff2 +md""" +# Part I — Unconstrained Minimization as Root Finding + +In control, *implicit* simulators and optimizers show up everywhere: + +- **Implicit time stepping** (e.g., Backward Euler) for stability on stiff dynamics. +- **KKT systems** and **Newton steps** inside constrained optimization. +- **Adjoints** in learning-to-control are solutions of linearized/adjoint equations. + +All three reduce to solving nonlinear equations of the form $r(x)=0$. This section builds the minimal, reusable toolkit to do that reliably. + +**Example we use:** one implicit Backward Euler step for $\dot x = f(x)$: +$x_{n+1} = x_n + h\,f(x_{n+1}) \;\Rightarrow\; r(x) \equiv x_n + h\,f(x) - x = 0.$ + +We compare two solvers for $r(x)=0$: a **fixed-point** iteration and **Newton’s method**, and relate what you see to the theory (contraction vs. quadratic convergence). + +## From differential equations to algebraic equations + +Why not just use explicit Euler? For many control systems (pendulum near upright, contact, power systems) the dynamics can be **stiff**. Stability and larger steps favor **implicit** schemes, but those require solving $r(x)=0$ each step. + +For Backward Euler (BE), +$r(x) = x_n + h\,f(x) - x,$ +$\partial r(x) = h\,J_f(x) - I,$ +where $J_f$ is the Jacobian of $f$. Good solvers exploit this structure: the residual is “identity minus something small” for small $h$, and the Jacobian carries the physics through $J_f$. + +## Two algorithms we’ll compare + +**Fixed-point (Picard)** defines $g(x):=x_n + h\,f(x)$ and iterates +$x_{k+1}=g(x_k).$ +It converges if $g$ is a **contraction** near the root (intuitively, $h\,\|J_f\|$ small). The local linear rate is controlled by $\rho\!\left(J_g(x^\star)\right)=\rho\!\left(h\,J_f(x^\star)\right)$. + +**Newton’s method** linearizes $r$ and solves for a correction: +$[\partial r(x_k)]\,\Delta x_k = -\,r(x_k), \quad x_{k+1}=x_k+\Delta x_k,$ +so for BE, +$\underbrace{(h\,J_f(x_k)-I)}_{\text{BE Jacobian}}\Delta x_k = -\,(x_n+h\,f(x_k)-x_k).$ + +**Stopping & diagnostics.** We track the **residual norm** $\|r(x_k)\|$ and optionally the **step norm** $\|\Delta x_k\|$. Reporting both avoids “false convergence” when steps stagnate but residuals do not. + +## What the theorems (briefly) say + +- **Banach fixed-point theorem (Picard):** If $g$ is a contraction on a neighborhood of $x^\star$ (i.e., $\|J_g(x)\|\le q<1$), then $x_{k+1}=g(x_k)$ converges **linearly** to $x^\star$. For BE, $g'(x^\star)=h\,J_f(x^\star)$, so small $h$ or well-scaled $f$ helps. + +- **Newton’s method:** If $r\in C^2$, $\partial r(x^\star)$ is nonsingular, and $x_0$ is close enough to $x^\star$, then the iterates are **quadratically** convergent: +$\|x_{k+1}-x^\star\| \le C\,\|x_k-x^\star\|^2.$ +In practice we use **damping / line search** to reach the fast local regime from farther starts. + +**Takeaway:** Picard is simple but fragile; Newton has higher per-iteration cost but far fewer iterations and better robustness for implicit steps. +""" + +# ╔═╡ 49d5b2e6-eb29-478c-b817-8405d55170b1 +begin + """ + pendulum_dynamics(x) + + x = [θ; v], with θ̇ = v, v̇ = -(g/ℓ) sin(θ). + Returns ẋ. + """ + function pendulum_dynamics(x::AbstractVector) + ℓ = 1.0 + g = 9.81 + θ, v = x[1], x[2] + return @. [v, -(g/ℓ)*sin(θ)] + end +end + +# ╔═╡ 950c61b8-f076-4b9a-8970-e5c2841d75f2 +md""" +## Residual and Jacobian + +For the implicit step, define +$r(x) = x_n + h\,f(x) - x,$ +$\partial r(x) = \dfrac{\partial}{\partial x}\big(x_n + h\,f(x) - x\big) = h\,J_f(x) - I.$ +""" + + +# ╔═╡ 92841a2e-bc0d-40f8-8344-a5c398a67275 +begin + # Residual for BE step: r(x) = x_n + h f(x) - x + residual(fun, x_n, h, x) = x_n .+ h .* fun(x) .- x + + function jac_residual(fun, x_n, h, x) + if fun === pendulum_dynamics + ℓ = 1.0; g = 9.81; θ = x[1] + Jf = [ 0.0 1.0 + -(g/ℓ)*cos(θ) 0.0 ] + return Matrix(h * Jf - I) # dense, safe for \ + else + Jf = ForwardDiff.jacobian(fun, x) + return Matrix(h * Jf - I) + end + end +end + +# ╔═╡ 8813982c-8c9a-4706-91a8-ebadf9323a4f +md""" +## Root solvers + +**Linear-algebra view.** Each Newton update solves +$(h\,J_f(x_k)-I)\,\Delta x_k = -\,r(x_k).$ +For small $h$, the matrix is close to $-I$ (well-conditioned); as $h$ grows or the dynamics stiffen, conditioning deteriorates — that is when damping and scaling matter. + + + +**Fixed point (Picard)** updates: $x_{k+1} = g(x_k) = x_n + h\,f(x_k)$. +Converges locally if \(g\) is a contraction (roughly, small enough \(h\)). + +**Newton** uses the true Jacobian \(hJ_f - I\); near a root it is typically quadratic (fewer iterations). + +We’ll stop when $\|r(x_k)\|$ is small or a max iteration budget is reached. +""" + +# ╔═╡ 4307a2f3-0378-4282-815b-9ed1fa532e1c +begin + const SAFE = x -> max(x, eps()) # for log-scale plotting later + + """ + be_step_fixed_point(fun, x_n, h; tol=1e-8, maxiter=30) + + One implicit BE step via fixed-point. Returns (x_next, residual_norms). + """ + function be_step_fixed_point(fun, x_n, h; tol=1e-8, maxiter=30) + x = copy(x_n) # natural initial guess + errs = Float64[] + for _ in 1:maxiter + fx = fun(x) + r = x_n .+ h .* fx .- x + push!(errs, norm(r)) + if errs[end] ≤ tol; break; end + x = x_n .+ h .* fx # Picard update using same fx + end + return x, errs + end + + """ + be_step_newton(fun, x_n, h; tol=1e-10, maxiter=20) + + One implicit BE step via Newton on r(x)=0. Returns (x_next, residual_norms). + """ + function be_step_newton(fun, x_n, h; tol=1e-10, maxiter=20) + x = copy(x_n) + errs = Float64[] + for _ in 1:maxiter + r = residual(fun, x_n, h, x) + push!(errs, norm(r)) + if errs[end] ≤ tol; break; end + Dr = jac_residual(fun, x_n, h, x) + Δx = - Dr \ r + x .= x .+ Δx + end + return x, errs + end +end + +# ╔═╡ a45ed97f-f7c1-4ef5-9bc7-654e827f751b +md""" +### Controls + +Initial state $x_n = [\theta,\, v]$: +- $\theta$ $(@bind θ0 Slider(-3.0:0.05:3.0, default=0.10, show_value=true)) +- $v$ $(@bind v0 Slider(-6.0:0.1:6.0, default=0.00, show_value=true)) + +Step size $h$: $(@bind hstep Slider(0.1:0.1:0.2, default=0.1, show_value=true)) +Max iterations: $(@bind iters_max Select([10, 15, 20, 30]; default=20)) + +Compare: Fixed-point $(@bind show_fp CheckBox(default=true)) +/ Newton $(@bind show_nt CheckBox(default=true)) + +$(@bind run Button("Run")) +""" + + + +# ╔═╡ 5a17f83e-751b-4244-9c15-7165645bfe29 +begin + run + + x_n = [θ0, v0] + + x_fp, e_fp = show_fp ? be_step_fixed_point(pendulum_dynamics, x_n, hstep; maxiter=iters_max) : (x_n, Float64[]) + x_nt, e_nt = show_nt ? be_step_newton(pendulum_dynamics, x_n, hstep; maxiter=iters_max) : (x_n, Float64[]) + + efp = isempty(e_fp) ? Float64[] : max.(e_fp, eps()) # safe for log scale + ent = isempty(e_nt) ? Float64[] : max.(e_nt, eps()) + + p = plt.plot(title="Single-step convergence (‖r(x_k)‖)", + xlabel="iteration k", ylabel="‖r(x_k)‖", yscale=:log10) + + if !isempty(efp) + plt.plot!(1:length(efp), efp, lw=2, marker=:circle, label="Fixed-point") + end + if !isempty(ent) + plt.plot!(1:length(ent), ent, lw=2, marker=:utriangle, label="Newton") + end + p +end + + +# ╔═╡ 12c03202-58b2-482e-a65c-b83bc1f6eed1 +md""" +**What to notice** + +- For small enough \(h\), **fixed point** often converges, but slowly (roughly linear rate). +- **Newton** typically reaches the tolerance in a handful of steps (near-quadratic rate). +- Increase \(h\) to see fixed point struggle while Newton remains robust (until the Jacobian becomes ill-conditioned). +""" + +# ╔═╡ 662d58d7-4c9c-4699-a846-cb6070c507d9 +md""" +# Part II - Unconstrained Minimization — Newton, Regularization, and Line Search + +**Learning goals** +- Apply Newton’s method to a nonconvex scalar objective. +- See why plain Newton can **increase** the objective (negative curvature). +- Stabilize Newton with **Hessian regularization**. +- Add **Armijo backtracking** to get robust progress. + +We will use the toy function +$f(x) = x^4 + x^3 - x^2 - x$ +which has both minima and maxima, so it’s perfect to show where Newton can go wrong. +""" + +# ╔═╡ 27760607-b740-47dc-a810-c332baa2bd2d +begin + #### Problem definition (scalar) #### + f(x::Real) = x^4 + x^3 - x^2 - x + ∇f(x::Real) = 4x^3 + 3x^2 - 2x - 1 + ∇²f(x::Real)= 12x^2 + 6x - 2 + + grad_f(x) = ∇f(x) + hess_f(x) = ∇²f(x) + + #### Plot helpers #### + function plot_fx(; xrange=(-1.75, 1.25), npts=1000, title="Objective") + xs = range(xrange[1], xrange[2], length=npts) + plt.plot(xs, f.(xs), lw=2, label="f(x)", xlabel="x", ylabel="f(x)", title=title) + end + + function plot_trace(xhist; xrange=(-1.75, 1.25), npts=1000, title="") + xs = range(xrange[1], xrange[2], length=npts) + p = plt.plot(xs, f.(xs), lw=2, label="f(x)", xlabel="x", ylabel="f(x)", title=title) + plt.scatter!(p, xhist, f.(xhist), marker=:x, ms=7, label="iterates") + return p + end +end + + +# ╔═╡ 7765e032-c520-4f97-b877-0d479f383f28 +md""" +## 1) Inspect the objective + +Below is the graph of \(f(x)\). The multiple critical points (max/min) make it a nice testbed for Newton. +""" + +# ╔═╡ a098a49b-a368-4929-824c-385a06b88696 +plot_fx(title="f(x) = x^4 + x^3 - x^2 - x") + +# ╔═╡ a62f1b6a-87fe-4401-bc13-42166ca0e129 +md""" +## 2) Plain Newton (no safeguards) + +Newton update: + + +$x_{k+1} = x_k - \frac{\nabla f(x_k)}{\nabla^2 f(x_k)}$. + + +If $\nabla^2 f(x_k) < 0$, the step **ascends** along \(f\). + +Use the sliders to choose a start and number of iterations. +""" + +# ╔═╡ b2ddc048-4942-43bc-8faa-1921062d8c9c +begin + # One Newton step (no regularization, no line search) + newton_step_plain(x) = -grad_f(x) / hess_f(x) + + # Driver for plain Newton + function newton_optimize_plain(x0; tol=1e-8, maxiters=10) + x = float(x0) + xhist = Float64[x] + for _ in 1:maxiters + g = grad_f(x) + H = hess_f(x) + if abs(g) <= tol || !isfinite(H) + break + end + Δx = -g / H + if !isfinite(Δx) + break + end + x += Δx + push!(xhist, x) + if abs(grad_f(x)) <= tol + break + end + end + return xhist + end +end + + +# ╔═╡ 6fd7a753-6db7-4c37-9e22-dc63dd3072c8 +begin + @bind x0_plain PlutoUI.Slider(-1.75:0.01:1.1, default=-1.50, show_value=true) +end + + +# ╔═╡ 49e0f5c3-fe14-42e1-9b3a-83c4447148a8 +begin + @bind iters_plain PlutoUI.Slider(1:20, default=8, show_value=true) +end + +# ╔═╡ 096ebc95-f133-4ca3-b942-cf735faaa42b +begin + xhist_plain_1 = newton_optimize_plain(x0_plain; maxiters=iters_plain) + plot_trace(xhist_plain_1; title = "Plain Newton from x₀ = $(round(x0_plain,digits=3)) and iters = $(iters_plain)") +end + + +# ╔═╡ fdf41c76-4405-49e0-abfa-5c5193de99f4 +md""" +## 3) Globalization: Regularization + Armijo backtracking + +Two simple fixes: + +1. **Regularization:** If $\nabla^2 f(x_k)\le 0$, replace it by $\nabla^2 f(x_k) + \beta $ with $\beta>0$ (LM-style), so the step is a **descent** direction. +2. **Armijo line search:** Scale the step by \(\alpha \in (0,1]\) until +$f(x_k+\alpha\Delta x) \le f(x_k) + b\,\alpha\,\nabla f(x_k)\,\Delta x$, +with typical $b=10^{-4}$, $c=0.5$ for backtracking. + +We’ll expose both as toggles. +""" + + +# ╔═╡ d259c1b8-3716-4f80-b462-3b3daebb444d +md""" +### Controls + +**Start x₀:** $(@bind x0_cmp Slider(-1.5:0.01:0.75, default=0.00, show_value=true)) +**Iterations:** $(@bind iters_cmp Slider(1:40, default=10, show_value=true)) + +**Regularization β₀:** $(@bind beta0 Slider(0.0:0.1:5.0, default=1.0, show_value=true)) + +**Armijo b:** $(@bind armijo_b Select([1e-4, 1e-3, 1e-2]; default=1e-4)) +**Armijo c (backtracking factor):** $(@bind armijo_c Slider(0.1:0.05:0.9, default=0.5, show_value=true)) + +**Show:** +- Plain Newton $(@bind show_plain CheckBox(default=true)) +- Regularized $(@bind show_reg CheckBox(default=true)) +- Reg + Line Search $(@bind show_rls CheckBox(default=true)) + +**Layout:** $(@bind orientation Select(["row (1×N)", "column (N×1)"]; default="row (1×N)")) +""" + + +# ╔═╡ e012d7e0-0181-49d0-bb78-995378c4f87a +md""" +## 4) Takeaways + +- **Plain Newton** is fast **if** you’re in a nice region and the Hessian is positive. +- **Regularization** turns the Newton step into a descent direction when curvature is negative or near-singular. +- **Armijo backtracking** fixes overshooting and makes progress predictable without hand-tuning step sizes. +- These building blocks generalize to higher-dimensional problems and constrained KKT systems (next section of the tutorial). + +**Further experiments** +- Change the objective (keep the same drivers). +- Try different Armijo parameters $b, c$. +- Visualize **$|\nabla f(x_k)|$** vs iteration to see convergence. +""" + + +# ╔═╡ 0a3eb748-fab3-4f8b-993e-2246c32fb6aa + + +# ╔═╡ 3989b8e1-ac1f-430d-9d72-e298ba7ae0ca + + +# ╔═╡ e466ffb2-fcc8-4c3b-9059-7fd68a4265f2 + + +# ╔═╡ 09a301c7-c1f4-4890-afe9-fbc1d7c3b905 + + +# ╔═╡ 53020b8a-4948-467a-8629-bef9496d0374 + + +# ╔═╡ 26c887ce-d95b-4e38-9717-e119de0e80ca +md""" +# Part III - Constrained Optimization (KKT) + +#### Setup (equality constraints) +We solve +```math +\min_{x\in\mathbb{R}^n} f(x) \quad \text{s.t.}\ C(x)=0,\qquad C:\mathbb{R}^n\to\mathbb{R}^m. +``` + +**Geometric picture.** At an optimum on the manifold `C(x)=0`, the gradient is orthogonal to the tangent space: + +```math +\nabla f(x^\star)\ \perp\ \mathcal{T}_{x^\star}=\{p:\ J_C(x^\star)p=0\}. +``` + +Equivalently, the gradient is a combination of the constraint normals: + +```math +\nabla f(x^\star)+J_C(x^\star)^{\!T}\lambda^\star=0,\qquad C(x^\star)=0. +``` + +**Lagrangian.** Define $L(x,\lambda)=f(x)+\lambda^{\!T}C(x)$. +""" + +# ╔═╡ a1b7f614-710a-4ce8-815b-2f94754088c4 +md""" +#### From conditions to a solver (Newton/SQP on the KKT system) + +Linearizing feasibility and stationarity gives the **saddle-point** (KKT) system: + +```math +\begin{bmatrix} +H & J_C(x)^{\!T} \\ +J_C(x) & 0 +\end{bmatrix} +\begin{bmatrix}\Delta x\\ \Delta\lambda\end{bmatrix} +=- +\begin{bmatrix} +\nabla f(x)+J_C(x)^{\!T}\lambda\\ +C(x) +\end{bmatrix}, +\qquad +H \approx \nabla^2 f(x) + \sum_{i=1}^m \lambda_i\,\nabla^2 C_i(x). +``` + +Two common choices for `H`: + +* **Full Newton (Hessian of the Lagrangian):** as written above (fast near a solution). +* **Gauss–Newton/SQP:** drop the constraint-curvature term so $H\approx\nabla^2 f(x)$ (often more robust far from the solution). + +> Practical: this system is symmetric indefinite; block elimination (Schur complement) and sparse factorizations are standard. +""" + +# ╔═╡ 6fd82e73-6fef-4f7f-b291-f94cbac0d268 +md""" +#### Inequality constraints (KKT, first-order) + +For $c(x)\ge 0$, + +```math +\begin{aligned} +&\text{Stationarity:} && \nabla f(x)-J_c(x)^{\!T}\lambda=0,\\ +&\text{Primal feasibility:} && c(x)\ge 0,\\ +&\text{Dual feasibility:} && \lambda\ge 0,\\ +&\text{Complementarity:} && \lambda^{\!T}c(x)=0\quad(\lambda_i\,c_i(x)=0\ \forall i). +\end{aligned} +``` + +**Interpretation.** Active constraints ($c_i(x)=0$) may carry nonzero multipliers; inactive ones ($c_i(x)>0$) have $\lambda_i=0$. +""" + +# ╔═╡ 52718e0b-f958-445e-b9b6-9e5baf09e81a + + +# ╔═╡ edce5b27-9af8-4010-9d9f-60681b2f427c + + +# ╔═╡ c139ccc9-0355-4c41-8d82-0c97fef50900 + + +# ╔═╡ 43c5ccb9-d51e-4d73-b7be-f8b9dd599130 +md""" +# Part III Code - Equality Constrained Minimization (KKT) + +We solve + +```math +\min_x\; f(x) \quad \text{s.t.}\; c(x)=0 +``` + +with one equality constraint. The **KKT system** for a Newton/SQP step is + +```math +\begin{bmatrix} H & J^\top \\ J & 0 \end{bmatrix} +\begin{bmatrix} \Delta x \\ \Delta \lambda \end{bmatrix} += - \begin{bmatrix} \nabla f(x)+J^\top \lambda \\ c(x) \end{bmatrix}, +\quad J=\nabla c(x)^\top,\; H\approx\nabla^2\!L(x,\lambda). +``` + +We'll compare: + +* **Gauss–Newton/SQP**: $H=\nabla^2 f$ (drop constraint curvature) ⇒ robust PD top-left. +* **Full**: $H=\nabla^2 f + \lambda \nabla^2 c$. + +We’ll also use a simple merit function $\phi(x)=f(x)+\frac{\rho}{2}\,c(x)^2$ with Armijo backtracking. +""" + +# ╔═╡ e0a7f431-61dd-4df2-b53c-d81c7a302baf +begin + # === Objective & constraint (KKT section) === + Q_kkt = Diagonal([0.5, 1.0]) + xstar_kkt = [1.0, 0.0] + + # Scalar objective (use dot() to ensure a Float64, not 1×1) + f_kkt(x::AbstractVector) = 0.5 * dot(x .- xstar_kkt, Q_kkt * (x .- xstar_kkt)) + ∇f_kkt(x::AbstractVector) = Q_kkt * (x .- xstar_kkt) + ∇²f_kkt(::AbstractVector) = Q_kkt # constant SPD 2×2 + + # One equality constraint: c(x) = x1^2 + 2x1 - x2 + c_kkt(x::AbstractVector) = x[1]^2 + 2x[1] - x[2] + ∇c_kkt(x::AbstractVector) = [2x[1] + 2.0, -1.0] # length-2 + ∇²c_kkt(::AbstractVector) = [2.0 0.0; 0.0 0.0] # constant 2×2 +end + + + +# ╔═╡ 3e3d6e18-f922-4e91-8eaf-bc26b8f91757 +begin + function landscape_plot_kkt(; xlim=(-4,4), ylim=(-4,4), nsamp=120, show_colorbar=false) + xs = range(xlim[1], xlim[2], length=nsamp) + ys = range(ylim[1], ylim[2], length=nsamp) + Z = [ f_kkt([x,y]) for x in xs, y in ys ] + + p = plt.contour(xs, ys, Z; + levels=18, + colorbar=show_colorbar, + xlabel="x₁", ylabel="x₂", + title="Objective contours & constraint (c(x)=0)", + legend=:topleft, + aspect_ratio=:equal, + xlims=(xlim[1], xlim[2]), + ylims=(ylim[1], ylim[2])) + + xc = collect(xs) + yc = @. xc^2 + 2xc + plt.plot!(p, xc, yc; lw=2, label="c(x)=0") + plt.scatter!(p, [xstar_kkt[1]], [xstar_kkt[2]]; marker=:star5, ms=8, label="x⋆") + + plt.xlims!(p, (xlim[1], xlim[2])) + plt.ylims!(p, (ylim[1], ylim[2])) + return p + end + + plot_path_kkt!(p, X) = begin + plt.plot!(p, X[1,:], X[2,:]; marker=:x, ms=6, lw=1.5, label="iterates") + plt.xlims!(p, (-4,4)); plt.ylims!(p, (-4,4)) + p + end +end + + + +# ╔═╡ 31407c3f-8f3d-4dcc-bc49-cedd8ca14013 +begin + # === KKT step + simple merit line search (KKT section) === + + ϕ_kkt(x; ρ=1.0) = f_kkt(x) + 0.5*ρ*c_kkt(x)^2 + ∇ϕ_kkt(x; ρ=1.0) = ∇f_kkt(x) .+ ρ * c_kkt(x) .* ∇c_kkt(x) + + """ + kkt_step_kkt(x, λ; method=:gn, δ=1e-8) + + Solve the 3×3 KKT system: + [ H C'; C 0 ] [Δx; Δλ] = -[ ∇f + C'λ; c ] + with C = ∇c(x) as a 1×2 row, H = ∇²f (Gauss–Newton) or ∇²f + λ∇²c (Full). + A small ridge δ is added to H on the diagonal for numerical robustness. + """ + function kkt_step_kkt(x::AbstractVector, λ::Real; method::Symbol=:gn, δ::Float64=1e-8) + g = ∇f_kkt(x) # 2 + cv = ∇c_kkt(x) # length-2 + C = reshape(cv, 1, :) # 1×2 + H = Matrix(∇²f_kkt(x)) + if method === :full + H .+= λ .* ∇²c_kkt(x) + end + # add small ridge on diagonal (avoid UniformScaling arithmetic) + @inbounds for i in 1:size(H,1); H[i,i] += δ; end + + K = [H C'; C 0.0] # 3×3 + rhs = -vcat(g .+ λ .* cv, c_kkt(x)) + Δ = try + K \ rhs + catch + pinv(K) * rhs + end + Δx = Δ[1:2]; Δλ = Δ[3] + return Δx, Δλ + end + + """ + kkt_solve_kkt(x0, λ0; iters=8, method=:gn, linesearch=true, ρ=1.0, b=1e-4, cdec=0.5) + + A few KKT iterations with optional Armijo backtracking on ϕ_kkt. + Returns (X, Λ) with X ∈ ℝ^{2×(iters+1)}. + """ + function kkt_solve_kkt(x0, λ0; iters=8, method=:gn, linesearch=true, ρ=1.0, b=1e-4, cdec=0.5) + x = copy(x0); λ = float(λ0) + X = reshape(x, 2, 1); Λ = [λ] + for _ in 1:iters + Δx, Δλ = kkt_step_kkt(x, λ; method=method) + α = 1.0 + if linesearch + φx = ϕ_kkt(x; ρ=ρ) + gφ = ∇ϕ_kkt(x; ρ=ρ) + slope = dot(gφ, Δx) + for _ in 1:20 + if ϕ_kkt(x .+ α .* Δx; ρ=ρ) <= φx + b*α*slope + break + end + α *= cdec + end + end + x .+= α .* Δx + λ += α * Δλ + X = hcat(X, x); push!(Λ, λ) + end + return X, Λ + end +end + + +# ╔═╡ a61a63b7-716e-4500-9bc6-ab2caf9062e7 +md""" +### Controls (KKT) + +**x₁** $(@bind x1_kkt Slider(-3.5:0.1:1.5, default=-1.5, show_value=true)) +**x₂** $(@bind x2_kkt Slider(-3.0:0.1:3.0, default=-1.0, show_value=true)) + +**λ₀** $(@bind λ0_kkt Slider(-2.0:0.1:2.0, default=0.0, show_value=true)) +**Iterations** $(@bind iters_kkt Select([3, 5, 8, 10, 15, 20]; default=8)) + +**Method** $(@bind method_kkt Select([:gn, :full]; default=:gn)) +**Line search** $(@bind use_ls_kkt CheckBox(default=true)) + +**ρ (merit weight)** $(@bind rho_kkt Slider(0.1:0.1:5.0, default=1.0, show_value=true)) + +$(@bind run_kkt_btn Button("Run KKT")) +""" + + +# ╔═╡ 1fda2985-ab68-4460-98be-8621e5a5f1c8 +let + # === Run & plot (KKT section) === + run_kkt_btn # recompute only when the button is clicked + + x0 = [x1_kkt, x2_kkt] + X, Λ = kkt_solve_kkt(x0, λ0_kkt; iters=iters_kkt, method=method_kkt, + linesearch=use_ls_kkt, ρ=rho_kkt) + + fig = landscape_plot_kkt() + plot_path_kkt!(fig, X) + + feas = abs(c_kkt(X[:,end])) + stat = norm( ∇f_kkt(X[:,end]) .+ (Λ[end] .* ∇c_kkt(X[:,end])) ) + + plt.annotate!(fig, -1.8, 15.3, plt.text("feas = $(round(feas,digits=4))", 9)) + plt.annotate!(fig, -1.8, 12.3, plt.text("stat = $(round(stat,digits=4))", 9)) + + fig +end + +# ╔═╡ ea0f0b1e-65b1-4c66-b9fa-7f2c428e3459 + + +# ╔═╡ b4f6cce9-c39e-4325-a0e3-ab9c38179894 +md""" +# Part IV - Log-domain Interior-Point Method (IPM): Tiny QP + +We solve a 2D quadratic program with one inequality: + +- Objective: minimize $f(x) = \tfrac{1}{2}(x - [1,0])^\top Q (x - [1,0])$ with $Q=\mathrm{diag}(0.5, 1)$. +- Constraint: $c(x) = (-1, 1)\cdot x - 1 \ge 0$ (i.e., $x_2 \ge x_1 + 1$). + +**Log-domain substitution** for inequality handling: +$s = \sqrt{\rho}\,e^{\sigma}$, $\lambda = \sqrt{\rho}\,e^{-\sigma}$ so that $s\lambda = \rho$. + +We solve the relaxed KKT system $r(z;\rho)=0$ with unknowns $z=[x;\sigma]$ using damped Newton plus Armijo backtracking on the merit function $\phi(z)=\tfrac{1}{2}\lVert r(z)\rVert^2$. +""" + +# ╔═╡ 252554be-a839-4770-b222-fbb6a32df2ef +begin + # Quadratic objective + Q = Diagonal([0.5, 1.0]) + f(x) = 0.5 * dot(x .- [1.0, 0.0], Q * (x .- [1.0, 0.0])) + ∇f(x) = Q * (x .- [1.0, 0.0]) + + # Single inequality c(x) ≥ 0 with gradient A = (-1, 1) + A = [-1.0, 1.0] # length-2 vector + b = 1.0 + c(x) = dot(A, x) - b # scalar + J(x) = A' # 1×2 (shown for reference) +end + +# ╔═╡ 62d85874-9673-4893-bbff-fa75b4462e96 +begin + # Confidence check: our analytic ∇f, ∇²f vs automatic differentiation + fd_grad(x) = ForwardDiff.derivative(f, x) + fd_hess(x) = ForwardDiff.derivative(fd_grad, x) + + xsamp = [-1.0, -0.25, 0.0, 0.5, 1.0] + grad_err = maximum(abs.(fd_grad.(xsamp) .- ∇f.(xsamp))) + hess_err = maximum(abs.(fd_hess.(xsamp) .- ∇²f.(xsamp))) + + (grad_err < 1e-10 && hess_err < 1e-10) ? + tip(md"AD check passed: analytic derivatives match ForwardDiff on sample points.") : + danger(md"AD check mismatch. Largest errors — grad: $(grad_err), hess: $(hess_err)") +end + +# ╔═╡ 22bfe0a3-c61b-4dfe-8f20-a8bf807c2e14 +begin + # Regularized Newton direction (scalar LM-style) + function newton_direction_reg(x; β0::Float64=1.0, max_tries::Int=20) + g = grad_f(x) + H = hess_f(x) + if H > 0 + return -g / H, H + end + β = β0 + Hreg = H + tries = 0 + while Hreg <= 0 && tries < max_tries + Hreg = H + β + β *= 2 + tries += 1 + end + return -g / Hreg, Hreg + end + + # Armijo backtracking (scalar) + function armijo(f, x, Δx, g; b::Float64=1e-4, c::Float64=0.5, α0::Float64=1.0, max_backtracks::Int=50) + α = α0 + fx = f(x) + rhs = b * g * Δx + for _ in 1:max_backtracks + if f(x + α*Δx) <= fx + α*rhs + return α + end + α *= c + end + return α + end + + # Unified driver with full controls + function newton_optimize(x0; tol=1e-8, maxiters=20, regularize::Bool=false, linesearch::Bool=false, + β0::Float64=1.0, b::Float64=1e-4, c::Float64=0.5, α0::Float64=1.0) + x = float(x0) + xhist = Float64[x] + for _ in 1:maxiters + g = grad_f(x) + + # Direction + Δx, Hused = if regularize + newton_direction_reg(x; β0=β0) + else + (-g / hess_f(x), hess_f(x)) + end + + if !isfinite(Δx) || abs(g) <= tol + break + end + + # Step length + α = linesearch ? armijo(f, x, Δx, g; b=b, c=c, α0=α0) : 1.0 + + x = x + α*Δx + push!(xhist, x) + if abs(grad_f(x)) <= tol + break + end + end + return xhist + end +end + +# ╔═╡ 858d4ee1-4f15-4ced-b984-c0291237d359 +begin + # Build plots for whichever methods are selected + plots = Plots.Plot[] # vector of subplots + + if show_plain + xhist_plain = newton_optimize(x0_cmp; maxiters=iters_cmp, regularize=false, linesearch=false) + push!(plots, plot_trace(xhist_plain; title="Plain")) + end + if show_reg + xhist_reg = newton_optimize(x0_cmp; maxiters=iters_cmp, regularize=true, linesearch=false, β0=beta0) + push!(plots, plot_trace(xhist_reg; title="Regularized (β₀=$(round(beta0,digits=2)))")) + end + if show_rls + xhist_rls = newton_optimize(x0_cmp; maxiters=iters_cmp, regularize=true, linesearch=true, + β0=beta0, b=armijo_b, c=armijo_c) + push!(plots, plot_trace(xhist_rls; title="Reg + LineSearch (b=$(armijo_b), c=$(round(armijo_c,digits=2)))")) + end + + if isempty(plots) + md"Select at least one method to visualize." + else + n = length(plots) + layout_val = orientation == "row (1×N)" ? (1,n) : (n,1) + plt.plot(plots..., layout=layout_val, size=(1200, orientation=="row (1×N)" ? 360 : 900)) + end +end + + +# ╔═╡ 82e2eca9-4991-4538-8e1f-455cd46f849f +begin + # Generic version + function landscape_plot(; xlim=(-4,4), ylim=(-4,4), nsamp=120, show_colorbar=false) + xs = range(xlim[1], xlim[2], length=nsamp) + ys = range(ylim[1], ylim[2], length=nsamp) + Z = [ f([x,y]) for x in xs, y in ys ] # size(Z) = (length(xs), length(ys)) + + p = plt.contour(xs, ys, Z; + levels=18, + colorbar=show_colorbar, + xlabel="x₁", ylabel="x₂", + title="Objective contours & constraint", + legend=:topleft, + aspect_ratio=:equal, + xlims=(xlim[1], xlim[2]), # hard-lock axes + ylims=(ylim[1], ylim[2])) + + xc = collect(xs) + yc = @. xc^2 + 2xc + plt.plot!(p, xc, yc; lw=2, label="c(x)=0") + plt.scatter!(p, [xstar[1]], [xstar[2]]; marker=:star5, ms=8, label="x⋆") + + # re-enforce limits after adding series (paranoid, but safe) + plt.xlims!(p, (xlim[1], xlim[2])) + plt.ylims!(p, (ylim[1], ylim[2])) + return p + end + + plot_path!(p, X) = begin + plt.plot!(p, X[1,:], X[2,:]; marker=:x, ms=6, lw=1.5, label="iterates") + plt.xlims!(p, (-4,4)); plt.ylims!(p, (-4,4)) # keep locked if path goes outside + p + end +end + + +# ╔═╡ db230e89-e8ea-4c57-8517-0dbf68cab6b7 +md""" +## Visualizing the objective and the constraint + +We plot contour lines of $f(x)$ and the line $c(x)=0$ (the feasible region is on or above that line, i.e., $c(x)\ge 0$). +""" + +# ╔═╡ d9c51abd-2488-405f-9b4c-799f496c0dd1 +begin + function plot_landscape(; xlim=(-4.0, 4.0), ylim=(-4.0, 4.0)) + xs = range(xlim[1], xlim[2], length=200) + ys = range(ylim[1], ylim[2], length=200) + Z = [f([xi, yi]) for yi in ys, xi in xs] + + p = plt.contour( + xs, ys, Z; + fill=false, + xlabel="x₁", ylabel="x₂", + title="Objective & constraint", + aspect_ratio=1, # square figure; keep if you like + xlims=xlim, ylims=ylim # << fix the visible ranges + ) + + plt.plot!(p, xs, xs .+ 1; label="c(x)=0 (x₂ = x₁ + 1)") + return p + end + + fig_test = plot_landscape() # will be limited to [-4,4] × [-4,4] +end + + +# ╔═╡ af7cf190-fa37-40f7-ac9e-dbdb8e72fff8 +md""" +## Log-domain IP residuals + +With $z=[x;\sigma]$ and parameter $\rho>0$, define +$\lambda(z,\rho)=\sqrt{\rho}\,e^{-\sigma}$ and $s(z,\rho)=\sqrt{\rho}\,e^{\sigma}$. + +Residuals: +- Stationarity (2 numbers): $\nabla f(x) - \lambda A = 0$. +- Primal feasibility (1 number): $c(x) - s = 0$. + +So $r(z;\rho) = \begin{bmatrix} \nabla f(x) - \lambda A \\ c(x) - s \end{bmatrix} \in \mathbb{R}^3$. +""" + +# ╔═╡ 6c5b33c5-94fc-4b41-8128-696a58ca9b64 +begin + function ip_residual(z::AbstractVector, ρ::Real) + x = z[1:2] + σ = z[3] + λ = sqrt(ρ) * exp(-σ) # ≥ 0 + s = sqrt(ρ) * exp(σ) # ≥ 0 + r1 = ∇f(x) .- (λ .* A) # length-2 + r2 = c(x) - s # scalar + return vcat(r1, r2) # length-3 + end +end + +# ╔═╡ 64d20ba5-d195-41ef-88b8-d2968bf87ac2 +md""" +## Newton solve for fixed $\rho$ + +We solve $r(z; \rho)=0$ with damped Newton on the merit $\phi(z)=\tfrac{1}{2}\lVert r(z)\rVert^2$. + +Directional derivative of $\phi$ along $\Delta z$ is $r(z)^\top J(z)\,\Delta z$ where $J$ is the Jacobian of $r$. +""" + +# ╔═╡ 4f9c5ddd-05f5-4826-a775-bc1af4555153 +begin + function newton_solve(z0::AbstractVector, ρ; tol=1e-10, maxiter=50, α_min=1e-12) + z = copy(z0) + R = ip_residual(z, ρ) + Zhist = reshape(z, :, 1) + + for k in 1:maxiter + if norm(R) ≤ tol + return z, Zhist + end + Jz = ForwardDiff.jacobian(zz -> ip_residual(zz, ρ), z) # 3×3 + Δz = -Jz \ R + + ϕz = 0.5 * dot(R, R) + gΔ = dot(R, Jz * Δz) # directional derivative of ϕ at z along Δz + α = 1.0 + z_new = z .+ α .* Δz + R_new = ip_residual(z_new, ρ) + ϕnew = 0.5 * dot(R_new, R_new) + c_armijo, shrink = 1e-4, 0.5 + + if gΔ > 0 + # fallback: just ensure decrease in ϕ + while ϕnew > ϕz && α > α_min + α *= shrink + z_new = z .+ α .* Δz + R_new = ip_residual(z_new, ρ) + ϕnew = 0.5 * dot(R_new, R_new) + end + else + # standard Armijo + while ϕnew > ϕz + c_armijo*α*gΔ && α > α_min + α *= shrink + z_new = z .+ α .* Δz + R_new = ip_residual(z_new, ρ) + ϕnew = 0.5 * dot(R_new, R_new) + end + end + + z, R = z_new, R_new + Zhist = hcat(Zhist, z) + end + return z, Zhist + end +end + +# ╔═╡ 1fb54858-898e-4d0d-8435-746bf3c17e65 +md""" +#### KKT sanity checks to report: +* stationarity norm $\lVert\nabla f(x) - \lambda A\rVert$ +* primal feasibility $c(x)$ +* dual feasibility $\lambda$ +* and complementarity slackness $\lambda\,c(x)$ + +These should approach zero feasibility violation and complementarity as $\rho \to 0$. +""" + + +# ╔═╡ 2070905c-d96b-4e3d-8a51-f0bf26ba98b8 +begin + function kkt_checks(z, ρ) + x, σ = z[1:2], z[3] + λ = sqrt(ρ) * exp(-σ) + (; stationarity_norm = norm(∇f(x) .- (λ .* A)), + primal_feas = c(x), # should be ≥ 0 + dual_feas = λ, # should be ≥ 0 + complementarity = λ * c(x)) + end +end + + +# ╔═╡ 08270f19-7c1c-4707-baf8-fa985519baf3 +md""" +## Demo: follow the central path by decreasing $\rho$ + +We warm-start Newton at a smaller $\rho$ and plot the iterates on the $(x_1,x_2)$ plane. +""" + + +# ╔═╡ a4c50b10-0a6f-4967-9895-e5171f44fafa +begin + # Initial guess (must be feasible or strictly interior for the barrier) + x0_init = [-2.0, 2.0] + σ0_init = 0.0 + z0_init = vcat(x0_init, σ0_init) # [x; σ] ∈ ℝ^3 + + # Central-path parameters + ρ_path1 = 1.0 + ρ_path2 = 1e-8 +end + +# ╔═╡ 71530ac2-841c-4616-98e3-2dedd63e3834 +begin + # cold-start both runs from the same z0_init + sol_rho1_cold, traj_rho1_cold = newton_solve(z0_init, ρ_path1; tol=1e-10) + sol_rho2_cold, traj_rho2_cold = newton_solve(z0_init, ρ_path2; tol=1e-10) + + fig_ipm_compare = plot_landscape(xlim=(-4,4), ylim=(-4,4)) + plt.plot!(fig_ipm_compare; legend=:bottomright, grid=true) + + # starting point + plt.scatter!(fig_ipm_compare, [z0_init[1]], [z0_init[2]]; + label="start", markershape=:star5, markersize=9, markerstrokewidth=0, color=:black) + + # path & final for ρ = 1.0 + plt.plot!(fig_ipm_compare, vec(traj_rho1_cold[1, :]), vec(traj_rho1_cold[2, :]); + label="iterates ρ=1.0", lw=2, marker=:circle, markersize=3, color=:blue) + + # path & final for ρ = 1e-8 + plt.plot!(fig_ipm_compare, vec(traj_rho2_cold[1, :]), vec(traj_rho2_cold[2, :]); + label="iterates ρ=1e-8", lw=2, marker=:utriangle, markersize=3, color=:red) + + + fig_ipm_compare +end + + + +# ╔═╡ fe6ec86b-84c1-4e82-85e0-dce1a316214a +md""" +### Optional: Jacobian of the residual at the final point + +The Jacobian $J(z)$ of $r(z;\\rho)$ at the final iterate can be inspected via eigenvalues to gauge local conditioning. +""" + +# ╔═╡ 686b3f99-11df-4cc0-b739-3ff8476d217d + + +# ╔═╡ 8aa33896-2585-4a60-86dc-7386b96feb9f + + +# ╔═╡ 3132cda5-ad6e-4ae2-afba-ea18d09cf8df +md""" +# Part V - Sequential Quadratic Programming (SQP) + +**Idea.** Solve a nonlinear constrained problem by repeatedly solving a **quadratic program (QP)** built from local models. + +- Quadratic model of the Lagrangian/objective near the current iterate. +- Linearize the constraints around the current iterate. +- Each iteration: solve a QP to get a step $d$, then update $x \leftarrow x + \alpha d$. +- Strengths: Newton-like local convergence (often superlinear), warm-start friendly. +""" + + + +# ╔═╡ 74d5c04f-3f63-41ba-818d-ce217cd18022 +md""" +## Problem & KKT recap + +We consider +$\min_{x \in \mathbb{R}^n} \ f(x) \quad \text{s.t.}\quad g(x)=0,\ \ h(x)\le 0$. + +At a candidate optimum $x^\star$, the KKT conditions require multipliers +$\lambda \in \mathbb{R}^m$, $\mu \in \mathbb{R}^p_{\ge 0}$ such that +$\nabla f(x^\star) + \nabla g(x^\star)^\top \lambda + \nabla h(x^\star)^\top \mu = 0$, +$g(x^\star)=0,\ \ h(x^\star)\le 0,\ \ \mu \ge 0,\ \ \mu \odot h(x^\star)=0$. +""" + + +# ╔═╡ 4b560ea8-8104-40ce-bf44-b5c2267132eb +md""" +## Local models that define the QP + +At iterate $x_k$ with multipliers $(\lambda_k,\mu_k)$: + +**Quadratic model** +$m_k(d) \;=\; \nabla f(x_k)^\top d \;+\; \tfrac{1}{2}\, d^\top B_k\, d,$ +with $B_k \approx \nabla^2_{xx}\mathcal{L}(x_k,\lambda_k,\mu_k)$. Common choices: exact Hessian, (L-)BFGS, or Gauss–Newton. + +**Linearized constraints** +$ g(x_k) + \nabla g(x_k)\, d = 0, \qquad h(x_k) + \nabla h(x_k)\, d \le 0.$ +""" + +# ╔═╡ 96738f09-e894-499e-9e30-15692e2434dd +md""" +## The SQP subproblem (QP) + +$ +\begin{aligned} +\min_{d \in \mathbb{R}^n}\quad & +\nabla f(x_k)^\top d \;+\; \tfrac{1}{2}\, d^\top B_k\, d \\ +\text{s.t.}\quad & +\nabla g(x_k)\, d + g(x_k) = 0, \\ +& +\nabla h(x_k)\, d + h(x_k) \le 0. +\end{aligned} +$ + +Solving this QP yields a step $d_k$ and QP multipliers $(\lambda_{k+1},\mu_{k+1})$. +Update with $x_{k+1} = x_k + \alpha_k d_k$ (line search or trust region). +""" + + + +# ╔═╡ 704f8f36-fda1-445e-9551-942fd85bff42 +md""" +## SQP (line-search flavor): 6 steps + +1. Initialize $x_0$, multipliers $(\lambda_0,\mu_0)$, and $B_0 \succ 0$. +2. Build the QP at $x_k$ using $B_k$, $\nabla g(x_k)$, $\nabla h(x_k)$, $g(x_k)$, $h(x_k)$. +3. Solve the QP $\Rightarrow$ get $d_k$ and $(\lambda_{k+1},\mu_{k+1})$. +4. Choose $\alpha_k \in (0,1]$ via globalization (merit or filter). +5. Set $x_{k+1} = x_k + \alpha_k d_k$. +6. Update $B_{k+1}$ (e.g., damped BFGS). Stop when KKT residuals are small. +""" + + + +# ╔═╡ b6acef39-d40d-42e6-b037-10713dc1254e +md""" +### Globalization: make SQP robust + +**Merit (penalty) function** for line search, e.g. +$\phi(x) \;=\; f(x) \;+\; \tfrac{\rho}{2}\,\|g(x)\|^2 \;+\; \rho\,\|h(x)_+\|_1,$ +and choose $\alpha_k$ by Armijo/backtracking so $\phi$ decreases. + +**Filter methods** accept steps that improve *either* objective *or* feasibility. + +**Trust-region SQP**: restrict $\|d\|\le \Delta_k$, compare predicted vs actual reduction, adjust $\Delta_k$. + +### Inequalities & active sets (intuition) + +- The QP contains the **linearized inequalities** $h(x_k)+\nabla h(x_k)d \le 0$. +- Its KKT system enforces complementarity via multiplier signs and active constraints. +- A small **working set** (estimated active constraints) tends to stabilize across iterations, enabling warm starts and fast solves. + +""" + + + +# ╔═╡ 34737fc8-095c-4061-8661-e776c23c7eed + + +# ╔═╡ fc35e75c-b81d-44eb-bcee-74c590e38652 + + +# ╔═╡ 32587054-655f-4f06-9ff8-500ba44bec76 + + +# ╔═╡ 3dfc4b4e-85e3-4dee-ae33-5ddb2a4eec29 +question_box(md"hello") + +# ╔═╡ e8f62342-88ab-4754-af33-e2347be2daa0 +Foldable(md"Compressed vs. separated form", md" +We can either express the values of $x$ and $u$ at $t_{k + \frac{1}{2}}$ with the expressions above, or we can set them as decision variables, and enforce the expressions above as equality constraints. +Doing the former results in *compressed form* and the latter *separated form*. +The compressed form tends to perform better with a large number of segments and the separated form a small number of segments [^kelly2017]. +") + +# ╔═╡ Cell order: +# ╟─81ebc291-89f0-4c1e-ac34-d5715977dd86 +# ╟─9543f7bc-ab36-46ff-b471-9aa3db9739e4 +# ╠═8969e78a-29b0-46d3-b6ba-59980208fe5b +# ╟─d90e9be0-7b68-4139-b185-6cbaad0d307e +# ╠═7b896268-4336-47e2-a8b5-f985bfde51f5 +# ╟─342decc1-43fa-432a-9a9c-757a10ba6a5d +# ╟─fd1ad74b-cb74-49a7-80b8-1a282abfdff2 +# ╠═49d5b2e6-eb29-478c-b817-8405d55170b1 +# ╠═950c61b8-f076-4b9a-8970-e5c2841d75f2 +# ╠═92841a2e-bc0d-40f8-8344-a5c398a67275 +# ╠═8813982c-8c9a-4706-91a8-ebadf9323a4f +# ╠═4307a2f3-0378-4282-815b-9ed1fa532e1c +# ╠═a45ed97f-f7c1-4ef5-9bc7-654e827f751b +# ╠═5a17f83e-751b-4244-9c15-7165645bfe29 +# ╟─12c03202-58b2-482e-a65c-b83bc1f6eed1 +# ╠═662d58d7-4c9c-4699-a846-cb6070c507d9 +# ╠═27760607-b740-47dc-a810-c332baa2bd2d +# ╟─7765e032-c520-4f97-b877-0d479f383f28 +# ╠═a098a49b-a368-4929-824c-385a06b88696 +# ╠═62d85874-9673-4893-bbff-fa75b4462e96 +# ╠═a62f1b6a-87fe-4401-bc13-42166ca0e129 +# ╠═b2ddc048-4942-43bc-8faa-1921062d8c9c +# ╠═6fd7a753-6db7-4c37-9e22-dc63dd3072c8 +# ╠═49e0f5c3-fe14-42e1-9b3a-83c4447148a8 +# ╠═096ebc95-f133-4ca3-b942-cf735faaa42b +# ╠═fdf41c76-4405-49e0-abfa-5c5193de99f4 +# ╠═22bfe0a3-c61b-4dfe-8f20-a8bf807c2e14 +# ╠═d259c1b8-3716-4f80-b462-3b3daebb444d +# ╠═858d4ee1-4f15-4ced-b984-c0291237d359 +# ╠═e012d7e0-0181-49d0-bb78-995378c4f87a +# ╠═0a3eb748-fab3-4f8b-993e-2246c32fb6aa +# ╠═3989b8e1-ac1f-430d-9d72-e298ba7ae0ca +# ╠═e466ffb2-fcc8-4c3b-9059-7fd68a4265f2 +# ╠═09a301c7-c1f4-4890-afe9-fbc1d7c3b905 +# ╠═53020b8a-4948-467a-8629-bef9496d0374 +# ╠═26c887ce-d95b-4e38-9717-e119de0e80ca +# ╟─a1b7f614-710a-4ce8-815b-2f94754088c4 +# ╟─6fd82e73-6fef-4f7f-b291-f94cbac0d268 +# ╠═52718e0b-f958-445e-b9b6-9e5baf09e81a +# ╠═edce5b27-9af8-4010-9d9f-60681b2f427c +# ╠═c139ccc9-0355-4c41-8d82-0c97fef50900 +# ╟─43c5ccb9-d51e-4d73-b7be-f8b9dd599130 +# ╠═e0a7f431-61dd-4df2-b53c-d81c7a302baf +# ╠═82e2eca9-4991-4538-8e1f-455cd46f849f +# ╠═3e3d6e18-f922-4e91-8eaf-bc26b8f91757 +# ╠═31407c3f-8f3d-4dcc-bc49-cedd8ca14013 +# ╠═a61a63b7-716e-4500-9bc6-ab2caf9062e7 +# ╠═1fda2985-ab68-4460-98be-8621e5a5f1c8 +# ╠═ea0f0b1e-65b1-4c66-b9fa-7f2c428e3459 +# ╠═b4f6cce9-c39e-4325-a0e3-ab9c38179894 +# ╠═252554be-a839-4770-b222-fbb6a32df2ef +# ╠═db230e89-e8ea-4c57-8517-0dbf68cab6b7 +# ╠═d9c51abd-2488-405f-9b4c-799f496c0dd1 +# ╠═af7cf190-fa37-40f7-ac9e-dbdb8e72fff8 +# ╠═6c5b33c5-94fc-4b41-8128-696a58ca9b64 +# ╠═64d20ba5-d195-41ef-88b8-d2968bf87ac2 +# ╠═4f9c5ddd-05f5-4826-a775-bc1af4555153 +# ╠═1fb54858-898e-4d0d-8435-746bf3c17e65 +# ╠═2070905c-d96b-4e3d-8a51-f0bf26ba98b8 +# ╠═08270f19-7c1c-4707-baf8-fa985519baf3 +# ╠═a4c50b10-0a6f-4967-9895-e5171f44fafa +# ╠═71530ac2-841c-4616-98e3-2dedd63e3834 +# ╠═fe6ec86b-84c1-4e82-85e0-dce1a316214a +# ╠═686b3f99-11df-4cc0-b739-3ff8476d217d +# ╠═8aa33896-2585-4a60-86dc-7386b96feb9f +# ╠═3132cda5-ad6e-4ae2-afba-ea18d09cf8df +# ╟─74d5c04f-3f63-41ba-818d-ce217cd18022 +# ╟─4b560ea8-8104-40ce-bf44-b5c2267132eb +# ╟─96738f09-e894-499e-9e30-15692e2434dd +# ╟─704f8f36-fda1-445e-9551-942fd85bff42 +# ╟─b6acef39-d40d-42e6-b037-10713dc1254e +# ╠═34737fc8-095c-4061-8661-e776c23c7eed +# ╠═fc35e75c-b81d-44eb-bcee-74c590e38652 +# ╠═32587054-655f-4f06-9ff8-500ba44bec76 +# ╠═3dfc4b4e-85e3-4dee-ae33-5ddb2a4eec29 +# ╠═e8f62342-88ab-4754-af33-e2347be2daa0