github仓库
线性指的是系统是线性的,典型的线性系统的状态方程
x ( k + 1 ) = A x ( k ) + B u ( k ) x(k+1)=Ax(k)+Bu(k)
x ( k + 1 ) = A x ( k ) + B u ( k )
二次型是指代价函数 J J J 是二次型的。
J = 1 2 x ( N ) T S x ( N ) + 1 2 ∑ k = 0 N − 1 [ x T ( k ) Q x ( k ) + u T ( k ) R u ( k ) ] J=\frac{1}{2} x(N)^TSx(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x^T(k)Qx(k)+u^T(k)Ru(k)]}
J = 2 1 x ( N ) T S x ( N ) + 2 1 k = 0 ∑ N − 1 [ x T ( k ) Q x ( k ) + u T ( k ) R u ( k ) ]
其中前一项中 N N N 表示末端时刻, x d x_d x d 是系统的参考,也就是目标,这个 J J J 是末端代价, S S S 为末端状态的权重矩阵,是一个对角阵, Q Q Q 是运行过程中的权重矩阵, R R R 是控制量的权重矩阵, S S S 和 Q Q Q 是 N × N N\times N N × N 的,但是 R R R 是 P × P P\times P P × P 的,如果对某个元素要求大时可以对应的将 R R R 增大,关键的是 S S S 和 Q Q Q 都是半正定阵, R R R 为正定阵,只有这样系统才会有最小值,即 s ≥ 0 , q ≥ 0 , r > 0 s≥0,q≥0,r>0 s ≥ 0 , q ≥ 0 , r > 0
对于一个系统来说,当 x d x_d x d 为零时,该系统就是一个调节系统,不为零就是一个跟踪系统。
LQR线性二次型调节器
对应的调节二次型的 LQR 的代价函数就是
J = 1 2 x ( N ) T S x ( N ) + 1 2 ∑ k = 0 N − 1 [ x T ( k ) Q x ( k ) + u T ( k ) R u ( k ) ] J=\frac{1}{2} x(N)^TSx(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x^T(k)Qx(k)+u^T(k)Ru(k)]}
J = 2 1 x ( N ) T S x ( N ) + 2 1 k = 0 ∑ N − 1 [ x T ( k ) Q x ( k ) + u T ( k ) R u ( k ) ]
采用逆向分级的方法来进行分析
系统的状态空间方程如下
x ( k + 1 ) = A x ( k ) + B u ( k ) k = 0 : x ( 1 ) = A x ( 0 ) + B u ( 0 ) k = 1 : x ( 2 ) = A x ( 1 ) + B u ( 1 ) = A ( A x ( 0 ) + B u ( 0 ) ) + B u ( 1 ) … k = N − 1 : x ( N ) = A k + 1 x ( 0 ) + ∑ i = 0 k A k − i B u ( i ) x(k+1)=Ax(k)+Bu(k)\\\\\\
k=0:\ x(1)=Ax(0)+Bu(0)\\\\
k=1:\ x(2)=Ax(1)+Bu(1)=A(Ax(0)+Bu(0))+Bu(1)\\\\
\dots\\\\
k=N-1:\ x(N)=A^{k+1}x(0)+\sum_{i=0}^{k}A^{k-i}Bu(i)
x ( k + 1 ) = A x ( k ) + B u ( k ) k = 0 : x ( 1 ) = A x ( 0 ) + B u ( 0 ) k = 1 : x ( 2 ) = A x ( 1 ) + B u ( 1 ) = A ( A x ( 0 ) + B u ( 0 ) ) + B u ( 1 ) … k = N − 1 : x ( N ) = A k + 1 x ( 0 ) + i = 0 ∑ k A k − i B u ( i )
其中 u ( n − 1 ) u(n-1) u ( n − 1 ) 可以使 x ( n − 1 ) x(n-1) x ( n − 1 ) 转化为 x ( n ) x(n) x ( n )
那么根据上式也可以得出对应的调节二次型的雅各比矩阵
J = 1 2 x ( N ) T S x ( N ) + 1 2 ∑ k = 0 N − 1 [ x T ( k ) Q x ( k ) + u T ( k ) R u ( k ) ] k = N : J N → N ∗ = 1 2 x ( N ) T S x ( N ) = 1 2 x ( N ) T P ( 0 ) x ( N ) ; S = P ( 0 ) k = N − 1 : J N − 1 → N ∗ = 1 2 x ( N ) T S x ( N ) + 1 2 x ( N − 1 ) T Q x ( N − 1 ) + 1 2 u ( N − 1 ) T R u ( N − 1 ) = J N → N ∗ + 1 2 x ( N − 1 ) T Q x ( N − 1 ) + 1 2 u ( N − 1 ) T R u ( N − 1 ) k = N − 2 : J N − 2 → N ∗ = 1 2 x ( N ) T S x ( N ) + 1 2 x ( N − 1 ) T Q x ( N − 1 ) + 1 2 u ( N − 1 ) T R u ( N − 1 ) + 1 2 x ( N − 2 ) T Q x ( N − 2 ) + 1 2 u ( N − 2 ) T R u ( N − 2 ) = J N − 1 → N ∗ + 1 2 x ( N − 2 ) T Q x ( N − 2 ) + 1 2 u ( N − 2 ) T R u ( N − 2 ) J=\frac{1}{2} x(N)^TSx(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x^T(k)Qx(k)+u^T(k)Ru(k)]}\\k=N:J_{N\rightarrow N}^*=\frac{1}{2} x(N)^TSx(N)=\frac{1}{2} x(N)^TP(0)x(N);\ S=P(0)\\k=N-1:J_{N-1\rightarrow N}^* = \frac{1}{2} x(N)^TSx(N)+\frac{1}{2} x(N-1)^TQx(N-1)+\frac{1}{2}u(N-1)^TRu(N-1)=J^*_{N\rightarrow N}+\frac{1}{2} x(N-1)^TQx(N-1)+\frac{1}{2}u(N-1)^TRu(N-1)\\k=N-2:J_{N-2\rightarrow N}^* = \frac{1}{2} x(N)^TSx(N)+\frac{1}{2} x(N-1)^TQx(N-1)+\frac{1}{2}u(N-1)^TRu(N-1)+\frac{1}{2} x(N-2)^TQx(N-2)+\frac{1}{2}u(N-2)^TRu(N-2)\\=J_{N-1\rightarrow N}^*+\frac{1}{2} x(N-2)^TQx(N-2)+\frac{1}{2}u(N-2)^TRu(N-2)
J = 2 1 x ( N ) T S x ( N ) + 2 1 k = 0 ∑ N − 1 [ x T ( k ) Q x ( k ) + u T ( k ) R u ( k ) ] k = N : J N → N ∗ = 2 1 x ( N ) T S x ( N ) = 2 1 x ( N ) T P ( 0 ) x ( N ) ; S = P ( 0 ) k = N − 1 : J N − 1 → N ∗ = 2 1 x ( N ) T S x ( N ) + 2 1 x ( N − 1 ) T Q x ( N − 1 ) + 2 1 u ( N − 1 ) T R u ( N − 1 ) = J N → N ∗ + 2 1 x ( N − 1 ) T Q x ( N − 1 ) + 2 1 u ( N − 1 ) T R u ( N − 1 ) k = N − 2 : J N − 2 → N ∗ = 2 1 x ( N ) T S x ( N ) + 2 1 x ( N − 1 ) T Q x ( N − 1 ) + 2 1 u ( N − 1 ) T R u ( N − 1 ) + 2 1 x ( N − 2 ) T Q x ( N − 2 ) + 2 1 u ( N − 2 ) T R u ( N − 2 ) = J N − 1 → N ∗ + 2 1 x ( N − 2 ) T Q x ( N − 2 ) + 2 1 u ( N − 2 ) T R u ( N − 2 )
其中标 * 的表示最优解
根据贝尔曼最优理论:
当 J N − 2 → N J_{N-2→N} J N − 2 → N 为最优时 J N − 1 → N J_{N-1→N} J N − 1 → N 一定为最优的
观察上个式子,对于每一步的最优解都是取决于最后的一部分,因为前一部分的最优是根据上一步的最优来的,所以对于每一步我们只需要求解最后一部分,而且由于 x ( k ) x(k) x ( k ) 是系统状态量,所以是已知的,根本不需要考虑,因此就是求出最优的 u ( k ) u(k) u ( k ) 就可以了。所以对于J k → N J_{k→N} J k → N 的最优解求解就是直接求导,导数为0就可以得到最优控制策略,证明二阶导是一个正定矩阵,所以得到的 u ( k ) u(k) u ( k ) 一定是最优的解。
下面将展示如何求解 u ( N − 1 ) u(N-1) u ( N − 1 ) ,因为其他的求解方式与之相同,不再赘述
J N − 1 → N ∗ = 1 2 x ( N ) T S x ( N ) + 1 2 x ( N − 1 ) T Q x ( N − 1 ) + 1 2 u ( N − 1 ) T R u ( N − 1 ) = J N → N ∗ + 1 2 x ( N − 1 ) T Q x ( N − 1 ) + 1 2 u ( N − 1 ) T R u ( N − 1 ) J_{N-1\rightarrow N}^* = \frac{1}{2} x(N)^TSx(N)+\frac{1}{2} x(N-1)^TQx(N-1)+\frac{1}{2}u(N-1)^TRu(N-1)=J^*_{N\rightarrow N}+\frac{1}{2} x(N-1)^TQx(N-1)+\frac{1}{2}u(N-1)^TRu(N-1)
J N − 1 → N ∗ = 2 1 x ( N ) T S x ( N ) + 2 1 x ( N − 1 ) T Q x ( N − 1 ) + 2 1 u ( N − 1 ) T R u ( N − 1 ) = J N → N ∗ + 2 1 x ( N − 1 ) T Q x ( N − 1 ) + 2 1 u ( N − 1 ) T R u ( N − 1 )
由上述可知
x ( N ) = A x ( N − 1 ) + B u ( N − 1 ) x(N)=Ax(N-1)+Bu(N-1)
x ( N ) = A x ( N − 1 ) + B u ( N − 1 )
所以对应的雅可比矩阵可以转化为
J N − 1 → N ∗ = 1 2 ( A x ( N − 1 ) + B u ( N − 1 ) ) T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) + 1 2 x ( N − 1 ) T Q x ( N − 1 ) + 1 2 u ( N − 1 ) T R u ( N − 1 ) J_{N-1\rightarrow N}^* = \frac{1}{2} (Ax(N-1)+Bu(N-1))^TP(0)(Ax(N-1)+Bu(N-1))+\frac{1}{2} x(N-1)^TQx(N-1)+\frac{1}{2}u(N-1)^TRu(N-1)
J N − 1 → N ∗ = 2 1 ( A x ( N − 1 ) + B u ( N − 1 ) ) T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) + 2 1 x ( N − 1 ) T Q x ( N − 1 ) + 2 1 u ( N − 1 ) T R u ( N − 1 )
对上式求导
∂ J N − 1 → N ∂ u ( N − 1 ) = 0 f r i s t p a r t : ∂ 1 2 x ( N ) T P ( 0 ) x ( N ) ∂ u ( N − 1 ) = ∂ x ( N ) ∂ u ( N − 1 ) ∗ ∂ 1 2 x ( N ) T P ( 0 ) x ( N ) ∂ x ( N ) = B T ∗ P ( 0 ) x ( N ) = B T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) s e c o n d p a r t : ∂ 1 2 x ( N − 1 ) T Q x ( N − 1 ) ∂ u ( N − 1 ) = 0 t h i r d p a r t : ∂ 1 2 u ( N − 1 ) T R u ( N − 1 ) ∂ u ( N − 1 ) = R u ( N − 1 ) ∂ J N − 1 → N ∂ u ( N − 1 ) = B T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) + R u ( N − 1 ) \frac{\partial J_{N-1\rightarrow N}}{\partial u(N-1)}=0\\frist\ part: \frac{\partial \frac{1}{2}x(N)^TP(0)x(N)}{\partial u(N-1)}=\frac{\partial x(N)}{\partial u(N-1)}*\frac{\partial \frac{1}{2}x(N)^TP(0)x(N)}{\partial x(N)}=B^T*P(0)x(N)=B^TP(0)(Ax(N-1)+Bu(N-1))\\second\ part:\frac{\partial \frac{1}{2} x(N-1)^TQx(N-1)}{\partial u(N-1)}=0\\third\ part:\frac{\partial \frac{1}{2}u(N-1)^TRu(N-1)}{\partial u(N-1)}=Ru(N-1)\\\frac{\partial J_{N-1\rightarrow N}}{\partial u(N-1)}=B^TP(0)(Ax(N-1)+Bu(N-1))+Ru(N-1)
∂ u ( N − 1 ) ∂ J N − 1 → N = 0 f r i s t p a r t : ∂ u ( N − 1 ) ∂ 2 1 x ( N ) T P ( 0 ) x ( N ) = ∂ u ( N − 1 ) ∂ x ( N ) ∗ ∂ x ( N ) ∂ 2 1 x ( N ) T P ( 0 ) x ( N ) = B T ∗ P ( 0 ) x ( N ) = B T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) s e c o n d p a r t : ∂ u ( N − 1 ) ∂ 2 1 x ( N − 1 ) T Q x ( N − 1 ) = 0 t h i r d p a r t : ∂ u ( N − 1 ) ∂ 2 1 u ( N − 1 ) T R u ( N − 1 ) = R u ( N − 1 ) ∂ u ( N − 1 ) ∂ J N − 1 → N = B T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) + R u ( N − 1 )
解得
B T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) + R u ( N − 1 ) = 0 ⇒ u ∗ ( N − 1 ) = − ( B T P ( 0 ) B + R ) − 1 B T P ( 0 ) A x ( N − 1 ) B^TP(0)(Ax(N-1)+Bu(N-1))+Ru(N-1)=0\\\Rightarrow u^*(N-1)=-(B^TP(0)B+R)^{-1}B^TP(0)Ax(N-1)
B T P ( 0 ) ( A x ( N − 1 ) + B u ( N − 1 ) ) + R u ( N − 1 ) = 0 ⇒ u ∗ ( N − 1 ) = − ( B T P ( 0 ) B + R ) − 1 B T P ( 0 ) A x ( N − 1 )
令 F ( N − 1 ) = ( B T P ( 0 ) B + R ) − 1 B T P ( 0 ) A F(N-1)=(B^TP(0)B+R)^{-1}B^TP(0)A F ( N − 1 ) = ( B T P ( 0 ) B + R ) − 1 B T P ( 0 ) A ,原式化简为
u ∗ ( N − 1 ) = − F ( N − 1 ) x ( N − 1 ) u^*(N-1)=-F(N-1)x(N-1)
u ∗ ( N − 1 ) = − F ( N − 1 ) x ( N − 1 )
这就是一个全状态的一个反馈控制器。
由于一阶导为0只能证明是极值,所以求解二阶导来验证一下该结果是否为最小值
∂ 2 J N − 1 → N ∂ u ( N − 1 ) 2 = ( B T P ( 0 ) B ) T + R T \frac{\partial^2 J_{N-1\rightarrow N}}{\partial u(N-1)^2}=(B^TP(0)B)^T+R^T
∂ u ( N − 1 ) 2 ∂ 2 J N − 1 → N = ( B T P ( 0 ) B ) T + R T
由于 R R R 是一个正定矩阵,而 B T P ( 0 ) B B^TP(0)B B T P ( 0 ) B 是一个半正定矩阵,所以结果是正定的,因此该结果是最小值
将最优的解带入到代价函数中去,可得到最优的代价
J N − 1 → N ∗ = 1 2 x ( N − 1 ) T { ( A − B F ( N − 1 ) ) T P ( 0 ) ( A − B F ( N − 1 ) ) + F ( N − 1 ) T R F ( N − 1 ) + Q } x ( N − 1 ) J_{N-1\rightarrow N}^* = \frac{1}{2}x(N-1)^T\{(A-BF(N-1))^TP(0)(A-BF(N-1))+F(N-1)^TRF(N-1)+Q\}x(N-1)
J N − 1 → N ∗ = 2 1 x ( N − 1 ) T { ( A − B F ( N − 1 ) ) T P ( 0 ) ( A − B F ( N − 1 ) ) + F ( N − 1 ) T R F ( N − 1 ) + Q } x ( N − 1 )
令
P ( 1 ) = ( A − B F ( N − 1 ) ) T P ( 0 ) ( A − B F ( N − 1 ) ) + F ( N − 1 ) T R F ( N − 1 ) + Q P(1)=(A-BF(N-1))^TP(0)(A-BF(N-1))+F(N-1)^TRF(N-1)+Q
P ( 1 ) = ( A − B F ( N − 1 ) ) T P ( 0 ) ( A − B F ( N − 1 ) ) + F ( N − 1 ) T R F ( N − 1 ) + Q
上式可为
J N − 1 → N ∗ = 1 2 x ( N − 1 ) T P ( 1 ) x ( N − 1 ) P ( 1 ) = ( A − B F ( N − 1 ) ) T P ( 0 ) ( A − B F ( N − 1 ) ) + F ( N − 1 ) T R F ( N − 1 ) + Q J_{N-1\rightarrow N}^* = \frac{1}{2}x(N-1)^TP(1)x(N-1)\\P(1)=(A-BF(N-1))^TP(0)(A-BF(N-1))+F(N-1)^TRF(N-1)+Q
J N − 1 → N ∗ = 2 1 x ( N − 1 ) T P ( 1 ) x ( N − 1 ) P ( 1 ) = ( A − B F ( N − 1 ) ) T P ( 0 ) ( A − B F ( N − 1 ) ) + F ( N − 1 ) T R F ( N − 1 ) + Q
所以最终可以得到
{ J N − k → N ∗ = 1 2 x ( N − k ) T P ( k ) x ( N − k ) P ( k ) = ( A − B F ( N − k ) ) T P ( k − 1 ) ( A − B F ( N − k ) ) + F ( N − k ) T R F ( N − k ) + Q F ( N − k ) = ( B T P ( k − 1 ) B + R ) − 1 B T P ( k − 1 ) A \left\{\begin{aligned}&J^*_{N-k\rightarrow N}=\frac{1}{2}x(N-k)^TP(k)x(N-k)\\&P(k)=(A-BF(N-k))^TP(k-1)(A-BF(N-k))+F(N-k)^TRF(N-k)+Q\\&F(N-k)=(B^TP(k-1)B+R)^{-1}B^TP(k-1)A\end{aligned}\right.
⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎧ J N − k → N ∗ = 2 1 x ( N − k ) T P ( k ) x ( N − k ) P ( k ) = ( A − B F ( N − k ) ) T P ( k − 1 ) ( A − B F ( N − k ) ) + F ( N − k ) T R F ( N − k ) + Q F ( N − k ) = ( B T P ( k − 1 ) B + R ) − 1 B T P ( k − 1 ) A
根据递归计算就可以得到所有的状态矩阵了,其中 F [ N − k ] F[N-k] F [ N − k ] 是反馈增益, u [ N − k ] u[N-k] u [ N − k ] 是N-k时的最优控制量,也会得到 p [ k ] p[k] p [ k ] 进行下一步的计算,最后需要不断递归计算得到 u [ 0 ] u[0] u [ 0 ] 也就是需要使用的输入量。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 clear all; close all; clc; A = [0 1 ; -1 -0.5 ]; n = size (A, 1 ); B = [0 ; 1 ]; p = size (B,2 ); Ts = 0.1 ; x0 = [1 ; 0 ]; x = x0; u0 = 2 ; u = u0; k_steps = 100 ; x_history = zeros (n, k_steps); x_history(:, 1 ) = x; u_history = zeros (p, k_steps); u_history(:, 1 ) = u; Q = [1 0 ; 0 1 ]; S = [1 0 ; 0 1 ]; R = 1 ; p_k = S; N = k_steps; for k = 1 : N F = inv(R + B'* p_k * B)*B'* p_k * A; p_k = (A - B * F)' * p_k * (A - B * F) + (F)' * R * (F) + Q; if k == 1 F_N = F; else F_N = [F; F_N]; end end for k = 1 : k_steps u = - F_N((k-1 )*p+1 :k*p, :) * x; x = A * x + B * u; x_history(:, k+1 ) = x; u_history(:, k) = u; end subplot(2 , 1 , 1 ); for i = 1 : n plot (x_history(i , :)); hold ; end legend (num2str((1 : n)', 'x %d' ));xlim([1 , k_steps]); grid on; subplot(2 , 1 , 2 ); for i = 1 : p stairs(u_history(i , :)); hold ; end legend (num2str((1 : p)', 'u %d' ));xlim([1 , k_steps]); grid on
LQR线性二次型跟踪器
跟踪二次型系统的LQR的代价函数为
J = 1 2 [ x ( n ) − x d ( n ) ) T S ( x ( n ) − x d ( n ) ] + 1 2 ∑ k = 0 N − 1 [ x ( k ) − x d ( k ) ) T Φ ( x ( k ) − x d ( k ) ) + u ( k ) T R u ( k ) ] J=\frac{1}{2} [x(n)-x_d(n))^TS(x(n)-x_d(n)]+\frac{1}{2}\sum_{k=0}^{N-1}{[x(k)-x_d(k))^T\Phi (x(k)-x_d(k)) + u(k)^TRu(k)]}
J = 2 1 [ x ( n ) − x d ( n ) ) T S ( x ( n ) − x d ( n ) ] + 2 1 k = 0 ∑ N − 1 [ x ( k ) − x d ( k ) ) T Φ ( x ( k ) − x d ( k ) ) + u ( k ) T R u ( k ) ]
定义系统误差
e ( k ) = x ( k ) − x d ( k ) e(k)=x(k)-x_d(k)
e ( k ) = x ( k ) − x d ( k )
跟踪器就是对 e ( k ) e(k) e ( k ) 的调节器,也就是 e d ( k ) = 0 e_d(k)=0 e d ( k ) = 0
可以定义矩阵的系统状态方程
[ x ( k + 1 ) x d ( k + 1 ) ] = [ A 0 0 A 0 ] [ x ( k ) x d ( k ) ] + [ B 0 ] u ( k ) \begin{bmatrix}x(k+1)\\x_d(k+1)\end{bmatrix}=\begin{bmatrix}A&&0\\0&&A_0\end{bmatrix}\begin{bmatrix}x(k)\\x_d(k)\end{bmatrix}+\begin{bmatrix}B\\0\end{bmatrix}u(k)
[ x ( k + 1 ) x d ( k + 1 ) ] = [ A 0 0 A 0 ] [ x ( k ) x d ( k ) ] + [ B 0 ] u ( k )
一般来说跟踪的量是不会改变的,也就是 A 0 = I A_0=I A 0 = I
可以用一些符号代表矩阵来简化式子
x a ( k + 1 ) = A a x a ( k ) + B a u ( k ) x_a(k+1)=A_ax_a(k)+B_au(k)
x a ( k + 1 ) = A a x a ( k ) + B a u ( k )
由
e ( k ) = x ( k ) − x d ( k ) = [ I − I ] [ x ( k ) x d ( k ) ] = C a x a ( k ) e(k)=x(k)-x_d(k)=\begin{bmatrix}I&&-I\end{bmatrix}\begin{bmatrix}x(k)\\x_d(k)\end{bmatrix}=C_ax_a(k)
e ( k ) = x ( k ) − x d ( k ) = [ I − I ] [ x ( k ) x d ( k ) ] = C a x a ( k )
其中
{ A a = [ A 0 0 A 0 ] B a = [ B 0 ] C a = [ I − I ] \left\{\begin{aligned}&A_a=\begin{bmatrix}A&&0\\0&&A_0\end{bmatrix}\\&B_a=\begin{bmatrix}B\\0\end{bmatrix}\\&C_a=\begin{bmatrix}I&&-I\end{bmatrix}\end{aligned}\right.
⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ A a = [ A 0 0 A 0 ] B a = [ B 0 ] C a = [ I − I ]
对应的代价函数为
J = 1 2 e ( N ) T S e ( N ) + 1 2 ∑ k = 0 N − 1 [ e T ( k ) Q e ( k ) + u T ( k ) R u ( k ) ] J=\frac{1}{2} e(N)^TSe(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[e^T(k)Qe(k)+u^T(k)Ru(k)]}
J = 2 1 e ( N ) T S e ( N ) + 2 1 k = 0 ∑ N − 1 [ e T ( k ) Q e ( k ) + u T ( k ) R u ( k ) ]
带入上式为
J = 1 2 x a ( N ) T C a T S C a x a ( N ) + 1 2 ∑ k = 0 N − 1 [ x a T ( k ) C a T Q C a x a ( k ) + u T ( k ) R u ( k ) ] J=\frac{1}{2} x_a(N)^TC_a^TSC_ax_a(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x_a^T(k)C_a^TQC_ax_a(k)+u^T(k)Ru(k)]}
J = 2 1 x a ( N ) T C a T S C a x a ( N ) + 2 1 k = 0 ∑ N − 1 [ x a T ( k ) C a T Q C a x a ( k ) + u T ( k ) R u ( k ) ]
令 S a = C a T S C a S_a=C_a^TSC_a S a = C a T S C a , Q a = C a T Q C a Q_a=C_a^TQC_a Q a = C a T Q C a ,得
J = 1 2 x a ( N ) T S a x a ( N ) + 1 2 ∑ k = 0 N − 1 [ x a T ( k ) Q a x a ( k ) + u T ( k ) R u ( k ) ] J=\frac{1}{2} x_a(N)^TS_ax_a(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x_a^T(k)Q_ax_a(k)+u^T(k)Ru(k)]}
J = 2 1 x a ( N ) T S a x a ( N ) + 2 1 k = 0 ∑ N − 1 [ x a T ( k ) Q a x a ( k ) + u T ( k ) R u ( k ) ]
与 LQR 调节器求解过程相同,得最终结果为
{ J N − k → N ∗ = 1 2 x a ( N − k ) T P a ( k ) x a ( N − k ) F a ( N − k ) = ( B a T P a ( k − 1 ) B a + R ) − 1 B a T P a ( k − 1 ) A a P a ( k ) = ( A a − B a F a ( N − k ) ) T P a ( k − 1 ) ( A a − B a F a ( N − k ) ) + F a ( N − k ) T R F a ( N − k ) + Q a u ( N − k ) = − F a ( N − k ) x a ( N − k ) \left\{\begin{aligned}&J^*_{N-k\rightarrow N}=\frac{1}{2}x_a(N-k)^TP_a(k)x_a(N-k)\\&F_a(N-k)=(B_a^TP_a(k-1)B_a+R)^{-1}B_a^TP_a(k-1)A_a\\&P_a(k)=(A_a-B_aF_a(N-k))^TP_a(k-1)(A_a-B_aF_a(N-k))+F_a(N-k)^TRF_a(N-k)+Q_a\\&u(N-k)=-F_a(N-k)x_a(N-k)\end{aligned}\right.
⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ J N − k → N ∗ = 2 1 x a ( N − k ) T P a ( k ) x a ( N − k ) F a ( N − k ) = ( B a T P a ( k − 1 ) B a + R ) − 1 B a T P a ( k − 1 ) A a P a ( k ) = ( A a − B a F a ( N − k ) ) T P a ( k − 1 ) ( A a − B a F a ( N − k ) ) + F a ( N − k ) T R F a ( N − k ) + Q a u ( N − k ) = − F a ( N − k ) x a ( N − k )
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 clear all; close all; clc; A = [0 1 ; -1 -0.5 ]; n = size (A,1 ); B = [0 ; 1 ]; p = size (B, 2 ); Ts = 0.1 ; [A, B]=c2d(A,B,Ts); x0 = [0 ; 0 ]; x = x0; xd = [1 ; 0 ]; xa = [x; xd]; u0 = 2 ; u = u0; k_steps = 100 ; x_history = zeros (n, k_steps); x_history(:, 1 ) = x; u_history = zeros (p, k_steps); u_history(:, 1 ) = u; AD = eye (n); Ca = [eye (n) -eye (n)]; Q = [1 0 ; 0 1 ]; S = [1 0 ; 0 1 ]; R = 0.01 ; Sa = Ca' * S * Ca; Qa = Ca' * Q * Ca; Aa = [A zeros (n); zeros (n) AD]; Ba = [B; zeros (n, 1 )]; p_k = Sa; for k = 1 : k_steps F = inv(R + Ba'* p_k * Ba) * Ba'* p_k * Aa; p_k = (Aa-Ba*F)'*p_k*(Aa-Ba*F)+ (F)'*R*(F)+Qa; u = -F * xa; x = A * x + B * u; xa = [x; xd]; x_history(:, k+1 ) = x; u_history(:, k) = u; end subplot(2 , 1 , 1 ); for i = 1 : n plot (x_history(i , :)); hold ; end legend (num2str((1 : n)', 'x %d' ));xlim([1 , k_steps]); grid on; subplot(2 , 1 , 2 ); for i = 1 : p stairs(u_history(i , :)); hold ; end legend (num2str((1 : p)', 'u %d' ));xlim([1 , k_steps]); grid on
LQR线性二次型跟踪器——稳态非零参考值控制
对于一个系统(这是系统的状态方程的递推形式)
x k + 1 = A x k + B u k x_{k+1}=Ax_k+Bu_k
x k + 1 = A x k + B u k
由于 LQR线性二次型跟踪器 在应对 R 过大时,也就是太过于注重节能的话,系统的输入就会为 0,这将导致系统不运作,所以有了这个控制模式
对于有些系统,处于某个目标状态并且稳定时,它所需要的输入/输出是一个非 0 的值,这个输入 u d u_d u d 将会使系统保持稳定在目标位置,也就是
x d = A x d + B u d x_d=Ax_d+Bu_d
x d = A x d + B u d
所以需要定义稳态输入误差
Δ u k = u k − u d \Delta u_k=u_k-u_d
Δ u k = u k − u d
带入到状态方程中就是
x k + 1 = A x k + B ( Δ u k + u d ) = A x k + B Δ u k + ( I − A ) x d x_{k+1}=Ax_k+B(\Delta u_k+u_d)=Ax_k+B\Delta u_k+(I-A)x_d
x k + 1 = A x k + B ( Δ u k + u d ) = A x k + B Δ u k + ( I − A ) x d
这里的话就可以得到一个增广矩阵,其中 x d x_d x d 为常数,并且定义误差矩阵
x a [ k + 1 ] = [ x k + 1 x d ] = [ A I − A 0 I ] [ x k x d ] + [ B 0 ] Δ u k ⇓ x a [ k + 1 ] = A a x a [ k ] + B a Δ u k e k = x k − x d = [ I − I ] [ x k x d ] = C a x a [ k ] x_a[k+1]=\begin{bmatrix}x_{k+1}\\x_d\end{bmatrix}=\begin{bmatrix}A&I-A\\0&I\end{bmatrix}\begin{bmatrix}x_k\\x_d\end{bmatrix}+\begin{bmatrix}B\\0\end{bmatrix}\Delta u_k\\\Downarrow\\x_a[k+1]=A_ax_a[k]+B_a\Delta u_k\\e_k=x_k-x_d=\begin{bmatrix}I&-I\end{bmatrix}\begin{bmatrix}x_k\\x_d\end{bmatrix}=C_ax_a[k]
x a [ k + 1 ] = [ x k + 1 x d ] = [ A 0 I − A I ] [ x k x d ] + [ B 0 ] Δ u k ⇓ x a [ k + 1 ] = A a x a [ k ] + B a Δ u k e k = x k − x d = [ I − I ] [ x k x d ] = C a x a [ k ]
列出代价函数,前面部分的计算都是与线性二次型跟踪器是一致的,只有第二部分是不一样的,这里的代价函数为
J = 1 2 e ( N ) T S e ( N ) + 1 2 ∑ k = 0 N − 1 [ e T ( k ) Q e ( k ) + Δ u T ( k ) R Δ u ( k ) ] J=\frac{1}{2} e(N)^TSe(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[e^T(k)Qe(k)+\Delta u^T(k)R\Delta u(k)]}
J = 2 1 e ( N ) T S e ( N ) + 2 1 k = 0 ∑ N − 1 [ e T ( k ) Q e ( k ) + Δ u T ( k ) R Δ u ( k ) ]
令 S a = C a T S C a S_a=C_a^TSC_a S a = C a T S C a , Q a = C a T Q C a Q_a=C_a^TQC_a Q a = C a T Q C a ,得
J = 1 2 x a ( N ) T S a x a ( N ) + 1 2 ∑ k = 0 N − 1 [ x a T ( k ) Q a x a ( k ) + Δ u T ( k ) R Δ u ( k ) ] J=\frac{1}{2} x_a(N)^TS_ax_a(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x_a^T(k)Q_ax_a(k)+\Delta u^T(k)R\Delta u(k)]}
J = 2 1 x a ( N ) T S a x a ( N ) + 2 1 k = 0 ∑ N − 1 [ x a T ( k ) Q a x a ( k ) + Δ u T ( k ) R Δ u ( k ) ]
其中第二项就表示系统的稳态输入误差,表示输入偏离目标位置的距离
与 LQR 调节器求解过程相同,得最终结果为
{ J N − k → N ∗ = 1 2 x a ( N − k ) T P a ( k ) x a ( N − k ) F a ( N − k ) = ( B a T P a ( k − 1 ) B a + R ) − 1 B a T P a ( k − 1 ) A a P a ( k ) = ( A a − B a F a ( N − k ) ) T P a ( k − 1 ) ( A a − B a F a ( N − k ) ) + F a ( N − k ) T R F a ( N − k ) + Q a Δ u ( N − k ) = − F a ( N − k ) x a ( N − k ) u ( N − k ) = u d + Δ u ( N − k ) \left\{\begin{aligned}&J^*_{N-k\rightarrow N}=\frac{1}{2}x_a(N-k)^TP_a(k)x_a(N-k)\\&F_a(N-k)=(B_a^TP_a(k-1)B_a+R)^{-1}B_a^TP_a(k-1)A_a\\&P_a(k)=(A_a-B_aF_a(N-k))^TP_a(k-1)(A_a-B_aF_a(N-k))+F_a(N-k)^TRF_a(N-k)+Q_a\\&\Delta u(N-k)=-F_a(N-k)x_a(N-k)\\&u(N-k)=u_d+\Delta u(N-k)\end{aligned}\right.
⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ J N − k → N ∗ = 2 1 x a ( N − k ) T P a ( k ) x a ( N − k ) F a ( N − k ) = ( B a T P a ( k − 1 ) B a + R ) − 1 B a T P a ( k − 1 ) A a P a ( k ) = ( A a − B a F a ( N − k ) ) T P a ( k − 1 ) ( A a − B a F a ( N − k ) ) + F a ( N − k ) T R F a ( N − k ) + Q a Δ u ( N − k ) = − F a ( N − k ) x a ( N − k ) u ( N − k ) = u d + Δ u ( N − k )
LQR线性二次型跟踪器——输入增量控制
上述中 x d x_d x d 是一个常数,所以这里讨论 x d x_d x d 为非常数的变化,并且设置
x d [ k + 1 ] = A d x d [ k ] x_d[k+1]=A_dx_d[k]
x d [ k + 1 ] = A d x d [ k ]
定义一个 u u u 的增量
Δ u k = u k − u k − 1 u k = Δ u k + u k − 1 \Delta u_k=u_k-u_{k-1}\\u_k=\Delta u_k+u_{k-1}
Δ u k = u k − u k − 1 u k = Δ u k + u k − 1
这个可以使得系统的输入更加平滑,输入的变化不那么剧烈。带入到系统状态方程中
x k + 1 = A x k + B Δ u k + u k − 1 x_{k+1}=Ax_k+B\Delta u_k+u_{k-1}
x k + 1 = A x k + B Δ u k + u k − 1
此时设置增广矩阵
x a [ k ] = [ x k x d [ k ] u k − 1 ] ⇓ e k = x k − x d [ k ] = [ I − I 0 ] [ x k x d [ k ] u k − 1 ] = C a x a [ k ] x_a[k]=\begin{bmatrix}x_{k}\\x_d[k]\\u_{k-1}\end{bmatrix}\\\Downarrow\\e_k=x_k-x_d[k]=\begin{bmatrix}I&-I&0\end{bmatrix}\begin{bmatrix}x_{k}\\x_d[k]\\u_{k-1}\end{bmatrix}=C_ax_a[k]
x a [ k ] = ⎣ ⎢ ⎡ x k x d [ k ] u k − 1 ⎦ ⎥ ⎤ ⇓ e k = x k − x d [ k ] = [ I − I 0 ] ⎣ ⎢ ⎡ x k x d [ k ] u k − 1 ⎦ ⎥ ⎤ = C a x a [ k ]
并且可以得到系统的递归函数
x a [ k + 1 ] = [ x k + 1 x d [ k + 1 ] u k ] = [ A 0 B 0 A d 0 0 0 I ] [ x k x d [ k ] u k − 1 ] + [ B 0 I ] Δ u k x_a[k+1]=\begin{bmatrix}x_{k+1}\\x_d[k+1]\\u_{k}\end{bmatrix}=\begin{bmatrix}A&0&B\\0&A_d&0\\0&0&I\end{bmatrix}\begin{bmatrix}x_k\\x_d[k]\\u_{k-1}\end{bmatrix}+\begin{bmatrix}B\\0\\I\end{bmatrix}\Delta u_k
x a [ k + 1 ] = ⎣ ⎢ ⎡ x k + 1 x d [ k + 1 ] u k ⎦ ⎥ ⎤ = ⎣ ⎢ ⎡ A 0 0 0 A d 0 B 0 I ⎦ ⎥ ⎤ ⎣ ⎢ ⎡ x k x d [ k ] u k − 1 ⎦ ⎥ ⎤ + ⎣ ⎢ ⎡ B 0 I ⎦ ⎥ ⎤ Δ u k
系统的代价函数可以设计为
J = 1 2 e ( N ) T S e ( N ) + 1 2 ∑ k = 0 N − 1 [ e T ( k ) Q e ( k ) + Δ u T ( k ) R Δ u ( k ) ] J=\frac{1}{2} e(N)^TSe(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[e^T(k)Qe(k)+\Delta u^T(k)R\Delta u(k)]}
J = 2 1 e ( N ) T S e ( N ) + 2 1 k = 0 ∑ N − 1 [ e T ( k ) Q e ( k ) + Δ u T ( k ) R Δ u ( k ) ]
令 S a = C a T S C a S_a=C_a^TSC_a S a = C a T S C a , Q a = C a T Q C a Q_a=C_a^TQC_a Q a = C a T Q C a ,得
J = 1 2 x a ( N ) T S a x a ( N ) + 1 2 ∑ k = 0 N − 1 [ x a T ( k ) Q a x a ( k ) + Δ u T ( k ) R Δ u ( k ) ] J=\frac{1}{2} x_a(N)^TS_ax_a(N)+\frac{1}{2}\sum_{k=0}^{N-1}{[x_a^T(k)Q_ax_a(k)+\Delta u^T(k)R\Delta u(k)]}
J = 2 1 x a ( N ) T S a x a ( N ) + 2 1 k = 0 ∑ N − 1 [ x a T ( k ) Q a x a ( k ) + Δ u T ( k ) R Δ u ( k ) ]
与 LQR 调节器求解过程相同,得最终结果为
{ J N − k → N ∗ = 1 2 x a ( N − k ) T P a ( k ) x a ( N − k ) F a ( N − k ) = ( B a T P a ( k − 1 ) B a + R ) − 1 B a T P a ( k − 1 ) A a P a ( k ) = ( A a − B a F a ( N − k ) ) T P a ( k − 1 ) ( A a − B a F a ( N − k ) ) + F a ( N − k ) T R F a ( N − k ) + Q a Δ u ( N − k ) = − F a ( N − k ) x a ( N − k ) u ( N − k ) = u ( N − k − 1 ) + Δ u ( N − k ) \left\{\begin{aligned}&J^*_{N-k\rightarrow N}=\frac{1}{2}x_a(N-k)^TP_a(k)x_a(N-k)\\&F_a(N-k)=(B_a^TP_a(k-1)B_a+R)^{-1}B_a^TP_a(k-1)A_a\\&P_a(k)=(A_a-B_aF_a(N-k))^TP_a(k-1)(A_a-B_aF_a(N-k))+F_a(N-k)^TRF_a(N-k)+Q_a\\&\Delta u(N-k)=-F_a(N-k)x_a(N-k)\\&u(N-k)=u(N-k-1)+\Delta u(N-k)\end{aligned}\right.
⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ J N − k → N ∗ = 2 1 x a ( N − k ) T P a ( k ) x a ( N − k ) F a ( N − k ) = ( B a T P a ( k − 1 ) B a + R ) − 1 B a T P a ( k − 1 ) A a P a ( k ) = ( A a − B a F a ( N − k ) ) T P a ( k − 1 ) ( A a − B a F a ( N − k ) ) + F a ( N − k ) T R F a ( N − k ) + Q a Δ u ( N − k ) = − F a ( N − k ) x a ( N − k ) u ( N − k ) = u ( N − k − 1 ) + Δ u ( N − k )
LQR线性二次调节器——系统输入线性化
这一部分实际上就是工程上比较常用的,将 LQR 控制器的输入 u 定义为 u = − K e u=-Ke u = − K e 的形式,一种线性反馈控制器,最后成为一个近似于 PD 控制器
连续系统
x ˙ = A x + B u e = x d − x x d ˙ = A d x d \dot{x}=Ax+Bu\\e=x_d-x\\\dot{x_d}=A_dx_d
x ˙ = A x + B u e = x d − x x d ˙ = A d x d
定义连续系统代价函数,由于 t → ∞ , e → 0 t→\infty, e→0 t → ∞ , e → 0
J = 1 2 e T ( t f ) S e ( t f ) + 1 2 ∫ 0 t f [ e T Q e + u T R u ] d t J=\frac{1}{2} e^T(t_f)Se(t_f)+\frac{1}{2}\int_{0}^{t_f}{[e^TQe+u^TRu]}dt
J = 2 1 e T ( t f ) S e ( t f ) + 2 1 ∫ 0 t f [ e T Q e + u T R u ] d t
e ( k ) = x ( k ) − x d ( k ) = [ I − I ] [ x ( k ) x d ( k ) ] = C a x a ( k ) e(k)=x(k)-x_d(k)=\begin{bmatrix}I&&-I\end{bmatrix}\begin{bmatrix}x(k)\\x_d(k)\end{bmatrix}=C_ax_a(k)
e ( k ) = x ( k ) − x d ( k ) = [ I − I ] [ x ( k ) x d ( k ) ] = C a x a ( k )
设计最优控制
u = − K e = − K C a x a u=-Ke=-KC_ax_a
u = − K e = − K C a x a
可以得到
x ˙ a = [ A 0 0 A d ] x a + [ B 0 ] u = [ A − B K C a 0 0 A d ] x a x ˙ a = A c x a + B c u = A a x a \dot{x}_a=\begin{bmatrix}A&0\\0&A_d\end{bmatrix}x_a+\begin{bmatrix}B\\0\end{bmatrix}u=\begin{bmatrix}A-BKC_a&0\\0&A_d\end{bmatrix}x_a\\\dot{x}_a=A_cx_a+B_cu=A_ax_a
x ˙ a = [ A 0 0 A d ] x a + [ B 0 ] u = [ A − B K C a 0 0 A d ] x a x ˙ a = A c x a + B c u = A a x a
其中
{ A a = [ A 0 0 A 0 ] B a = [ B 0 ] C a = [ I − I ] \left\{\begin{aligned}&A_a=\begin{bmatrix}A&&0\\0&&A_0\end{bmatrix}\\&B_a=\begin{bmatrix}B\\0\end{bmatrix}\\&C_a=\begin{bmatrix}I&&-I\end{bmatrix}\end{aligned}\right.
⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ A a = [ A 0 0 A 0 ] B a = [ B 0 ] C a = [ I − I ]
对应的代价函数为
J = 1 2 e ( t f ) T S e ( t f ) + 1 2 ∫ 0 t f [ e T Q e + u T R u ] d t J=\frac{1}{2} e(t_f)^TSe(t_f)+\frac{1}{2}\int_{0}^{t_f}{[e^TQe+u^TRu]}dt
J = 2 1 e ( t f ) T S e ( t f ) + 2 1 ∫ 0 t f [ e T Q e + u T R u ] d t
带入上式为
J = 1 2 x a ( t f ) T C a T S C a x a ( t f ) + 1 2 ∫ 0 t f [ x a T C a T Q C a x a + u T R u ] d t J=\frac{1}{2} x_a(t_f)^TC_a^TSC_ax_a(t_f)+\frac{1}{2}\int_{0}^{t_f}{[x_a^TC_a^TQC_ax_a+u^TRu]}dt
J = 2 1 x a ( t f ) T C a T S C a x a ( t f ) + 2 1 ∫ 0 t f [ x a T C a T Q C a x a + u T R u ] d t
令 S a = C a T S C a S_a=C_a^TSC_a S a = C a T S C a , Q a = C a T Q C a Q_a=C_a^TQC_a Q a = C a T Q C a ,得
J = 1 2 x a T ( t f ) S a x a ( t f ) + 1 2 ∫ 0 t f [ x a T Q a x a + u T R u ] d t J=\frac{1}{2} x_a^T(t_f)S_ax_a(t_f)+\frac{1}{2}\int_{0}^{t_f}{[x_a^TQ_ax_a+u^TRu]}dt
J = 2 1 x a T ( t f ) S a x a ( t f ) + 2 1 ∫ 0 t f [ x a T Q a x a + u T R u ] d t
将 u = − K e u=-Ke u = − K e 带入到代价函数中去
J = 1 2 x a T ( t f ) S a x a ( t f ) + 1 2 ∫ 0 t f x a T ( Q a + K T R K ) x a d t J=\frac{1}{2} x_a^T(t_f)S_ax_a(t_f)+\frac{1}{2}\int_{0}^{t_f}{x_a^T(Q_a+K^TRK)x_a}dt
J = 2 1 x a T ( t f ) S a x a ( t f ) + 2 1 ∫ 0 t f x a T ( Q a + K T R K ) x a d t
第一种解法
可以假设存在一个向量 P,使得
d d t ( x a T P x a ) = − x a T ( Q a + K T R K ) x a 1 ◯ \frac{d}{dt}(x_a^TPx_a)=-x_a^T(Q_a+K^TRK)x_a~~\text{\textcircled{1}}
d t d ( x a T P x a ) = − x a T ( Q a + K T R K ) x a 1 ◯
假设系统是稳定的(实际上需要验证),带入代价函数中得到
J = 1 2 x a T ( 0 ) P x a ( 0 ) J=\frac{1}{2}x^T_a(0)Px_a(0)
J = 2 1 x a T ( 0 ) P x a ( 0 )
将 1 式左侧微分展开,并且把 x a x_a x a 微分形式替换
x ˙ a T P x a + x a T P x ˙ a + x a T ( Q a + K T R K ) x a = 0 x a T A a T P x a + x a T P A a x a + x a T ( Q a + K T R K ) x a = 0 x a T ( A a T P + P A a + Q a + K T R K ) x a = 0 \dot{x}_a^TPx_a+x_a^TP\dot{x}_a+x_a^T(Q_a+K^TRK)x_a=0\\x_a^TA_a^TPx_a+x_a^TPA_ax_a+x_a^T(Q_a+K^TRK)x_a=0\\x_a^T(A_a^TP+PA_a+Q_a+K^TRK)x_a=0
x ˙ a T P x a + x a T P x ˙ a + x a T ( Q a + K T R K ) x a = 0 x a T A a T P x a + x a T P A a x a + x a T ( Q a + K T R K ) x a = 0 x a T ( A a T P + P A a + Q a + K T R K ) x a = 0
要想式子成立,括号里必须始终为 0
A a T P + P A a + Q a + K T R K = 0 A_a^TP+PA_a+Q_a+K^TRK=0
A a T P + P A a + Q a + K T R K = 0
这就是 R i c c a t i Riccati R i c c a t i 方程
第二种解法
J = 1 2 x a T ( t f ) S a x a ( t f ) + 1 2 ∫ 0 t f [ x a T Q a x a + u T R u ] d t J=\frac{1}{2} x_a^T(t_f)S_ax_a(t_f)+\frac{1}{2}\int_{0}^{t_f}{[x_a^TQ_ax_a+u^TRu]}dt
J = 2 1 x a T ( t f ) S a x a ( t f ) + 2 1 ∫ 0 t f [ x a T Q a x a + u T R u ] d t
由于函数第一项是恒定的数值,所以只考虑第二项的最小值就好,也就是
m i n ∫ 0 t f [ x a T Q a x a + u T R u ] d t min\int_{0}^{t_f}{[x_a^TQ_ax_a+u^TRu]}dt
m i n ∫ 0 t f [ x a T Q a x a + u T R u ] d t
引入哈密顿函数
H = f ( x a , u ) + λ g ( x a , u ) = x a T Q a x a + u T R u + λ ( A c x a + B c u ) H=f(x_a,u)+\lambda g(x_a,u)=x_a^TQ_ax_a+u^TRu+\lambda(A_cx_a+B_cu)
H = f ( x a , u ) + λ g ( x a , u ) = x a T Q a x a + u T R u + λ ( A c x a + B c u )
∂ H ∂ u = ( R + R T ) u + B c T λ = 0 ⇓ u = − ( R + R T ) − 1 B c T λ \frac{\partial H}{\partial u}=(R+R^T)u+B_c^T\lambda=0\\\Downarrow\\u=-(R+R^T)^{-1}B_c^T\lambda
∂ u ∂ H = ( R + R T ) u + B c T λ = 0 ⇓ u = − ( R + R T ) − 1 B c T λ
− ∂ H ∂ x = − ( f x + λ g x ) = − [ ( Q a + Q a T ) x a + A T λ ] = λ ˙ ∂ H ∂ λ = g = A c x a + B c u = A c x − B c ( R + R T ) − 1 B c T λ = x ˙ a -\frac{\partial H}{\partial x}=-(f_x+\lambda g_x)=-[(Q_a+Q_a^T)x_a+A^T\lambda]=\dot\lambda\\\frac{\partial H}{\partial \lambda}=g=A_cx_a+B_cu=A_cx-B_c(R+R^T)^{-1}B_c^T\lambda=\dot{x}_a
− ∂ x ∂ H = − ( f x + λ g x ) = − [ ( Q a + Q a T ) x a + A T λ ] = λ ˙ ∂ λ ∂ H = g = A c x a + B c u = A c x − B c ( R + R T ) − 1 B c T λ = x ˙ a
由于我们设定 u = − K x a u=-Kx_a u = − K x a ,所以可以知道 λ = P x a \lambda=Px_a λ = P x a ,并且 − ( R + R T ) − 1 B c T P = K -(R+R^T)^{-1}B_c^TP=K − ( R + R T ) − 1 B c T P = K
带入上述的方程得到
λ ˙ = − [ ( Q a + Q a T ) x a + A c T λ ] = − ( Q a + Q a T + A c T P ) x a x ˙ a = A c x − B c ( R + R T ) − 1 B c T λ = [ A c − B c ( R + R T ) − 1 B c T P ] x a \dot\lambda=-[(Q_a+Q_a^T)x_a+A_c^T\lambda]=-(Q_a+Q_a^T+A_c^TP)x_a\\\dot{x}_a=A_cx-B_c(R+R^T)^{-1}B_c^T\lambda=[A_c-B_c(R+R^T)^{-1}B_c^TP]x_a
λ ˙ = − [ ( Q a + Q a T ) x a + A c T λ ] = − ( Q a + Q a T + A c T P ) x a x ˙ a = A c x − B c ( R + R T ) − 1 B c T λ = [ A c − B c ( R + R T ) − 1 B c T P ] x a
联立得到
λ ˙ = P ˙ x a + P x ˙ a = [ P ˙ + P ( A c − B c ( R + R T ) − 1 B c T P ) ] x a = − ( Q a + Q a T + A c T P ) x a \dot{\lambda}=\dot{P}x_a+P\dot{x}_a=[\dot{P}+P(A_c-B_c(R+R^T)^{-1}B_c^TP)]x_a=-(Q_a+Q_a^T+A_c^TP)x_a
λ ˙ = P ˙ x a + P x ˙ a = [ P ˙ + P ( A c − B c ( R + R T ) − 1 B c T P ) ] x a = − ( Q a + Q a T + A c T P ) x a
所以得到
P ˙ + P A c − P B c ( R + R T ) − 1 B c T P + Q a + Q a T + A c T P = 0 \dot{P}+PA_c-PB_c(R+R^T)^{-1}B_c^TP+Q_a+Q_a^T+A_c^TP=0
P ˙ + P A c − P B c ( R + R T ) − 1 B c T P + Q a + Q a T + A c T P = 0
令 Q 1 = Q a + Q a T , Q 2 = R + R T Q_1=Q_a+Q_a^T,Q_2=R+R^T Q 1 = Q a + Q a T , Q 2 = R + R T ,得到
P ˙ + P A c − P B c Q 2 − 1 B c T P + Q 1 + A c T P = 0 \dot{P}+PA_c-PB_cQ_2^{-1}B_c^TP+Q_1+A_c^TP=0
P ˙ + P A c − P B c Q 2 − 1 B c T P + Q 1 + A c T P = 0
最终得到一个 R i c c a t i Riccati R i c c a t i 方程
总结
输入系统线性化相当于是得到了一个线性的输入, u = − K x u=-Kx u = − K x 这个 K K K 可以通过解算 R i c c a t i Riccati R i c c a t i 方程得到解,并且将会是一个固定的常数,所以最终的控制有点像 PD 控制器
但是 R i c c a t i Riccati R i c c a t i 方程是一个由许多解的方程,所以不容易手算出来,所以在 matlab 中直接调用 lqr
函数就可以得到对应的 K K K