Explanation
Initialization (k=0)
x 0 + = E [ x 0 ] = 1 N ∑ i = 0 N x i
x^+_0 = E[x_0] = \frac{1}{N}\sum_{i=0}^{N}x_i
x 0 + = E [ x 0 ] = N 1 i = 0 ∑ N x i
P 0 + = E [ ( x 0 − x 0 + ) ( x 0 − x 0 + ) T ]
\boldsymbol{P}^+_0 = E \left[ (x_0 - x^+_0)(x_0 - x^+_0)^T \right]
P 0 + = E [ ( x 0 − x 0 + ) ( x 0 − x 0 + ) T ]
The initialization of the EKF occurs at k = 0 k=0 k = 0 . The variable x 0 + x^+_0 x 0 + is defined as being the mean value of x x x over N iterations. Following this, x 0 x_0 x 0 is derived from the IMU measurements and used to obtain P 0 + P^+_0 P 0 + .
Prediction Step (k = 1, 2, …)
x ^ k ∣ k − 1 = f ( x ^ k − 1 ∣ k − 1 , u k ) ⇌ x k − = Φ ( x k − 1 + , u k − 1 ) = A k − 1 x k − 1 + + B k − 1 u k − 1
\hat{x}_{k|k-1} = f(\hat{x}_{k-1|k-1}, u_k)
\qquad \rightleftharpoons \qquad
\begin{aligned}
x_k^- & = \boldsymbol{\Phi}(x^+_{k-1}, u_{k-1}) \\
& = \boldsymbol{A}_{k-1} x^+_{k-1} + \boldsymbol{B}_{k-1}u_{k-1}
\end{aligned}
x ^ k ∣ k − 1 = f ( x ^ k − 1 ∣ k − 1 , u k ) ⇌ x k − = Φ ( x k − 1 + , u k − 1 ) = A k − 1 x k − 1 + + B k − 1 u k − 1
x k − [ 7 × 1 ] = [ 7 × 7 ] [ 7 × 1 ] + [ 7 × 3 ] [ 3 × 1 ]
x_k^- [7\times1] = [7\times7] [7\times1] + [7\times3][3\times1]
x k − [ 7 × 1 ] = [ 7 × 7 ] [ 7 × 1 ] + [ 7 × 3 ] [ 3 × 1 ]
The first step of the EKF Algorithm is to produce an estimated value x k − x^-_k x k − . The inputs for this step is the previous value x k − 1 + x^+_{k-1} x k − 1 + , and the Gyroscope measurements for the previous time step, u k − 1 u_{k-1} u k − 1 , which should be in radians/second.
The A and B matrices are application dependent, however for an IMU implementation containing an Accelerometer, Gyroscope and Magnetometer, they are shown below:
A k − 1 = [ I 4 x 4 − T 2 S ( q ) 0 3 x 4 I 3 x 3 ] k − 1 = [ 1 0 0 0 − T 2 ( − q 1 ) − T 2 ( − q 2 ) − T 2 ( − q 3 ) 0 1 0 0 − T 2 ( − q 0 ) − T 2 ( − q 3 ) − T 2 ( − q 2 ) 0 0 1 0 − T 2 ( − q 3 ) − T 2 ( − q 0 ) − T 2 ( − q 1 ) 0 0 0 1 − T 2 ( − q 2 ) − T 2 ( − q 1 ) − T 2 ( − q 0 ) 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 ] k − 1 x k − 1 + = [ q 0 q 1 q 2 q 3 b x b y b z ]
A_{k-1} =
\begin{bmatrix}
I_{4x4} & -\frac{T}{2}S(q) \\
0_{3x4} & I_{3x3} \\
\end{bmatrix}_{k-1}
=
\begin{bmatrix}
1 & 0 & 0 & 0 & -\frac{T}{2}(-q_1) & -\frac{T}{2}(-q_2) & -\frac{T}{2}(-q_3) \\
0 & 1 & 0 & 0 & -\frac{T}{2}(\phantom{-}q_0) & -\frac{T}{2}(-q_3) & -\frac{T}{2}(\phantom{-}q_2) \\
0 & 0 & 1 & 0 & -\frac{T}{2}(\phantom{-}q_3) & -\frac{T}{2}(\phantom{-}q_0) & -\frac{T}{2}(-q_1) \\
0 & 0 & 0 & 1 & -\frac{T}{2}(-q_2) & -\frac{T}{2}(\phantom{-}q_1) & -\frac{T}{2}(\phantom{-}q_0) \\
0 & 0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 \\
\end{bmatrix}_{k-1}
\qquad
x^+_{k-1} = \begin{bmatrix}
q_0 \\
q_1 \\
q_2 \\
q_3 \\
b_x \\
b_y \\
b_z \\
\end{bmatrix}
A k − 1 = [ I 4 x 4 0 3 x 4 − 2 T S ( q ) I 3 x 3 ] k − 1 = ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎡ 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 − 2 T ( − q 1 ) − 2 T ( − q 0 ) − 2 T ( − q 3 ) − 2 T ( − q 2 ) 1 0 0 − 2 T ( − q 2 ) − 2 T ( − q 3 ) − 2 T ( − q 0 ) − 2 T ( − q 1 ) 0 1 0 − 2 T ( − q 3 ) − 2 T ( − q 2 ) − 2 T ( − q 1 ) − 2 T ( − q 0 ) 0 0 1 ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎤ k − 1 x k − 1 + = ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎡ q 0 q 1 q 2 q 3 b x b y b z ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎤
B k − 1 = [ T 2 S ( q ) 0 3 x 3 ] k − 1 = T 2 [ − q 1 − q 2 − q 3 − q 0 − q 3 − q 2 − q 3 − q 0 − q 1 − q 2 − q 1 − q 0 0 0 0 0 0 0 0 0 0 ] k − 1 u k − 1 = [ w x w y w z ] k − 1
B_{k-1} =
\begin{bmatrix}
\frac{T}{2}S(q) \\
0_\mathrm{3x3} \\
\end{bmatrix}_{k-1}
= \frac{T}{2}
\begin{bmatrix}
-q_1 & -q_2 & -q_3 \\
\phantom{-}q_0 & -q_3 & \phantom{-}q_2 \\
\phantom{-}q_3 & \phantom{-}q_0 & -q_1 \\
-q_2 & \phantom{-}q_1 & \phantom{-}q_0 \\
0 & 0 & 0 \\
0 & 0 & 0 \\
0 & 0 & 0 \\
\end{bmatrix}_{k-1}
\qquad
u_{k-1} =
\begin{bmatrix}
w_x \\
w_y \\
w_z \\
\end{bmatrix}_{k-1}
B k − 1 = [ 2 T S ( q ) 0 3 x 3 ] k − 1 = 2 T ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎡ − q 1 − q 0 − q 3 − q 2 0 0 0 − q 2 − q 3 − q 0 − q 1 0 0 0 − q 3 − q 2 − q 1 − q 0 0 0 0 ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎤ k − 1 u k − 1 = ⎣ ⎡ w x w y w z ⎦ ⎤ k − 1
Where S ( q ) S(q) S ( q ) is the system dynamic equation using Quaternions.
Next, we compute the estimated Covariance matrix, P k − P^{-}_{k} P k − . Where Q k − 1 Q_{k-1} Q k − 1 is the Process noise covariance.
P k − = A k − 1 P k − 1 + A k − 1 T + Q k − 1
\boldsymbol{P}_k^- = \boldsymbol{A}_{k-1} \boldsymbol{P}^+_{k-1} \boldsymbol{A}^T_{k-1} + \boldsymbol{Q}_{k-1}
P k − = A k − 1 P k − 1 + A k − 1 T + Q k − 1
P k − [ 7 × 7 ] = [ 7 × 7 ] [ 7 × 7 ] [ 7 × 7 ] T + [ 7 × 7 ]
P_k^- [7\times7] = [7\times7][7\times7][7\times7]^T + [7\times7]
P k − [ 7 × 7 ] = [ 7 × 7 ] [ 7 × 7 ] [ 7 × 7 ] T + [ 7 × 7 ]
Update Step (k = 1, 2, …)
The first step is to calculate the Measurement Innovation, which is the error between the measurement values and the rotated estimation of the predicted values.
y ~ k = [ z k − h k ( x ^ k ∣ k − 1 ) ] ⇌ d k = [ z k − h k ( x k − ) V x r e f ] = [ z k − d k − ]
\tilde{y}_k = \left[ z_k - h_k(\hat{x}_{k|k-1}) \right]
\qquad \rightleftharpoons \qquad
d_k = \left[ z_k - h_k(x^-_k)V_x^{ref} \right] = \left[ z_k - d^-_k \right]
y ~ k = [ z k − h k ( x ^ k ∣ k − 1 ) ] ⇌ d k = [ z k − h k ( x k − ) V x r e f ] = [ z k − d k − ]
d k [ 6 × 1 ] = ( [ 6 × 1 ] − [ [ 3 × 3 ] [ 3 × 1 ] [ 3 × 3 ] [ 3 × 1 ] ] ) = [ 6 × 1 ] − [ 6 × 1 ]
d_k[6\times1] = \left( [6\times1] -
\begin{bmatrix}
\left[3\times3\right] & \left[3\times1\right] \\
\left[3\times3\right] & \left[3\times1\right] \\
\end{bmatrix}
\right) = [6\times1] - [6\times1]
d k [ 6 × 1 ] = ( [ 6 × 1 ] − [ [ 3 × 3 ] [ 3 × 3 ] [ 3 × 1 ] [ 3 × 1 ] ] ) = [ 6 × 1 ] − [ 6 × 1 ]
The measurement function, h k ( ∗ ) h_{k}\left(*\right) h k ( ∗ ) , which is typically a transformation from the calculated coordinate system to the measurement body coordinate system. Due to our predicted quaternion, x k − x^-_k x k − , being calculated in the NED-Frame (Global), we must rotate the quaternion from the NED-Frame to the Body-Frame (Local) using R n b R^b_n R n b .
h ( q k ) = h ( x ^ k ∣ k − 1 ) = h ( x k − ) = R n b
h(q_k) = h(\hat{x}_{k|k-1}) = h(x^-_k) = R^b_n
h ( q k ) = h ( x ^ k ∣ k − 1 ) = h ( x k − ) = R n b
We then multiply the Rotated matrix by the reference vector, V x r e f V^{ref}_x V x r e f . The V x r e f V^{ref}_x V x r e f vectors are the North and Down vectors for the NED-coordinate frame, as defined below:
V x r e f = [ V x V y V z ] → V N r e f = [ M x M y 0 ] V D r e f = g ∗ [ − 0 − 0 − 1 ]
V^{ref}_x =
\begin{bmatrix}
V_x \\ V_y \\ V_z
\end{bmatrix}
\qquad \rightarrow \qquad
V^{ref}_N =
\begin{bmatrix}
M_x \\ M_y \\ 0
\end{bmatrix}
\qquad
V^{ref}_D = g*
\begin{bmatrix}
\phantom{-}0 \\ \phantom{-}0 \\ -1
\end{bmatrix}
V x r e f = ⎣ ⎡ V x V y V z ⎦ ⎤ → V N r e f = ⎣ ⎡ M x M y 0 ⎦ ⎤ V D r e f = g ∗ ⎣ ⎡ − 0 − 0 − 1 ⎦ ⎤
Next, we are able to calculate the Innovation Covariance matrix, S k S_k S k :
S k = H k P k ∣ k − 1 H k T + R k ⇌ S k = H k P k − H k T + R k
\boldsymbol{S}_k = \boldsymbol{H}_k \boldsymbol{P}_{k|k-1} \boldsymbol{H}^T_k + \boldsymbol{R}_k
\qquad \rightleftharpoons \qquad
\boldsymbol{S}_k = \boldsymbol{H}_k \boldsymbol{P}^-_k \boldsymbol{H}^T_k + \boldsymbol{R}_k
S k = H k P k ∣ k − 1 H k T + R k ⇌ S k = H k P k − H k T + R k
S k [ 6 × 6 ] = [ 6 × 7 ] [ 7 × 7 ] [ 7 × 6 ] + [ 6 × 6 ]
\boldsymbol{S}_k [6\times6] = [6\times7][7\times7][7\times6] + [6\times6]
S k [ 6 × 6 ] = [ 6 × 7 ] [ 7 × 7 ] [ 7 × 6 ] + [ 6 × 6 ]
Where H k H_k H k is the Jacobian matrix of the measurement function, h ( ∗ ) h(*) h ( ∗ ) .
H k = ∂ h ∂ x ∣ x ^ k ∣ k − 1 = ∂ h ( x k , u k ) ∂ x ∣ x k = x ^ k −
H_k = \left. \frac{\partial h}{\partial x}\right\vert_{\hat{x}_{k|k-1}} = \left.\frac{\partial h (x_k, u_k)}{\partial x}\right\vert_{x_k = \hat{x}^-_k}
H k = ∂ x ∂ h ∣ ∣ ∣ ∣ x ^ k ∣ k − 1 = ∂ x ∂ h ( x k , u k ) ∣ ∣ ∣ ∣ x k = x ^ k −
The generalized Jacobian Matrix, H k H_k H k , may be calculated using a reference vector as shown below. The predicted value of the quaternion, x k − x^-_\mathrm{k} x k − is contained within the calculation of the Jacobian.
H k = ∂ h V ( q k ) ∂ ( q k ) ∣ q k ∣ k − 1 = ∂ h ( x k , u k ) ∂ x ∣ x k = x k −
H_k = \left.\frac{\partial h^V(q_k)}{\partial(q_k)} \right\vert_\mathrm{q_\mathrm{k|k-1}} = \left.\frac{\partial h(x_k,u_k)}{\partial x}\right\vert_\mathrm{x_k=x^-_k}
H k = ∂ ( q k ) ∂ h V ( q k ) ∣ ∣ ∣ ∣ q k ∣ k − 1 = ∂ x ∂ h ( x k , u k ) ∣ ∣ ∣ ∣ x k = x k −
H k = 2 ∗ [ [ − q 0 − q 3 − q 2 ] V x r e f [ − q 1 − q 2 − q 3 ] V x r e f [ − q 2 − q 1 − q 0 ] V x r e f [ − q 3 − q 0 − q 1 ] V x r e f [ − q 3 − q 0 − q 1 ] V x r e f [ − q 2 − q 1 − q 0 ] V x r e f [ − q 1 − q 2 − q 3 ] V x r e f [ − q 0 − q 3 − q 2 ] V x r e f [ − q 2 − q 1 − q 0 ] V x r e f [ − q 3 − q 0 − q 1 ] V x r e f [ − q 0 − q 3 − q 2 ] V x r e f [ − q 1 − q 2 − q 3 ] V x r e f ] k
H_k = 2 *
\begin{bmatrix}
\begin{bmatrix} \phantom{-}q_0 & \phantom{-}q_3 & -q_2 \end{bmatrix} V_x^\mathrm{ref} &
\begin{bmatrix} \phantom{-}q_1 & \phantom{-}q_2 & \phantom{-}q_3 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} -q_2 & \phantom{-}q_1 & -q_0 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} -q_3 & \phantom{-}q_0 & \phantom{-}q_1 \end{bmatrix} V_x^{ref} \\[0.5em]
\begin{bmatrix} -q_3 & \phantom{-}q_0 & \phantom{-}q_1 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} \phantom{-}q_2 & -q_1 & \phantom{-}q_0 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} \phantom{-}q_1 & \phantom{-}q_2 & \phantom{-}q_3 \end{bmatrix} V_x^{ref} & \begin{bmatrix} -q_0 & -q_3 & \phantom{-}q_2 \end{bmatrix} V_x^{ref} \\[0.5em]
\begin{bmatrix} \phantom{-}q_2 & -q_1 & \phantom{-}q_0 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} \phantom{-}q_3 & -q_0 & -q_1 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} \phantom{-}q_0 & \phantom{-}q_3 & -q_2 \end{bmatrix} V_x^{ref} &
\begin{bmatrix} \phantom{-}q_1 & \phantom{-}q_2 & \phantom{-}q_3 \end{bmatrix} V_x^{ref}
\end{bmatrix}_{k}
H k = 2 ∗ ⎣ ⎢ ⎡ [ − q 0 − q 3 − q 2 ] V x r e f [ − q 3 − q 0 − q 1 ] V x r e f [ − q 2 − q 1 − q 0 ] V x r e f [ − q 1 − q 2 − q 3 ] V x r e f [ − q 2 − q 1 − q 0 ] V x r e f [ − q 3 − q 0 − q 1 ] V x r e f [ − q 2 − q 1 − q 0 ] V x r e f [ − q 1 − q 2 − q 3 ] V x r e f [ − q 0 − q 3 − q 2 ] V x r e f [ − q 3 − q 0 − q 1 ] V x r e f [ − q 0 − q 3 − q 2 ] V x r e f [ − q 1 − q 2 − q 3 ] V x r e f ⎦ ⎥ ⎤ k
Where V x r e f V_x^\mathrm{ref} V x r e f are the reference vectors defined above.
Now we are able to calculate the Kalman Gain by using the predicted Covariance matrix, P k − P^-_k P k − .
K k = P k − H k T ( H k P k − H k T + R k ) − 1
\boldsymbol{K}_{k} = \boldsymbol{P}^{-}_{k} \boldsymbol{H}^{T}_{k} \left( \boldsymbol{H}_{k} \boldsymbol{P}^{-}_{k} \boldsymbol{H}^{T}_{k} + \boldsymbol{R}_k \right)^{-1}
K k = P k − H k T ( H k P k − H k T + R k ) − 1
K k [ 7 × 6 ] = [ 7 × 7 ] [ 7 × 6 ] ( [ 6 × 7 ] [ 7 × 7 ] [ 7 × 6 ] + [ 6 × 6 ] ) − 1 = [ 7 × 6 ] ( [ 6 × 6 ] )
\boldsymbol{K}_k[7\times6] = [7\times7][7\times6]\left([6\times7][7\times7][7\times6] + [6\times6]\right)^\mathrm{-1} = [7\times6]\left( [6\times6]\right)
K k [ 7 × 6 ] = [ 7 × 7 ] [ 7 × 6 ] ( [ 6 × 7 ] [ 7 × 7 ] [ 7 × 6 ] + [ 6 × 6 ] ) − 1 = [ 7 × 6 ] ( [ 6 × 6 ] )
Once the Kalman Gain has been determined, we can use it to calculate our current state estimate, x k + x^+_k x k + , by using the previously predicted estimate state,x k − x^-_k x k − , and our measurement innovation, d k d_k d k .
x k + = x k − + K k [ d k ] = x k − + K k [ z k − h k ( x k − ) V x r e f ]
x^+_k = x^-_k + \boldsymbol{K}_k [ d_k ] = x^-_k + \boldsymbol{K}_k \left[ z_k - h_k(x^-_k)V^\mathrm{ref}_x \right]
x k + = x k − + K k [ d k ] = x k − + K k [ z k − h k ( x k − ) V x r e f ]
x k + [ 7 × 1 ] = [ 7 × 1 ] + [ 7 × 6 ] [ 6 × 1 ]
x^+_k[7\times1] = [7\times1] + [7\times6][6\times1]
x k + [ 7 × 1 ] = [ 7 × 1 ] + [ 7 × 6 ] [ 6 × 1 ]
Next, we use the estimated value of x k − x^-_k x k − , the Kalman Gain, K k K_k K k , the measured values of the sensors, z k z_k z k , and the estimated values of the sensors based on the rotated value of the predicted quaternion, x k − 1 − x^{-}_{k-1} x k − 1 − .
P k + = ( I − K k H k ) P k −
\boldsymbol{P}^+_k = \left( \boldsymbol{I} - \boldsymbol{K}_k \boldsymbol{H}_k \right) \boldsymbol{P}^-_k
P k + = ( I − K k H k ) P k −
P k + [ 7 × 7 ] = ( [ 7 × 7 ] − [ 7 × 6 ] [ 6 × 7 ] ) [ 7 × 7 ] = [ 7 × 7 ] [ 7 × 7 ]
\boldsymbol{P}^+_k[7\times7] = \left([7\times7] - [7\times6][6\times7]\right)[7\times7] = [7\times7][7\times7]
P k + [ 7 × 7 ] = ( [ 7 × 7 ] − [ 7 × 6 ] [ 6 × 7 ] ) [ 7 × 7 ] = [ 7 × 7 ] [ 7 × 7 ]
Now we can calculate the Covariance Matrix, P k P_k P k . The identity matrix, I, is used to limit the divergence of the EKF.
After updating the system state variables, x k + x^+_k x k + and P k + P^+_k P k + , we are able to calculate the system residual error, ε k \varepsilon_k ε k .
ε k = [ z k − h k ( x k + ) V x r e f ]
\varepsilon_k = \left[ z_k - h_k(x^+_k)V^\mathrm{ref}_x \right]
ε k = [ z k − h k ( x k + ) V x r e f ]
ε k [ 6 × 1 ] = ( [ 6 × 1 ] − [ [ 3 × 3 ] [ 3 × 1 ] [ 3 × 3 ] [ 3 × 1 ] ] ) = [ 6 × 1 ] − [ 6 × 1 ]
\varepsilon_k[6\times1] = \left( [6\times1] -
\begin{bmatrix}
\left[3\times3\right] & \left[3\times1\right] \\
\left[3\times3\right] & \left[3\times1\right] \\
\end{bmatrix}
\right) = [6\times1] - [6\times1]
ε k [ 6 × 1 ] = ( [ 6 × 1 ] − [ [ 3 × 3 ] [ 3 × 3 ] [ 3 × 1 ] [ 3 × 1 ] ] ) = [ 6 × 1 ] − [ 6 × 1 ]
With the system residual error, ε k \varepsilon_k ε k , we can now update the Process Noise Covariance, Q k Q_k Q k , and Measurement Noise Covariance, R k R_k R k , using the equations below.
R k = α R k − 1 + ( 1 − α ) ( ε k ε k T + H k P k − H k T ) Q k = α Q k − 1 + ( 1 − α ) ( K k d k d k T K k T )
\boldsymbol{R}_k = \alpha\boldsymbol{R}_{k-1} + (1-\alpha)\left( \varepsilon_k \varepsilon^T_k + \boldsymbol{H}_k \boldsymbol{P}^-_{k} \boldsymbol{H}^T_k \right)
\qquad
\boldsymbol{Q}_k = \alpha\boldsymbol{Q}_{k-1} + (1 - \alpha)\left( \boldsymbol{K}_k d_k d^T_k \boldsymbol{K}^T_k \right)
R k = α R k − 1 + ( 1 − α ) ( ε k ε k T + H k P k − H k T ) Q k = α Q k − 1 + ( 1 − α ) ( K k d k d k T K k T )
Following this, we can update our previous value variables, x k − 1 x_\mathrm{k-1} x k − 1 and P k − 1 P_\mathrm{k-1} P k − 1 .
x k − 1 + = x k + P k − 1 + = P k +
x^{+}_{k-1} = x^+_k \qquad
\boldsymbol{P}^{+}_{k-1} = \boldsymbol{P}^+_k
x k − 1 + = x k + P k − 1 + = P k +