Adaptive Unscented Kalman Filter using Maximum Likelihood

Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark (e-mail: [email protected]). Abstract: Th...

0 downloads 97 Views 528KB Size
Downloaded from orbit.dtu.dk on: Sep 12, 2019

Adaptive Unscented Kalman Filter using Maximum Likelihood Estimation

Mahmoudi, Zeinab; Poulsen, Niels Kjølstad; Madsen, Henrik; Jørgensen, John Bagterp Published in: IFAC-PapersOnLine Link to article, DOI: 10.1016/j.ifacol.2017.08.356 Publication date: 2017 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit

Citation (APA): Mahmoudi, Z., Poulsen, N. K., Madsen, H., & Jørgensen, J. B. (2017). Adaptive Unscented Kalman Filter using Maximum Likelihood Estimation. IFAC-PapersOnLine, 50(1), 3859-3864. https://doi.org/10.1016/j.ifacol.2017.08.356

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.  Users may download and print one copy of any publication from the public portal for the purpose of private study or research.  You may not further distribute the material or use it for any profit-making activity or commercial gain  You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

The International Federation of Congress Automatic Control Proceedings of 20th Proceedings of the the 20th World World The International Federation of Congress Automatic Control Toulouse, France, July 9-14, 2017 The International Federation of Proceedings of the 20th World Congress The International of Automatic Automatic Control Control Toulouse, France,Federation July 9-14, 2017 Available online at www.sciencedirect.com Toulouse, France, July 9-14, 2017 The International of Automatic Control Toulouse, France,Federation July 9-14, 2017 Toulouse, France, July 9-14, 2017

ScienceDirect

IFAC PapersOnLine 50-1 (2017) 3859–3864 Adaptive Unscented Kalman Filter using Adaptive Adaptive Unscented Unscented Kalman Kalman Filter Filter using using  Adaptive Unscented Kalman Filter using Maximum Likelihood Estimation  Adaptive Unscented Kalman Filter using Maximum Likelihood Estimation Maximum Maximum Likelihood Likelihood Estimation Estimation  Maximum Likelihood Estimation

Zeinab Mahmoudi, Niels Kjølstad Poulsen, Henrik Madsen, Zeinab Mahmoudi, Niels Kjølstad Poulsen, Henrik Madsen, Zeinab Niels Kjølstad Poulsen, John Bagterp Jørgensen Zeinab Mahmoudi, Mahmoudi, Niels Kjølstad Poulsen, Henrik Henrik Madsen, Madsen, John Bagterp Jørgensen John Bagterp Jørgensen Zeinab Mahmoudi, Niels Kjølstad Poulsen, John Bagterp Jørgensen Henrik Madsen, John Bagterp and Jørgensen Department of Applied Mathematics Computer Science, Technical Department of Mathematics and Computer Science, Technical Department of Applied Applied Mathematics andLyngby, Computer Science, Technical University of Denmark, 2800 Kgs. Denmark (e-mail: Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark (e-mail: University of Denmark, 2800 Kgs. Lyngby, Denmark (e-mail: [email protected]). Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark (e-mail: [email protected]). [email protected]). University of Denmark, [email protected]). 2800 Kgs. Lyngby, Denmark (e-mail: [email protected]). Abstract: The purpose of this study is to develop an adaptive unscented Kalman filter (UKF) Abstract: The purpose of this study is to an adaptive unscented Kalman filter (UKF) Abstract: The purpose of of this this study is to to develop develop an adaptive unscented Kalman filter (MLE) (UKF) by tuning the measurement noisestudy covariance. We usean the maximum likelihood estimation Abstract: The purpose is develop adaptive unscented Kalman filter (UKF) by tuning the measurement noise covariance. We use the maximum likelihood estimation (MLE) by tuning the measurement noise covariance. We use the maximum likelihood estimation (MLE) and the covariance matching (CM) method to estimate the noise covariance. The multi-step Abstract: The purpose of this study is to develop an adaptive unscented Kalman filter (UKF) by tuning the measurement noise covariance. We use the maximum likelihood estimation (MLE) and the covariance matching (CM) method to estimate the noise covariance. The multi-step and the covariance matching (CM) method to estimate the noise covariance. The multi-step prediction errors generated by the UKF are used for covariance estimation by MLE and by tuning the measurement noise covariance. We use the maximum likelihood estimation (MLE) and the covariance matchingby (CM) method to used estimate the noise covariance. The multi-step prediction errors generated the UKF are for covariance estimation by MLE prediction errors generated by the method UKFestimation are used for covariance estimation by multi-step MLE and and CM. Then we apply the twoby covariance methods on an example The application. In and the covariance matching (CM) to estimate the noise covariance. prediction errors generated the UKF are used for covariance estimation by MLE and CM. Then we apply the two covariance estimation methods on an example application. In CM. Then we apply the two estimation methods on an example application. In the example, we identify the covariance of the measurement noise for a continuous glucose prediction errors generated by the UKF are used for covariance estimation by MLE and CM. Then we apply the two covariance estimation methods on an example application. In the example, we identify the covariance of the measurement noise for a continuous glucose the example, we identify the covariance of the measurement noise for a continuous glucose monitoring (CGM) sensor. The sensor measures the subcutaneous glucose concentration for a CM. Then we apply the two estimation methods on an example application. In the example, we identify the covariance of the the measurement noiseglucose for a concentration continuous glucose monitoring (CGM) sensor. The sensor subcutaneous for aa monitoring (CGM) sensor. The sensor measures measures the subcutaneous glucose concentration for type 1 diabetes patient. The root-mean square (RMS) error and noise the glucose computation time are used the example, we identify the covariance of the measurement for a continuous glucose monitoring (CGM) sensor. The sensor measures the subcutaneous concentration for a type 11 diabetes patient. The root-mean square (RMS) error and the computation time are used type diabetes patient. The root-mean square (RMS) error and the computation time are used to compare the patient. performance of sensor the two covariance estimation methods. The results indicate monitoring (CGM) sensor. The measures the subcutaneous glucose concentration for a type 1 diabetes The root-mean square (RMS) error and the computation time are used to compare the performance of the two covariance estimation methods. The indicate to compare the patient. performance of the two covariance estimation methods. The results results indicate that asdiabetes the prediction horizon expands, the RMS error for and the the MLE declines, while the error type 1 The root-mean square (RMS) error computation time are used to compare the performance of the two covariance estimation methods. The results indicate that as prediction horizon expands, the RMS error for MLE declines, while the error that as the the prediction horizon expands, the RMS error for the the methods. MLE declines, while the error remains relatively largehorizon for theofexpands, CM For larger prediction horizons, theresults MLE the provides to compare the performance the method. two the covariance estimation The indicate that as the prediction RMS error for the MLE declines, while error remains relatively large for the CM method. For larger prediction horizons, the MLE provides remains relatively large for the expands, CM method. method. Forbiased larger prediction horizons, the MLE provides an estimate of thelarge noise covariance that the is less than the MLE estimate bythe the CM the method. that as the prediction horizon RMS error for the declines, while error remains relatively for the CM For larger prediction horizons, MLE provides an estimate of the noise covariance that is biased than the by the CM method. an estimate of the noise covariance that is less less biased than the estimate estimate bythe theMLE CM provides method. The CM relatively method islarge computationally less expensive though. remains for the CM method. For larger prediction horizons, an estimate of the noise covariance that is less biased than the estimate by the CM method. The CM is computationally less though. Theestimate CM method method is noise computationally less expensive expensive though. an of theis covariance that is less biased than the estimate by the CM method. The CM method computationally less expensive though. © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. The CM method is computationally less expensive though. Keywords: Unscented Kalman filter, Maximum likelihood estimation, Covariance matching Keywords: Unscented Kalman filter, Maximum likelihood estimation, Covariance matching Keywords: Unscented Kalman filter, likelihood estimation, Covariance matching technique, estimation, Continuous glucose monitors. Keywords: Adaptive Unscentedfiltering, KalmanCovariance filter, Maximum Maximum likelihood estimation, Covariance matching technique, Adaptive filtering, Covariance estimation, Continuous glucose monitors. technique, Adaptive filtering, Covariance estimation, Continuous glucose monitors. Keywords: Unscented Kalman filter, Maximum likelihood estimation, Covariance matching technique, Adaptive filtering, Covariance estimation, Continuous glucose monitors. technique, Adaptive filtering, Covariance estimation, Continuous glucose monitors. 1. INTRODUCTION estimation of the measurement noise covariance as the 1. INTRODUCTION INTRODUCTION estimation of the measurement noise covariance as the 1. estimation of measurement noise basis for deriving adaptation. 1. INTRODUCTION estimation of the the filter measurement noise covariance covariance as as the the basis for deriving filter adaptation. basis for deriving adaptation. 1. INTRODUCTION of the filter measurement noise covariance as the basis for deriving filter adaptation. Identifying the uncertainties that affect a system is fun- estimation paper is structured as follows. First, we present the Identifying the the uncertainties uncertainties that that affect affect aa system system is is funfun- The The paper is structured as follows. First, we present the for deriving filter adaptation. Identifying damental andthat designing optimal and basis The paper is follows. we the Identifyingfor themonitoring uncertainties affect aan system is fununscented filter as (UKF) for First, prediction, filtering, damental for monitoring and designing an optimal and The paper Kalman is structured structured as follows. First, we present present the unscented Kalman filter (UKF) for prediction, filtering, damental for monitoring and designing an optimal and adaptive estimator. In order to have a filter that is close Identifying the uncertainties that affect a system is fununscented Kalman filter (UKF) for prediction, filtering, damental for monitoring and designing an optimal and and generating the covariance matrix of the prediction The paper is structured as follows. First, we present the adaptive estimator. In order to have a filter that is close unscented Kalman filter (UKF) for prediction, filtering, and generating the covariance matrix of the prediction adaptive estimator. to have aa filter that is close to optimal, wemonitoring needIn toorder know the covariance of the prodamental for and designing an optimal and and generating the covariance matrix of the prediction adaptive estimator. In order to have filter that is close errors. Then, we develop the MLE problem and the CM unscented Kalman filter (UKF) for prediction, filtering, to optimal, optimal, we we need need to know know the the covariance covariance of of the the propro- and generating the covariance matrix of the prediction Then, we develop the MLE problem and the CM to cess and measurement identifying adaptive estimator. order tothe have afor that is noise close errors. Then, we develop MLE problem and the to optimal, we needInto tonoise. knowMethods covariance of the pro- errors. technique to estimate the the covariance of of thethe measurement and generating covariance matrix prediction cess and measurement measurement noise. Methods forfilter identifying noise errors. Then, wethe develop the MLE problem and the CM CM technique to estimate the covariance of the measurement cess and noise. Methods for identifying noise covariances often include maximum likelihood to optimal, we need tonoise. know the covariance ofestimation the noise pro- noise. technique to estimate the covariance of the cess and measurement Methods for identifying The multi-step prediction errors and measurement theirthe covarierrors. Then, we develop the MLE problem and CM covariances often include maximum likelihood estimation technique to estimate the covariance of the measurement The multi-step prediction errors and their covaricovariances often maximum likelihood estimation (MLE) and Rawlings, 2015b; Jørgensen and noise. cess and(Zagrobelny measurement noise. Methods for identifying noise noise. The multi-step prediction errors and their covaricovariances often include include maximum likelihood estimation ances generated by thethe UKF are used in the MLE and CM technique to estimate covariance of the measurement (MLE) (Zagrobelny and Rawlings, 2015b; Jørgensen and noise. The multi-step prediction errors and their covarigenerated by the UKF are used in the MLE and CM (MLE) (Zagrobelny and 2015b; Jørgensen and Jørgensen, 2007), covariance matching (CM) techniques covariances often include maximum likelihood estimation ances generated by the UKF used in the CM (MLE) (Zagrobelny and Rawlings, Rawlings, 2015b; Jørgensen and ances methods. then the are MLE and algorithms on noise. TheWe multi-step errors andMLE theirand covariJørgensen, 2007), covariance matching (CM) techniques ances generated by apply the prediction UKF are used inCM the MLE and CM methods. We then apply the MLE and CM algorithms on Jørgensen, 2007), covariance (CM) techniques (Maybeck, 1982; Weige et al.,matching 2015; Partovibakhsh (MLE) (Zagrobelny and Rawlings, 2015b; Jørgensen and methods. We then apply the MLE and CM algorithms on Jørgensen, 2007), covariance matching (CM) techniques an example. The example is a are nonlinear metabolic model of ances generated by the UKF used in the MLE and CM (Maybeck, 1982; Weige et al., 2015; Partovibakhsh and methods. We then apply the MLE and CM algorithms on an example. The example is a nonlinear metabolic model of (Maybeck, 1982; Weige et al., 2015; Partovibakhsh and Liu, 2015),1982; and Weige correlation-based approaches suchand as aan Jørgensen, 2007), covariance matching (CM) techniques example. The example is a nonlinear metabolic model of (Maybeck, et al., 2015; Partovibakhsh patient with type 1 diabetes. In the example, we estimate methods. We then apply the MLE and CM algorithms on Liu, 2015), and correlation-based approaches such as an example. The example is a nonlinear metabolic model of a patient with type 1 diabetes. In the example, we estimate Liu, 2015), and correlation-based approaches such as ˚ (Maybeck, 1982; Weige et al., 2015; Partovibakhsh and the autocovariance least-squares (ALS) method (A kesson a patient with type 1 diabetes. In the example, we estimate Liu, 2015), and correlation-based approaches such as the noise covariance of a continuous glucose monitoring an example. The example is a nonlinear metabolic model of ˚ a patient with type 1 diabetes. In the example, we estimate the autocovariance least-squares (ALS) method ( A kesson noise covariance of aa continuous glucose monitoring ˚ the autocovariance least-squares (ALS) method ((A kesson Liu, 2015), and correlation-based approaches such as the et al., 2008; Odelson et al., 2006a,b; Zagrobelny and Rawl˚ the noise covariance of continuous glucose monitoring (CGM) sensor, and derive the adaptive UKF to filter the the autocovariance least-squares (ALS) method A kesson a patient with type 1 diabetes. In the example, we estimate noisesensor, covariance of a continuous monitoring et al., al., 2008; 2008; Odelson Odelson et et al., al., 2006a,b; Zagrobelny Zagrobelny and and Rawl- the and derive the adaptiveglucose UKF to filter the et Rawl˚ ings, 2015a). These least-squares methods often(ALS) deal with onlyand or (CGM) theal., autocovariance method (linear Akesson (CGM) sensor, and the UKF filter sensor measurements. et 2008; Odelson et al., 2006a,b; 2006a,b; Zagrobelny Rawlthe noise covariance of a continuous monitoring (CGM) sensor, and derive derive the adaptive adaptiveglucose UKF to to filter the the ings, 2015a). These methods often deal with only linear or sensor measurements. ings, These methods often deal with only linear or linearized CM technique is commonly used et al.,2015a). 2008;systems. Odelson et al., 2006a,b; Zagrobelny Rawlings, 2015a). These The methods often deal with onlyand linear or sensor (CGM)measurements. sensor, and derive the adaptive UKF to filter the sensor measurements. linearized systems. The CM technique is commonly used linearized systems. The CM technique is commonly used for nonlinear it is ings,the 2015a). Thesesystems, methods often deal with only linear or sensor measurements. linearized systems. The CMbecause technique is computationally commonly used for the nonlinear systems, because it is is computationally for the nonlinear systems, because it computationally inexpensive and it is flexible to accommodate the nonlinear linearized systems. The CM technique is commonly used for the nonlinear systems, because it is computationally 2. MATERIALS AND METHODS inexpensive and and it it is is flexible flexible to to accommodate accommodate the the nonlinear nonlinear inexpensive 2. MATERIALS AND METHODS models. It isand a suboptimal though. for the nonlinear because it estimation is computationally inexpensive it systems, is flexible covariance to accommodate the nonlinear 2. models. It is a suboptimal covariance estimation though. 2. MATERIALS MATERIALS AND AND METHODS METHODS models. It is aaon suboptimal estimation though. The literature the use of covariance optimization-based covariance inexpensive and it is flexible to accommodate the nonlinear models. It is suboptimal covariance estimation though. The literature literature on on the the use use of of optimization-based optimization-based covariance covariance 2.1 The unscented 2. MATERIALS AND METHODS Kalman filter The estimation approaches forofnonlinear systems is sparse. models. It is aon suboptimal covariance estimation though. 2.1 The unscented Kalman filter The literature the use optimization-based covariance estimation approaches for nonlinear nonlinear systems is sparse. sparse. estimation approaches for systems is 2.1 The The unscented unscented Kalman Kalman filter filter The literature on the use ofnonlinear optimization-based covariance 2.1 estimation approaches for systems is sparse. The purpose of this study is to use an optimization-based 2.1 The unscented Kalman filter In the UKF, a set of sigma points are deterministically The purpose of this study is to use an optimization-based estimation approaches for nonlinear systems is sparse. The of this study is an optimization-based estimator, i.e., of the In the UKF, aa set of sigma points are deterministically The purpose purpose of the thisMLE studymethod, is to to use usefor anidentification optimization-based In the UKF, set of sigma points are chosen to represent the mean and covariance of the states. estimator, i.e., the MLE method, for identification of the In the to UKF, a set the of sigma points are deterministically deterministically estimator, i.e., the method, for identification of the noise covariance in MLE astudy nonlinear system. Furthermore, we chosen represent mean and covariance of the states. The purpose of this is to use an optimization-based estimator, i.e., the MLE method, for identification of the chosen to represent the mean and covariance of the The sigma points therefore approximate the noise covariance in a nonlinear system. Furthermore, we In the UKF, a set of sigma points are deterministically chosen to represent the mean and covariance ofprobability the states. states. noise covariance in a nonlinear system. Furthermore, we compare the MLE approach with a suboptimal estimation The sigma points therefore approximate the probability estimator, i.e., the MLE method, for identification of the noise covariance a nonlinear Furthermore, we distribution The sigma points the probability of the therefore states as itapproximate goescovariance through the nonlinear compare the MLE MLEinapproach approach withsystem. a suboptimal suboptimal estimation chosen to represent the mean and of the states. The sigma points therefore approximate the probability compare the with a estimation method, i.e.,MLE theinapproach CM algorithm, in theFurthermore, context of we an distribution of the states as it goes through the nonlinear noise covariance a nonlinear compare the withsystem. a suboptimal estimation distribution of the states as it goes through the nonlinear transformation (S¨arkk¨ a, 2007; Julier and Uhlmann, 2004). method, i.e., the the CM CM algorithm, in the the context context of an an The sigma points therefore approximate the probability distribution of the states as it goes through the nonlinear method, i.e., algorithm, in of adaptive unscented Kalman filter (UKF). We employ the (S¨ aarkk¨ aa,, 2007; Julier and 2004). compare the with a(UKF). suboptimal estimation method, i.e.,MLE the approach CM algorithm, in the We context of the an transformation transformation (S¨ rkk¨ 2007; Julier and Uhlmann, Uhlmann, 2004). Approximating the probability distribution of the states adaptive unscented Kalman filter employ distribution of the states as it goes through the nonlinear transformation (S¨ a rkk¨ a , 2007; Julier and Uhlmann, 2004). adaptive Kalman filter employ the probability distribution of the states method, unscented i.e., the CM algorithm, in the We context of the an Approximating adaptive unscented Kalman filter (UKF). (UKF). We employ the Approximating the probability distribution of the states by the sigma points in the UKF has shown tothe produce transformation (S¨ a rkk¨ a , 2007; Julier and Uhlmann, 2004). Approximating the probability distribution of states by the sigma points in the UKF has shown to produce adaptive unscented Kalman filter (UKF). We employ the  This work is funded by the Danish Diabetes Academy supported by the sigma points in the UKF has shown to produce less estimation bias compared to the linearization in the Approximating the probability distribution of the states by the sigma points in the UKF has shown to produce  This work is funded by the Danish Diabetes Academy supported less estimation bias compared to the linearization in the  less estimation bias compared to the linearization in byThis the Novo Nordisk Foundation. extended Kalman filter (EKF) (Simon, 2006). work is funded by the Danish Diabetes Academy supported by the sigma points in the UKF has shown to produce  less estimation bias compared to the linearization in the the work Nordisk is fundedFoundation. by the Danish Diabetes Academy supported byThis the Novo extended Kalman filter (EKF) (Simon, 2006). byThis the Novo Novo Nordisk Foundation. extended Kalman filter (EKF) (Simon, (Simon, 2006).  less estimation bias compared to the linearization in the work Nordisk is fundedFoundation. by the Danish Diabetes Academy supported by the extended Kalman filter (EKF) 2006). by the Novo Nordisk Foundation. Copyright © 2017 IFAC 3922extended Kalman filter (EKF) (Simon, 2006). Copyright © 2017 IFAC 3922 Copyright 2017 3922 2405-8963 © IFAC (International Federation of Automatic Control) Copyright © 2017, 2017 IFAC IFAC 3922Hosting by Elsevier Ltd. All rights reserved. Peer review under responsibility of International Federation of Automatic Copyright © 2017 IFAC 3922Control. 10.1016/j.ifacol.2017.08.356

Proceedings of the 20th IFAC World Congress 3860 Zeinab Mahmoudi et al. / IFAC PapersOnLine 50-1 (2017) 3859–3864 Toulouse, France, July 9-14, 2017

Model The model of the state space in the stochastic differential equation (SDE) form and the measurement model are of the form dx(t) = f (x(t), u(t), d(t)) dt + σ · dω(t), (1a) yk = g(xk ) + ξk , (1b) dω(t) ∼ Niid (0, Idt), (1c) in which x is the state, u is the input, d is the disturbance, and y is the measurement. We assume that ξ is a Gaussian zero-mean discrete-time measurement noise with covariance R. The stochastic noise ω is a standard Wiener process, and σ is the diffusion coefficient. I is an n × n identity matrix, where n is the number of state variables in the model. Prediction This section explains the multi-step prediction with the UKF. The prediction steps are j = 1, 2, . . . , Np and Np is the prediction horizon. The scaling parameter λ determines how far the sigma points are scattered away from the mean. λ and c are defined as λ = α2 (n + κ) − n, c = α2 (n + κ). (2a) The associated weights, W , for the 2n + 1 sigma points are given by (0) = λ/(n + λ), Wm

Wc(0) (i) Wm Wc(i)

(2b) 2

= λ/{(n + λ)(1 − α + β)},

(2c)

= 1/{2(n + λ)},

i = 1, . . . , 2n (2d)

= 1/{2(n + λ)},

i = 1, . . . , 2n (2e)

(0) (2n) T ] . (2f) . . . Wm Wm = [Wm A deterministic approach, based on the Cholesky factorization of the covariance P , samples the probability ˆ distribution to generate the sigma points X. ˆ k+j−1 = [ˆ xk+j−1|k . . . x ˆk+j−1|k ] X   √ + c [0 Pk+j−1|k − Pk+j−1|k ]

= [m(i) . . . m(2n) ], i = 0, . . . , 2n. (3a) The nonlinear function f propagates each of the sigma points according to   ˆ k+j−1 dX ˆ k+j−1 (t), u(t), d(t) , (t) = f X dt t ∈ [tk+j−1 tk+j ] (3b) ˆ k+j = X ˆ k+j−1 (tk+j ). X (3c)

The parameters α, κ, and β are set to α = 0.01, κ = 0, and β = 2. The weighted average of the transformed sigma points gives the predicted mean ˆ k+j Wm . x ˆk+j|k = X (3d) The covariance of the estimation error is computed by propagating dPk+j−1 according to dPk+j−1 (t) dt 2n  T   = Wc(i) m(i) (t) − mx (t) f (m(i) (t), u(t)) − mf (t) +

i=0 2n  i=0

  T Wc(i) f (m(i) (t), u(t)) − mf (t) (m(i) (t) − mx (t)

+ σσ T ,

t ∈ [tk+j−1 tk+j ]

(3e)

where mx and mf are mx (t) =

2n 

(i) Wm m(i) (t),

i=0

mf (t) =

2n 

(i) Wm f (m(i) (t), u(t)).

(3f)

i=0

The propagated error covariance is then Pk+j|k = Pk+j−1 (tk+j ). (3g) To increase accuracy, new sigma points are generated from the predicted state mean and covariance as indicated in ˜ k+j = [ˆ xk+j|k . . . x ˆk+j|k ] X   √ + c [0 Pk+j|k − Pk+j|k ]

= [m ˜ (i) . . . m ˜ (2n) ], i = 0, . . . , 2n. (3h) The measurement model transforms each of the new sigma points ˜ k ) = [µ(i) . . . µ(2n) ], Yˆk+j = g(X i = 0, . . . , 2n. (4a)

The weighted average of the measurement sigma points gives the predicted measurement yˆk+j|k = Yˆk+j Wm , (4b) and the j-step prediction error is given by ek+j|k = yk+j − yˆk+j|k . (4c) yy ˆ S is the covariance of Yk+j and is computed as k+j

yy Sk+j =

2n  i=0

  T Wc(i) µ(i) − yˆk+j|k µ(i) − yˆk+j|k . (4d)

Sk+j is the covariance of ek+j|k and is calculated by yy + Rk+j . Sk+j = Sk+j

(4e)

Filtering The equation set (5) describes filtering and xy measurement update with the UKF. Sk+1 is the cross˜ and Yˆk+j|k , and can be estimated as covariance of X xy Sk+1 =

2n  i=0

Wc(i)



m ˜ (i) − x ˆk+1|k



µ(i) − yˆk+1|k

T

, (5a)

and Kk+1 is the filter gain as follows xy −1 Sk+1 . Kk+1 = Sk+1 The updated state mean is computed as x ˆk+1|k+1 = x ˆk+1|k + Kk+1 (yk+1 − yˆk+1|k ). The updated error covariance is given by T Pk+1|k+1 = Pk+1|k − Kk+1 Sk+1 Kk+1 .

(5b) (5c) (5d)

Multi-step prediction error and its covariance matrix Let {yk }N k=1 denote the measurements and Np denote the prediction horizon. Let the time indices be k = 0, 1, . . . N − Np , and the prediction index be 1 ≤ j ≤ Np . This implies that 1 ≤ k + j ≤ N . Let k+Np denote the vector of the prediction errors in the Np -sample prediction window as     yk+1 − yˆk+1|k ek+1|k  ek+2|k   yk+2 − yˆk+2|k  = . (6) k+Np =  .. ..     . .

3923

ek+Np |k

yk+Np − yˆk+Np |k

Proceedings of the 20th IFAC World Congress Toulouse, France, July 9-14, 2017 Zeinab Mahmoudi et al. / IFAC PapersOnLine 50-1 (2017) 3859–3864

Analogously to the linear systems, the cross-covariances of the multi-step prediction errors may be computed by (Jørgensen and Jørgensen, 2007; Kailath et al., 2000) Sk+i,k+j = ek+i|k , ek+j|k   T  if i > j, Yˆk+i W Yˆk+j T ˆ ˆ = Yk+i W Yk+j + Rk+j if i = j, (7)  Yˆ ˆT if i < j, k+j W Y k+i

The covariance matrix of k+Np is calculated by Rk+Np = k+Np , k+Np    Sk+1,k+1 Sk+1,k+2 · · · Sk+1,k+Np  Sk+2,k+1 Sk+2,k+2 · · · Sk+2,k+Np  . = .. .. .. ..   . . . . Sk+Np ,k+1 Sk+Np ,k+2 · · · Sk+Np ,k+Np (8) The covariance Rk+Np is an ny Np ×ny Np matrix, in which ny is the size of the measurement vector y. Under the assumption that the measurement noise ξ is zero-mean Gaussian, the prediction error k+Np is also   Gaussian with the distribution N [0, 0, · · · 0]T , Rk+Np . By having R, ,

Sˆk+j =

By taking the negative log likelihood of the multivariate normal distribution of k+Np , we write the MLE optimization problem as min V (R) = ln det(Rk+Np ) + Tk+Np R−1 k+Np k+Np . (9) R0

When Np is large, computing ln det(Rk+Np ) is challenging in terms of computational time and numerical accuracy. Alternatively, Cholesky factorization offers a faster approach. The Cholesky factorization decomposes the positive definite Rk+Np into Rk+Np = LLT with L being a lower triangular matrix. Then ny N p

ln det(Rk+Np ) = 2



ln(Lii ),

(10)

i=1

in which Lii are the diagonal entries of L. Computing inverse of Rk+Np for calculating Tk+Np R−1 k+Np k+Np is computationally heavy. To avoid matrix inversion, we compute Tk+Np R−1 k+Np k+Np by solving a system of linear equations Rk+Np Z = k+Np via back substitution and finding Z. Then T (11) Tk+Np R−1 k+Np k+Np = k+Np Z.

k+j 

eq|q−j eTq|q−j ,

(12b)

q=k+j−M +1

where M is the length of the data sequence used for estimating the sample covariance. We set M to 15. The estimated measurement noise covariance is Np  1 yy ˆ (Sˆk+j − Sk+j ). (12c) R= Np − 1 j=1

In both estimation methods, the estimated R is used in the UKF for the next w samples. w is the size of the moving step of the prediction window and is set to 50% of Np . The size of the prediction window is the same as the prediction horizon Np . The root-mean square (RMS) error of the estimated noise covariances evaluates the performance of the two covariance estimation methods. The RMS error is computed by   N 1  ˆ k 2 , Vrms =  Rtrue,k − R (13) 2 N k=1

where N is length of the data sequence.

3. EXAMPLE: IDENTIFYING THE NOISE OF CONTINUOUS GLUCOSE MONITORING SENSOR

and e from the UKF, we estimate and tune the covariance of the measurement noise R by MLE and the CM method. 2.2 Maximum likelihood estimation

1 M −1

3861

The CGM sensor measures interstitial glucose from the subcutaneous (SC) tissue. The sensor measurements are corrupted by random noise and artifacts originated from several sources including sensor electronics, miscalibration of the sensor, and biofouling. 3.1 The state space model We used the Medtronic Virtual Patient (MVP) model in the SDE form for the state space representation of the patient’s metabolism and also for simulating the CGM sensor (Kanderian et al., 2009). This model describes the pharmacokinetics (PK) of SC insulin and the insulinglucose interaction. We also included the blood glucose -interstitial glucose dynamics in the model. The model also contains the two-compartments of carbohydrate (CHO) absorption (Wilinska et al., 2010). The model is described as

2.3 Covariance matching technique





1 ID(t) − Isc (t) dt + σSC · dωSC (t), τ1 CI 1 (ISC (t) − Ip (t)) dt + σp · dωp (t), dIp (t) = τ2

dIsc (t) =



(14a) (14b)

dIef f (t) = −P2 · Ief f (t) + P2 · SI · Ip (t) dt + σef f · dωef f (t),

(14c)



dGB (t) = −(GEZI + Ief f (t)) · GB (t) + EGP + Ra (t) dt

In the prediction window for the prediction steps j = 1, 2, . . . , Np , we compute Sk+j that is the theoretical covariance of ek+j|k . The covariance matrix Sk+j is computed as yy + Rk+j , (12a) Sk+j = Sk+j yy is calculated accordin which the covariance matrix Sk+j ing to (4d). The sample covariance of ek+j|k is Sˆk+j , which is given by

3924

+ σG · dωG (t), 1 dGI (t) = − (GB (t) − GI (t)) dt + σGI · dωGI (t), τ3   1 D1 (t) dt + σD1 · dωD1 (t), dD1 (t) = q(t) − τm 1 (D1 (t) − D2 (t)) dt + σD2 · dωD2 (t), dD2 (t) = τm 1 D2 (t). Ra (t) = V G τm

(14d) (14e) (14f) (14g) (14h)

Proceedings of the 20th IFAC World Congress 3862 Zeinab Mahmoudi et al. / IFAC PapersOnLine 50-1 (2017) 3859–3864 Toulouse, France, July 9-14, 2017

ID contains basal and bolus insulin. The patient eats breakfast at 8:00 hrs, lunch at 13:15 hrs, dinner at 18:00 hrs, and snack at 22:00 hrs. The CHO content of the meals are 72 g for breakfast, 131 g for lunch, 51 g for dinner, and 70 g for snack, respectively. The insulin bolus to cover meal are 3.5 U for breakfast, 7 U for lunch, 2.5 U for dinner, and 3.5 U for snack, respectively.

160

Maximum likelihood estimation Covariance matching

120

Vrms (mg

2

/ dL 2 )

140

100

80

60

40

0

20

40

60

80

100

120

140

160

180

200

Prediction horizon (min)

Fig. 1. Root-mean square error of covariance estimation averaged over 50 experiments. 90

Maximum likelihood estimation Covariance matching

80

Deviation (mg / dL)

ID is the SC insulin input (µU/min), Isc , Ip , and Ief f are the SC insulin concentration (mU/L), the plasma insulin concentration (mU/L), and the effect of insulin (min−1 ), respectively. GB is the blood glucose concentration and GI is the interstitial glucose concentration both in mg/dL. q(t) is the CHO ingestion rate (g/min), D1 and D2 are the glucose masses (mg) in the accessible and inaccessible compartments, and Ra is the glucose appearance rate (mg/dL/min). τ1 (49 min) is the time constant to the insulin movement from administration site to the SC tissue, τ2 (47 min) is the time constant related to the insulin movement from the SC tissue to plasma, τ3 (10 min) is the time constant related to the glucose movement from plasma to SC tissue, CI (2010 mL/min) is the insulin clearance, P2 (1.06 10−2 min−1 ) is the delayed insulin action on the blood glucose, SI (8.11 10−4 mL/µU/min) is the insulin sensitivity, GEZI (2.20 10−3 min−1 ) is the glucose effectiveness at zero insulin, EGP ( 1.33 mg/dL/min) is the endogenous glucose production rate at zero insulin, τm (47 min) is the peak time of meal absorption, and Vg (253 dL) is the volume of distribution for glucose.

70 60 50 40 30 20 10

0

20

40

60

80

100

120

140

160

180

200

Prediction horizon (min)

Fig. 2. Deviation of the estimated covariance from the true covariance averaged over 50 experiments.

3.2 The measurement model

performed 50 experiments, each experiment consists of The CGM sensor samples interstitial glucose. There- simulating the model in (14) to generate one day (1440 fore, the measurement model comprises GI measurements min) CGM data. For simulation, we set σ to 0.5% of xss , which are affected by the measurement noise φ as indicated in which xss is the steady state of the model. The experiments had different realizations of the measurement noise by φ and process noise dω. Then we applied the MLE and the yk = GI,k + φk . (15a) CM method on each experiment to estimate the covariance The measurement noise φ has covariance Rφ , and Facchinetti Rφ . We initialized the UKF from the steady state of the et al. identified it as the sum of two autoregressive pro- model, and P0 = σσ T . Table 1 and Table 2 summarize the cesses given by (Facchinetti et al., 2014) results which are averaged over 50 experiments for each (15b) prediction horizon. φk = ck + ϑˆk , (15c) Fig. 1 compares the performance of the two covariance esck = 1.23ck−1 − 0.3995ck−1 + δc,k , ϑˆk = 1.013ϑˆk−1 − 0.2135ϑˆk−2 + δϑ,k , (15d) timation methods in terms of the root-mean square error. Fig. 2 illustrates the absolute deviation of the estimated δc,k ∼ N (0, 11.3), δc,k ∼ N (0, 14.45). covariance from the true covariance, for the two estimation methods. Fig. 3 depicts the histogram of the estimated The model (1) corresponds to the model (14) with the covariance over 50 experiments for the prediction horizon state variables x = [Isc Ip Ief f GB GI D1 D2 ]T , the = 200 min. Fig. 4 shows the histogram of the CPU time input u = ID, the disturbance d = q, the noise ξ = φ, and for covariance estimation for 50 experiments and the prethe measurement y being the CGM data. In this example, diction horizon = 200 min. Fig. 5 indicates the result of the measurement model g is linear and Rφ is a scaler. applying MLE and CM methods on an example experiFor simulating the measurements y, we first simulated the ment. Fig. 1 and Fig. 2 indicate that as the prediction model in (14) by using Euler Maruyama method (Higham, horizon Np expands, the bias of the covariance estimate 2001). Then we added noise φ to the simulated GI . We declines. This originates from the consistency property simulated one day of one-minute CGM data that consists of the MLE. Because MLE is a consistent estimator, inof 1440 measurements. The aim is to estimate the unknown creasing the number of measurements by expanding the prediction horizon improves the estimation precision and Rφ and adopt the UKF accordingly. reduces bias. This is well demonstrated in Fig. 1 and Fig. 2. The figures also indicate that the decrease in Vrms and 4. RESULTS AND DISCUSSION bias is approximately exponential for the MLE. Fig. 3 We considered six different values for the prediction hori- also shows that for sufficiently large Np , the MLE has zon Np ( see Table 1). For each prediction horizon we considerably less bias than CM method. 3925

Proceedings of the 20th IFAC World Congress Toulouse, France, July 9-14, 2017 Zeinab Mahmoudi et al. / IFAC PapersOnLine 50-1 (2017) 3859–3864

3863

Table 1. Estimating measurement noise covariance by maximum likelihood estimation∗ . Prediction horizon (min)

Rφ,true (mg 2 /dL2 )

ˆφ R

ˆ φ |/Rφ,true ) % (|Rφ,true − R

Vrms

CPU time (s)

5

103.8

42.3

59.3

118.2

378.0

20

114.0

29.6

74.0

102.9

284.3

40

105.9

41.9

60.4

78.4

277.1

80

104.9

65.3

37.7

66.6

289.9

100

102.6

70.8

31.0

59.9

301.2

200

106.9

98.8

7.5

52.8

378.5



The values are the mean for one-day data which is the average of the 50 one-day experiments.

Table 2. Estimating measurement noise covariance by covariance matching∗ . Prediction horizon (min)

Rφ,true (mg 2 /dL2 )

ˆφ R

ˆ φ |/Rφ,true ) % (|Rφ,true − R

Vrms

CPU time (s)

5

103.8

69.9

32.6

123.3

220.5

20

114.0

105.5

7.4

120.9

179.2

40

105.9

143.3

35.4

150.2

172.7

80

104.9

159.6

52.1

143.6

161.0

100

102.6

156.7

52.6

128.0

157.2

200

106.9

178.2

66.7

117.1

142.6



Frequency

Estimated covariance True covariance

15 10 5 40

60

80

100

120

140

160

180

200

220

240

260

280

15 10 5 320

340

Estimated covariance True covariance

15 10 5 40

60

80

100

120

140

160

Estimated R (mg

180 2

200

360

380

400

420

440

460

480

500

170

180

190

200

Covariance matching

Covariance matching

20

0 20

20

0 300

300

Frequency

0 20

Frequency

Maximum likelihood estimation

Maximum likelihood estimation

20

Frequency

The values are the mean for one-day data which is the average of the 50 one-day experiments.

220

240

260

280

20 15 10 5 0 100

300

/ dL 2 )

110

120

130

140

150

160

CPU time (s)

Fig. 3. Histogram of the estimated covariance compared to the true covariance based on 50 experiments and the prediction horizon = 200 min.

Fig. 4. Histogram of CPU time for covariance estimation based on 50 experiments and the prediction horizon = 200 min.

Fig. 4 implies that the CPU time for the MLE is around 1.5 times greater than that for the CM estimation. However, this CPU time seems reasonable for the example application in Section 3. If the filter was optimal, the filter innovations (the one-step ahead prediction error) were white and could serve as the orthogonal basis for the LDL decomposition of Rk+Np (Jørgensen and Jørgensen, 2007; Kailath et al., 2000). In this case, there is no need for the Cholesky decomposition of Rk+Np . In addition we could spare us the multi-step predictions, because the filter routine is sufficient to generate the innovations sequence. This would result in a computationally more efficient MLE. However, the innovations sequence is not white for two reasons. First, we did not assume ξ in (1b) to be a white noise. Second, we do not process the data with the optimal filter. The optimal filter is unknown,

because the true noise covariance is not known in the prediction window and is to be estimated. Furthermore, the assumption of optimality for the nonlinear filters, i.e. the UKF and EKF, is not valid in general. This is due to the fact that the UKF approximates the state probability distribution and the EKF linearizes the state-space model, both procedures make the filter deviate from optimality. As Fig. 5(c) illustrates, the CGM data filtered with the maximum likelihood estimated covariance is closer to the ideally (known covariance) filtered CGM, compared to the CGM data filtered with the CM estimated covariance. The improvement is modest though. This is because the process noise covariance moderates the effect of variation of the measurement noise covariance on filtering. When the process noise is relatively small (small σ in (14)), the filtered measurements are close to the one step model-predicted measurements, due to the small filter gain, without being

3926

Proceedings of the 20th IFAC World Congress 3864 Zeinab Mahmoudi et al. / IFAC PapersOnLine 50-1 (2017) 3859–3864 Toulouse, France, July 9-14, 2017

REFERENCES

Covariance

True covariance Maximum likelihood estimation Covariance matching estimation

1000

Covariance (mg

2

/ dL 2 )

1200

800 600 400 200 0

0

200

400

600

800

1000

1200

1400

Time (min)

(a) CGM data Glucose concentration (mg/dL)

400

Unfiltered CGM data Filtered CGM with the maximum likelihood estimated covariance Filtered CGM with the covariance matching estimated covariance

350 300 250 200 150 100 50

0

200

400

600

800

1000

1200

1400

1000

1200

1400

Time (min)

(b) Deviation from the ideal filter

Glucose concentration (mg/dL)

14

Maximum likelihood estimation Covariance matching estimation

12 10 8 6 4 2 0

0

200

400

600

800

Time (min)

(c)

Fig. 5. Covariance estimation and filtering for an example experiment and the prediction horizon = 200 min. a) Estimated covariance. b) The filtered CGM data. c) The absolute deviation of the filtered CGM data from the ideally filtered CGM. The ideal filter is the UKF with the actual measurement noise covariance. profoundly affected by the variations of the measurement noise covariance. 5. CONCLUSIONS We presented an adaptive UKF by tuning the measurement noise covariance. A method based on the MLE estimates the noise covariance. The inputs of the ML objective function are the multi-step prediction errors and their covariance matrix generated by the UKF. We also compared the method with the CM algorithm which is a suboptimal estimation technique. The results generally show that the MLE method outperforms the CM method. However, the computational cost associated to the MLE method is somewhat larger than the computational cost of the CM method.

Facchinetti, A., Del Favero, S., Sparacino, G., Castle, J., Ward, W., and Cobelli, C. (2014). Modeling the glucose sensor error. IEEE Transactions on Biomedical Engineering, 61(3), 620–629. Higham, D.J. (2001). An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Review, 43(3), 525–546. Jørgensen, J.B. and Jørgensen, S.B. (2007). Comparison of prediction-error-modelling criteria. In Proceedings of the American Control Conference, 5300–5306. Julier, S. and Uhlmann, J. (2004). Unscented filtering and nonlinear estimation. Proceedings of the IEEE, 92(3), 401–422. Kailath, T., Sayed, A.H., and Hassibi, B. (2000). Linear estimation. Prentice Hall,. Kanderian, S., Weinzimer, S., Voskanyan, G., and Steil, G. (2009). Identification of intraday metabolic profiles during closed-loop glucose control in individuals with type 1 diabetes. Journal of Diabetes Science and Technology, 3(5), 1047–1057. Maybeck, P.S. (1982). Stochastic models, estimation, and control. Academic Press, Chapter 10. Odelson, B.J., Lutz, A., and Rawlings, J.B. (2006a). The autocovariance least-squares method for estimating covariances: Application to model-based control of chemical reactors. IEEE Transactions on Control Systems Technology, 14(3), 532–540. Odelson, B.J., Rajamani, M.R., and Rawlings, J.B. (2006b). A new autocovariance least-squares method for estimating noise covariances. Automatica, 42(2), 303– 308. Partovibakhsh, M. and Liu, G. (2015). An adaptive unscented Kalman filtering approach for online estimation of model parameters and state-of-charge of lithium-ion batteries for autonomous mobile robots. IEEE Transactions on Control Systems Technology, 23(1), 357–363. ˚ Akesson, B.M., Jørgensen, J.B., Poulsen, N.K., and Jørgensen, S.B. (2008). A generalized autocovariance least-squares method for kalman filter tuning. Journal of Process Control, 18(7), 769–779. S¨arkk¨ a, S. (2007). On unscented Kalman filtering for state estimation of continuous-time nonlinear systems. IEEE Transactions on Automatic Control, 52(9), 1631–1641. Simon, D. (2006). Optimal state estimation : Kalman, H∞, and nonlinear approaches. John Wiley & Sons. Weige, Z., Wei, S., and Zeyu, M. (2015). Adaptive unscented Kalman filter based state of energy and power capability estimation approach for lithium-ion battery. Journal of Power Sources, 289, 50–62. Wilinska, M.E., Chassin, L.J., Acerini, C.L., Allen, J.M., Dunger, D.B., and Hovorka, R. (2010). Simulation environment to evaluate closed-loop insulin delivery systems in type 1 diabetes. Journal of Diabetes Science and Technology, 4(1), 132–144. Zagrobelny, M.A. and Rawlings, J.B. (2015a). Practical improvements to autocovariance least-squares. AIChE Journal, 61(6), 1840–1855. Zagrobelny, M.A. and Rawlings, J.B. (2015b). Identifying the uncertainty structure using maximum likelihood estimation. Proceedings of the American Control Conference, 2015-, 422–427.

3927