Non-Conservative Trajectory Planning for Automated Vehicles by Estimating Intentions of Dynamic Obstacles

Motion planning algorithms for urban automated driving must handle uncertainty due to unknown intention and future motion of Dynamic Obstacles (DOs). Considering a single future trajectory for each DO is not adequate, especially in urban frameworks where traffic participants exhibit very different behaviors. However, including multiple candidate trajectories representing different behaviors results in an excess of conservatism. We present a novel combination of the Interactive Multiple Model (IMM) algorithm and Stochastic Model Predictive Control (SMPC) that allows non-conservative safe motion planning in presence of DOs with unknown intention. We introduce a framework based on LQR and IMM to predict multiple candidate future trajectories, each interpreted as a high-level intention that the DO is pursuing, and dynamically estimate their probabilities. Then, the future trajectory of the automated vehicle is iteratively planned in an SMPC fashion, in which collision avoidance constraints are generated for multiple future trajectories of each DO, with a focus on the most likely. Our method improves safe motion planning fully exploiting the benefits of multi-modal predictions of the DOs, avoiding excessive conservatism. Advantages of the proposed method are discussed through simulations in the CARLA environment.

plan trajectories for an automated vehicle [2], named ego vehicle (EV) in this work. In MPC, an optimal control problem is solved iteratively, yielding a sequence of inputs to the automated vehicle. The sequence of inputs is chosen by minimizing a cost function subject to satisfaction of a set of constraints. Only the first element of the minimizing input sequence is applied to the system, and the problem is solved again over a moved (receding) horizon as soon as a new measurement of the state of the system and the environment is available. In motion planning for automated driving, the cost function is designed to penalize deviations from a desired behavior of the EV (e.g., keeping a cruise speed, minimize usage of input, maximize comfort). Compared with other motion planning approaches, constraints consisting of traffic rules, respect of the vehicle physical limitations, and safety conditions are explicitly included in the optimization.
To avoid collisions between the EV and static obstacles, motion planning algorithms design fixed forbidden areas for the EV. The same approach cannot directly be used to avoid collisions with other traffic participants, here Dynamic Obstacles (DOs), since their future motion is not known and only probabilistic models of DOs are available. Therefore, in order to consider safety requirements to prevent collisions with DOs, the MPC scheme must be able to consider uncertainty.
A common approach in MPC-based motion planning for automated driving consists of predicting the future trajectory of DOs using a nominal model with additive uncertainty, representing the unpredictable component of the motion. The additive uncertainty ultimately results in a probabilistic constraint for each DO, addressed in a Stochastic MPC (SMPC) fashion. However, in predicting the future motion of DOs, the uncertainty is twofold [3], [4]. On one hand, the high-level intention that determines the behavior of the obstacle is unknown. Depending on the DO, an intention could represent the desire to reach a target position, the plan to follow a particular gait, or vehicle maneuvers like turns at intersections or lane changes. On the other hand, one cannot assume DOs to behave entirely according to a detectable rational and deterministic strategy when performing a given intention. Even assuming this, the dynamics of the DO are generally not completely known. Thus, an additional lower-level uncertainty is considered, in the form of uncertainty around the nominal trajectory performing a particular intention. Therefore, a multitude of uncertain possible trajectories is obtained for each DO, and the challenge lies in generating safety constraints non-conservatively by taking multiple possible trajectories into account.
In this work, we propose a framework to reduce conservatism by explicitly prioritizing the most likely possible future trajectories of each DO. Multiple DO future trajectories are considered using the Interactive Multiple Model (IMM) algorithm [5] combined with several LQR-based approximations. In IMM, one generally considers multiple candidate models representing the dynamics of a moving target, and an accurate state estimate and the relative probability of each model are recursively obtained. In contrast, we use LQR to design n I closed-loop models comprising the dynamics of the DO and an internal optimal controller steering the state of the DO in a way to realize a high-level intention. Thus, in this work the IMM delivers the probability of each LQR-based trajectory representing a high-level intention. Then, n I candidate trajectories for each DO are predicted over a finite horizon, also propagating trajectory uncertainty. Finally, SMPC [6] is used to plan the trajectory of the EV, generating probabilistic safety constraints for all the predicted trajectories of the DOs in the vicinity of the EV. To reduce conservatism, probabilistic safety constraints are required to hold with a probability which depends on the estimated confidence in each candidate trajectory. In such a way, higher priority is explicitly given to those future motions that are more likely to be executed, without being excessively restricted by those that are considered unlikely.
The contributions of this work are as follows: r novel combination of IMM and SMPC for motion planning in presence of dynamic obstacles with unknown intention. Such combination allows enhanced safety resulting from taking multiple intentions of DOs into consideration, still handling them non-conservatively; r IMM-based framework for multi-trajectory prediction considering twofold uncertainty, comprising unknown intention and non-deterministic motion. The remainder of the paper is organized as follows. In Section I-A related work is discussed. Section II recalls the IMM algorithm and Section III introduces the key idea of the combination of IMM and SMPC. Section IV presents the EV dynamical model and the framework for generation of candidate trajectories of DOs encoding high-level intentions. Section V gives the details of the trajectory planning algorithm. Results from the simulations in CARLA [7] involving typical urban driving environments and a discussion on the proposed method are given in Section VI and in Section VII, respectively. Conclusive remarks and future research directions are suggested in Section VIII.

A. State of the Art
Interest around trajectory planning for automated vehicles in urban environment is increasing, specifically with a focus on the challenges posed by the unknown intentions of other traffic participants. Several methods have been proposed to plan a safe and non-conservative motion for the EV based on Machine Learning [8], [9]. However, in this work we focus on model-based approaches. MPC [10] takes the evolution of the environment over a prediction horizon into account and iteratively re-plans the trajectory accommodating changes in the traffic, including constraints in the optimization.
Robust Model Predictive Control (RMPC) was first introduced to address uncertainty in the MPC scheme through a worst-case analysis [11] and has been applied to collision avoidance tasks [12], [13], [14]. Exact constraint satisfaction is guaranteed for every possible realization of the uncertainty, thus intrinsically yielding a conservative motion for the EV. In the urban framework, where a multitude of DOs without a clear future trajectory surrounds the EV, a robust approach is likely to result in standstills more often than necessary. More recently, SMPC was presented [15], [16], and applied to automated driving [6], [17]. In SMPC the probability distribution of the uncertainty is considered and a small probability of constraint violation is allowed for sufficiently unlikely scenarios. This reduces conservatism to a large extent and our recent work [18] proposed a scheme combining safety guarantees with the optimistic planning of SMPC. Uncertainty was also included in MPC schemes through Grid-based MPC [19], multistage SMPC [20], learning-based MPC [21], and considering models of different granularity for further prediction steps [22]. An extensive summary is presented in the recent work [23]. However, non-conservatively addressing twofold uncertainty is a challenge worth further investigation.
A prediction of the future motion of other traffic participants is necessary to enforce safety constraints. Prediction models are classified into three categories [24]: physics-based, maneuverbased, and interaction aware. Physics-based models only consider dynamical properties and are not reliable in the long run, whereas interaction-aware models [25] also include the reaction of the other vehicles to the future motion of the EV, but including them in the optimization is still challenging because of the increased computational complexity. Maneuver-based models are a reasonable trade-off and current state of the art [26], [27], allowing to take the possible intentions driving the behavior of the other traffic participants into account, without excessively increasing the complexity. However, being the intention unknown (especially for pedestrians and cyclists), considering a single model for prediction is limiting. Thus multiple intentions should be considered in the prediction framework and in generating safety constraints.
Among several algorithms proposed to predict the future trajectories of traffic participants, in [27] Gaussian process regression models are used to recognize the maneuver being executed by other traffic participants computing the likelihood of successive observations to several models belonging to a pre-recorded collection of classified motions. Then, a Monte-Carlo prediction method implemented as a modified version of particle filters is used to predict the future motion, since, as the authors argue, other estimations like the Kalman Filter could only be executed after that the maneuver has been recognized. However, in this work we consider multiple candidate trajectories at the same time. Furthermore, [28] used Partially Observable Markov Decision Processes (POMDPs) regarding the (unknown) intention of other traffic participants as a hidden variable. This is a very interesting approach, accounting for the change in the prediction accuracy directly in the computation of the policy. However, rather simple transition models for other traffic participants must be adopted and the method is computationally demanding because it operates in a continuous state space. Alternatively, Inverse Optimal Control and especially Inverse Model Predictive Control [29] are adopted to model and predict the future motion of human-driven vehicles, exhibiting notable accuracy. However, a prior on the features to be included in the candidate cost functions must be available and handling multi-modal motions representing different intentions could be challenging. In this work, we use the IMM algorithm for prediction and estimation of the future motion of DOs, since it allows joint estimation of the uncertainty around estimates and assessment of the probability of each mode with low computational effort and ease of implementation. However, other estimation and prediction algorithms could also be coupled with SMPC in a similar fashion.
The IMM algorithm allows to predict the future motion of objects with changing dynamics. Given a set of candidate dynamical models, it returns an estimate of the state of the object being tracked and of the associated covariance matrix, and the probability currently assigned to each of the candidate models is also estimated. The IMM is recalled in more details in Section II.
Recent works presented approaches to include multiple possible future trajectories of vehicles, each based on a different intention, with a view at avoiding an unnecessarily conservative planning. A first approach was presented in [6], dynamically estimating the probability of several maneuver-based models to represent intention uncertainty, employing the IMM algorithm for estimation and prediction. However, the motion planning models used are overly simplified, basically assuming that each DO will maintain a constant velocity or acceleration, and most importantly only the most likely model is taken into account in the trajectory planning of the EV. Although this is an improvement, since the prediction is based on the recent behavior of the vehicles and not only on a-priori assumptions, still the most likely model can change, particularly if none of the models considered perfectly matches the real dynamics. The possible sudden and frequent change in the dominant model makes the optimization framework and the resulting planned motion unreliable. A different approach is presented in [30], in which the relative confidence of multiple candidate models for the other traffic participants is estimated using POMDP, similarly to the previously mentioned work [28]. Then, the estimated probabilities are utilized as weights to obtain a mixture Gaussian distribution resulting from the combination of the multiple models, and a probabilistic collision avoidance constraint is determined based on the mixture distribution. This approach allows to effectively consider multiple future behaviors, but relying on a single collision avoidance constraint that accounts for all candidate future motions at once requires significantly over-approximating the forbidden areas for the EV, and thus induces unnecessary conservatism. Differently, the recent work [31] considers several candidate maneuvers of DOs separately, avoiding the just-mentioned over-approximation. Still, all obtained predictions are regarded as equally likely, irrespective of the estimated probabilities. Giving equal importance to all candidate predictions also renders the planned motion of the EV unnecessarily conservative, as we discuss in Section VI.
In this work, we benefit from the estimated probabilities estimated by the IMM algorithm to explicitly prioritize the multiple possible future trajectories of the other traffic participants depending on their probability. The proposed approach allows to consider multiple possible intentions of other vehicles, cyclists, and pedestrians (generally referred to as DOs) in a non-conservative fashion. In Section III, the basic concepts of the method are outlined, and details are discussed thereafter, presenting a framework to generate intention-based candidate trajectories of the DOs, and discussing how to include them into the SMPC optimization of the EV, limiting conservatism.

B. Notation
Scalar quantities are denoted in lowercase letters and symbols, z, vectors are in bold lowercase, z, matrices in bold uppercase, M , and [M ] ij indicates entry i, j of M . For every vector z and matrix M of appropriate dimensions, z 2 M = z M z, where z is the transpose of z. z k is the k-steps ahead prediction of z and z + refers to the value of z at the next sampling time.ẑ andz denote the estimate and the prediction of z, respectively. For n 1 , n 2 ∈ Z, I(n 1 , n 2 ) = {n 1 , n 1 + 1, . . . , n 2 − 1, n 2 }. The probability of an event is denoted by P[·].

II. PRELIMINARIES
In this section, the IMM algorithm for multi-modal state estimation is recalled. Although in this work the IMM is used to consider multiple candidate trajectories of the DOs, here the original formulation for different dynamical models is given [5]. Then, Section IV-B introduces a framework for design of closedloop models representing high-level intentions in a form suitable for the IMM.
The IMM algorithm can be understood as an extension of the standard Kalman Filter [32], adapted to obtain a better estimate of the state of target objects whose dynamics are described by different models over time, and is used for example in Aerospace for tracking maneuvering targets [33], as also in application related to object motion detection in ballistic [34] or automated driving frameworks [35], [36]. Instead of using just one model to dynamically estimate the state, multiple candidate models are considered. At each sampling time, the newly-collected output measurement is used to produce not only an estimate of the state and associated covariance matrix, but also the probability that each of the considered model is currently used to update the dynamics of the object being tracked. Compared to [5], here a known and deterministic input is straightforwardly taken into account. The procedure outlined is likewise run for every DO independently. Interactions between different DOs are not considered in this work.
Assume that the aim is to track an object with switching linear dynamics, meaning that at every time the state is updated according to one of n I possible models. Let consider state z, known deterministic (and possibly time-varying) input η j , and zero-mean process noise ω with covariance matrix Σ ω . All models share the same output vector γ, and the zero-mean measurement noise ν with covariance matrix Σ ν . The generic Algorithm 1: Interacting Multiple Model Algorithm.
Output: {μ j } n I j=1 ,ẑ,P j-th model (for the given DO) is in the form: The n I multiple possible models are specified by different inputs η j and different matrices F j , G j , and H of appropriate dimensions. Let Π ∈ R n I ×n I be the (row-stochastic) state switching matrix, where each entry [Π] ij defines the a-priori probability of switching from model i to model j, with i, j ∈ I(1, n I ), at any given sampling time.
Algorithm 1 [5] shows the IMM steps to update the estimate of the system state and covariance, and the probability of the considered models. At each iteration, the algorithm receives as input the estimated probability for each model from the previous iteration {μ j } n I j=1 and the new measurement γ of the DO, and returns as output the updated model probabilities {μ j } n I j=1 , and the accurate combined state estimateẑ and associated covariance matrixP . Observe that the model-specific state estimates and covariance matricesẑ j ,P j in Algorithm 1 are "internal" variables. The iteration starts with the so-called "State Interaction", in which estimates from the previous iterations are combined, considering possible switches in the dynamics. Then, n I Kalman Filters (one for every model) are fed with the combined estimates and, based on the prediction error with respect to the newlycollected measurement γ, the model probabilities are updated. Finally, a combined state estimate and covariance matrix are obtained, weighing the individual estimates of the n I filters by their probability. Since the update of the estimates is also based on results from the previous iteration, past data are considered in the computation of the estimate with relatively small effort.
Compared to a bank of n I independent Kalman Filters each considering only one of the n I different possible models, here the initial combination of model estimates is beneficial. In case of a switch, the IMM quickly corrects the estimate, with a rate depending on the a-priori switching probabilities specified in matrix Π. Moreover, at steady state, the combined estimateẑ of the IMM algorithm performs nearly as good as a Kalman Filter based on the (currently) real model [5].

III. COMBINATION OF IMM AND SMPC
The proposed method uses a combination of the IMM algorithm and of SMPC to plan the trajectory of an automated vehicle in presence of twofold uncertainty due to the DOs in the environment. Although the same approach is not restricted to automated driving and can be adopted in other frameworks to handle similar twofold uncertainties, here we stick to the example of safe motion planning for clarity.
Instead of assuming that a DO will follow a single nominal trajectory perturbed by large uncertainty to include significantly different outcomes, one can narrow down the level of uncertainty by considering a twofold structure. On a first level, each DO is assumed to be pursuing one of the possible high-level intentions (which will be precisely defined in Section IV-B), drawn from a known set of options. Then, for every considered intention, the future motion of the DO is not deterministically known, adding the second level of uncertainty. Therefore, collision avoidance constraints, such as maintaining a minimum safety distance from the DO with respect to one of the trajectories are probabilistic in nature. Thus, we adopt an SMPC approach in which probabilistic safety constraints are generated for every considered trajectory of the DO, i.e., solving the optimal control problem resulting from the current initial condition ξ 0 and from a given input sequence U . Constraints (2c) are chance constraints, in which S j is a set representation of the collision avoidance conditions, one for each of the n I candidate future trajectories of the DO. In an SMPC approach, instead of guaranteeing constraint satisfaction for every possible outcome of the uncertainty, constraints are required to hold only up to a pre-defined level of probability, 0 ≤ β j ≤ 1.
Parameter β j , sometimes referred to as safety level, is a design parameter of the SMPC method that tunes the trade-off between safety and efficiency, and is typically chosen heuristically. If β j in safety constraints (2c) is set equal for all DO trajectories independently of their probability, the motion planned for the EV is conservative. Thus, in this work β j explicitly depends on the probability of each possible high-level intention estimated by the IMM, see Fig. 1. The prioritization of safety constraints depending on the probability of each high-level intention allows to reduce unnecessary conservativeness with respect to those trajectories that are currently considered unlikely, maintaining focus on the (possibly multiple) dominant ones. For example, during an initial transient phase in which enough data for accurate inference on the high-level intention are yet to be collected, all the candidate high-level intentions are considered and prioritized; then, as soon as the measurements show that some high-level intentions are safely ruled out, the related safety conditions are gradually relaxed, focusing the attention on the most plausible future motions. Hence, very conservative motions of the controlled system planned taking all the possible high-level intentions equally into account are avoided. Nevertheless, the probability for each trajectory is recursively updated every time new measurements are collected. Thus, if the DO changes intention and starts moving unexpectedly, the previously excluded motions are included again in the optimization with higher priority, taking full advantage of the multi-modal prediction.

IV. DYNAMICAL MODELS
In this section, the dynamical model used for prediction of the future states of the EV is given, and a framework for design of multiple candidate trajectories of DOs each resulting from a high-level intention is introduced.

A. Ego Vehicle Model
For completeness, the model of the EV adopted in the optimal control problem (2) is outlined. The well-known kinematic bicycle model [37] is used, since it allows to take into account the coupling between the longitudinal and lateral movements for a sufficiently realistic planning, without adding unnecessary details for the scope of the analysis proposed in this work, whose ultimate goal is to deliver a feasible trajectory. We assume the presence of a low-level module in charge for tracking the generated trajectory, where further details about the vehicle dynamics can be more appropriately addressed. Observe that the presented novel IMM+SMPC combination is not restricted to the use of the kinematic bicycle model, which we adopt because sufficiently accurate for the scope of our simulations. More precise models, particularly for high speeds, could also be used without compromising the efficacy of presented approach.
The nonlinear kinematic bicycle model is expressed in the road-aligned (Frenet) frame, that is with respect to a possibly curved reference path, represented in Fig Assuming that the reference path and its features are known to the EV, let κ(s) be the local curvature of the reference path expressed as a function of the curvilinear coordinate s. The differential equations of the kinematic bicycle model are obtained along the lines of [38], although adapted to represent the vehicle with respect to the center of mass (and not with respect to the center of the rear axles). The nonlinear model iṡ where α = arctan l r l f + l r tan δ .
Assumption 1: During the motion, the position (s, d) of the EV is such that d = 1 k(s) . Assumption 1 is meant to avoid singularities in the model dynamics (3) and is not restrictive, since standard curvature values for vehicle roads are small and the controller guarantees that the lateral displacement d from the reference is limited, so that typically d 1 k(s) . To predict the state of the EV within the prediction horizon, a linear and discrete-time equivalent sufficiently reliable in predicting the future evolution of the EV dynamics in the proximity of the current state is obtained, see [18]. At first, the nonlinear dynamics (3) are linearized around the current state where the Jacobian matrices are Obtaining the couple of matrices (A d , B d ) from (A l , B l ) with zero-order hold of sampling time T , the linearized model (4) is discretized as To make the computation tractable, the derivation of matrices of the discrete-time system is performed under the approximation that κ (s 0 ) = 0, consistently with the assumption that the reference path was designed sufficiently smooth. The explicit formulation of A d and B d is given in Appendix A. Model (6) is used to predict the future states at step k = 1, . . . , N, where N is the prediction horizon, based on the current state ξ 0 and on the input sequence over the prediction horizon, u k , k = 0, . . . , N − 1. Section V-B introduces the procedure to derive safety constraints preventing collisions with DOs with unknown intention, which is the main contribution of this work. However, even in absence of DOs, the EV must obey the following constraints, related to static traffic rules or to physical limitations of the vehicle. The set of constraints is where Δu k = u k − u k−1 is the input rate. (7a) is designed to ensure that the shape of the EV, of width w EV , never exceeds the allowed driving area, whose limits (in the road-aligned frame) are d min and d max . (7b) prevents the EV from driving backward or faster than the maximum allowed speed, v max . (7c) and (7d) make the trajectory planning algorithm aware of the of the actuators limits of the EV in terms of minimum and maximum acceleration and steering angle and of the minimum and maximum rate between two consecutive steps.

B. Dynamic Obstacle Models
Here we introduce closed-loop dynamical models used to describe possible behaviors resulting from the unknown intention of the other traffic participants in a suitable form to be included in the IMM algorithm. Either pedestrians, cyclists, or other vehicles are referred to as DOs. We use simplified models with the ultimate goal of representing different intended behaviors of DOs. Although more precise prediction models representing the motion of traffic participants in more detail could also be adopted without compromising the validity of the presented approach, a precise description of the motion of traffic participants is beyond the scope of this work. The aim of models here derived is to express different possible behaviors of traffic participants with a view to considering multiple future trajectories of the DOs in the EV planning in a non-conservative fashion, as addressed in the next section.
Each DO is subject to twofold uncertainty. On one hand, the intention driving the behavior of each DO is assumed unknown, and for each of the n O DOs n I possible "high level" intentions are considered. The choice of the n I candidate behaviors is worth a special discussion, which is addressed later in this work; here the set of possible behaviors is assumed to be given. On the other hand, each intention is considered to be an approximation of the real behavior of the DOs, therefore trajectory uncertainty around each candidate intention is also accounted for. For simplicity of notation, in the following a single DO is considered, although the same holds for each of the n O DOs, independently of the number.
The DO is represented by , where x and y are the absolute cartesian coordinates of the position of the DO in the world frame, and v x and v y the longitudinal and lateral velocity, respectively. Considering input u o ∈ R 2 and disturbance ω ∈ R 2 , the motion is assumed to be described as a simple point-mass model with decoupled longitudinal and lateral dynamics as in [3] Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
where, defining T the sampling time, we have On top of that, the DO is assumed to behave in a way which is modeled through an LQR controller tracking one of a range of n I possible intentions. With the term intentions, we refer to specific goal behaviors, like tracking a given lateral or vertical position, or maintaining a precise speed in a direction.
Definition 1: An intention z * is defined as a goal target state for a DO.
Remark 1: Observe that intentions are not time-varying reference trajectories, but rather desired steady-state values for the states, fixed over time. For example, a lane change and acceleration maneuver would be formulated as a target lateral position (to steer the DO at the center of the desired lane) and a desired speed.
Thus, the DO applies an input u o chosen as that is, as a feedback term to steer the state z of the DO according to one of the intentions z * j , j ∈ I(1, n I ). The DO is supposed to plan its trajectory depending on some internal optimality criterion, therefore the feedback gain K j is selected as the solution of an LQR problem with weighing matrices The feedback gain K j minimizing (11) is obtained as whereP is the only positive semi-definite solution to the Algebraic Riccati Equation [39] Remark 2: For a given intention z * j , matrices Q j and R j specify the penalty to be paid for the error in each state component with respect to the others or with respect to the penalty for large inputs. Therefore, Q j and R j are used to tune the relative importance in the speed of convergence of each component to the desired state. The future behavior of a DO is fully specified only if the tuple (z * j , Q j , R j ) is given, i.e., if both the intention and the relative weighing matrices are known.
Not only is it possible to tune the relative importance in the speed of convergence, but the evolution of a specific state variable can also be completely ignored by assigning zero to the associated entries in Q j . This allows, for example, to only specify a cruise speed in x-direction. A target x-position would necessarily be time-varying or imply that the ultimate goal of the controller is to lead the DO to a fixed position, possibly contradicting the commanded cruise speed.
Substituting (10) into (8), gives where z j represents the predicted state of the DO assuming it is following trajectory j. Then, by defining the closed-loop state matrix F j := A o + B o K j and input matrix G j = B o , the evolution of each DO is formulated as in (1), where η j = −K j z * j is the input applied to the closed-loop representation of the DO assuming it is pursuing intention j. Since intentions z * j are constant, η j is also constant over the prediction horizon.
Vector γ in (1) represents the collected measurements of the position of the DO, where which is the same for every model, and ν ∼ N (0, Σ ν ) represents the measurement noise. (15) implies that the EV only relies on position measurements of the DO. This is an advantage over other approaches, since an estimate of the full state is produced by the IMM algorithm.
Observe that in using models of the form (1) to predict the future trajectory of the DO, uncertainty on the intention is accounted for by considering multiple models for the DO, and uncertainty about the execution of each given intention is represented by the process noise ω ∼ N (0, Σ ω ).
Then, for each DO an IMM is run to dynamically estimate the probability that the evolution of the trajectory will be the result of each one of the n I possible considered intentions. It is worth emphasising again that the set of possible future intentions (and weighing matrices) is assumed to be specified in advance. On one hand, this is a limitation of the method, which will be discussed in Section VII. On the other hand, this allows to derive closed-loop models for DOs completely offline, storing n I tuples (F j , G j , H, η j ) independently of the number of DOs that will be encountered. Then, any time a new DO enters the range of the sensors, a new IMM filter is created, using the stored closed-loop models.

V. TRAJECTORY PLANNING ALGORITHM
In this section the trajectory planning algorithm is outlined, discussing the multi-modal prediction of future trajectories of the DOs surrounding the EV, how non-conservative safety constraints are generated, and formulating the optimal control problem solved at every iteration of the SMPC algorithm. Finally, a short discussion of potential limitations of the generated constraints and possible improvements are proposed. Again for simplicity of notation and without loss of generality, in Sections V-A and V-B a single DO is considered.

A. Multi-Modal Prediction
For every DO, the last combined estimateẑ obtained by the IMM algorithm is used to initialize n I predictors, each iterating dynamics (1) for N steps. This yieldsz j k , that is, the predicted state of the DO k-steps ahead, if trajectory j is followed, for every considered trajectory j ∈ I(1, n I ), for every future step k ∈ I(1, N). Together with the state, the covariance matrix is also propagated along every possible trajectory of the DO, yielding the uncertainty around each state prediction. Precisely, given linear models of form (1), the covariance matrix Σ j k of the predicted statez j k of the DO at time step k if trajectory j is adopted is recursively obtained as [6] starting from the initialization Σ j 0 =P , ∀j = I(1, n O ), wherê P is the combined covariance matrix for the current step (zero) delivered from the IMM algorithm tracking the DO. Input η j in (1) is deterministic and therefore does not contribute to the propagation of the uncertainty.
Remark 3: All the predictors of the DO are initialized with the (same) combined estimateẑ and combined covariance matrix P , because this is the most accurate information available. Initializing each predictor with the estimateẑ j and covariance matrixP j resulting from each filter j in Algorithm 1 would deteriorate the performance.
As a result, for every time step k ∈ I(1, N) in the prediction horizon, n I × n O predicted states with associated covariance matrices are available, representing the n I possible future positions for each of the n O DOs surrounding the EV, see Fig. 3. Next, this information is used to design non-conservative safety constraints.

B. Safety Constraints
In this section the generation of safety constraints preventing collisions of the EV with DOs is introduced. User-defined (deterministic) conditions, such as maintaining a minimum distance from the DO, result in a set S of allowed (safe) states for the EV, so that constraints in the form (2c) are considered. Such constraints are chance constraints because the future position of the DOs is not known deterministically, and therefore also the set S of safe states for the EV is uncertain. In the following, based on the nominal predictions and propagation of the uncertainty derived in Section V-A, chance safety constraints for the trajectory planning for automated urban driving are designed and reformulated into deterministic expressions that are included in the numerical optimal control problem discussed in Section V-C.
For every predicted step k ∈ I(1, N), n I safety constraints are generated, one for every candidate trajectory of the DO. The aim is to consider multiple possible future trajectories of the DO resulting from different intentions, avoiding excessive conservatism. Thus, constraints must be designed to: r increase conservativeness for increasing confidenceμ j in trajectory j for the DO (first level of uncertainty) and for increasing uncertainty around the nominal predicted trajectoryz j k (second level of uncertainty); r vanish if trajectory j is ruled out for the DO, i.e.,μ j → 0.
In this work, the safety condition is formulated using an ellipsoidal region around the position of the DO, that the EV must not enter, as in [18]. Such a requirement needs the future position of the DO to be assessed, which is stochastic. Thus, we derive regions around each of the n I possible nominal trajectories of the DO designed to contain the (real) future position of the DO with probability β j [6]. We are only interested in the position of DO, therefore the reduced statež j k = [x j k , y j k ] is considered, which is modeled by a bivariate Gaussian distribu- x,k,j , σ 2 y,k,j ) extracted from the full nominal predictionz j k and full covariance matrix Σ j k , respectively. The ellipsoidal regions designed to contain the position of the future state of the DO with probability β j consist of all the positionš z j k satisfying ζ is the tolerance level and depends on the required level of probability through ζ(β j ) = −2 ln(1 − β j ) [18]. Observe that the regions described by (17) increase for larger uncertainty (σ 2 x,k,j and σ 2 y,k,j inΣ j k ) and for a higher required probability of constraint satisfaction β j ; furthermore, for β j → 0 these regions collapse to points, that is, the constraints are neglected in practice.
In general, the required probability of constraint satisfaction β j for a given chance constraint is a design parameter in the SMPC algorithm, which is set to balance between safety and efficiency. The larger β j , the safer and the less efficient the motion planning. Therefore, a reasonable choice of β j should allow for a small probability of constraint violation that does not compromise safety, yet reducing the conservatism typical of robust control approaches at the same time. In our method, the goal is to require stricter safety guarantees for those constraints related to the most likely future trajectories of the DO. Hence, β j is selected explicitly depending on the estimated probability of each future trajectoryμ j through the function β j = g(μ j ).
In principle, the identity β j =μ j satisfies the premises on the constraints stated at the beginning of the section and thus is a good candidate for function g(p), but other options are possible. For example, choosing g(p) = p ϕ , with 0 < ϕ < 1, the effect is that the required level of safety for relatively large probabilities μ j is increased, still allowing for a sudden loss of relevance of the constraints for very unlikely predicted trajectories, see Fig. 4. In general, the function g(p) is a design parameter which must satisfy This introduces flexibility in the method, as the rate of change of β j with respect toμ j can be shirked or stretched unequally across the interval p ∈ (0, 1). However, then values β j do not represent true probabilities anymore. Remark 4: If the confidence in a trajectory approaches 1, the safety areas grow up to the entire support of the disturbance. Therefore, for unbounded uncertainties, like the Gaussian case, a thresholding is needed to prevent the constraints from becoming excessively large.
To conclude the discussion on the safety constraints, observe that the forbidden areas defined in (17) only refer to the position of the center of the DO. However, for collision avoidance, the physical dimensions of the EV and of the DO must be included. For this reason, each safety area is expanded to account for the deterministic length l o and width w o of the EV and of the DO (independently of the prediction model used), obtaining an ellipse with semi-major axis a j k and semi-minor axis b j k defined as Remark 5: In [18], the ellipse sizes are chosen asǎ j k = σ y,k,j ζ(β j ) + l o . However, this option would not satisfy the requirement of "vanishing" constraints forμ j → 0, i.e., it would not rule out any constraint also for unlikely predictions, thus resulting in a conservative motion. Furthermore, l o and w o can be artificially increased with respect to the actual dimensions of the EV and of the DO, including additional margins, making the framework arbitrarily safe also using (19).
In summary, safety requirements deliver n I constraints as with a j k and b j k from (19) with β j = g(μ j ), fixed over the whole prediction horizon. As the EV model is referred to the roadaligned components, constraints are generated considering the projection in the Frenet frame of the predicted positions of the DO. If n O DOs surround the EV, for each step k in the prediction horizon n O × n I constraints are generated.
Remark 6: Constraints (20) are quadratic non-convex expressions. Equivalent representations linearized around the current working point would violate the requirement of neglecting the constraints if the trajectory is very unlikely (μ j → 0), and thus are not adopted.

C. Optimal Control Problem
Considering the feasible state set Ξ and input U for the EV dynamics (7), and state weighing matrices Q = Q 0 and P = P 0, and input weighing matrices R = R 0 and S = S 0, at every sampling time, a sequence of control inputs U = [u 0 , . . . , u N −1 ] for the EV is computed, where N is the prediction horizon, by solving where (21b) is a compact representation of (6), and (21e) refers to safety constraints (20). The cost function (21a) consists of terms penalizing large inputs and large rates of change in the input, where u k−1 is the last applied input, which are included to boost comfort in the motion planning. (21a) penalizes also large deviations Δξ = ξ − ξ * of the EV state from the desired reference state ξ * , consisting of a reference speed and zero lateral displacement and yaw angle with respect to the road. Regarding the computational complexity of the proposed method, the main bottleneck is the numerical solution of the optimal control problem (21) which, because of quadratic safety constraints (21e), is not a quadratic program with linear constraints. However, observe that most of the quantities in (21) are computed before the numerical optimization. The IMM algorithm and the multi-modal prediction of the future trajectories of DOs and the error propagation are relatively fast operations that are performed online when the new measurements are collected, before the optimization starts, since the tuples (F j , G j , H, η j ) describing each considered trajectory are available offline. Furthermore, since such procedure for constraints generation is run for each DO independently, parallelization is possible. Moreover, the matrices of the linear EV model in (6) are also quickly evaluated online before the optimization starts, since their structure is known analytically and only depends on the current state of the EV and on the current curvature of the reference path.

D. Improvements in the Safety Constraints
Chance constraints described in the previous section might suffer from rapid variability due to the dependency on the probability assigned to each trajectory, which can vary repeatedly and abruptly during a transient phase in which the information collected through the measurements is not enough yet. Constraints changing significantly and frequently between consecutive iterations of the SMPC algorithm are not ideal, because they result in unreliable and therefore sub-optimal planning. The main reason for this comes from the idea of using still untrustworthy information to draw conclusions. To mitigate the effect, two solutions are proposed.
A first approach to limit the variability in the constraints consists in substitutingμ j with the average value over the last t instants, i.e., β j = g( 1 t 0 h=−t+1μ j h ). By doing so, too rapid and frequent fluctuations are attenuated, making the planning more "suspicious", so that it takes longer for the EV to trust the prediction and dare to react consequently when a single high-level behavior emerges as dominant. Heuristics can be used to choose a reasonable value fort to rule the trade-off.
Alternatively, a second idea to prevent the EV from taking contradicting decisions over consecutive steps due to the rapidly changing prediction environment consists in adding a further constraint to the first predicted step. The additional constraint is also elliptical, centered in the weighted average of the positions of the first predicted states of the DO n I r it eventually vanishes when one trajectory becomes dominant, allowing the EV to dare overtaking. Observe that with this advance, when focus is on a trajectory which is currently considered as the most likely, the other options are simply temporarily given lower priority. As required by the premises of the method presented in this work, no option is permanently ruled out. As soon as uncertainty around the trajectory selection increases again, the additional constraint for the first prediction step starts indeed inducing more caution consequently.
Remark 7: The additional constraint centered in the average predicted state of the DO might also be enforced to further prediction steps. However, in principle a constraint in the first predicted step only is enough to induce caution, as in an MPC fashion only the first input of the sequence is applied.

VI. RESULTS
The behavior of the EV controlled by the proposed algorithm is simulated in CARLA, to discuss the properties and benefits of the novel combination of the IMM and SMPC. We compare the performance with two similar approaches presented in previous works: [6], in which the probability of each candidate trajectories of the DOs is estimated but only the most likely is accounted for in the motion planning of the EV, and [31], in which multiple candidate trajectories are considered and regarded as equally likely. Urban environments with significant uncertainty concerning the future trajectories of dynamic obstacles are considered. Although a simplified and high-level model of the EV (3) is used to plan the trajectory, the EV position is updated applying the determined input to a more detailed dynamical model. For a realistic physical representation of the involved agents, we rely on the CARLA library. CARLA is an open-source simulator for autonomous driving research, and we implemented our algorithm in Python using publicly available libraries. The optimal control problem (21) is solved using the optimize.minimize solver from scipy. Numerical values and further details are given in Appendix B. All simulations are shown at https://youtu.be/ZzJ_h71ccOk.

A. Interaction With Absentminded Cyclist
As simulation scenario, we consider a straight road, with a sidewalk to the right, and an intersection to the left, as depicted in Fig. 3, part of the simulation environment "Town01" in CARLA, see Fig. 5. The EV must proceed straight on the road, considering the uncertain future motion of a cyclist, initially on the sidewalk. The three considered candidate intentions of the cyclist are: A) continue on the bike lane on the sidewalk, B) move to the road and proceed straight on the right of the lane, C) invade the road to eventually make the left turn. The challenge for the EV is to accommodate possibly sudden and hazardous moves of the cyclist (a lane invasion), without inducing an excessively conservative behavior.
In the simulation, we consider a possible mismatch between the models used to predict the future trajectory of the DOs and their actual dynamical properties. Precisely, the cyclist motion is regulated by two PID controllers, one used to regulate the speed to the desired target value, one in charge for the lateral position. Therefore, all candidate future trajectories considered by the EV are imprecise. Furthermore, during the initial phase, the target lateral position of the cyclist is repeatedly changed, simulating a possibly absentminded behavior challenging to deal with, since at first it is unclear whether it is actually starting a maneuver to invade the EV lane or just proceeding erratically on the sidewalk. The method is tested for both cases, i.e., if the cyclist eventually invades the lane and if it stays on the sidewalk, showing the capability of the novel combination of IMM and SMPC in dealing with mode uncertainty avoiding excessive conservatism.

1) Efficient Planning Considering Multiple Intentions:
In the first simulation, the cyclist starts demonstrating an unclear behavior and eventually leaves the sidewalk and invades the lane. The estimated probability for the three intentions, dynamically updated by the IMM based on the position measurements of the cyclist, are depicted in Fig. 6. After an initialization with a-priori equal probability, the trajectory on the sidewalk (A) is shortly recognized as dominant, although with great fluctuations in the trajectory probability. Oscillations in the estimated probabilities are due to the fact that the actual behavior of the cyclist does not match any of the candidates trajectories considered by the EV; consequently, the confidence in each of them necessarily changes repeatedly as new measurements are collected. Thus, the EV regards a possible cut-in of the cyclist as sufficiently likely and maintains a safety distance. In fact, since the size of the safety ellipse around each candidate predicted trajectory of the cyclist is directly obtained from the estimated probability for that candidate trajectory, the safety ellipses of multiple trajectories (straight on the sidewalk and lane invasion) are considerably large. Consequently, the set of feasible positions for the EV is so restricted that the EV is forced to proceed slowly, to be able to safely remain behind the cyclist in case a lane invasion takes place. Eventually, when the cyclist actually moves to the lane (after approximately 4.2 seconds), the EV is sufficiently far and the safety distance is maintained, revealing the enhanced safety resulting from considering multiple candidate intentions of the cyclist. Observe that the estimated probabilities of the trajectories eventually converge to their steady-state value given the a-priori switching probabilities.
In the second simulation, at first the cyclist shows a similar absentminded behavior (for the first 4.2 seconds), but eventually remains on the sidewalk. Once again, as long as the intention of the cyclist is unclear, the EV is forced to a cautious behavior, so as to be ready for a possible lane incursion by the cyclist, which is regarded to have a non-negligible chance. Thus, the EV is prevented from taking risky decisions while the collected information is not reliable enough to draw conclusions. Nonetheless, at around 4.2 seconds there is a critical change: the motion of the cyclist becomes more regular and the IMM consequently starts steadily estimating that the cyclist will remain on the sidewalk, see Fig. 7. As soon as it appears clear that the cyclist will proceed straight and will not invade the lane, the other candidate trajectories (B and C) gradually exit the optimization framework of the EV, which hence dares accelerating. Therefore, the proposed combination of IMM and SMPC allows to consider multiple candidate trajectories for enhanced safety as far as they are sufficiently plausible, without resulting in unnecessary caution if one candidate trajectory is clearly dominant.
Observe that, as previously mentioned, in our simulations all models used to run the IMM and to predict the nominal trajectories of the cyclist are only approximations of real behaviors, i.e., none of the designed models perfectly matches the dynamics of the cyclist used in the CARLA simulator, which is actually controlled by two PIDs. Nevertheless, the goal of the models is to identify trends corresponding to different high-level intentions, and each trajectory obtained is considered to be a nominal realization only. Therefore, additional safety margins are added, and model discrepancies are included in the second level of uncertainty, i.e., uncertainty around the nominal predicted trajectory (for a fixed intention).
2) Comparison to State-of-the-Art Approaches: We now propose comparison simulations with two methods inspired by previous works, that do not consider multiple possible future trajectories for the DOs or do not prioritize them. In simulation 3, the EV only takes the (currently) most likely intention of the cyclist into account, with an approach similar to [6]. The same set of candidate intentions of the cyclist is considered and for each the probability is assessed. However, safety constraints are generated only with respect to the currently most likely candidate trajectory of the cyclist. In this simulation, for the first phase the EV, despite the large uncertainty in the trajectory selection, trusts the cyclist to remain on the sidewalk, because this is the dominant mode, although with highly fluctuating confidence. However, at around 4.2 seconds the cyclist starts moving toward the lane and candidate trajectory B gains focus, since the IMM detects another dominant intention. Still, now the EV is too close to the cyclist and thus reacts with an emergency braking and steering maneuver. An excess of confidence in the single (at present) most likely trajectory is limiting and multiple candidate trajectories should be accounted for.
Nevertheless, merely considering all candidate trajectories proves inefficient. In simulation 4, all candidate trajectories of the cyclist are taken into account with equally large safety areas, see Fig. 8(a). This approach is inspired by [31]. Initially, the EV behaves similarly to when controlled with the IMM+SMPC algorithm, that is, it maintains a distance and avoids overtaking at first. Yet, even when the cyclist starts moving regularly, clearly showing the intention of staying on the sidewalk, in this case the EV still does not dare accelerating and overtaking. In Fig. 9, the speed of the EV is depicted. After a short acceleration at the beginning, the EV shortly starts slowing down and gradually reaches the speed of the cyclist. As a matter of fact, if all models are taken into account irrespective of their estimated probability, the future trajectories consisting of a movement toward the lane are never ruled out and thus the EV stays in the distance, slowing down. Equally weighing all candidate future trajectories results in an overly conservative motion planning for the EV, preventing the method from application in real traffic scenarios. Using the IMM+SMPC method, trajectories of the DOs currently considered unlikely result in constraints that can Fig. 9. Speed of the EV during the comparison simulation in which the cyclist remains on the sidewalk and collision avoidance constraints are enforced for all three candidate trajectories of the cyclist irrespective of the estimated probability. be neglected in practice, without overly restricting the motion of the EV if unnecessary, see Fig. 8(b).
3) Number of Candidate Future Trajectories: Finally, we discuss the role played by the number of candidate trajectories of the cyclist considered in the IMM through additional simulations. Given that each candidate trajectory is meant to represent a specific intention of the cyclist, including more trajectories is intuitively beneficial. However, there are a few shortcomings that must be taken into account. We test the IMM+SMPC method considering 9 candidate trajectories for the cyclist. For each of the three candidate trajectories initially considered (straight on the sidewalk, lane invasion, and left turn), two more are now included, resulting from different rates of convergence to the same target state, see Appendix B.
In simulation 5, the cyclist eventually invades the lane, whereas in simulation 6 the cyclist eventually remains on the sidewalk. Considering multiple candidate trajectories to represent relatively similar motions is not beneficial because it results in unnecessary indecision about the trajectory of the cyclist. Fig. 10 shows the estimated probability for each candidate future trajectory of the cyclist during the simulation in which it eventually invades the lane. Although the behavior of the cyclist is the same as in the previous simulations, due to the many candidate trajectories considered, in this case there is persistent indecision about the future trajectory of the cyclist, with the most likely trajectory only estimated at around 0.2 even after the initial phase with the absentminded behavior. As a result, the safety regions around the predicted future positions of the Fig. 11. Speed of the EV during the simulation in which the cyclist remains on the sidewalk, considering 9 candidate trajectories of the cyclist consisting of different target speeds or different convergence rates.
cyclist are rather small, see Section V-B, and the optimization framework becomes very complex. For example, depending on the predicted future positions of the cyclist, it might happen that at a certain iteration the EV can find a feasible path in a given direction and accelerates, whereas at the next sampling time the framework is so different that the EV needs to suddenly slow down and turn. These sudden changes result in a trembling behavior for the EV, visible for example at around 4 seconds of simulation 5 and at around 6 seconds of simulation 6.
Similarly, we test the method in another setup with several candidate trajectories for the cyclist. In these simulations, further candidate trajectories with respect to the original three are designed considering different target speeds, also resulting in 9 candidate trajectories overall. Simulation 7 shows the lane invasion, whereas in simulation 8 the scenario in which the cyclist eventually remains on the sidewalk is depicted. In both simulations a similar trembling behavior can be observed. Furthermore, in this case the safety areas are so small and displaced sufficiently distant from one another that the EV always manages to find a feasible path within the areas to be avoided. Thus, the EV never even slows down. Fig. 11 shows the speed profile of the EV in case the cyclist eventually remains on the sidewalk, comparing the performance depending on the considered candidate trajectories of the cyclist. If the 9 candidate trajectories are based on different convergence rates to the same target states, the EV starts accelerating but needs to slow down shortly after 3 seconds, in order to keep a safety distance from the cyclist even in case of a possible lane invasion, until the incursion is completely ruled out (red curve in the figure). Conversely, if the 9 trajectories are based on different target speeds of the cyclist, the EV basically does not even slow down (blue curve). In the latter case, although 6 of the 9 considered trajectories of the cyclist are partially located on the lane, their safety areas are so small that the EV basically does not consider them, because none of trajectories is regarded likely enough to generate a meaningful safety area.
To allow the EV to benefit from the IMM+SMPC method even in presence of many trajectories very similar to one another, the safety area should be made artificially very large also for small probabilities. However, a saturation mechanism should then be added, avoiding unreasonably large safety areas if one model perfectly matches the data and its estimated probability increases. We compare our novel IMM+SMPC algorithm with the two comparison methods from [6], [31] considering two metrics. On one hand, we evaluate the average computation time for each iteration of the IMM+SMPC method for the discussed simulations; specifically, in each iteration, we include the update step of the IMM, the multi-modal trajectory prediction algorithm and the generation of safety collision avoidance constraints, and the solution of the optimal control problem. The average computation time is a primary concern for real-time applicability of the proposed algorithm, as an update must be performed at each sampling time, iteratively re-planning the optimal trajectory of the EV. Secondly, to quantitatively compare the performances in terms of deviations from the desired cruise speed and lateral position of the EV, and in terms of usage of the input, we evaluate the average running-cost of the optimal control problem (21). Precisely, for simulation steps t = 1, . . . , N sim , and for weighing matrices Q, R, and S chosen as in the SMPC cost function used in the simulation, we define: The cost (22) is a natural choice as a metric to rank the performances of the considered algorithms, given that are all obtained as solution of an optimization problem aiming at minimizing such a cost. Performances are summarized in Table I. Observe that, although the EV assumes the cyclist to behave stochastically, the actual cyclist motion in the simulations here is deterministic. Therefore, the reformulation of chance safety constraints in the SMPC optimal control problem (21) yields always the same constraint, thus running each scenario once is sufficient. For all considered methods, the average computation time is significantly shorter than the sampling time T = 0.2 s used. Focusing on the scenario in which the cyclist eventually invades the lane, we observe that the method from [6], which only considers the currently most likely prediction of the cyclist, yields a computation time similar to the IMM+SMPC method with the initial 3 candidate trajectories, but a noticeably smaller cost. The explanation for the lower cost lies in the fact that by considering only the most likely future motion of the cyclist, the EV basically ignores the cyclist for most of the simulation and only takes it into account when the relative distance is too small to prevent the collision (which is not considered in the computation of cost J sim ). Meanwhile, increasing the number of candidate future trajectories of the cyclist slightly reduces the cost (around 5% of reduction), at the price of considerably increasing the average computation time (between 10% and 20%).
In the second simulation scenario, in which the cyclist in the end remains on the sidewalk, the average computation time is generally shorter than in the first scenario. As a matter of fact, in this case at last the cyclist is predicted to occupy areas outside of the road boundaries, therefore the collision avoidance safety constraints are always satisfied in practice and the optimal control problem is solved quickly. Compared to our proposed IMM+SMPC method with the initial 3 candidate trajectories, the comparison method similar to [31], in which future candidate trajectories are regarded as equally likely, results in a slightly shorter average computation time, but in a 60% increase in the cost. Indeed, without prioritization of the candidate trajectories, the EV never excludes the possible lane invasion, even when the cyclist clearly shows a different intention. Consequently, the EV continues to proceed unnecessarily slow, without daring to speed up, placing excessive focus on an unlikely outcome. Furthermore, we observe the possible shortcoming in the usage of several candidate trajectories. If the 9 candidate trajectories based on different convergence rates are considered, both the average computation time and the cost J sim marginally increase. However, this is noticeably not the case if we consider the 9 candidate trajectories for the cyclist based on different target speeds. The reason for this lies in the fact that in this case the EV can always find a viable path and all multiple predicted trajectories of the cyclist are essentially disregarded, as previously discussed, compromising the efficacy of the method.

B. Overtaking Maneuver in Uncertain Traffic
Furthermore, we test our algorithm in a different setting, specifically in a traffic scenario including the interaction with two Target Vehicles (TVs). We consider a three-lane road, part of the simulation environment "Town05" in CARLA, in which the two TVs are both initially on the right-most lane, both located in front of the EV, which approaches from the center lane. The EV must eventually overtake both TVs, who are driving slower, avoiding overly conservative maneuvers when unnecessary. TV1 proceeds on the right most lane regularly, whereas TV2 exhibits a trembling behavior. Such irregular behavior of TV2 could be just an erratic driving on the right-most lane due to distraction of the driver, but could also be understood as a possible initial movement to the left lane to eventually overtake the slower TV1. For both TVs we consider three candidate future trajectories, consisting of executing a lane change to the left, keeping the current lane, and lane change to the right. Also in this case, the mismatch between the models of the DOs assumed by the IMM and the physical actuation is considered, controlling the TVs with PD controllers. The irregular behavior of TV2 is realized along the lines of that of the cyclist in the previous simulations. Observe that in this scenario the lateral motion plays a major role in the trajectory planning of the EV.
We first consider the case in which TV2, after some hesitation causing an ambiguous movement, changes lane to overtake TV1. The estimated probabilities for the two TVs in this scenario are shown in Fig. 12. The candidate maneuver lane keep for TV1 is quickly recognized as dominant and the corresponding probability is steadily the most likely. Conversely, the trembling behavior of TV2 is reflected on larger fluctuation in the estimated probabilities. Although lane keep is recognized as the most likely maneuver, the possibility of lane change left is also nonnegligible and eventually the latter is recognized as the dominant mode. Videos 9 and 10 represent this scenario both if the novel IMM+SMPC algorithm is used to control the EV and if the comparison method considering only the most likely maneuver of DOs [6] is adopted. Thanks to the IMM+SMPC combination, the EV can pre-account for a possible lane change maneuver of TV2 before TV2 finally initiates a lane change maneuver. Therefore, when the lane change finally occurs, the EV can still guarantee satisfaction of the safety conditions by moving to the left-most maneuver, without the necessity to slow down. Conversely, if the comparison algorithm from [6] is used, the EV trust TV2 to remain on the right-most lane and does not ponder a possible lane-change maneuver until too late. The difference is evident by comparing the planned trajectory of the EV in the two cases. Fig. 13 represents the traffic configuration and the planned trajectory of the EV at simulation time t = 4.40 s, that is, when the lane change has not been clearly initiated yet and TV2 still lies within the right-most lane. Although still relatively unlikely, the IMM+SMPC allows to weigh in a possible lane change maneuver and therefore the planned trajectory is shifted to the left at the end of the prediction horizon (see Fig. 13(a)), i.e., when the EV is predicted to get closer to TV2, in order to observe the safety distance in case TV2 should actually initiate to move to the left. Thus, when this actually happens, the EV can smoothly continue to shift to the left and eventually overtakes TV2 from the left lane, without decelerating. Vice versa, the comparison method from [6] only accounts for the most likely future trajectory of TV2 (lane keeping), and completely neglects a possible lane change; thus, the planned trajectory of the EV continues straight also at the end of the prediction horizon ( Fig. 13(b)). When TV2 moves to the center lane, the EV is too close to just safely move to the left-most lane, thus steers emergently.
Then, we simulated the same traffic scenario, but at the end of the ambiguous phase TV2 eventually keeps the right-most lane. The estimated probabilities for TV2 in this case are represented in Fig. 14; the behavior of TV1 is unchanged, therefore the estimated probabilities are as in Fig. 12(a). We test both the IMM+SMPC combination and the comparison algorithm from [31], which takes all candidate trajectories into account equally; the simulations are represented in videos 11 and 12, respectively. Using the IMM+SMPC combination, the planned trajectory while TV2 behaves ambiguously is consistent with the former case. At first, a possible lane change of TV2 is accounted for and thus the planned trajectory of the EV shows a temporary shift to the left for those instants in the horizon in which the EV is predicted to get close to TV2. Fig. 15 shows the planned  trajectory of the EV at time t = 4.80 s, shortly after TV2 ends the swinging behavior. The lane change of TV2 is not completely ruled out yet, so the planned trajectory of the EV includes a shift to the left later in the prediction horizon. However, as soon as TV2 exhibits a more regular behavior consistent with a lane keeping intention, the estimated probability of the maneuvers become more steady and the lane change is gradually ruled out, so that the shift to the left of the EV is not necessary anymore. As a result, the EV eventually overtakes remaining on the center lane, avoiding overly conservative behaviors. In contrast, the comparison method from [31] never rules out the lane change of TV2, since all candidate trajectories are regarded as equally plausible irrespective of the estimated probabilities. Hence, the EV motion is unnecessarily cautious and the EV moves to the left lane to overtake both vehicles, potentially slowing down the traffic flow on the left-most lane. Table II proposes a numerical evaluation of the performances and comparison of the algorithms through the same metrics previously introduced, i.e., the average computation time for each iteration of the IMM+SMPC algorithm and metrics (22). Considering the scenario in which TV2 eventually changes the lane, the performances of the comparison method from [6] are affected by the fact that the method notably fails in yielding a safe trajectory and thus the IMM+SMPC algorithm yields a considerably lower cost and average computation time. In the other scenario, in which TV2 ultimately remains on the right-most lane, the IMM+SMPC also yields a significantly lower average computation time, due to the fact that eventually the EV disregards the possible lane change of TV2 and the overly conservative lane change maneuver of the EV is avoided. Furthermore, compared to the simulations with the cyclist, we observe that increasing the number of DOs surrounding the EV does not severely affect the computation time, confirming that running a new IMM algorithm for each DO is relatively inexpensive and that the computational burden is driven by the solution of the optimal control problem (21). Conversely, comparing the cost J sim for different scenarios is not informative, due to the different setup of the simulations.

VII. DISCUSSION
In this work, the set of candidate possible trajectories resulting from high-level intentions is assumed known. Yet, the choice of such possible trajectories is an open problem. Recorded data can be used to synthesize accurate trajectories describing typical vehicles, pedestrians and cyclists behaviors using Deep Learning [40] or Inverse Optimal Control [29].
Moreover, choosing the number of possible different intentions to be considered is also a challenge, in which conflicting considerations arise. Including many trajectories is tempting, as this allows to potentially approximate more accurately an increasing number of motions. For example, in [6] many models are used to take into account different "rates" of convergence in a lane change maneuver. However, in the method proposed in this work this choice would be problematic, leading to persistent indecision in the choice of the trajectory, and therefore to a continuously changing prediction environment, particularly if none of the trajectories considered is "precise enough". Therefore, in practice, it is preferable to consider trajectories sufficiently "differentiated", i.e., representing significantly different high-level intentions. Hence, the uncertainty about the trajectory selection is limited to those cases in which the behavior is really highly unpredictable, and not caused by oscillations between trajectories resulting in a very similar motion, bearing in mind that those candidate trajectories are anyway only approximations of the real behavior.
In Section V-B it was explained how, for every prediction step, n O × n I safety constraints are generated, needed to take into account all the n I possible intentions that each of the n O DOs might be pursuing. However, if candidate trajectories are chosen consistently with the aforementioned considerations, the collected measurements allow the IMM to shortly rule out most of the trajectories, leaving just a few reasonable options for each DO. The unlikely trajectories result in very small elliptical constraints, which have little physical meaning and do not influence the motion planning in practice. However, they can considerably slow down the numerical solution of the optimal control problem. Therefore, it is recommended to set a threshold for the parameters β j below which no constraint is generated. Observe that this does not reduce the capabilities of the method in taking multiple possible trajectories into account, as the unlikely trajectories are still considered in the IMM and the related constraints would be included again in the optimization as soon as the probability should increase. Therefore, this threshold would just speed up the computation by explicitly ignoring trajectories that are currently not considered likely. As a result, several different sufficiently "differentiated" trajectories can be considered, permitting the method to consider a wide range of intentions.
Furthermore, although a new IMM filter must be implemented and run online for every DO surrounding the EV, the resulting workload remains relatively limited compared to that of the online solution of the constrained optimization problem, which dominates the overall computational demand of the proposed method, as usual for motion planning algorithms.
Finally, we emphasize that the concept of prioritization of chance constraints depending on the probability of a given outcome can be employed in other applications, possibly using multi-modal estimation algorithms other than IMM, whose implementation and computational workload are, however, undemanding. The formulation and handling of chance constraints is not limited to the procedure presented in this work, and different options are possible.

VIII. CONCLUSION
In this work a novel combination of the IMM algorithm and SMPC for motion planning for automated urban driving in presence of DOs with unknown intention is presented. Multiple candidate future trajectories of the DOs are designed considering LQR approximations, and their probability is dynamically estimated using the IMM algorithm. This allows to consider twofold uncertainty about the future motion of DOs, comprising unknown intention and non-deterministic motion. Then, the trajectory of the EV is planned iteratively solving an optimal control problem in an SMPC fashion, using the probability of each candidate trajectory of the DO to prioritize the collision avoidance constraints. Thus, multiple possible intentions of DOs are included in the motion planning, giving lower importance to possible future trajectories that are currently considered unlikely, for sake of efficiency. However, since the probability of the intentions of DOs is continuously estimated, temporarily excluded trajectories can gain focus again, fully exploiting the benefits of the multi-modal prediction. Simulations in CARLA are provided, showing the advantages of the method.
The combination of IMM and SMPC is suitable for other contexts in which twofold uncertainty is to be considered. Future research should focus on comparison and combinations with Grid-based MPC [19] and with Belief Function Theory. Furthermore, alternative formulations for safety constraints should be investigated, allowing a faster computation compared to the design proposed here. Besides, more sophisticated approaches for multi-modal prediction of the other traffic participants should be integrated. For example, considering interactions between DOs would make the predicted future trajectories more reliable.

APPENDIX B NUMERICAL VALUES USED IN THE SIMULATIONS
All simulations were run on a desktop computer with an AMD Ryzen 7 1700X eight-core processor. We implemented the CARLA client using the CARLA Python library. At each sampling time, the CARLA client receives updated information on the involved agents and runs the code yielding the control input to move the agents. The numerical value of the current position of the DOs used to run the IMM is directly obtained through the CARLA client, whereas no information about the current speed or acceleration of DOs is used. The EV and the TVs are spanned using CARLA's BMW Grandtourer blueprint, the crossbike blueprint is used for the cyclist.
Units are omitted, as all quantities are given in SI units. The server updates the world in CARLA with a frequency of 30. The EV control is updated every T = 0.2 and the prediction horizon length is N = 10. The parameters of the nonlinear bicycle model are l f = l r = 1.9, and the weighing matrices in cost function (21a) of the EV are Q = P = diag(0, 1, 1, 1),