Modelling Metropolitan-area Ambulance Mobility under Blue Light Conditions

Actions taken immediately following a life-threatening personal health incident are critical for the survival of the sufferer. The timely arrival of specialist ambulance crew in particular often makes the difference between life and death. As a consequence, it is critical that emergency ambulance services achieve short response times. This objective sets a considerable challenge to ambulance services worldwide, especially in metropolitan areas where the density of incident occurrence and traffic congestion are high. Using London as a case study, in this paper we consider the advantages and limitations of data-driven methods for ambulance routing and navigation. Our long-term aim is to enable considerable improvements to their operational efficiency through the automated generation of more effective response strategies and tactics. A key ingredient of our approach is to use a large historical dataset of incidents and ambulance location traces to model route selection and arrival times. Working on the London road network graph modified to reflect the differences between emergency and civilian vehicle traffic, we develop a methodology for the precise estimation of expected ambulance speed at the individual road segment level. We demonstrate how a model that exploits this information achieves best predictive performance by implicitly capturing route-specific persistent patterns in changing traffic conditions. We then present a predictive method that achieves a high route similarity score while minimising journey duration error. This is achieved through the combination of a technique that correctly predicts routes selected by the current navigation system of the London Ambulance Service and our best performing speed estimation model. This hybrid approach outperforms alternative mobility models.


I. INTRODUCTION
The key performance indicator for emergency ambulance services worldwide is time to respond to life-threatening incidents. Indeed, a well-established clinical finding is that shorter ambulance arrival times are critical for the survival of individuals involved in high-severity emergency incidents [1]. In the United Kingdom for example, ambulance services operating under the National Health Service (NHS) treat approximately 30, 000 out-of-hospital cardiac arrest (OHCA) patients every year [2] with a survival rate of approximately 9% for those reaching hospital discharge [3]. Short ambulance response times are a key factor in improved clinical outcomes because longer delays between collapse and the commencement of emergency life support result into significantly reduced survival rates [4]. After 10 minutes very few patients survive [5].
Accordingly, emergency ambulance services in the UK operate under the regulatory requirement to reach at least 75% of OHCA patients within 8 minutes. Yet, in major metropolitan areas such as London meeting this mandate presents considerable challenges, especially in a setting of horizontal cuts in funding and a rising number of medical emergencies: in October 2017, only 68% of patients were treated within the time frame required. To address the challenges of effective ambulance response, in this paper we introduce a data-driven methodology with a view to enable considerable improvements to the operational efficiency of emergency services. Specifically, we investigate the routing and navigation performance of a metropolitan ambulance response system using London as our case study. Our main goal is to exploit the improved understanding of mobility patterns as relating specifically to ambulances, so that we can model and predict their movements precisely. Our longer-term goal is to contribute towards the generation of more effective response strategies and tactics through software automation.
We investigate the performance of this data-driven approach using a large historical dataset collected specifically for this study from the London Ambulance Service (LAS) over a 2-year period. This dataset was assembled by the lead author with full access to LAS telemetry (described in more detail in Section II) which captures ambulance movements in the city as the service responds to emergency calls. The dataset offers a distinct view of a metropolitan-scale emergency response system at a level of detail and at scale not previously considered in the research literature. To the best of our knowledge, this study introduces the first datadriven methodology for algorithmic route selection and the estimation of travel time specifically tailored to emergency ambulatory vehicles responding to an incident with blue lights and sirens on.
In Section III, we describe the incident and ambulance tracking dataset and then proceed to employ ambulance journey data to enrich emergency vehicle-specific graph representations of London's road network reflecting systematic spatio-temporal variations that characterise the speed patterns of emergency vehicles. We show how these speed variations capture implicit, yet essential, information, of underlying traffic conditions. Taking these into account can effectively lead to better estimates of ambulance arrival times at the location of an incident, which is key in predictive modelling and dispatch simulation.
We summarise our research findings with the following key points: • Spatio-temporal analysis of ambulance mobility: Our analysis of ambulance movement patterns in Section III reveal that, in line with London's well known polycentric urban structure, demand for the service concentrates around several urban nuclei within the metropolitan area of the city. Moreover, we identify strong temporal variations in the speed of different types of ambulatory transport vehicles. Finally, we demonstrate that ambulance mobility significantly differs from civilian traffic and thus requires a specifically tailored approach. • Multi-layer graph representation of London's road network for Blue Lights traffic: Reflecting the fact that emergency vehicles that travel with blue lights and sirens on are exempt from traffic regulations, we curate a graph representation of London's road network in a manner that accommodates the distinctive requirements of emergency services. The Blue Lights Road Network described in detail in Section IV permits right turns where restrictions apply on civilian traffic, among other characteristic features. Furthermore, each road segment represented in the network is associated with a set of weights employed to provide alternative ambulance speed estimates. • Accurate ambulance speed estimation from lowfrequency GPS data: A key element of the data-driven approach presented in this paper is the calculation of accurate expected ambulance speeds on every segment of the London road network. To achieve this, in Section V we employ the Blue Lights Road Network and a map-matching method selected for its performance on coarse-grained GPS ambulance tracking data to reconstruct complete ambulance routes. Fully reconstructed routes are subsequently employed to obtain considerably improved speed estimates on a per road-segment basis thus enabling the calculation of more precise expected ambulance journey times. • Development of a data-driven predictive ambulance mobility model: Building on the above findings, we develop and evaluate alternative predictive ambulance mobility methods of increasing sophistication. First, we consider journey time accuracy as the key success criterion, concluding that the best performing model is adaptive to both time of week and the specific set of road segments traversed (Section VI). We demonstrate through in-depth route similarity analysis, that a hybrid model combining a route selection stage followed by travel time estimation can improve intelligence regarding journey times, specifically capturing the design features of the navigation system currently in place at the service (Section VII). Last but not least, we identify opportunities for improved selection of ambulance routes.

II. BACKGROUND AND RATIONALE
Setting the context for the detailed description of the dataset used in this study, in this section we provide a concise operational overview of the LAS, including a description of the mobile computing technology implemented on-board ambulances. We then proceed to demonstrate the features that differentiate emergency vehicle movement and general civilian traffic and conclude with a brief overview of relevant literature.

A. LAS OVERVIEW
The LAS is responsible for providing emergency medical care across Greater London, covering an area of approximately 1, 572 km 2 with a population of 8  Ambulance crew can have different levels of skill and experience. Emergency Ambulance Crew (EAC) and Emergency Medical Technicians (EMT) are trained to respond quickly to out-of-hospital medical emergencies. They are able to deliver a wide range of treatments to patients such as those suffering from cardiac arrest, trauma or minor injuries, and are authorised to deliver drugs and provide immediate life support such as defibrillation and basic airway management. Paramedics are authorised to conduct invasive procedures such as advanced airway management and needle chest decompressions, as well as deliver a wider range of drugs. Ambulance crews typically work on 12-hour shifts beginning at 7am and 7pm respectively. Each crew can take a single 30-45 minute rest break during a "rest break window." Since April 2011, the Department of Health (DoH) have set targets for ambulance trusts such that 75% of immediately life-threatening incidents must have a first responder arrive within 8 minutes and 95% of patients must be reached within 19 minutes [6]. Note that these targets are relatively lower than international standards, which typically require sub-8 minute response for 90% of immediately life-threatening cases. Indeed, the OPALS study calls for a maximum 5 minute arrival time to 90% of Category A cases [7]. Other non-life threatening incidents have locally agreed targets typically requiring the first responder to arrive within 20 to 30 minutes depending on the severity. Table 1 summarises target response times.

B. ON-BOARD TECHNOLOGY
AEUs and FRUs are equipped with an on-board computer known as the Mobile Data Terminal (MDT) incorporating a touch-sensitive display. Information transmitted to the MDT by the LAS Command and Control Centre is used to provide the crew with patient and incident details, and map-based search and navigation facilities. AEUs also carry extensive instrumentation that tracks their location via GPS, and monitors vehicle state including equipment temperature, hand-brake position, door open, blue lights, sirens, batteries, fuel level and so forth. This information is relayed periodically to LAS headquarters over multiple wireless pathways including at least two 2G cellular mobile telephony networks to ensure resilience and extended coverage, as well as IEEE 802.11 when an ambulance is near an ambulance station. Similar to other UK emergency services, ambulances also carry twoway TETRA transceivers which carry encrypted voice and data so that the crew can communicate directly with the LAS Command and Control centre.
As relates to location information specifically, AEUs carry a Siemens Navigation Unit (SNU) which provides geographic positioning information and routing capabilities to the MDT. The SNU incorporates embedded GPS receivers, gyroscopes and accelerometers and receives supplementary information from external wheel speed sensors. Positioning information is recorded whenever the vehicle is en-route to an incident, A&E, fuel stop or standby point. Location updates are transmitted only when a vehicle starts to move or every 15 seconds while moving. Speed data is encoded in 5 mph increments and heading is transmitted in 15 degree increments. Approximately 95% of all data traffic is received at LAS Headquarters within 1 second of transmission. Retransmissions generally account for less than 1% of the total data volume.

C. AMBULANCE VS. CIVILIAN TRAFFIC
Patterns of ambulance movement differ from civilian traffic mainly due to the fact that ambulance crew travelling with blue lights and sirens on are by law exempt from traffic regulations [8] that would otherwise impede progress to a patient. For example, ambulances responding to a call on blue lights are permitted to treat red traffic lights as a give way sign, are able to pass on the wrong side of a keep left bollard and can exceed the speed limit. To illustrate this point, we use the Google Maps Distance Matrix API (MDM) [9] as a typical example of civilian traffic models in terms of its route selection strategy and estimation expected arrival times. A comparison of AEU trips recorded in the LAS dataset against estimates obtained by MDM in Figure 1 suggests a tendency of the latter to overestimate trip duration by a factor of 1.4. Similar calculations for FRUs (not shown in the figure) suggest a tendency for MDM to overestimate trip duration by a factor of 1.5.
Moreover, ambulances travel for a specific purpose that is, in response to emergency medical incidents and the temporal and spatial characteristics of such events also follow particular patterns. Such patterns follow the temporal rhythms of urban life for example, with commuters flowing from the suburbs into the City of London in the morning and returning to their homes in the evening. These diurnal variations in speed of ambulances are observed empirically in our dataset as shown in Figure 2. Our analysis shows that demand for the service concentrates around several urban nuclei within the metropolitan area of the city reflecting London's well known polycentric urban structure [10], with medical emergencies VOLUME 4, 2016  more likely to occur at specific times and places in London as depicted in aggregate in Figure 3.

D. RELATED WORK
The study of computational techniques for the effective and efficient management of Emergency Medical Service (EMS) resources has a relatively long history [11]. The main focus of research in this domain has been on the development of models for the placement of EMS facilities, such as ambulance stations, and on strategies for resource relocation so that specific performance metrics are maximised [12], [13]. The survey [14] provides a chronological perspective on the development of research in this area.
Over the past decade, the wider availability of performance data made available by EMSs worldwide have enabled the exploration of data-driven approaches, which aim to improve response efficiency by adapting to the dynamics of incident generation [15], [16] and road traffic [17], leading to a closer focus on improved clinical outcomes rather than resource allocation optimization [18]. Research in this vein has revealed key limitations in historical dispatch strategies, for example the disadvantages of the common practice to send the proximal available ambulance to the incident when choosing the vehicle to dispatch [19]. Indeed, recent work suggests that EMS systems can benefit from performance improvements achieved through the implementation of sophisticated strategies which take into consideration spatial and temporal factors [20].
To account for temporal and spatial variations in resource availability and incidence generation, a common approach is to employ synthetic models for event-driven simulation [22]- [25]. Nevertheless, when considering the route selection problem specifically, current literature typically adopts a traditional optimisation approach [26]- [30] such as linear programming with constraints using tree-based search for shortest path calculations.
This paper adopts a data-driven methodology for the estimation of ambulance routes and arrival time at the location of an incident as a core ingredient of our approach. The development of such a data-driven methodology has become possible due to the availability of a detailed and comprehensive dataset from LAS recording the true mobility patterns of ambulances at the individual level over a prolonged period of time rather than in the aggregate. To the best of our knowledge this is the first time that a fundamentally datadriven methodology has been applied for the algorithmic inference at high fidelity of the route followed by ambulances responding to emergency incidents with blue lights and sirens on, and the estimation of their arrival time.

E. METHODOLOGICAL APPROACH
Our approach involves estimating the journey time for an emergency ambulatory vehicle while travelling with blue lights and sirens on. This calculation consists of three steps: First, we use historic data to estimate the average speed at which an ambulance responding to an emergency event traverses each road segment in London. Second, from these average speeds we can estimate the journey time for any route by accumulating the estimated journey time for each road segment in that route. This estimate may include a correction depending on either the number of turns or the length of the route. Finally, a standard graph theoretic algorithm is used to determine the quickest route between any start and end locations notably from any position of ambulance activation to the location of an incident.
Central to our approach is the computation of the so-called Blue Lights Road Network (BLRN) for London. It differs from the standard road network used for civilian traffic in that it incorporates the exceptions established by law for The first step, estimating average road segments speeds, is not straightforward. The historic LAS dataset contains GPS locations and vehicle identifiers, this data needs to be matched to the road network in order to infer the route taken by a particular vehicle. In addition, the speeds of emergency vehicles travelling with blue lights and sirens on show hourly and weekly seasonality. Considering the size of the BLRN this implies that our historic data, although substantial, does not provide complete coverage. Consequently, five different methods for estimating the speed, denoted Metrics I to V, are investigated. These methods are broadly ordered by increasing spatio-temporal granularity, full details are provided in Section IV. Metrics I and II are the most straightforward; Metric I assumes that the estimated average speed is the same over the entire network and Metric II that an already known speed profile exists for each road type. Metrics III to IV estimate average speeds for individual road types with increasing temporal granularity. Metric V is similar to Metric IV but at higher spatial granularity, considering individual road segments, and includes a more sophisticated method for filling-in missing data.
We show that estimating the entire journey time by simple aggregation will underestimate the result for Metric V, a simple correction is proposed and its effect is described. We also explore the advantages and limitations of different route selection strategies employing Metrics I to V against both route similarity, that is the ratio of overlap between the estimated and observed paths; and, arrival time error, that is the difference in estimated and actual arrival times. Finally, we construct a so-called hybrid model that provides the best match for the routes selected by the current LAS dispatch system and LAS ambulance crews. The hybrid model employs a combination of Metric II with Nelder-Mead optimised road speeds for route selection and Metric V for arrival time estimation.

III. LAS DATASET
In this section, we introduce the datasets used in this paper: First, we describe the operational data obtained from LAS with particular emphasis on location tracking. We then proceed with some preliminary observations relating to mobility patterns discovered in the data.

A. DATASET FEATURES
The LAS dataset contains incident, activation and tracking information of their entire fleet between March 2014 to December 2016 (cf. Table 2). Specifically, the following data type records are included:  of activation and the arrival time at the place of the incident.
A regular flow of data was received during the period of data collection with the notable exception of May 2014 when significantly higher volumes were observed. This behaviour was validated against LAS archives which record the weekend of 17-18th May as one of the busiest since records began, with Sunday 18th May being the sixth busiest day in its history. Further, since we are only interested in ambulance mobility when travelling with blue lights and sirens on, AVLS data were filtered for records corresponding only to vehicles en route to a Category A incident, retaining only the relevant data points as summarised in Table 2.

B. LOCATION SAMPLING
The frequency of AVLS reporting was analysed finding that the majority of data arrive at 15-second intervals as expected. However, a small proportion of records appear to arrive at 0, 10 and 20-second intervals. Further investigation revealed that this was due to erroneous time-stamping by the MDT. Specifically, the MDT is programmed to poll the SNU only once every 15 seconds so that when it subsequently transmits status updates, for example in response to an action by a crew member, ambulance position is reported using the last cached entry obtained so that stale GPS position information is used. AVLS records with such erroneous time-stamps were identified and excluded from the dataset.

C. TRACE AGGREGATION
The first step in the generation of road routes from raw GPS data, is the aggregation of AVLS records into tracks representing individual journeys. This is achieved by grouping AVLS data by call sign, incident identifier, and vehicle type. Note that the call sign is a code used on the radio for the purposes of crew identification. This process generated approximately 2.3 million distinct journeys attending to approximately 1.3 million emergency incidents (cf. Table 2 for details). The higher number of journeys reflects the fact that often multiple vehicles attend a single emergency.

D. ROAD NETWORK COVERAGE
For each AVLS data point, we determine the nearest link on the road network using a naïve algorithm often referred to as GPS snapping. Snapping each GPS position to its   nearest road link provides a simple way to determine the proportion of the road network that is covered by the data set as demonstrated in Figure 4. This calculation also enables us to estimate that the entire London road network is fully covered approximately every 2 − 3 years suggesting a rough measure of the period required to obtain a full dataset refresh.

E. PATTERNS OF AMBULANCE MOBILITY
Further to the observation that ambulance speed fluctuates with the time of day (cf. Figure 2), we note that average speeds range between 24 − 32 mph for AEUs and between 28 − 34 mph for FRUs. Traffic slows down during the morning (06:00-09:00) and evening rush-hour (16:00-19:00). Daytime speeds improve somewhat from that minimum, with significant further increases during the night. While qualitatively the diurnal pattern is similar for both vehicle types, as expected FRUs are on average quicker by 2 − 5 mph.
Considering the spatial distribution of emergency response trips, Figure 5 depicts their density at 3 km × 3 km cell resolution. As expected, Figure 5 confirms significant differences between centre and suburbs. It also reveals considerable localised differences in the outer regions in particular with several hot-spots of higher average incident density identified roughly corresponding to suburban loci.
Overall, these observations suggest that an effective model for ambulance route selection and arrival time prediction should be adaptive to the spatial and temporal context and take into account the specific vehicle type involved. Indeed, the approach adopted in this study and detailed next is motivated by this observation.

IV. BLUE LIGHTS ROAD NETWORK
A core data structure employed in the present work is the multi-layer directed graph G representing London's road network as viewed by emergency vehicles while travelling with blue lights and sirens on. G forms the foundation for modelling ambulance vehicle mobility and the evaluation of alternative route selection methods. Specifically, the Blue Lights Road Network (BLRN) G is the graph: such that: • the nodes N correspond to road intersections, • the edges E are road segments connecting intersections (we refer to edges also as road links), • each road link e i ∈ E is associated with weights W The calculation of W (j) for each road link is a key element of the present work. Specifically, we consider five alternative road link cost-estimation metrics of increasing sophistication: • Metric I uses the fixed speed of 22.8 mph for all road links. This metric corresponds to the naïve approach assuming there are no differences in speed ratios and traffic conditions across the city. It is only used in this paper as a baseline for comparison. • Metric II uses a standard speed profile for each road type (cf. Table 3) also adding a delay of 2.5 seconds each time a junction is crossed. Metric II corresponds to the method employed by the LAS routing engine at the time when the data set was recorded. • Metric III uses road speed estimates adapted to ambulance position, hour-of-day and ambulance type. Specifically, the road link e i is associated with the 2 × 24 weight matrix W (II) i , with rows corresponding to either AEU or FRU and columns to hour-of-day. The weight W (II) i (a, h) for a specific ambulance type a during hour-of-day h are calculated as the harmonic mean of all AVLS records in the data set located within a 500 × 500m box centred at the midpoint of e i recorded on roads of the same type during hour-of-day h (i.e. irrespective of the day of the week). • Metric IV uses road speed estimates characteristic of ambulance position and ambulance type similar to Metric III, but adapting to the hour-of-week. Hence, in this case the weight matrix W (IV) i has size 2 × 168 and calculated in a manner similar to Metric III incorporating variations in weekly traffic conditions. Metric IV improves on Metric III in terms of time resolution and the ability to capture weekly patterns, but its accurate calculation requires a significantly higher data set size to provide an adequate number of samples for each e i in the road network.  To construct the BLRN we employed the Integrated Transport Network (ITN) dataset supplied by Ordnance Survey. The ITN was modified to allow routing according to the rules pertaining to ambulances and other emergency vehicles travelling with blue lights and sirens on. Specifically, the BLRN permits: • passing on the incorrect side of Keep Left/Right Signs including passing on the wrong side of a keep left bollard, • right turns where No Right Turn restrictions apply, • use of bus lanes during operating hours, • driving into a pedestrian precinct, • treating red traffic lights and zebra crossings as a give way sign, and • exceeding the speed limit. Moreover, the BLRN replaces single edges for bi-directional roads with two edges, one for each direction. This modification is required for the effective implementation of the mapmatching methods introduced in Section V, which represent a key ingredient of our data-driven methodology. Finally, the estimation of the weights W (i) requires the calculation of ambulance speed for each road segment in every journey reconstructed from the AVLS records. We elaborate further on this point in Section V.

V. AMBULANCE ROUTE RECONSTRUCTION
As already noted, the accurate reconstruction of observed ambulance routes provides the foundation for the development of alternative road speed models for ambulances travelling under blue lights and sirens on. Because LAS ambulances record their position relatively infrequently while VOLUME 4, 2016 operating in a high-density urban setting, the GPS tracks obtained from AVLS records provide only a coarse-grain record of their movement. In particular, such tracks do not identify all the road links of the route followed even after snapping to the road network, which in turn severely limits our ability to produce accurate speed estimates. To address this problem, we adopt a map-matching approach [31] that leads us to achieve the full reconstruction of ambulance routes from relatively low-frequency GPS traces. Overall, this approach enables significant improvements to road speed models using Metrics III, IV and V.

A. MAP-MATCHING AVLS TRACKS
The map-matching process transforms a time-ordered sequence of GPS locations into the most likely path followed by the ambulance in the BLRN. We investigated the performance of several map-matching techniques based on particle filters and the Hidden Markov Model/Viterbi Paths (HMM/V) method on the LAS dataset specifically. This was deemed necessary due to the fact that each technique provides different trade-offs including sensitivity to the sampling rate, computational complexity, real-time performance and accuracy.
Two popular map-matching methods were considered in detail, one employing particle filters [32]- [34] and the second the HMM/V variant proposed by Newson and Krumm [35]. Our experiments suggest that AVLS data are best suited to the HMM/V approach which we found to produce consistently higher fidelity routes (the details of these experiments are not included in this paper due to lack of space). Nevertheless we note that this result is consistent with the literature [36]- [39] in that it confirms that HMM/V produces accurate results for the relatively low sampling rate such as one AVLS record every 15 seconds typical in the LAS dataset; it does not require setting tight speed limits, which fits well with the fact that ambulances are exempt from such limits when travelling with blue lights and sirens on; and, adapts well to low accuracy GPS fixes which can occur especially in urban settings.
Using HMM/V map-matching, we obtain the complete road segment-by-segment route followed by an ambulance and the time of entry to and exit from each road segment. Assuming constant travelling speed across individual road segments, vehicle speed can be estimated directly as the ratio of road segment length over traversal time.

B. BLRN COVERAGE
A total of 1, 910, 941 journeys were processed by the HMM/V algorithm involving 177, 975, 172 road link records and associated speeds. Using the fully reconstructed routes, we carry out an analysis of coverage by road type: Table 4 suggests that coverage for road types more likely to be used by an ambulance is considerably improved against GPS snapping. For example, for Type A roads (major roads providing large-scale transport links) coverage increases to 94%. Table 5 compares road speed estimates obtained from GPS against those calculated after map-matching per road type and the fixed road speeds used by the routing engine implemented by LAS as the baseline. Overall, these results highlight the significance of map-matching, which yields considerable improvements.

VI. MODELLING AMBULANCE MOVEMENT
Using the speed estimates calculated in Section V, we can compute the weights W (j) defined in Section IV for each road segment in the BLRN. Then, for a particular incident and initial ambulance location, we can apply a shortest path algorithm on G using each of the five weight layers W (j) for j = I, II, . . . , V in turn to select a route. In this section, we explore the advantages and limitations of these alternatives by comparing them against a testing set selected from the LAS dataset along two criteria: (a) travel time error, and (b) route similarity. Without loss of generality, we use the classic Dijkstra's algorithm for shortest path calculations throughout the remainder of this study.  of estimated arrival times: For each route in the testing set we calculate the difference between the actual journey duration as recorded in the LAS dataset and the estimated arrival time using each of the alternative metrics to select a route. The distribution of the error is depicted at the top of Figure 6. Figure 6 suggests that on average all methods tend to underestimate arrival time. For example, using Metric II the error has a mode of approximately −130 seconds with 95% of all routes estimated at −45 seconds or less than the actual time duration. Surprisingly, Metric I outperforms Metric II even though the latter is used operationally by LAS. The more sophisticated Metric V provides the best performance with mode of approximately −20 seconds.
To better understand the tendency of the models to underestimate arrival times, specifically whether this is due to erroneous route selection or inaccurate speed estimates, we also calculate travel times using each model but using the actual route recorded in the LAS dataset rather than the route selected by the corresponding metric. The outcomes of this comparison are presented at the bottom of Figure 6 suggesting an improvement and with Metric V still outperforming all others. Figure 7 provides a closer examination of the difference in journey time estimation for the actual (green) and estimated (blue) route taken using only the best performing Metric V. Note that for journeys of up to 8 minutes there is only a relatively small difference in accuracy. Recall that due to the regulatory framework under which LAS operates, response time of 8 minutes is required for 75% of Category A incidents resulting in the largest proportion of our dataset to record sub-8-minute journeys. For journeys longer than 8 minutes, arrival time estimates diverge with the mean estimated journey time using the actual route providing good accuracy for  up to 20 minutes.

B. ROUTE SIMILARITY
To further investigate the different path choices between predicted and actual routes, we compare routes by segment as shown in Figure 8. First, each route is split in four equidistant segments q 1 , q 2 , q 3 and q 4 . Then for each segment we calculate path coincidence as the percentage of road links in the actual route that are correctly identified. Path coincidence ranges between 70% and 80%. As expected, Metric II used by LAS for route planning, produces the most similar routes as expected (however, recall that this does not result into the most accurate arrival time estimation cf. Figure 6). Moreover, the more sophisticated Metric V does not perform well in terms of route similarity. Figure 9 shows the distribution of path coincidence for Metric V per route segment: For all segments, scores between 5% and 95% appear fairly uniformly distributed. Further, in segments q 1 and q 4 , representing the beginning and the end VOLUME 4, 2016 of the route, a peak is observed at 100% path coincidence indicating that a large proportion of routes have been matched perfectly. In segments q 2 and q 3 in addition to similar peaks at 100% path coincidence, a secondary peak is observed at 0% suggesting that an entirely different route was selected. Overall, these observations suggest that in a significant proportion of cases, Metric V picks substantially different routes towards the location of an incident. Moreover, these routes lead to considerably shorter arrival times than those suggested by Metric II.
The dataset employed in the present study does not permit further investigation of the causes of the route selection divergence, for example, it does not record how often the crew overrides the route selection made by the MDT rather than following the recommended route. Localised traffic load and related driving conditions are also not recorded, nor are they available via third party datasets to the required resolution. As such, without further experimentation under true emergency response circumstances it is not possible to assess whether the routes selected by the MDT routing engine are optimal specifically for ambulances travelling under blue lights and sirens on. Yet, this observation suggests the possibility that the results presented here offer the opportunity for reduced ambulance arrival times if Metric V were used for route selection. Figure 7 suggests that the potential improvement is approximately one minute for a ten minute journey, performance which would represent a considerable improvement in the ability of LAS to meet mandated operational targets.

C. SPATIO-TEMPORAL VARIATION
London is a polycentric, densely populated area with a complex, roughly circular, road network [10]. AVLS data confirm that traffic density is higher, and road journeys for comparable distances are thus longer, in the centre of London and certain times of day (cf. Figure 3). In this section, we consider how our predictions of preferred ambulance routes vary along spatio-temporal dimensions. Specifically, we investigate whether there is a relationship between journey time estimation and distance from the centre of London. To this end, we follow the commonly adopted convention to consider Charing Cross as the centre point of London. We calculate distance from the centre as the length in kilometres of the straight-line segment connecting Charing Cross and the geographic midpoint between start and end locations of the journey. Figure 10 shows estimated journey time against distance from Charing Cross. It reveals that journey times are somewhat shorter at the centre but overall roughly consistent at 4 − 6 minutes across London. Variation is least at the centre and decreases somewhat at the outer suburbs, but note that the volume of incidents in the furthest areas is low. Because arrival times also depend on the number of resources available, this is also indicative of how close a resource is located to the incident. Finally, Figure 11 considers timing error and path coincidence: In both cases, mean prediction error and variance are higher at the centre, gradually improving further away.
Considering temporal variations, Figure 12 displays hourby-hour error during the day. Performance is relatively uniform irrespective of the time of day with only two noticeable drops, coinciding with the change of crew shifts.

VII. HYBRID PREDICTIVE MODEL
In Section VI-A we observed a general tendency to underestimate arrival times in general and when using Metric V in particular. To address this limitation, a naïve approach would be to simply adjust arrival times by applying the bias function suggested by Figures 13 and 14, namely: where t β is the original estimated journey time and t χ is the corrected time in seconds. The result of this adjustment is shown in Figure 15 indicating a significant improvement in accuracy with a mean error of less than 60 seconds for journeys lasting up to 14 minutes which account for 90% of all journeys completed by LAS in response to a Category A incident. However, while the adjustment described above considerably reduces arrival time error, it does not improve route similarity scores. Recall from Figure 8 that Metric II achieves a similarity score of 80% which compares favourably with 73% obtained for Metric V. This suggests that a hybrid approach combining Metric II to select the route, subsequently employing Metric V to predict journey duration can yield  Boxes represent the 25th and 75th percentiles. Whiskers represent the 10th and 90th percentiles. VOLUME 4, 2016 increased accuracy both in arrival times and path similarity. Our objective in this case is to match the current performance of LAS rather than achieve the shortest possible arrival time.
To this end, first we optimise the static BLRN weights W (II) used by Metric II to maximise route similarity. The Nelder-Mead algorithm [40] is employed on approximately 200 routes selected from the LAS dataset with initial parameters set to the speeds employed by LAS (cf. Table 5) and allowing a maximum perturbation of 20 mph so a suitably wide range of alternative speeds can be explored. The resulting road speeds are displayed in Table 6. Using Metric II with the Nelder-Mead optimised speeds we achieve a path coincidence of 84%, which represents best performance. Following route selection using Metric II with the Nelder-Mead road speeds, we proceed calculate the estimated journey time using Metric V. Figure 16 shows that this approach (green) outperforms Metric V after correction with the bias function (blue), achieving less than 60 seconds accuracy for journeys up to 15 minutes while maintaining best path coincidence of 84%. We refer to the method of route selection using Metric II with Nelder-Mead road speeds and subsequent journey time estimation using Metric V as the hybrid model, obtaining the best match to actual LAS performance.
Using the hybrid model, the choropleth map in Figure  17 depicts the spatial variation of the prediction error per London Borough. Figure 17 implies that the error is lower in the suburbs, where there is a tendency to overestimate arrival times, while the reverse is true in the centre. The specific causes of these variations as well as the possibility of building area-specific mobility models will be investigated in future work.

VIII. DISCUSSION AND CONCLUSIONS
This paper introduces a novel data-driven methodology for the accurate prediction of the route followed by an ambulance responding to an emergency incident travelling with blue lights and sirens on, and the precise estimation of its expected arrival time. Key ingredients of our approach include the comprehensive reconstruction of ambulance journeys from coarse location tracking data using map-matching; the development of a graph representation of the London road network specifically tailored to the particularities of emergency vehicles travelling with blue lights and sirens on; the estimation of several alternative edge-cost metrics for this network; the assessment of their performance characteristics; and, the use of the best performing metrics for the development of a hybrid model that achieves the highest route similarity while minimising arrival time error. This model offers consistent performance across both spatial and temporal dimensions. This methodology has considerable implications for ambulance services aiming to improve the accuracy and fidelity of their emergency response emulations, which are the main instrument employed in practice for the investigation of more effective and more efficient operational policy. Specifically, the ability to trace closely the true movements of ambulances and to estimate their expected arrival time, enables the exploration of the advantages and limitations of alternative strategies through emulation of realistic scenarios. In addition to forecasting, data-driven decision making can offer significant improvements in planning for key service requirements such as ambulance staffing levels and resource placement, and enables services to balance strategy and tactics through the accurate assessment, for example, of the effects of particular dispatch tactics on meeting their strategic objectives. Moreover, the methodology can be applied in real-time at emergency service operating centres for planning, to determine the ability of the service to cover its area of responsibility as well as to quantify the effects on its ability to maintain a high level of response resulting from specific vehicle dispatch or relocation decisions or as a result of permitting crews to go on a break.
One feature of this work is that our methodology considers only historical data collected internally by the emergency ambulance service. In future work, we aim to explore potential improvements that can be achieved using real-time information as well as traffic and related context information retrieved from external systems. Although extending the methodology to cater for such data sources appears relatively straightforward, a major challenge relates to assessing the quality of such third-party data sources and validating their accuracy on the fly, especially considering the effects on loss of life that erroneous or intentionally misleading information may have.
Finally, our analysis appears to suggest that a bespoke route and navigation engine tailored specifically to ambulances travelling with blue lights and sirens on, rather than one developed for general civilian traffic as is currently the case, can lead to significant reductions in crew arrival times at the site of an incident. Specifically, the use of Metric V introduced in this paper appears to select preferable ambulance routes that suggest a faster journey to the site of an incident. Clearly, extensive further work under true operational conditions is required to assess whether concrete performance benefits can be achieved following this approach.