Misper-Bayes: A Bayesian Network Model for Missing Person Investigations

Bayesian Networks are probabilistic graph models that can be used for classification, prediction, diagnosis and parameter learning. Probabilities can be inferred from the models and missing values can be imputed, based on probability theory. Missing person cases place a strain on the already overstretched resources of Police Forces. Such cases predominantly come from at risk groups such as children in care and people suffering from depression or dementia. Current approaches for dealing with such cases are manual and rely upon empirical studies and domain knowledge. This paper proposes the use of a Bayesian Network model, which can be used to predict the likely location of a missing person (misper) for a number of at risk groups. The model is evaluated using a set of misper cases and results compare very favourably with those of the manual processes currently used by UK Police forces. The novel approach described provides both a theoretical foundation and a practical framework for the future development of a decision support system. In addition to the model, a contribution is made through guidelines, which recount experiences in learning a Bayesian Network from data.


I. INTRODUCTION
According to National Guidelines set out for UK Police Forces [1], A missing person is defined as: ''Anyone whose whereabouts cannot be established and where the circumstances are out of character or the context suggests the person may be subject of crime or at risk of harm to themselves or another.' ' When someone is categorized as missing, the police will investigate their disappearance and try to find and safeguard them.
In 2013 the Guidelines introduced a second absent person category, defined as: ''A person not at a place where they are expected or required to be'' and perceived to be ''not at any apparent risk''.
When someone is categorised as absent, no police response is required except to monitor and review the situation.
Typically absent cases involve individuals who go missing frequently (often referred to as frequent fliers). They are The associate editor coordinating the review of this manuscript and approving it for publication was Claudio Zunino . likely to be designated a missing person for the first few times that they are missing, but, if they return unharmed, thereafter they may be designated absent.
Dealing with missing person cases consumes a large proportion of Police time and resources, particularly in urban areas.
This paper distinguishes a missing person (colloquially referred to as a misper) from a person who has lost their way, perhaps due to getting lost during a hiking expedition. Fig. 1 highlights the scale of misper cases facing UK Police Forces and the volume of calls generated from dealing with such cases.
Mispers come from a spectrum of the population. Many are children (typically teenagers) who go missing from care homes, others are adults with mental illness or depression. Cases also include elderly people suffering from dementia related conditions. Murder (homicide) cases, manslaughter cases and death by misadventure often start out as misper cases until a body is located. Current practice relies on heuristics and localized domain knowledge. Social science studies will often interview mispers in the hope of eliciting knowledge in relation to the mispers thought processes, while they were missing. Typically, Police rely heavily on historical data and behavioural patterns. For example, many teenagers who go missing are found in local parks, where teenagers are known to congregate. Elderly people suffering from dementia may travel to a location associated with a past event in their lives.
Bayesian Networks (BN) are directed graphical models, which have been used extensively in the fields of cognitive science and artificial intelligence throughout the latter half of the 20 th and early 21 st centuries. The models are based on the theorem of Thomas Bayes [2], which allows probabilities to be updated in light of new evidence. BN have been used for some time within the AI community and more recently amongst the machine learning community [3]. This paper describes the use of a BN model to capture the causal relationships that exist in a misper case and go on to show how the model can be used to impute missing values, such as the likely distance travelled and the likely location, where they may be found.
The paper makes a novel contribution through the development of the BN model, which is learned from data. The work also makes a contribution through the guidance proposed for determining the structure of the model. Current procedures, as used by UK Police forces, rely on iFIND [4], which is a PDF document, based on an empirical study. The BN model allows likely locations to be imputed from input data and serves as the basis of a computer based system.

II. RELATED WORK
Before describing the development of the model some related work is first reviewed, which is of relevance to the work that is the subject of this paper.

A. BAYESIAN NETWOKS FOR SEARCH AND RESCUE
To date no other work has been found, which is concerned with the use of BN to predict outcomes for at-risk misper groups. There is however, some notable research concerned with the use of BN for Search And Rescue (SAR), in relation to people (and ''things'') whom have got lost. The distinction of course is that at risk misper groups have intentionally gone missing, whilst SAR cases have unintentionally got lost. The most notable use of Bayesian inference for search techniques was that of the search for Air France Flight AF 447, which crashed into the Atlantic on 1 st June 2009 [5]. After two years of unsuccessful searching, the team used a Bayesian procedure developed for search planning to produce the posterior target location distribution. The distribution was used to guide the search and the wreckage was located within a week.
Reference [6] describes a Bayesian approach to modeling lost person behaviors based on terrain features in Wilderness Search and Rescue. The approach uses a first-order Markov transition matrix for generating a temporal, posterior predictive probability distribution map. The approach also uses a Bayesian χ 2 test for goodness-of-fit and goes on to show that the model closely fits a synthetic dataset. Reference [7] provides a thorough study of missing person behavior in Australia. The study, conducted by Victoria Police as part of the SARBayes project, considers a large dataset of parameters, some of which are more significant than others. Terrain plays an important role and the range of activities, relating to the missing person, are also considered (e.g. climbing, canoeing, hunting etc.).
These works make valid contributions to further knowledge within the field, but they differ to this work in that they deal with cases of entities (people, planes) that go missing by accident. This work is concerned with people who largely go missing intentionally, which reinforces the choice of BN to help understand the structure of causality relations.

B. MACHINE LEARNING AND FORMAL APPROACHES
There are also several other machine learning related approaches for dealing with missing person cases. For example, [8] compares the use of neural networks and rulebased systems for missing person cases in Australia. In later work [9] considers the use of J48 to derive rules, based on the popular C4.5 decision tree generator.
In previous work conducted by the authors [10], a missing person model was developed based on Situation Calculus. The approach represented the state changes that take place over time, whilst the person is missing. The formalisms help to provide a consistent means to represent the uncertainty present in such investigations.

C. EMPIRICAL STUDIES
Two notable related empirical approaches are those of the UK booklet 'Missing Persons: Understanding, Planning and Responding' (colloquially referred to as the Grampian Study) [11] and the iFIND System [4], which is currently used by a number of UK Police forces.
The Grampian Study considers a similar set of at risk groups to this work. For each group the study provides a number of tables, which portray useful information, such as likely time periods of missing, distance travelled and likely places for a misper to be found. The Grampian Study also translates data into useful search ranges, which can be superimposed on a map.  iFIND follows a similar structure, but is based on more recent data to provide a more through coverage. iFIND provides more detail in terms of likely locations. Both Grampian and iFIND place emphasis on Time, Distance and Likely Location and these parameters also feature predominantly in this work. Table 1 shows a typical excerpt from iFIND, which highlights the places where mispers for the category were located. The majority being found outside locally, with a smaller proportion either returning home of being found at a friend's house.
The Grampian Study and iFIND are manual solutions to misper cases. They provide lookup tables that Police officers and call handlers can use to plan search strategies. The model described in this paper provides a computer-based solution, which calculates location probabilities in response to input data related to a misper case.

D. GEOSPATIAL REASONING
In previous work conducted by the authors, the CASPER System (Computer Assisted Search Prioritization and Environmental Response) [12] was developed to study the Geographies of Missing Persons. CASPER combines data analysis and GIS to develop a Google map application to assist investigative and strategic decision making. CASPER was developed to a prototype stage and demonstrated to several Police forces as a viable alternative to their existing case management systems, namely COMPACT and NICHE. CASPER (illustrated in Fig. 2) was rich in terms of the geospatial information it provided, being able to display heatmaps, places of interest and even live CCTV footage. CASPER allows the search team to overlay a range of different layers onto a map region of interest. For example, the team may choose to overlay information of ATM cash machines if it is known that a misper may be short of money. Alternatively, suicide hotspots can be overlayed (from precompiled suicide data) when dealing with a potential suicide case. However, the algorithms used in CASPER were largely rule-based algorithms (based on empirical data) which were not optimal.

III. PRELIMINARIES
Formally, for a discrete random variable X = {X 1 , . . . , X n }, a BN is an annotated directed acyclic graph, which encodes a joint probability distribution (JPD) over X. Formally, a BN is expressed as the pair N = G, . The first element in N, is a directed acyclic graph, G = (V , E). V denotes the random variables in X , and E denotes the edges, which represent direct dependencies between the variables. The second element denotes the set of parameters, which quantify the network, via conditional probability tables. Each node is annotated with a conditional probability distribution, P (X i | Pa (X i )), representing the conditional probability of the node X i given its parents in G. The network N defines a unique JPD over X given by: In a BN a conditional probability P(X | Y ) is the probability of an event X occurring given that Y occurs. A marginal probability is effectively an unconditional probability. A marginal probability is a distribution formed by calculating the subset of a larger probability distribution. For example, for a JPD P(X, Y ) the probability of X can be determined simply by summing all the values for X = False and X = True in the joint table. For a query on a node in a BN, the result is often referred to as the marginal for that node.
For BNs, inference, is the computational methods for deriving answers to queries given a probability model expressed as a BN. Inference in BNs can take on several different forms [13]- [15], broadly speaking, it may be exact or approximate, depending upon the structure of the graph. Exact inference is not always possible when the number of combinations and paths are excessively large. However, it is often possible to refactor a BN graph (i.e. alter the graph structure) before resorting to approximate inference.
Let U be the set of random variables. Let U e ⊆ U be the set of known (evidence) variables. Let X q ∈ U \U e be the 49992 VOLUME 9, 2021 variables of interest (queries) and let U r = U \ (U e ∪ X q ) be the set of remaining variables.
The probability distribution of the evidence variables and the query variables can be calculated, via marginalization, as: The normalization may be calculated as: Then conditional probabilities may be calculated as: Inference can be used to ask a range of different questions, in relation to the probability distribution, depending upon the nature and context of the problem at hand. Typically one or more of the following are likely to be of interest: • Decision-making (given a cost function) The particular strengths and weaknesses of BN are covered well in [16]. To summarize: • They provide a natural way to handle missing data • Suitable for small and incomplete datasets • Combine different sources of knowledge • Explicit treatment of uncertainty and support for decision analysis • Fast response to queries Essentially, a BN defines a unique JPD over X and computationally the JPD takes the form of a large table, constructed from the tables defined at individual nodes, in accordance with the graph links. So computationally, inference is the process of scanning the joint table to find a value (or values), which correspond to evidence E, possibly summing values along the way.
Often, the table will take the form of a sparse matrix (i.e. many zero entries) and this property can be exploited to make inference tractable, even when the number of parameters is very large. Subsequent sections in the paper will consider certain legal rearrangements of the JPD table, can be used to marginalize out certain parameters. Such rearrangements allow queries to be satisfied in linear-time methods by identifying a subgraph of the original graph relevant to the query [17].

IV. DESIGN AND IMPLEMENTATION A. MISPER-BAYES MODEL DEVELOPMENT
The development of a BN model requires the learning of two components: the graph topology (structure) and the parameters of each conditional probability distribution. Both structure and parameters can be learnt from data. However, learning structure is much harder than learning parameters. There are a number of established techniques available to learn both the parameters and the structure [18]. Algorithms for learning a BN structure from data have two components: a scoring metric and a search procedure. The scoring metric computes a score reflecting the goodness-of-fit of the structure to the data. The search procedure tries to identify network structures with high scores and is regarded as NP hard [19].
Typically, the Naïve Bayes classifier provides a good place to start in relation to learning BN structure for a relatively small set of variables. The Naïve Bayes classifier (and its variants) [20] provides a baseline model for many machine learning classification problems. Naïve Bayes gives surprisingly good results, provided the condition of independence amongst variables hold. Unfortunately, in this case independence does not hold. For example, the different mental health categories, under consideration, are age related.
After dismissing Naïve Bayes, the development process went on develop a Generalized Bayesian Network (GBN) and chose the bnlearn Python library [21] to learn the structure, based on data from iFIND. After several iterations and variable eliminations the development process arrived at the graph similar to that of Fig. 3. bnlearn starts with an empty network structure of all variables, then proceeds by adding, removing and reversing edges between nodes to maximize the goodness of fit of the model. The final structure, learnt by bnlearn, contained an excessive number of edges, likely due to overfitting (i.e. the noise within the data had been represented in the model itself). These unnecessary edges out were thinned out, based on the interpretation of the causal relationships between variables to deliver the final structure of Fig. 3. Finally, bnlearn was used to learn the parameters using iFIND data compiled from the summary table (Table 2).
In summary, automated structure learning is useful to develop the initial structure, but it can lead to overfitting and manual intervention is required to thin out some unnecessary edges. Based on the journey in developing the BN model, it was thought that it would be worthwhile sharing the experiences of learning a BN model in terms of structure and parameters, as presented in Algorithm 1.  Most of the tables are fairly self-explanatory, with a couple of exceptions: the Cat(x) table reflects the different categories of at risk mispers and the Loc(x) table reflects the different locations that mispers are likely to be found. Note that there is no edge connecting Cat(x) and Time(x) (although there was an edge in an earlier version of the model). It transpired that Age(x) provides a better predictor of the time spent missing than Cat(x). For example, the age of a young child or an elderly subject has a direct bearing on the time that they are missing. There were other variables that could have been included into the model such as race, ethnicity and deprivation index, but these were seen to have a lesser effect than the variables shown in Fig. 3.
Recalling equation (1), the JPD for the Misper-Bayes graphical model (Fig. 3) can be written as: where: P(A) represents the probability of the different age groups. P(S) represents the probability of the sex types male and female. P(C | S, A) represents the conditional probability of the different categories, based on sex type and age group. P(D | T ) represents the conditional probability of distance travelled, based on time missing. P(L | D, C) represents the conditional probability of the likely location, based on the different categories and the distance travelled.
In Fig. 3 the Sex(x) category values (Male/Female) do not total to 1 because for certain categories there is no distinction between gender types in the iFIND data (Table 1). This is the case for young children for whom gender is not significant.
The current model represents time missing as a discrete variable (Time(x)) and as such, the model does not provide a continuous representation of time. Section V.D alludes to misper cases for which time may prove significant and suggest alterations to the model to accommodate time as a continuum.

B. MISPER-BAYES IMPLEMENTATION AND INFERENCE
The Misper-Bayes model was implemented in Python using the pomegranate machine learning package [22], which provides an easy to use abstraction of BN modelling. Pomegranate uses a belief propagation algorithm to satisfy conditional probability queries, which gives exact marginals when the graph is a tree (i.e. has no loops), but only approximates the true marginals in cyclic (loopy) graphs. Under these circumstances the algorithm is referred to as loopy belief propagation.
Pomegranate, provides a useful predict_proba() function, which uses loopy belief propagation in order to query probabilities for different parts of the graph and calculate marginals. For example (in the Python interpreter) a set of facts may be entered as a collection of dictionary entries to represent a realistic misper case. The beliefs map will provide probability predictions for likely Distance travelled and likely Location to be found. After some Python pretty printing, the output is as shown below.
Sex M Age T Cat. The BN model was designed and implemented using data relating to UK misper cases. As such, the current implementation will have a bias towards UK cases, although it is felt that the model itself (Fig. 3) defines the nodes and the graph structure that are sufficiently generic to apply internationally.

A. MODEL EVALUATION
The Misper-Bayes model was evaluated using a series of queries with a set of misper cases. For each query, the results of the model were cross checked against the results of the iFIND system. The result comparisons are summarized in Table 3. Table 3 shows the three most likely locations for each misper case (other locations with very low probabilities are omitted or brevity). The Misper-Bayes column has probabilities rounded to 2 decimal places. As can be seen the majority of results give the same value as iFIND and several results that are not the same are accurate within ±1%. This indicates that the model converges to the results provided in iFIND. Details of how the iFIND results were rounded is unknown, also in the majority of iFIND Location tables there is a final row termed 'Individual Cases' for which no numerical values are given. These 'Individual Cases' could explain the very slight variation, between or own Misper-Bayes results and the results of iFIND.
Although data from iFIND was used in the development of the model, it is important to stress that the Location values for Misper-Bayes were imputed from the model, based on the conditional relationships of Fig. 3. Typically a Police Officer using iFIND would need to perform a manual lookup to locate the appropriate table in order to determine the likely Location. Table 3, the Misper-Bayes model (Fig. 3) provides a very similar set of results to those of iFIND. The model was examined further to see if it could be improved or made more computationally efficient. As mentioned previously in section 1.1, there are certain legal rearrangements of the joint probability distribution (JPD) table through which certain parameters may be marginalized out of a graph and such rearrangements allow queries to be satisfied in linear-time.

As shown in
There is a large body of work concerned with the manipulation and transformation of probabilistic graphical models to improve structured learning, inference and information storage and representation. There are various techniques, rooted in graph theory, such as moralization [23], d-separation [17] and tree-decompositions [24]. Of particular interest to this work is the idea of a tree approximation to the original model and determining how good this approximation actually is, based on the use of cross-entropy. Further still, what is the optimum approximation and how well does it match the results of the original model?
As mentioned previously, in BNs tree structures are desirable because belief propagation is exact. A tree with n vertices ((n-1) edges) only requires ((d − 1) + d(d − 1)(n − 1)) Algorithm 2 (Chow Liu) 1. From the given distribution P(x) compute the joint distribution P(x i , x j ) for all i = j 2. Using the pairwise distributions from step 1, compute the mutual information for each pair of nodes and assign it as the weight to the corresponding edge. 3. Compute the maximum-weight spanning tree (MST): 3.1 Start from the empty tree over n variables. 3.2 Insert the two largest-weight edges. 3.3 Find the next largest-weight edge and add it to the tree if no cycle is formed; otherwise, discard the edge and repeat this step. 3.4 Repeat step 3.3 until n − 1 edges have been selected (a tree is constructed). 4. Select an arbitrary root node, and direct the edges outwards from the root. 5. Tree approximation Q(x) can be computed as a projection of P(x) on the resulting directed tree (using the product-form of Q(x)).
parameters, where d is the domain size. The actual complexity of inference in a BN is proportional to its tree-width [25] which measures how closely the network resembles a tree. The further examination of the model considered the Chow Liu theorem [26], which relies on Kullback-Leibler divergence (KL-divergence or cross entropy), which is expressed in (6).
The KL-divergence, quantifies how much one probability distribution differs (or diverges) from another probability distribution (|| is the divergence operator, indicating that P deviates from Q). K-L divergence expresses the amount of information lost when Q is used to approximate P, proof of which is provided in [16]. The Chow Liu theorem is stated below.
Lemma: For a JPD P(X = x) and a tree structure T , the best approximation Q(X = x) (i.e., Q(X = x) that minimizes D(P||Q) ) satisfies: Such Q(x) is called the projection of P(x) on T .

Theorem (Chow Liu):
For a JPD P(x), the KL-divergence D(P||Q) ) is minimized by projecting P(x) on a Maximum-Weight Spanning Tree (MST) over nodes in X , where the weight on the edge (X i , X j ) is defined by the mutual information measure: If the Chow Liu algorithm is applied to the previous Misper-Bayes model, the model is transformed into the polytree structure shown below in Fig. 4.
Effectively, the algorithm produces an approximation, which is always a tree. It works by computing the weight I X i , X j of each edge between nodes X i , X j and finding the MST. When the algorithm is applied to the graph of Fig. 3, the MST eliminates the edge connecting nodes 'Dist.' and 'Loc.'.
To evaluate the revised polytree approximation the same set of queries from Table 3 were executed. Numerically, the marginals were almost identical (barring some rounding errors).

C. MODEL ASSESSMENT (QUANTITATIVE)
Given that the results of the queries were largely the same, the execution time for queries on the ploytree model were then examined against those for the original acyclic graph. It turned out that the execution times for the polytree model were only marginally less than those of the original acyclic graph model. On average a query for the original graph model took 0.036 seconds, whereas the same query for the polytree model took 0.03 seconds, executed on an Intel Core i7, 16GB RAM (no GPU). As mentioned previously, pomegranate uses a loopy belief propagation algorithm for the implementation of the predict_proba() function, which is an inexact algorithm that converges to the exact solution on BNs which have a tree structure.
The additional time for the original graph model (0.006 seconds) is likely to be down to the time taken for convergence. By realistic BN standards both of the models are relatively small in terms of the number of nodes and edges. However, the models contain relatively large, sparse conditional probability tables (e.g. Loc(x) is 700 × 4 elements). The literature provides specialised algorithms for dealing with sparseness in BN, which restrict the search space based on heuristics that either bound the search space by limiting the degree of nodes within the network or by limiting the set of possible edges [27]. Many BN models typically consist of a large number of variables (e.g. medical applications may use several hundred variables) with small probability tables. In contrast, the model is based on a small number of variables with large probability tables.
Computationally, the latter type of model is beneficial for satisfying queries. For models with many variables (i.e. nodes), more effort is expended traversing the nodes as opposed to the values in a probability table associated with a given node. To improve query performance the model was transformed into a polytree using the Chow Liu algorithm, which is based on mutual information. The Chow Liu algorithm is known to reduce sparseness [27] but results showed only a minimal performance improvement, which suggests that sparseness in models with few variables yet large probability tables does not have a major impact on performance.
After examining the revised polytree model, it was concluded that the original acyclic graph is preferable as it is semantically richer. For example, the location of a missing person is likely to depend on how far a misper has travelled. If a misper has travelled a considerable distance, they may be less likely to frequent certain locations that they would do if they had stayed local.

D. MODEL ASSESSMENT (QUANTITATIVE)
The model performs well for cases that fit the discrete categories and exhibit normal behavior within that category. Results of such cases compare very closely to those of iFIND. However, it is unknown how well the model performs for cases that are a combination of categories. For example, if an individual is categorized, equally, as suffering from ADHD, but with a history of schizophrenia then, manually, one could impute the outcome for each case and look for common ground between the outcomes to inform the search strategy. However, for many cases, one category will be more dominant than the other.
There was no data available to consider such cases because UK Police Forces record data based on the discrete categories considered in the paper. Hybrid cases are, at present, largely dealt with based on the domain knowledge of the officers involved in the investigation. However, to consider a sample of such cases the tables in the Python implementation were modified to allow values to be imputed for queries with two categories of equal (50:50) weighting. This did not require any alteration to the model itself (Fig. 3), only the implementation. Based on this revised implementation, queries for two categories could be issued as below. The results for a small sample of queries are shown in Table 4.
The Combined Category results are comparable with those imputed from real data in the Individual Category (Misper-Bayes) column. For the Combined Category results the four most likely locations for each case are shown. It is important to stress that the Combined Category results were not arrived at by simply taking an average of the Individual category results, they were imputed from the tables of the Misper-Bayes BN model. These results cannot be compared against any tangible results used in Police cases, but they serve only to show the versatility of the model in that it is amendable to change to satisfy specific requirements.
One drawback with this approach is that it is difficult to estimate the balance between the two (or more) categories. Ratios of 50:50 were chosen, although it could be 60:40 or 30:70. Police officers alone would not be able to judge the balance, it would require input from a Psychologist, which adds strength to the argument that misper cases require input from many sources not just Police. Typically, misper cases require a multi-agency approach with input from Police, Psychologists, Health-care professionals, family members and friends, to name but a few.
Inevitably, some misper cases prove difficult, particularly those in which the subject endeavors to stay missing. Under such circumstances the model will only provide a baseline from which search strategy decisions can be made. The model was chosen for its simplicity and accuracy in predicting location and the variables defined in the model are still valid, even for the more difficult cases. The success in dealing with the more difficult cases often depends on the response time of the Police officers involved and the thoroughness of their search of a particular location. These more difficult cases could be assisted by a model that accommodates changes that may occur over time. Such temporal effects exceed the scope of the current work, but this is something that will be pursued in future work.
In misper cases Police forces refer to the ''golden hour'', particularly for cases involving children. This is the time period in which the application of standardized approaches will, in theory, yield a desirable result. It is also the time period in which tangible evidence and information is abundant. As time progresses the case may become more complex and standardized approaches alone may not yield a desirable result and Police may have to think ''outside of the box''. Periodically, misper cases are reviewed to see if any further evidence has come to light to assist search strategies. The passage of time may lead to significant changes such as a misper failing to take vital medication, or falling short of money, food or accommodation. The scope of the current work was the development of a Bayesian Network model to impute likely locations for different categories of mispers. In future work, the model will be extended using Dynamic Bayesian Networks to incorporate the timeline of events associated with a misper investigation.

VI. CONCLUSION
This research has described the design and implementation of our Misper-Bayes model to assist Police forces in determining the whereabouts of a missing person. The work makes a novel contribution as it is the first computer-based solution to assist in actively dealing with misper cases.
Misper-Bayes provides a powerful tool, which can be used to good effect to whittle down the likely locations where the missing person may be found. The results of likely location queries on the Misper-Bayes model delivered very similar results to those of the iFIND system. The model was examined to see if a tree approximation provides a better alternative. It was concluded that a tree approximation is not needed, assuming the current implementation based on the Python Pomegranate package.
The strength of the model lies in its simplicity yet versatility. The model can accommodate some variation to the discrete categories through some changes to the Python implementation. When combined with a geospatial front-end (e.g. CASPER), the Misper-Bayes model can be used to very good effect to assist Police Officers with the prioritization of their search strategy. The approach demonstrated has scope to support evidence-based policing beyond that of missing person cases.
In addition to the development of the model, guidelines were provided that may prove useful for others faced with learning a BN model from data.