Bayesian Multiobjective Optimisation With Mixed Analytical and Black-Box Functions: Application to Tissue Engineering

Tissue engineering and regenerative medicine looks at improving or restoring biological tissue function in humans and animals. We consider optimising neotissue growth in a three-dimensional scaffold during dynamic perfusion bioreactor culture, in the context of bone tissue engineering. The goal is to choose design variables that optimise two conflicting objectives, first, maximising neotissue growth and, second, minimising operating cost. We make novel extensions to Bayesian multiobjective optimisation in the case of one analytical objective function and one black-box, i.e. simulation based and objective function. The analytical objective represents operating cost while the black-box neotissue growth objective comes from simulating a system of partial differential equations. The resulting multiobjective optimisation method determines the tradeoff between neotissue growth and operating cost. Our method exhibits better data efficiency than genetic algorithms, i.e. the most common approach in the literature, on both the tissue engineering example and standard test functions. The multiobjective optimisation method applies to real-world problems combining black-box models with easy-to-quantify objectives such as cost.

Abstract-Tissue engineering and regenerative medicine looks at improving or restoring biological tissue function in humans and animals. We consider optimising neotissue growth in a three-dimensional scaffold during dynamic perfusion bioreactor culture, in the context of bone tissue engineering. The goal is to choose design variables that optimise two conflicting objectives: (i) maximising neotissue growth and (ii) minimising operating cost. We make novel extensions to Bayesian multi-objective optimisation in the case of one analytical objective function and one black-box, i.e. simulation-based, objective function. The analytical objective represents operating cost while the black-box neotissue growth objective comes from simulating a system of partial differential equations. The resulting multiobjective optimisation method determines the trade-off in the variables between neotissue growth and operating cost. Our method outperforms the most common approach in literature, genetic algorithms, in terms of data efficiency, on both the tissue engineering example and standard test functions. The resulting method is highly applicable to real-world problems combining black-box models with easy-to-quantify objectives like cost.
Index Terms-Bayesian optimisation, black-box optimisation, multi-objective optimisation, tissue engineering M ODERN developments in obtaining patient-specific genetic characteristics [1] open up the possibilities of personalised healthcare, which is expected to greatly improve the quality and cost-effectiveness of healthcare, as it offers more personalised and targeted therapies [2]. Bone tissue engineering develops methods for healing, improving or replacing damaged bone tissue. Engineered skeletal tissue has potential, e.g. in treating bone fractures or regenerating after surgical bone removal [3].
The keys for successful in vivo or in vitro formation of bone tissue [4], [5] are (i) a vibrant cell population with predictable in vitro and in vivo behaviour, (ii) carriers and scaffolds for the cells to grow on, (iii) growth factors that control what cell types are created, (iv) proper mechanical conditions that do not put too much stress on the cells, (v) sufficient blood supply to carry oxygen to the cells and remove waste products. Ensuring that-or understanding when-all key elements are in place is a major difficulty in tissue engineering. Neotissue (cells and their extracellular matrix) generated in vitro has been shown to hold great potential for use as scaffold material in tissue engineering [6].
Bioreactors play an important role in clinical applications of tissue engineering because bioreactors provide controllable environments for large-scale production of cells or neotissue. Standardising bioreactor processes may reduce the cost of neotissue production, a necessary condition for the future commercial viability of tissue-engineered products [3], [7].
To address problems of reproducibility and quality control in in vitro experiments, in silico models [8]- [15] can be used, as they allow us to gain insight into the complex biology occurring during neotissue growth inside the bioreactor, and to predict bioreactor settings that optimise those processes. However, the in silico models often consist of solving systems of differential equations with implementations that may be specialised to specific hardware or software. We treat these in silico models as black boxes for the purpose of optimisation when we can simulate reactor conditions but cannot efficiently derive gradient or Hessian information. Relevant black-box optimisation methods include genetic algorithms [16], trustregion methods [17], and Bayesian optimisation [18]- [21]. Black-box methods give flexibility with respect to the blackbox contents. If there are no restrictions on the black box, the optimisation methods are equally relevant to in vitro experiments as legacy computer code.
It is often difficult to define what constitutes an optimal solution to a problem as there may be multiple, conflicting objectives. The goal of the optimisation procedure is to find a good trade-off between costs, e.g. cost of materials, time and negative side-effects, and rewards, e.g. successful neotissue growth. Multi-objective optimisation (MOO) methods explicitly find the trade-offs between conflicting objectives [22], [23]. This paper considers optimising design variables for neotissue growth on a three-dimensional scaffold in a bioreactor setting. The problem has two conflicting objectives: (i) maximising neotissue growth and (ii) minimising operating cost. The first objective function is an expensive-to-evaluate black box. The second objective function is analytical and cheap to evaluate. Combining analytical and black-box methods is an active area of research [24]. This manuscript introduces a method for finding solutions to the MOO problem using a novel extension of Bayesian MOO methods that exploits the mixed analytical and black-box natures of the objective functions. We compare this method to random search and a variety of genetic algorithms (the most common approach in literature to solving MOO problems), and demonstrate that our method performs better (in terms of the required number of function evaluations) for a collection of test problems as well as for the tissue engineering application. We show how the MOO results can guide tissue engineering practice.
A limited study on the acquisition function in Sec. II-A was previously published in a conference [25]. The contribution of this article is the study of additional acquisition functions, more significant experimental results, and discussions of the applications and implications of our method.

I. BACKGROUND
In the following section, we provide background on (A) the tissue engineering application and multi-objective problem, (B) Gaussian process regression and Bayesian optimisation, and (C) multi-objective optimisation.

A. Objective Functions: Tissue Engineering Application
We consider maximising the percentage of a scaffold filled with neotissue, i.e. the filling, while minimising the amount of material used by the bioreactor, i.e. the cost.
The filling objective function is a black box, i.e. we cannot write down an analytical, differentiable expression of the filling with respect to the inputs. We have an expensive-toevaluate in silico partial differential equations (PDE) model of neotissue growth on a three-dimensional scaffold in a bioreactor setting (see Fig. 1) [10]- [12].
The PDE model assumes curvature-driven neotissue growth kinetics [10] with local shear stress effects on the growth rate [12]. The scaffold geometry and the neotissue filling level determine the curvature of the neotissue surface. As illustrated in Fig. 1c, greater curvature yields a greater growth rate and zero curvature yields zero neotissue growth velocity [12]. The medium flow through the bioreactor subjects the neotissue to shear stress. Some shear stress encourages neotissue growth, but too much shear stress inhibits it. The Navier-Stokes equation models the flow in the scaffold void space. The neotissue, modelled as a homogeneous porous medium with a given permeability, exhibits Darcy, i.e. creeping, flow. The shear stress distribution on the neotissue interface and the flow of nutrients and waste products change as the scaffold neotissue filling level changes, which in turn affects the neotissue growth kinetics.
The PDE model implementation currently only runs on one machine. Parallel model evaluations are impossible due to hardware constraints. Legacy in silico models are difficult to maintain [26], especially in research settings where longterm code maintenance typically brings few benefits to the original developer. We could replicate the existing model, but we estimate six person-months for recreating and validating the new code. Instead, we repurpose the existing model.
The PDE model considers several design variables: • Refreshment period: length of time r between changing the bioreactor medium.
• Refreshment amount: percentage a of the medium changed every time r hours.
• Flow rate: rate s at which the medium is pumped through the bioreactor.    Fig. 1a. We fix the scaffold parameters, e.g. the scaffold shape and pore size [10]. We also have a reduced-order, cheap-to-evaluate in silico ordinary differential equations (ODE) model [27]. The reduced model only takes the refreshment period and refreshment amount as design variables, and assumes a fixed flow rate s. One model evaluation takes on the order of milliseconds. The reduced-order model will be used for testing our optimisation method and comparing its performance to competing methods. The full PDE and reduced-order ODE models show agreement  Data PDE model ODE model Figure 2. Experimental data and model predictions [27].
with each other and the experimental data (see Fig. 2). The cost objective function (the cost of the bioreactor medium) scales linearly with the amount of medium used. We assume that the medium has an associated cost c per ml, with V ≈ 10 ml of the medium in the system at any time, and that the cost of an experiment is given solely by the amount of medium used. The medium contains growth factors inherently present in the foetal bovine serum [28], which, with its cost of around $0.11/ml (Sigma Aldrich website, 8 March 2017), is the most expensive component.
However, foetal bovine serum is an animal product, and therefore problematic in the context of clinical applications. Chemically defined media are animal-free alternative growth media consisting of cocktails of various well-known (recombinant versions of) growth factors under well-defined concentrations, generally leading to a cost increase by at least one order of magnitude. Table II lists some of the required human growth factors [29], their approximate costs (Sigma Aldrich website, 8 March 2017) and approximations of the concentrations needed, based on growth factor concentrations studied in [30]- [33]. Often the growth factor concentrations used in in vitro experiments will be at least a factor of 2 higher. The scaffold shown in Fig. 1 is 6 mm in diameter and 6 mm in length [11], and one experiment can require more than 100 ml of medium. Thus, the cost of materials for growing a human bone graft of less than 1 cm 3 could be hundreds of dollars. This expense clearly motivates using optimisation to mitigate high costs.
The PDE and ODE models simulate 504-hour (21 days) experiments. Given a refreshment period r, the medium is changed a discrete number ( 504/r ) of times. The cost of one experiment is the sum of the initial cost V ·c and the cost to change the medium by the refreshment amount a. In total, c(r, a) = V · c · (1 + 504/r · a). For simplicity, the constant V · c is ignored, such that c(r, a) = 1 + 504/r · a ∈ [1,43].
We could create a single-objective optimisation problem by setting a monetary value on the neotissue produced in the bioreactor, and optimise the design variables by maximising the difference between the monetary reward of the neotissue and the production cost, e.g. as in growing red blood cells [34]. This would be equivalent to scalarisation, which is described later in this section. However, the commercial viability of bone tissue engineering depends on growing high-quality bone grafts. Therefore, we also want to maximise the filling of the scaffold. Hence, an MOO approach using a Pareto frontier is more appropriate.

B. Bayesian Optimisation and Gaussian Processes
We begin by considering the minimisation of a single objective function f . Bayesian optimisation is a global blackbox optimisation method [18]. It uses function evaluations to construct a probabilistic surrogate model of the objective function, and does not require the gradients of the objective function with respect to its input. By using a probabilistic surrogate model, we can compute the optimal choice for the next function evaluation point. This makes Bayesian optimisation more data efficient, i.e. it requires fewer function evaluations, than many other methods. It is therefore practical to use with objective functions that are expensive to evaluate. Bayesian optimisation has proven very useful, e.g. in robotics [35], sensor networks [36] and biology [37]. The typical surrogate model is a Gaussian process. A Gaussian process is a collection of random variables, any finite subset of which are jointly Gaussian distributed [38]. The random variables are the values of the objective function. We place a Gaussian process prior GP(µ(·), k(·, ·)) on the unknown function f , where µ(·) and k(·, ·) are the mean and covariance functions, respectively. The Gaussian process prior is fully specified by the choice of µ(·) and k(·, ·).
The Gaussian process prior implies that any finite number of function values f = [f (x 1 ), . . . , f (x n )] at points X = [x 1 , . . . , x n ] are jointly Gaussian distributed with the function value f (x * ) at a query point x * : Assume we have observed the function values f at locations X. Given training data f , X and an unseen test location x * , Gaussian process regression uses Bayes' theorem to compute the posterior predictive distribution noise terms, then the posterior predictive mean and variance are µ * = µ(x * ) + k * (K + σ 2 η I) −1 (y − µ) and σ 2 * = k(x * , x * ) − k * (K + σ 2 η I) −1 k * , respectively. We assume the black-box objective functions are smooth. A common choice of covariance function k for smooth latent functions is the radial basis function with automatic relevance detection (RBF-ARD) [38]: where ρ 2 is the signal variance and Λ = diag(λ 2 1 , . . . , λ 2 D ), x ∈ R D , are the length scales. The hyperparameters (ρ 2 , Λ, σ 2 η ) are commonly learnt by maximising the marginal likelihood p(y | X, ρ 2 , Λ, σ 2 η ). One limitation of Gaussian process models lies in retraining the hyperparameters after new observations, and inverting the matrix K (or K + σ 2 η I), e.g. by using its Cholesky decomposition [38]. Retraining the model scales as O(n 3 ). Only adding an additional observation without retraining the hyperparameters scales as O(n 2 ). Computing the Gaussian process posterior predictive distribution also requires several matrix multiplications and scales as O(n 2 ). Sec. IV will discuss how our proposed methodology depends on the tradeoff between evaluating the black-box function, evaluating the posterior distribution, and updating the Gaussian process surrogate model.
In Bayesian optimisation, the optimal next point to query (in the search for the minimum of the objective function) is computed by maximising an acquisition function [39]. The acquisition function has a built-in trade-off between exploration and exploitation, i.e. trying to find potential optima in unexplored regions versus exploiting already existing information about where a minimum might be.
Assume the conditional distribution f (x * )|y, X ∼ N (µ * , σ 2 * ) is given for a point x * and letμ = (y min −µ * )/σ * , where y min = min i y i is the best value of f observed so far. Common acquisition functions for single-objective Bayesian optimisation are confidence bounds [40], probability of improvement [18] where φ(·) is the zero-mean Gaussian probability density function with unit variance and Φ(·) the corresponding cumulative distribution function. Fig. 3 shows an example of the expected improvement acquisition function, where the posterior predictive distributions are computed given three observations of an unknown objective function.
We optimise the acquisition function through multi-start gradient descent: The acquisition function is evaluated at a large number, in our case 10 6 , of random initial locations. Out of the initial set of locations, the locations that generated the best acquisition function values form a smaller subset of locations that are chosen as starting locations for gradient descent. The objective function is evaluated at the location of the optimum found through optimising the acquisition function, after which we update the Gaussian process surrogate model with the new data point.

C. Multi-Objective Optimisation
MOO aims to compromise multiple conflicting objectives. Assume a D-dimensional input space and n f conflicting objective functions f i : R D → R, i = 1, . . . , n f , where we are interested in finding the optimal input arg min There are two different ways of finding an optimal trade-off between the objective functions [22]: Scalarisation, where the trade-off is made a priori, and the Pareto method, where the trade-off is made a posteriori.

1) Scalarisation:
The other method for dealing with multiple, conflicting objectives is to create a single, aggregated objective function f s . A common example is a weighted [42]. This function can then be minimised to find a trade-off solution to the original problem. Scalarisation is easy to employ. However, the choice of weight coefficients {ω i } introduces a new problem as they have to be chosen a priori, often with incomplete understanding of the system. Additionally, an optimal solution for a specific set of weights yields little or no information about other possible optimal solutions for different sets of weights.
2) Pareto method: This method aims to finds the frontier of Pareto-optimal points in objective space. Pareto-optimality is a state in which the value of one objective function cannot be improved without impairing the value of another [43]. The Pareto frontier is commonly approximated using the set of non-dominated 2 function observations (see Fig. 4a). Once an approximate Pareto frontier has been computed, the user decides which optimum, i.e. what trade-off, best suits the particular application.
Weighted sum scalarisation suffers from the limitation that it can only converge on trade-offs lying on convex 3 sections of the Pareto frontier (see Fig. 4b) [44]. If the entire Pareto frontier is concave, only the extreme points at the boundaries can be found using weighted sum scalarisation.
A problem with the Pareto method is the difficulty to visualise the Pareto frontier for more than three objective functions (n f > 3), which makes it difficult for the user to choose an optimal trade-off. An advantage of the Pareto method over scalarisation is that restrictions imposed on the range of one objective function immediately translate into 1 Also known as a linear combination, or convex combination if i ω i = 1, ∀ i ω i ≥ 0). 2 Given two observations y (1) , y (2) ∈ R n f , we say that y (1) is dominated by y (2) if for all j = 1, . . . , n f it holds that y (1) j ≥ y (2) j , and for at least one j it holds that y restrictions on the range of optimal values that can be achieved for the other objective(s). This paper uses the Pareto method.

II. METHOD
The neotissue growth application has two conflicting objectives: maximising filling, i.e. minimising negative filling, and minimising cost. We can compute the probability of an input x resulting in an observation y = (y 1 , y 2 ). The cost function f 1 (x) is deterministic, whereas the filling function is modelled by a Gaussian process. This means the joint probability is For notational convenience we drop the dependency on x of f 1 , µ 2 and σ 2 2 from here on. We also define the functionŝ , as differences between zero-mean, unit variance probability density and cumulative distribution functions, respectively. We assume that an approximated Pareto frontier P a = {r j } = {(r j,1 , r j,2 )} from n p non-dominated observations is given, and that P α is sorted, i.e. r j,1 < r j+1,1 and r j,2 > r j+1,2 for all j = 1, . . . , n p .
There are many different acquisition functions for Bayesian MOO, e.g. expected hyper-volume improvement (EHVI) [45], expected maximin improvement (EMmI) [46], probability of improvement and expected Euclidean improvement [47], predictive entropy search [48], and uncertainty reduction [49]. This section presents novel closed-form expressions, given one black-box and one analytical objective function, for the EHVI and EMmI acquisition functions. Similar expressions may be derived for other acquisition functions. Table III summarises some mathematical notation.

A. Expected Hyper-Volume Improvement (EHVI)
The EHVI [45] E HI (x) for input x is defined as: where the function I V (y, P α ) = Vol(P α ∪y)−Vol(P α ) is the Lebesgue volume improvement that a point y would yield to the area bounded by P α (see Fig. 5a), calculated with respect to some reference point. Note that I V (y, P α ) = 0 if y is dominated by any point in P α . For the two-dimensional case, Eq. (1) is the expected area improvement. We divide the region below P α into rectangles C ij defined by corners (r i,1 , r j,2 ) and (r i+1,1 , r j+1,2 ), as in Fig. 5b. Surrogate points r 0 and r np+1 act as rectangle corner points on the available objective space. The reference point for calculating Vol(·) becomes (r np+1,1 , r 0,2 ). Now the EHVI in Eq. (1) can be rewritten as a sum over all rectangles, i.e.: Since y ∈ C ij is in the region below P α , it will never be dominated by any point in P α , but will instead dominate existing observations. The set of non-dominated points in (P α ∪ y) is {r 0 , . . . , r i , y, r j+1 , . . . , r np+1 }. Now let V n = n k=1 (r np+1,1 − r k,1 )(r k−1,2 − r k,2 ) denote the area bounded by the first n observations in P α , with V 0 = 0, and V np = V np+1 = Vol(P α ). This way, Vol(P α ∪ y) becomes:  and we see that the area improvement I V (y, P α ) for y ∈ C ij can be rewritten as: In our application, one objective function is deterministic, hence p(y|x) = 0 for all y 1 = f 1 . Let h be the integer index such that 0 ≤ h ≤ n p and r h,1 < f 1 ≤ r h+1,1 , which means Eq. (2) simplifies to: Inserting the expression in Eq. (3) into Eq. (4) and integrating out y 2 yields the final expression: Eq. (5) is differentiable with respect to f 1 , µ 2 and σ 2 , so we can use gradient-based optimisation to maximise the EHVI.

B. Expected Maximin Improvement (EMmI)
The modified maximin fitness function [50], developed for MOO using genetic algorithms, is defined as: It is a measure of the distance from the point f (x) to the approximated Pareto frontier, and for best performance requires that the scales of the different objective functions are comparable. Fig. 6a illustrates the modified maximin improvement function. For Bayesian MOO, Svensson and Santner [46] propose using the (truncated) EMmI: as the acquisition function. They show that for the case of two objective functions (n f = 2), Eq. (7) is equivalent to: where the function I(x, i, j) is the integral: The integration limits, illustrated in Fig. 6b, are y up (i) = r h(i,j),k(i) − r j,i + y i and y low (i) = r j,k(i) − r j,i + y i , where k(1) = 2, k(2) = 1, h(1, j) = j − 1, h(2, j) = j + 1 and r np+1,1 = r 0,2 = ∞.
The probability of improvement may not encourage uniform Pareto frontier exploration since PI(x) does not consider the expected amount of improvement. The expected Euclidean improvement E Euc. (x) tries to overcome this [47]:     where r * = arg min r∈Pα [f 1 (x * ), µ 2 (x * )] − r 2 , with x * = arg max PI(x), and: For one analytical and one black-box objective function, Eq. (13) can be simplified usingŷ 1 (x) = f 1 /PI(x) and: For predictive entropy search [48], the acquisition function is in general intractable, even for single-objective optimisation. But similar expressions combining analytical and black-box objectives may be developed for other acquisition functions.

III. RESULTS
This section presents the results of MOO for the neotissue growth problem with both the ODE and the PDE model. It also introduces performance metrics enabling a comparison of the EHVI and EMmI acquisition functions and competing MOO methods, for a set of MOO test problems.

A. Results for the Reduced-Order ODE Model
The reduced-order ODE model is not guaranteed to mimic the full PDE model but it is useful for testing possible algorithms before moving to the expensive-to-evaluate model [27].
Recall that the objective is to maximise filling and minimise cost. We compute approximate Pareto frontiers using the EHVI (Fig. 7a) and EMmI (Fig. 7b) acquisition function, with 10 random initialisation points after which 25 additional points are chosen by the optimisation algorithm. Fig. 7 shows that the query points chosen by EHVI and EMmI favour different variable space regions. EHVI selects more points with higher filling and higher cost (flat part of the frontier), whereas EMmI selects more points with a medium filling and low cost (steeply increasing part of the frontier). EHVI chooses points along the flat part of the frontier because a small filling improvement may yield a large area improvement.
We compute a probabilistic Pareto frontier [51]- [53] by mapping a grid of variable points through the cost function and the filling function posterior distribution. We determine the non-dominated sampled point set by letting a point with cost f    2 . Fig. 8 shows an example evolution of a subregion of the probabilistic Pareto frontier. In the Fig. 8 example, the optimisation starts with 10 initial data points, and further points are chosen using the EMmI acquisition function. The probabilistic Pareto frontier approaches the true Pareto frontier. As the number of data points grow, the probabilistic frontier better approximates the true Pareto frontier compared to the approximation from observations alone.

B. Performance Metrics
To compare algorithm performance, we use three performance metrics. The first is the Euclidean generational distance (GD) [54], which measures the average Euclidean distance between points r i ∈ P α to the true Pareto frontier P true : The second performance metric is the maximum Pareto frontier error (MPFE) [54], a measure of the largest approximation error introduced by using P α to approximate P true , defined as the greatest Euclidean distance between any point in P α to the true Pareto frontier P true : The third metric is the volume ratio (VR) between (i) the area (or hypervolume) bounded by the approximated Pareto frontier P α and (ii) the area bounded by the true Pareto frontier P true .
The logarithm in Eq. (16) distinguishes the methods' performance for volume ratios close to 1, which frequently occur in our tests. Increasing the ratio Vol(P a )/Vol(P true ) from 99% and 99.9% is more significant than increasing it from 19% and 19.9%. Good optimisation method performance, in terms of producing a good approximate Pareto frontier P a , is indicated by small GD and MPFE metrics and large VR metric. We approximate the true Pareto frontier P true for each problem through exhaustive grid search.

C. Test Problems
We additionally consider five test problems, based on the following criteria: • There should be exactly two objective functions for each problem, and at least one of them has to be analytical.
• The optimisation problems should be unconstrained, but with finite (hyper-rectangular) search domains.
• The Pareto frontiers should be different: concave, convex, both concave and convex, and discontinuous. The first test problem is Fonzeca & Fleming's problem [55] , where the conflicting objective functions are:  3,3], which implies output bounds f 1 (x) ∈ [0, 9] and f 2 (x) ∈ [0, 25]. Schaffer's test problem has a convex Pareto frontier. The third test problem is Kursawe's problem [57]: and called S − and S + depending on the sign in front of the sine-term in f 2 . Problem S − has two convex sections with a concave section in-between, whereas S + has two concave sections with a convex section in-between. For both S − and S + , the two-dimensional inputs have ranges x 1 ∈ [0, 10] and x 2 ∈ [0, 2], which yields outputs with approximate domains 12]. Fig. 9 illustrates the Pareto frontiers for all test problems. The acquisition functions derived in Sec. II require one analytical and one black-box objective function. Therefore, for Bayesian MOO of each test problem, we assume that f 1 is an analytical function and f 2 is a black box.

D. Comparison to Competing Methods
This section computationally compares the EHVI and EMmI acquisition functions and competing MOO methods. The first competing method is random search, where we uniformly sample input values for which to evaluate the objective functions. The second competing method is genetic algorithms, commonly used for MOO in biomedical engineering, e.g. [58], [59]. We consider the algorithms: NSGA-II, NSGA-III, EpsMOEA, GDE3, SPEA2, MOEAD, OMOPSO, SMPSO, CMAES, IBEA, PAES, PESA2, as implemented in the Python package Platypus [60]. We use default parameter values for all genetic algorithms, with 10 outer divisions for NSGA-III and = [1, 1] for EpsMOEA and OMOPSO.
For each test problem, as well as for the optimisation of the ODE model for neotissue growth, the Bayesian algorithms are given 10 random initial observations after which the acquisition functions run for 50 iterations. The genetic algorithms and random search run for 60 function evaluations. Performance metrics are computed after each function evaluation. This is repeated 25 times, with new random seeds for all methods, after which the average performance metrics for each algorithm are computed. Table IV compares the average values (with standard error) of the performance metrics at initialisation (after 10 function evaluations) and after 10, 25 and 50 additional function evaluations. For each performance metric, the first two columns are the average performance metrics for EHVI and EMmI,  respectively. The third column for each performance metric is the best average value (with standard error) achieved by any of the competing genetic algorithms and random search.
We see from Table IV that, with the exception of some instances in the MPFE column, at least one (and often both) of the Bayesian methods perform better than all competing methods (genetic algorithms and random search). In Fig. 10, we display the learning curves for the tissue engineering application using the reduced-order ODE model. Learning curves for the test problems can be found in the supplementary material. The curves in Fig. 10 are typical for the difference between the different MOO algorithms, in that the Bayesian methods perform better than the competition for the GD and VR metrics, and similar to (or better than) the competing methods for the MPFE metric.

E. Scaling Up Test Problems to Higher Dimensions
This section compares the Bayesian MOO methods versus the genetic algorithms for test problems of higher dimension D. Fonzeca & Fleming's Eq. (17) and Kursawe's Eq. (18) are defined for D ≥ 2, so we compare using these test problems.
Consider D ∈ {5, . . . , 20, 25, 30}. The data needed for a "good" Pareto frontier approximation may increase with D, so we choose to begin each Bayesian MOO experiment with N 0 = 10 + D initial, uniformly-sampled data points and then    Defining P A a,j,i as the approximate Pareto frontier produced by algorithm A in experiment j using the N 0 +i first observations, we compute the average metrics of J = 100 experiments: with M ∈ {GD, MPFE, VR} defined in Eqs. (14)- (16). To present the results, we re-define the metrics as the difference between the average performances of algorithm A compared to random search R, summed over all iterations: Note: larger, positive values for the metrics in Eqs. (19)- (21) indicate better performance than random search.
As we scale up the dimensionality for Fonzeca & Fleming's test problem, performance deteriorates for all algorithms. The performance of the Bayesian methods and the genetic algorithms are equally poor, and effectively indistinguishable from that of the random search.
For Kursawe's test problem (Fig. 11), the Bayesian methods outperform both random search and the genetic algorithms on metrics GD A and VR A . Digging deeper into Kursawe's test problem, the Bayesian methods' performance stays relatively stable on GD A and VR A for large D, while random search and the genetic algorithms degrade with increasing dimension D. Metric MPFE A has large standard errors, i.e. the mean fluctuates a lot, making it difficult to derive conclusions about Kursawe's test problem from MPFE A .

F. Result for PDE Model
Sec. III-D shows no large performance difference between the EHVI and EMmI acquisition functions for the test problems. We expect the full PDE model to have a concave Pareto frontier resembling that of the reduced-order ODE model in Figs. 7 & 8. Our interpretation is that EMmI performs better for the reduced-order ODE model, and that it chooses to explore less in the region with a high operating cost. EMmI also out-performed EHVI for the concave Fonzeca & Fleming test problem, so we choose the EMmI acquisition function for optimising the full, expensive-to-evaluate PDE.
The full PDE model takes the refreshment period, refreshment amount and flow rate as design parameters. There are 8 data points from past model evaluations (used e.g. for validating the reduced-order model) that we use as our initial data set. The EMmI acquisition function chooses 14 additional points to evaluate, with computations running from 12 December 2015 until 19 February 2016. As mentioned in Sec. I-A, despite our best efforts, parallel model evaluations are not possible (the server runs out of memory and the computations crash). Fig. 12 shows the resulting approximated Pareto frontier. Our initial data set already contained a point (top left, roughly 4 units' cost and 80 % filling) on the Pareto frontier yielding a good trade-off between cost and scaffold filling. In the bottom left of Fig. 12, the surrogate model has also explored regions where it overestimated the expected filling at a low cost.

IV. DISCUSSION
Single-objective maximisation of the scaffold filling would lead to frequent medium replacement and therefore unrealistically high operating costs. Incorporating cost minimisation shows the trade-off between scaffold filling and cost of culture. Fig. 12 clearly indicates that there is a refreshment regime beyond which the amount of filling is hardly increasing whilst the cost is increasing dramatically. Based on these in silico results, new experiments should be set up investigating the culture regimes that are close to the optimal point in the top left of the Pareto frontier and compare them to (new or historic) experiments on other Pareto frontier regimes.
This manuscript develops novel extensions of acquisition functions EHVI and EMmI for Bayesian MOO of one blackbox and one analytical objective function. The Bayesian method outperforms twelve state-of-the-art genetic algorithms for the tissue engineering application as well as for a set of test problems. These results demonstrate successful exploitation of problems combining both black-box and analytical objective functions.
Our positive results follow from the limited number of available function evaluations. Our working assumption is that the limiting step is the computational cost of evaluating the black-box objective function, i.e. that the computational cost of training the Gaussian process surrogate model and maximising the acquisition function is negligible in comparison. When maximising the acquisition function, the major computational cost comes from the large number M of Gaussian process predictions. Maximising the acquisition function scales as O(M n 2 ), with n the number of training data points and N n, whereas training the surrogate model scales as O(n 3 ). Our working assumption is valid for the full, expensive-toevaluate PDE model which inspires this paper. However, for problems where training the surrogate model and/or evaluating the posterior predictive distribution is more expensive than evaluating the black-box function, genetic algorithms and random search will be able to explore the variable space much quicker (in terms of wall clock time) than Bayesian optimisation, giving them an advantage.
Gaussian processes typically do not scale well to higher dimensions due to the curse of dimensionality -the amount of training data needed grows exponentially with the number of dimensions. But Sec. III-E shows that the Bayesian methods scaled either (i) as well (or poorly) as or (ii) better than random search and genetic algorithms with default parameter settings for two test problems with dimensions up to D = 20.
The Pareto frontier of the neotissue growth problem with the reduced-order ODE model has both convex and concave sections, making optimisation through scalarisation, e.g. by using a weighted sum, unattractive for the full PDE model, since scalarisation may not be able to capture all optima. The Pareto frontier also yields an understanding of the interaction between the conflicting objectives, making an a posteriori trade-off decision preferable to an a priori one. We showed in Sec. III-A that constructing a probabilistic surrogate model of the expensive black box also enables us to construct a probabilistic Pareto frontier. This provides a measure of confidence for selecting optima in unexplored regions of the variable space. Other MOO methods, e.g. genetic algorithms and random search, do not provide any measure of confidence for selecting unexplored optima.
Combining expensive-to-evaluate black-box models with easy-to-quantify objectives like cost is common in the life sciences and engineering. The benefit of Gaussian processes is that they act as cheap simulators and allow us to treat black boxes almost like analytical functions, whilst providing us with confidence bounds. Note that the choice of Gaussian process prior is important: Most off-the-shelf covariance functions make explicit assumptions about the smoothness of the underlying function [38], but there exist methods to overcome these assumptions, e.g. [61].
Tissue engineering has come a long way in the past decade, and tissue-engineered products are finding clinical use [62]. However, scaling up production introduces new problems, many of which have in common that they require optimisation of conflicting black-box and easy-to-quantify, analytical objectives. The Bayesian MOO method presented in this paper puts no restrictions on the contents of the black box, which makes the method equally useful for problems involving in vitro experiments as legacy computer code.

V. CONCLUSIONS
We have demonstrated a method for exploiting one blackbox and one analytical objective for Bayesian MOO, where the black-box function is assumed to be expensive to evaluate. The Bayesian method outperforms competing MOO methods for a diverse set of test problems and for a tissue engineering application. Given the Gaussian process surrogate model that we construct of the black box during the Bayesian optimisation, we infer a probabilistic Pareto frontier that allows us to choose optimal solutions all across the variable space, with some measure of their uncertainty. The results show that the Bayesian method is highly applicable to real-world problems combining black-box models with easy-to-quantify objectives like cost. The acquisition functions have been implemented in python and made available on GitHub [63]. 1

APPENDIX
Performance metric learning curves for test problems.