Linearizing Bilinear Products of Shadow Prices and Dispatch Variables in Bilevel Problems for Optimal Power System Planning and Operations

This work presents a method for linearizing bilinear terms in the upper level of bilevel optimization problems when the bilinear terms are products of the primal and dual variables of the lower level. Bilinear terms of this form often appear in energy market optimization models where the dual variable represents the market price of energy and the primal variable represents a generator dispatch decision. Prior works have linearized such bilinear terms for specific problems. This work is the first to demonstrate how to linearize these terms in the most general case and the conditions required to perform the linearization for bilevel problems with integer or continuous variable in the upper level. The method is provided in an open source Julia module that allows researchers to write their bilevel programs in an intuitive fashion.


I. INTRODUCTION AND BACKGROUND
S INCE the restructuring of electricity markets began in the early 1980's [1] and the introduction of locational marginal pricing into large scale power markets in the 1990's researchers have been investigating electricity market design optimization problems. From a market participant point-of-view one of the most critical terms in a problem is the price signal (typically in $/MWh) from the market operator multiplied by the energy delivered (MWh) by the participant, which together represent the participant's income. When both the price signal and energy delivered are decision variables in a mathematical program then the problem becomes bilinear.
In many electricity markets the price signal to market participants (or generators) is determined as the marginal price of the load balance constraint at any given network node at any given time step. The objective of the market model is to minimize the total cost of energy, or as it is commonly known: maximizing the social welfare. The constraints of the model typically represent an approximation to the power flow equations and take participant cost functions as input.
Equilibrium models allow modeling both the electricity market and participant behavior by including the power flow constraints and multiple, competing objective functions. Bilevel or Stackelberg Game formulations are common in electricity market models that include participant objectives, which are typically to maximize profits. Ruiz et al. 2009 [2] is the earliest known example to demonstrate that bilinear terms for market price and participant dispatch can be linearized. Their model places the market participant in the upper level, which chooses its offer curve for energy generation, while the lower level models the electricity market given the other participants' offer curves. Fernandez-Blanco et al. 2016 [3] is the first to find a linearization for the same bilinear terms (products of lower level primal and dual variables) in the upper level of a bilevel program for revenue adequacy constraints. More recently, Naebi et al. 2020 [4] forms a bilevel problem to optimize the bidding strategy of a microgrid owner in a day ahead market. The upper level minimizes operating costs from the microgrid owner's perspective with the product of its exported power and the dual variable of the lower level, linear power flow load balance in its objective. (In other words, the microgrid operator knows its impact on the market price). The lower level minimizes the system operator's cost, including the payment to the microgrid owner, subject to linear power flow constraints. Xu et al. 2020 [5] proposes a bilevel model in which the upper level represents a coalition of PV system owners that can sell excess power to the grid or to other consumers. The upper level objective contains a bilinear product of the price to charge consumers and the level of excess PV production to be sold. The lower level objective is the sum of the PV owners' cost functions, which contain the benefit of selling excess PV and the cost of consuming grid power. The bilinear product of price and dispatch variables is linearized by setting the lower level primal objective equal to the dual objective. Additional problem specific examples of the linearization technique can be found in [6] and [7].
Each of the aforementioned examples presents problemspecific examples of how the bilinear products of shadow prices This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and dual variables can be linearized in bilevel problems using Strong Duality Theorem [8]. This paper presents a general algorithm for linearizing the bilinear terms of interest and determines the exact conditions under which the bilinear terms can be linearized in general bilevel problems. The algorithm is implemented in an open source Julia module for mathematical programming that allows researchers to write their bilevel problems in an intuitive fashion ( [9], which extends [10]). After showing the linearization algorithm we present some simple and complex use-case examples to demonstrate the value of the linearization method for power system planning research questions. In the complex use-case, with a power flow model, we show that the linearization method makes otherwise intractable problems solvable in a matter of minutes. Using the open source module other researchers can take advantage of the linearization method for any bilevel problem with bilinear products of shadow prices and dispatch variables of interest.

II. LINEARIZATION METHOD WITH INTEGER UPPER LEVEL VARIABLES
We begin by assuming that the upper level variables x are integer such that any product of integer x and the continuous variables y or λ can be made linear using binary expansion [11]. 1 The bilevel problem with bilinear terms in the upper level objective and a linear lower level is min Table I summarizes the terms in (1). Note that the method is also valid for bilinear terms of λ and y in the upper level constraints, but they are not shown for clarity.
The linearization algorithm is applicable when the upper level and/or the lower level problems are non-linear in constraints or objectives. However, the lower level constraints that include the lower level variables from the upper level bilinear terms must be linear to get an exact linearization of the upper level bilinear terms. Futhermore, we assume that the lower level problem is linear in its decision variables (given the upper level decisions) so that we can replace the lower level with its Karush Kuhn Tucker conditions to show single level problem equivalents to the non-linear bilevel problems of interest.
To linearize any λ j y n term one must combine the lower level primal and dual constraints. The dual formulation of the lower level problem is shown below for reference. max The first step is to multiply the lower level primal constraints (1e) by λ component-wise: where • denotes the Hadamard product. 2 Second, the dual constraints (2b) are multiplied with y as follows:

Set Eq Li
Note that any μ n y n can be linearized because of the upper bound The complementary slackness condition for (5) allows one to linearize μ n y n : μ n y n = μ n y n .
A similar result follows for any μ n y n . Combining the last result with the complementary slackness conditions gives: Equations (3) and (7) are then combined to produce a system of equations with the bilinear products of λ and y as the unknowns. In the following we show how to solve for a specific λ j V jn y n .
Let the i th row of (3) be defined as (P i ), which can be written: And, let the k th row of (7) be defined as (D k ), which can be written: (9) Note that the choice of P for (P i ) and D for (D k ) are intentional: P is for primal constraints and D is for dual constraints.
Algorithm 1 outlines the procedure for determining the minimum set of the (P i ) and (D k ) equations needed to linearize a given λ j y n term. Note that the algorithm refers to the indices of (P i ) as rows and (D k ) as columns because the sums over V jk in (8) and (9) are over the rows and columns of V respectively.
The first step of Algorithm 1 is to check if V jn is the only non-zero value in the n th column of V : in this case (D n ) provides the exact linearization of λ j y n (and (P i ) is unnecessary): y n λ j = 1 V jn c n y n + μ n y n − μ n y n + y n m∈M B mn x m (10) Note that (10) only applies under the condition that y n is in a single lower level primal constraint. Additionally, the bilinear products of y n and x m in (10) can be linearized since we are assuming that x is integer in this section.
In the second step of Algorithm 1 the first primal equation (P j ) is added to the set of row indices that will be returned at the end of the algorithm (where j is an input). Additionally, for all the non-zero values in the j th row of V , except V jn , the indices of the dual equations (D k ) are added to the set of column indices. In mathematical terms, this step is taking (P j ): (11) and all of the (D k ) equations for k ∈ N \ {n} in order to replace the bilinear terms of λ j and y k on the right-hand-side of (11). Each (D k ) equation can add more bilinear terms of λ and y and so step three of Algorithm 1 adds additional equations if necessary.
In the third and final step of Algorithm 1 a recursive function, Algorithm 2, is used to search the array V for non-zero, "connected" values. We use the term "connected" to indicate that one could draw horizontal and vertical paths through V to connect non-zero entries to the first entry of interest V jn , starting with a horizontal line each time. A horizontal line adds a (P i ) equation and a vertical line adds a (D k ) equation. The indices of the rows and columns are collected until a sufficient amount of equations are obtained to linearize the λ j y n term in the upper level objective.
Note that Algorithm 2 is similar to -but not the same asfinding the blocks of a block-diagonal matrix. The difference is that Algorithm 2 does not necessarily find all of the non-zero values in a block. In other words, one does not need all of the (P i ) and (D k ) equations that may be available; one only needs as many equations as unknowns (where the unknowns are products of λ and y entries). Algorithm 2 has some conditions under which it returns as error: these errors occur when the search has indicated that redundant row or column indices should be appended to the final vectors. Mathematically, these errors indicate that there are more unknowns than equations and thus the system of equations is underdetermined.
Let the indices of (P i ) and (D k ) returned from Algorithm 1 for a given (j, n) ∈ A pair be defined as J j and N n respectively.
The exact linearization of λ j y n is: which is simply a combination of (8) and (9) for all of the nonzero values of V connected to λ j y n , as demonstrated with the examples in Appendix A. Finally, using the result (12) the mixed integer linear form of (1) is: where the lower level problem has been replaced with the Karush Kuhn Tucker (KKT) conditions and the complementary constraints can be modeled as special order sets or using the "big M" method from [13]. Note that this section assumes that the x variables are integer and therefore the products of x m and y n or λ j can be linearized using binary expansion [11].
It is important to note that finding valid values for the "big M" can be difficult [14]. The open source package in which the linearization method presented in this paper is implemented includes the options to use big M (Fortuny-McCarl) constraints, special order sets, or indicator constraints to linearize the complementary conditions used to integrate the lower level problem into the upper level. The latter two methods do not require defining bounds for the dual variables, but may be more difficult to solve than the "big M" method.

III. LINEARIZATION METHOD WITH CONTINUOUS UPPER LEVEL VARIABLES
In Section II we assumed that x are integer such that all of the products of x m and y n or products of x m and λ j can be linearized using binary expansion [11]. Here we show the conditions under which a λ j y n term can be linearized when the upper level variables x are continuous.
The conditions are divided into two groups with one group less restrictive than the other. The first group of conditions is less restrictive but does not allow lower level variables y to be bilinear in both the upper level problem with λ and the lower objective with x. In mathematical terms this is when AB = Ø, where AB {(j, n) ∈ A : ∃m ∈ M such that B mn = 0}.
The second group of conditions allows a problem to be linearized when bilinear products of λ j y n are in the upper level and bilinear products of x m y n are in the lower level objective for a given n. These types of problems are particularly relevant to energy system market models, in which the upper and lower level bilinear products together represent a zero-sum game. Demonstrative examples are provided in Section IV, in which the upper level products of λ j y n represent payments to distributed generator owners and the lower level products of x m y n represent generator owner income.

A. Conditions When AB = Ø
Recall that the Algorithms 1 and 2 provide the sets J j for each λ j in the upper level objective. Let J ∪ j∈A J J j , which includes the indices of all the lower level constraints that are connected (via non-zero values of V ) to the λ j terms in the upper level objective. Therefore, in order to eliminate all bilinear terms of the form λ j U jm x m in (13a) the following condition must be met: Condition 1: U jm = 0 ∀j ∈ J ∪ , ∀m ∈ M Similar to Condition 1, let N ∪ n∈A N N n , then one could assume that to eliminate all bilinear terms of the form x m B mn y n from (13a). Under Conditions 1 and 2 the mixed integer result for (1) is The case when Condition 2 is violated and Problem (1) has bilinear terms in the upper and lower level objectives of the form λ j A jn y n and x m B mn y n , (where A jn = 0 and B mn = 0), for some n is particularly relevant to energy system market models. For example, take the case where A jn = 1 and B mn = −1 for some particular m, j, and n. Let y n represent a lower level generator dispatch decision. Then λ j represents the marginal cost of the dispatch decision y n as well as the upper level's cost of purchasing power from the lower level. And −x m y n in the lower level objective is the lower level's income for the generation y n using the price signal x m . Section IV provides an example of such a scenario. Thus it is useful to investigate the linearization of problems when AB = Ø (i.e. when Condition 2 is violated).
The problem of interest has the following structure: Note that the products of x and y in the lower level objective (15c) are linearized when the lower level problem is replaced with the KKT conditions. And the set of y n for all n ∈ N \ (A N ∪ N ∪ ) in the last sum of (15c) are the values of y that are not in the upper level objective nor connected to the y n , n ∈ A N , in the upper level objective. Recall that the connected indices are provided by Algorithm 1 and captured in N ∪ . We will show shortly that the connected y n values must not be in the lower level objective with x terms to prevent y n x m terms from showing up in the (D k ) equations needed to linearize the λ j y n in the upper level objective. Also, Condition 1 is reflected in (15f). Now, applying Condition 1 to (12) gives We wish to eliminate the y n B mn x m terms when AB = Ø.
Recall that the y n B mn x m terms in (12) and (16) come from the (D k ) equations with B mk = 0, and that the (D k ) equations for all k ∈ N ∪ are necessary to linearize the upper level λ j A jk y k terms.
Let us assume that a less restrictive version of Condition 2 holds: condition . B mn = 0 ∀m ∈ M, ∀n ∈ N ∪ \ A N Condition 2 implies that none of the lower level variables connected to the λ j A jn y n terms (provided by Algorithm 1) are in the lower level objective with y n B mn x m terms, except the lower level variables in the upper level objective (y n ∀n ∈ A N ). Applying Condition 2 to (16) gives: Let us also assume that Condition 3 holds: Condition 3 implies that the y n ∀n ∈ A N variables of interest are connected to each other via non-zero values of V . Section IV-B provides an example problem, in which the y n ∀n ∈ A N are indexed on time and connected to each other via another timeindexed variable in each equality constraint that is restricted to be no more than a certain value across all time.
Condition 3 allow us to rewrite (17) as Applying (9) to the last summation in (18) gives: This step is key to eliminating the bilinear y n B mn x m terms when AB = Ø. The next steps are to impose conditions that allow us to move the last summation in (19) to the left hand side to get a single sum of terms over the set A. Let us assume that Condition 4 holds: Condition 4 is equivalent to each y n for all n ∈ A N being in only one lower level constraint. Condition 4 implies that Note that Condition 4 requires that Step 1 of the Algorithm be skipped. The revised algorithm for Conditions 1, 2 , 3, and 4 is shown in Algorithm 3. Condition 4 allows us to write (19) as Rearranging (21) gives: c n y n + μ n y n − μ n y n , ∀(j, n) ∈ A. (22) Note that since (22) is valid for all (j, n) ∈ A it implies that N n are equal for all n ∈ A N and that J j are equal for all j ∈ A J . Lastly, we see that to replace (j,n)∈A λ j A jn y n with the last result for (j,n)∈A λ j V jn y n we must require that the two sums are equal to a proportionality constant p: With Condition 5 we can write (22) as: Substituting (23) To summarize all of the conditions under which (24) is valid: r Condition 1: U jm = 0 ∀j ∈ J ∪ , ∀m ∈ M r None of the connected constraints contain x terms.

C. A Note on Separable Lower Level Problems
It is important to note that the conditions required to linearize the bilinear products of shadow prices and primal variables can be applied to sub-matrices when the lower level problem is separable. For example, in a multi-follower Stackelberg game the lower level is likely to be separable, such as when modeling multiple distributed energy reosource (DER) owners or grid customers. In these cases Conditions 3 and 5 should be checked against the blocks of A and V corresponding to each sub-problem. An example of a separable lower level problem is provided in Section IV

IV. USE-CASE EXAMPLES
It is important to note that the use of the linearization algorithm is not limited to the problem types shown in these examples. Indeed, the conditions required for the linearization algorithm are met in each of the references mentioned in Section I. For example, other use cases include: r optimizing generator offer curves in an energy market [2]; r applying generator revenue constraints in a market clearing process [3]; r and optimal profit sharing in a community microgrid [6].

A. Simple Examples Without Time Indices
The first use-case example shows a step-by-step linearization process for a scenario in which the upper level model minimizes the total cost of power with the option to purchase power from the bulk system or from customer owned DER. The lower level can choose to meet its demand from the grid at some retail rate or invest in DER to lower its total cost.
y e ≥ y e ≥ 0 (μ e , μ e ) (25h) In example (25) the upper level (UL) can purchase power x 0 at the feeder head at the wholesale price c LMP and/or from the lower level (LL) at a price of the UL's choosing. The UL chooses x λ to set the LL's marginal cost of power λ when the LL chooses to export power y e ("e" for export). The LL considers buying power y i from grid at the retail rate c i ("i" for import) and/or purchasing the DER capacity y DER at the cost c DER to meet its demand d 1 . Constraint (25b) is the system load balance, which includes an uncontrollable demand d 2 (in practice this constraint is replaced with a power flow model). Constraint (25d) is the UL enforcement of no simultaneous export and import. 3 Constraint (25f) is the LL's load balance. The lower level dual variables, μ DER , μ e , and μ i , are show in parentheses.
Let y [y e , y i , y DER ] . The lower level has one equality constraint, making V = [−1 1 1]. In this case we wish to linearize the product λy e , making the indices of interest j = 1 for the first and only equality constraint in the lower level, and n = 1 because we put y e in the first index of y.
First, let us check the linearization conditions for this problem. Note that the set AB is not empty because we have a bilinear term in the upper and lower level objectives of the form λ j A jn y n and x m B mn y n . Thus, we must check the Conditions 1, 2 , 3, 4, and 5. To check the conditions we need the sets J ∪ , N ∪ , A and their sub-sets J j , N n , A N . With N = {1, 2, 3} for the three LL variables and J = {1} for the single LL constraint, we get: With only one pair of (j, n) in A we only need to call Algorithm 3 once with the values for V , j = 1, and n = 1. Algorithm 3 first initializes J 1 = {1} and defines the cols_to_check as {2, 3} because V 1,2 and V 1,3 are non-zero. The cols_to_check is copied to start the set N 1 and then the recursive search is started. The recursive search loops over the values in cols_to_check and calls Algorithm 2, appending the results to the sets J 1 and N 1 . In this case no new non-zero values are found (there is only one row in V ) and so Algorithm 2 returns the values that it was provided. Finally, Algorithm 3 returns the sets J 1 = {1} and N 1 = {2, 3}. Now, since we only have one pair of (j, n) in A the union sets J ∪ and N ∪ are equal to the sets J 1 and N 1 respectively. With the necessary sets defined we can use (23) to linearize the product of λ and y e in the UL objective: With this last result, we replace the lower level problem in (25) with its KKT conditions to get the mixed integer linear program: Example (25) (and its mixed-integer linear version (27)) is useful for showing how DER can benefit system operators. First, let us assume that the DER system has a relatively high x BkWh,j ≥ x SOC,j,t ∀j ∈ N B , ∀t ∈ T (28t) x SOC,j,0 = 0.5x BkWh,j ∀j ∈ N B (28u) x SOC,j,N t = 0.5x BkWh,j ∀j ∈ N B (28v)  [9,22,31,34,17]. Battery capacities are listed in order of nodes [2,7,24]. cost of 10 $/MW when compared to the other cost values, which are c LMP = 1 $/MW, c i = 1 $/MW, d 1 = 1 MW, and d 2 = 2 MW. In this case it is not in the LL's interest to buy DER and so y DER = 0 and the LL purchases all of its power d 1 = 1 from the grid. Also, the UL purchases all power from the bulk system at c LMP = 1 to meet the total demand d 1 + d 2 = 3 MW. Therefore, the UL's cost is $3 and the LL's cost is $1.
Now, let us assume that the DER system cost is equivalent to the other cost values at 1 $/MW. The LL can now meet its demand for equivalent costs from either the grid or from a DER system. However, it is in the UL's best interest for demand to be met by the LL's DER system because the UL can lower its total cost from $3 to $2 by paying the LL $2 for exporting excess DER power in to the grid to meet demand d 2 instead of meeting the total demand d 1 + d 2 from the bulk system. Therefore, the UL chooses x λ = 1 $/MW, which incentivizes the LL to purchase y DER = 3 MW. The LL meets its demand d 1 = 1 MW behind-the-meter and exports 2 MW, which meets demand d 2 = 2 MW (and x 0 = 0 MW). The LL's cost is $1 (the same as in the high DER cost scenario), but the UL reduces its cost from $3 to $2.
In summary, in this simple demonstration, only when the marginal cost of power from DER for the LL is less than (or equal to) retail rate will the LL purchase DER, which allows the UL to purchase excess power. When the LL can export excess power, and the UL can lower its total cost by purchasing DER exports, the UL will set the LL's marginal cost of power by choosing the minimum compensation rate to incentivize the LL to export the optimal amount of power that minimizes the total system cost of power. Note that in practice the decision variables are indexed on time; and, with solar PV as a DER option, there can be times when the LL has a zero marginal cost of power. Therefore, determining the DER solutions in practice are not as simple as comparing the cost coefficients.

B. Complex Example with Separable Lower Level Problem
In this planning example we have a distribution system planner in the upper level that is considering purchasing battery energy storage systems for installation at three different nodes in a distribution system in order to reduce its operating cost in a real-time energy market. The planner also accounts for purchasing exported PV power from customers and sending a time-of-use price signal to refrigerated warehouses with priceresponsive cooling systems. The upper level model is shown in (28). Tables III and IV summarize the variables, parameters and sets in (28). The objective (28a) includes three components to minimize: (1) the cost of energy purchased on the bulk market at the feeder head; (2) the cost of energy purchased from distributed, customer-owned photovoltaic (PV) systems; and (3) the capital costs of battery systems. We assume an analysis period of 20 years and a discount rate of 5%. For the bulk market price c LMP,t we use the average hourly real-time market prices from ERCOT over the year of 2019 [15]. A year of load is simulated at an hourly resolution by randomly assigning different U.S. Department of Energy Commercial Reference Building profiles to the load nodes [16]. The load nodes are defined in [17], from which we take the 38 node network model. Constraints (28c) -(28g) define a linear power flow model, commonly known as "LinDistFlow" [18]. Constraint (28h) limits the squared voltage magnitude. Constraints (28i) and (28j) define the net power injection from system operator owned battery systems. Constraints (28k) and (28l) define the net power injection from customer nodes with PV systems. Constraints (28m) and (28n) define the net power injection from nodes with price-responsive refrigerated warehouses. Constraints (28o) and (28p) define the net power injection from nodes with uncontrollable load. Constraint (28q) is structural and prevents simultaneous export and import rom nodes with PV systems. Constraints (28r) -(28v) define the operational limits of the system operator's battery systems. Finally, constraint (28w) says that the lower level decisions y must be optimal for the lower level problem (29) shown at the top of next page.
Problem (29) shows the lower level problem, with the objective to minimize the total cost of energy for all customers in the distribution system. Tables III and IV summarize the variables, parameters and sets in (29). The first half of the lower level objective (29a) represents the cost of energy for price responsive refrigerated warehouses that have a known retail rate c i,n,t and a time-varying price signal from the upper level problem x i,n,t . The second half of (29a) represents the cost of energy for customers that can install PV systems. These customers also pay the retail rate c i,n,t for imported power but can also recieve compensation for exported, excess PV power from the upper level via x e,n,t . Constraints (29b) and (29c) are the load balance constraints for the PV and warehouse customers respectively. Constraint (29d) limits the PV power used to meet load to the PV capacity times a known, normalized solar PV production y (x) = arg min y pwf t∈T n∈N W (y i,n,t [c i,n,t + x i,n,t ]) + pwf t∈T n∈N PV (c i,t y i,n,t − x e,n,t y e,n,t ) s.t. y i,n,t + y pvprod,n,t = d n,t + y e,n,t , (λ n,t ) ∀n ∈ N PV , ∀t ∈ T (29b) y i,n,t = d n,t + y HVAC,n,t /COP ∀n ∈ N W , ∀t ∈ T (29c) y pvprod,n,t ≤ y PV,n f PV,n,t ∀n ∈ N PV , ∀t ∈ T (29d) y T,n,t = y T,n,t−1 + f hr Ay T,n,t−1 + B [y HVAC,n,t , u n,t ] T ∀n ∈ N PV , ∀t ∈ T (29e) y T,n,0 = T n,t=0 ∀n ∈ N PV (29f) ; y e,n,t ≥ 0, y i,n,t ≥ 0, ∀n ∈ N PV , ∀t ∈ T (29h) factor from [19]. Constraints (29e) -(29g) define the refrigerated warehouse temperature dynamics, starting condition, and temperature limits. We assume that the warehouses are have freezing units that can at most reach zero • C but can be cooled to as low as −20 • C in order to lower their energy costs by coasting through high price periods. We assume that the distribution system is in Austin, Texas, which defines the PV production factor and the ambient temperature used as an input to the refrigerated warehouse models. By inspection the lower level model (29) is separable between the set of PV nodes N PV and the warehouse nodes N W . Because we wish to linearize the bilinear product λ n,t y e,n,t in (28a), which is only defined for the PV nodes, we only need to check the linearization conditions for the components of the lower level model (29) that are relevant to the PV nodes. A Julia module to programatically check the linearization conditons, including for separable problems like this example, is available in [20]. 4 The upper level is allowed to install battery systems at up to three nodes (2, 7, and 24) in the network while the lower level can install PV systems on up to five nodes (9, 17, 22, 31, and 34). Using the baseline values the optimal solution is for the lower level customers to install small PV systems to reduce their utility bills and it is not economical for the upper level to install storage systems. To produce more interesting results we increase the bulk energy costs by integer values from the baseline 1× to 5×. Table II summarizes the results with increasing bulk energy costs. In table II we can see some expected trends: as the bulk energy costs increase less energy is purchased on the bulk market and more PV energy is purchased from customers, which encourages larger PV systems. However, there are not clear trends in the battery sizes nor energy throughput. The lack of trends in the battery results is not surprising: batteries can serve many purposes including energy arbitrage, peak shaving, and grid services. In fact, in the 5× bulk cost scenario so much PV energy is exported that it is necessary to use the storage systems to keep the voltage within limits.
For more information regarding model results readers are encouraged to see the code available online [21]. The examples in this section are meant is demonstrative use-cases and only represent a fraction of the types of questions that might be answered using the linearization technique for power system planning. Furthermore, the price signal from the upper level to the lower level can also be used in transactive control context by removing capacity decisions, shortening the time horizon, increasing the time resolution, and using forecasts for the uncontrolled demand and PV production.

C. Solution Time Impact
Using the example from Section (IV-B) we compare solution times with and without the upper level bilinear terms λ t y e,t replaced with the linearization. In both cases the model is reformulated as a single level problem. Both the mixed integerlinear and the mixed integer-bilinear problems were solved using Gurobi 9.1 on 16-core 3.4 GHz Linux PC with 32 GB of RAM using "big M" constraints for the complementary constraints. The mixed integer-bilinear problem is (28) combined with the KKT conditions for (29). The mixed integer-linear problem is not shown due to space constraints, but is available in the public repository [21].
Both the linearized and bilinear problems have 350,405 binary variables and 2,417,777 continuous variables. The bilinear problem also has 43,800 bilinear objective terms. After 25 seconds in the presolve the linearized problem has 103,588 binary variables and 519,031 continuous variables. The linear problem solves in 128 seconds with a gap of 0.01%. After 21 seconds in the presolve the bilinear problem has 83,143 binary variables, 540,209 continuous variables, and 21,180 bilinear constraints. The bilinear model takes 8,447 seconds to get to a 57% gap, which is not improved until the operating system kills the problem at 16,032 seconds due to running out of memory. In short, the linearization method makes the otherwise intractable bilinear bilevel problem from Section (IV-B) solve in a few minutes. Similar results are expected for other large planning problems like the example in Section (IV-B).  V. CONCLUSION This work presents a method for linearizing bilinear products of lower level primal and dual variables in the upper level of bilevel optimization problems, and the conditions required for the linearization to be exact. The linearization method is especially relevant for modeling large scale energy distribution systems with many stakeholders and is therefore applicable to a growing number of problems as energy markets expand and adapt to new regulations such as FERC Order 2222 [22] and the increasing adoption of distributed energy resources [23]. By publishing this method we hope that more use cases will be discovered for the linearization technique.
For future work we intend to leverage the method in an open source mathematical programming package [9], [?]. for studying compensation mechanisms of distributed energy resources serving as power system upgrade deferrals (c.f. [17]). Another future research direction involves using the optimal price signals from one level to the other as a transactive control mechanism. We will also explore more accurate and complex power flow approximations such as the second-order cone approximation for the Branch Flow Model.

APPENDIX A EXAMPLES TO DEMONSTRATE THE ALGORITHMS
Example 1: The simplest case for linearizing a certain λ j y n term occurs when the y n in the upper level objective bilinear term is in only one lower level constraint. In this case, step 1 of Algorithm 1 returns and (10) provides the linearization of λ j y n . Note that (10) is a particular instance of (9). In this example we present the next simplest case, which is when y n is in more than one constraint but the other y variables in constraint j are in no other lower level constraints.
In Step 2 of Algorithm 1 the indices of (D k ) in the set {k ∈ N \ {n} : V jk = 0} are added to the set of column indices to check using the recursive Algorithm 2. And the set J j is initialized with {j}. Algorithm 2 then checks for non-zero values of V above and below each V jk entry for each of the column indices. If no non-zero values are found then Algorithm 2 returns the same sets that were passed to it, meaning that no more equation indices are needed to linearize the λ j y n term.
Take one particular k in N \ {n} for example. The special case that V ik = 0 ∀i ∈ J \ {j} is illustrated as follows: When y k is only in constraint j then (D k ) is: Equation (31) can then be substituted into (P j ), shown in (11), to eliminate the bilinear term of λ j and y k in the sum over k ∈ N \ {n}. A similar result follows for eliminating all of the λ j y k terms on the right hand side of (11). Example 2: Continuing from our previous example, now let us assume that the k -th column of V has one other non-zero entry. Now, additional combinations of the (P i ) and (D k ) equations are necessary to eliminate the λ j y k term in (11). This is where step three of Algorithm 1, which relies on Algorithm 2, comes in.
For this example let V i k = 0 for a particular i in J \ {j}, and let V ik = 0 ∀i ∈ J \ {j, i }. Also, let the i -th row of V contain one other non-zero value V i , and let V i = 0 ∀i ∈ J \ {i }. This case is illustrated in (32).
Algorithm 2 is passed row j and column k from step 3 of Algorithm 1. Algorithm 2 first finds all the row indices of non-zero values of V in column k (except V jk ) and checks that those rows have not already been added to the set of rows. (Recall that redundant rows or columns found by the Algorithms indicate an underdetermined system of equations). The set of rows in Algorithm 2 now contains i since V i k = 0. Finally, Algorithm 2 loops over each row found to check for non-zero values of V . If any values are found they are added to the columns set to search in another call to Algorithm 2 (hence the name "recursive_array_search"). In this case column is appended to the empty set of columns and so Algorithm 2 calls itself once with r = i, c = , rows = {i }, cols = { }, which finds no more non-zero entries in V . Thus Algorithm 2 returns rows = {i }, cols = { } to Algorithm 1, which appends the returned sets to J j and N n , making the final sets J j = {j, i } and N n = {k, }, Now (D k ) gives: Since the i -th row of V contains only one other non-zero value V i , and the other values in column of V are zero as illustrated in (32), then adding equations (P i ) and (D ) allows one to linearize the λ i V i k y k term in (33) in a similar fashion to the previous example.