Modelling the Interdependence of Multiple Electricity Markets in the Distribution System Aggregator Bidding

The volatility of market prices and the interdependence of multiple markets make it challenging for Distribution System Aggregators (DSAs) to model these prices. In this paper, a novel statistical model based on Gaussian process regression and mutual information screening technique is developed. This model is able to predict different market prices and quantify their uncertainty whilst incorporating the interdependence of different markets. The proposed model is employed to assist DSAs with market price modelling. Price scenarios for various markets generated by this model make it viable to formulate the optimal involvement of DSAs in multiple markets as a stochastic multi-step two-stage problem. Other than providing a set of scenarios that efficaciously model multiple electricity market prices, after the clearing of each market, the proposed model leverages market clearing results to improve the accuracy of price prediction of subsequent markets. Extensive simulation results on large price datasets demonstrate that the proposed methodology will result in a considerable increase in the profit of the DSA compared to state-of-the-art price prediction approaches.


Modelling the Interdependence of Multiple Electricity Markets in the Distribution System
Aggregator Bidding Mohammad Afkousi-Paqaleh , Graduate Student Member, IEEE, Mohammad Jafarian , and Andrew Keane , Senior Member, IEEE Abstract-The volatility of market prices and the interdependence of multiple markets make it challenging for Distribution System Aggregators (DSAs) to model these prices.In this paper, a novel statistical model based on Gaussian process regression and mutual information screening technique is developed.This model is able to predict different market prices and quantify their uncertainty whilst incorporating the interdependence of different markets.The proposed model is employed to assist DSAs with market price modelling.Price scenarios for various markets generated by this model make it viable to formulate the optimal involvement of DSAs in multiple markets as a stochastic multi-step two-stage problem.Other than providing a set of scenarios that efficaciously model multiple electricity market prices, after the clearing of each market, the proposed model leverages market clearing results to improve the accuracy of price prediction of subsequent markets.Extensive simulation results on large price datasets demonstrate that the proposed methodology will result in a considerable increase in the profit of the DSA compared to state-of-the-art price prediction approaches.
Index Terms-Bidding strategy, distribution system aggregator, electricity markets, Gaussian process regression, uncertainty modelling.

I. INTRODUCTION
A S MORE Distributed Energy Resources (DERs) are in- stalled in the distribution network and with the ongoing electrification of heating and transport sectors, the distribution system is increasingly encountered with more flexibility and control potentials.An underappreciated opportunity for the power systems pursuing to drive the energy transition lies in the identification of these potentials.These potentials could be aggregated by a Distribution System Aggregator (DSA) and offered to the electricity markets.DSAs should determine an optimal bidding strategy for participation in multiple energy markets to maximise their profit.Nevertheless, determining an optimal bidding strategy is difficult due to the need to account for the uncertainty of market prices and resources, the interdependence of different markets, and the network's operational constraints.

A. Literature Review
To tackle the bidding strategy challenges of an aggregator of small prosumers, [1] applies a two-stage stochastic optimisation technique to minimise the net cost of buying and selling energy.In [2], a model of the Day-Ahead (DA) bidding strategy for a load aggregator is presented, wherein Demand Response (DR) is utilised to mitigate the risk.A new distributionally robust optimisation is proposed in [3] to design a collaborative DA bidding strategy for a DSA of intermittent resources.It is shown that the collaborative bidding strategy yields a more significant financial gain than independent bidding.The authors in [4] propose a secure and effective bidding strategy and Real-Time (RT) operation of a DSA that incorporates distribution network constraints.However, the above studies, [1], [2], [3], [4], have failed to address the uncertainties of market prices.
A growing body of literature recognises the importance of factoring in the price uncertainties in the bidding strategy problem of the DSA.Integrating demand flexibility with DERs, [5] presents a bidding strategy optimisation model for DA market involvement, modelling uncertain parameters of DERs and market prices.In [6], a stochastic decision-making model of a DSA's bidding strategy for the DA market is developed, coordinating renewable resources, DR, and Electric Vehicles (EVs).Taking into account the price elasticity of the retail loads, [7] proposes a bidding strategy for the DSA of DR resources in the DA wholesale market.The bidding strategy of a small-scale microgrids aggregator for participation in the RT balancing market is introduced in [8]; their proposed model aims to achieve maximum returns while mitigating the effects of uncertainties.However, such studies, [5], [6], [7], [8], remain narrow in focus dealing only with one market.Some research studies investigated the involvement in multiple energy markets in an attempt to improve the DSA's profitability.In a study investigating the bidding strategy of a DR aggregator participating in DA and balancing markets, Vahid-Ghavidel et al. [9] report that the information-gap decision theory could be employed to gain a guaranteed predefined profit, taking into account consumers' and market prices' uncertainties.In [10], a model is provided to optimise an EV aggregator's DA and RT bidding strategy in the presence of price uncertainty.According to [11], the hybrid stochastic-interval approach could be employed to reflect the uncertainties associated with renewable resources and different market prices in the bidding strategy problem of hybrid power plants.In [12] a bidding strategy model of a DR aggregator participating in DA and RT markets is presented.The problem's uncertainties are represented by a set of scenarios based on historical data.Using satisficing theory principles, [13] models the customer behaviour for the portfolio design of a DR aggregator.The proposed model determines the optimal DR premium for participation in DA and RT markets.In all the studies reviewed so far, different markets are recognised as independent and the interdependency of markets is overlooked.In [14] a coordinated strategy for a DSA that aggregates wind power and compressed air energy storage to participate in DA, Intra-Day (ID), and balancing markets is presented.The correlation among prices of different markets is considered via a constant factor.

B. Contributions
To maximise their profits, DSAs are expected to engage in multiple electricity markets.In this study, a stochastic multi-step, two-stage bidding strategy problem is formulated and solved for DSAs participating in multiple markets.In this regard, to properly allocate their resources, DSAs should first identify and model the interdependencies of these markets as well as the price uncertainties.Although extensive research has been carried out on the bidding strategy problem of DSAs, no single study exists that models and integrates the interdependencies of multiple electricity markets into this problem.Merely considering different markets as independent is too simplistic.This assumption increases the risk of non-optimal allocation of resources in different markets, putting the DSA's profit at risk.Conversely, if the interdependence of markets is simplistically modelled by a constant correlation factor, there is the risk of under or over-estimating markets' dependence, thus diminishing the DSA's profit.
A framework based on Gaussian Process Regression (GPR) is proposed to model multiple electricity market prices.As GPR employs a kernel-based method, different data can be considered as explanatory parameters by specifying the respective kernel function.In addition, since GPR is a nonparametric Bayesian model, it is immune to data over-fitting, as is the case with linear models applied to non-linear data [15].These characteristics allow GPR to capture and model 1) the interdependence between different electricity markets and 2) the uncertainties of market prices.Contrarily, neural networks can model the former but face unresolved challenges to model the latter [16], and ARIMA can only model the latter [17].The GPR-based model complexity expands as the number of explanatory parameters increases.The Mutual Information Screening (MIS) technique is employed to limit the number of explanatory parameters to the proposed model by identifying the most relevant historical prices of all markets.In addition, a platform is developed to optimise the structure of the GPR by optimising and assigning different functions and parameters of the GPR.Finally, the trained model is implemented to generate price scenarios that incorporate the uncertainties and dependencies of various markets.These scenarios make it possible to optimise the bidding strategy problem of a price-taker DSA for participation in multiple markets.
In the bidding process, new data becomes available after each market clearing that could be exploited to increase the prediction accuracy for the upcoming markets.The DSA's bidding strategy problem is modelled as a multi-step two-stage problem to address this issue, wherein each step corresponds to a market.Following the clearing of each market, the resulting market prices are used to update the GPR-based model, thereby enhancing the accuracy of prediction for succeeding markets.
The results of solving the DSA bidding problem incorporating the multiple market price scenarios generated by the proposed GPR-based approach are compared to state-of-the-art price modelling approaches including, ARIMA, Lasso Estimated AutoRegressive (LEAR), and Long Short-Term Memory (LSTM) neural network.

A. Market Structure
The DSA's involvement in the energy markets in this research follows the market rules provided in [18].Nonetheless, the approaches outlined here could be applied to different markets with slight adjustments.In this section, only the DA and ID markets are modelled.However, the proposed approach could be easily expanded to encompass all existing markets.Fig. 1 schematically illustrates the structure of the bidding strategy problem and the mechanisms of the market.Three separate agents are illustrated: the Independent System Operator (ISO), the DSA, and the Distribution Network Operator (DNO).The ISO publishes historical market prices and releases market results following each market clearing.The DNO is responsible for ensuring that the electricity service is not imperilled while having limited monitoring infrastructure.
The DSA must submit its offers to tomorrow's DA market before 11:00.At this step, the DSA should have decided on its final DA bids and its vision of ID bids.After the DA market clearing, DSA leverages the DA market results to improve its prediction of ID prices.Finally, the DSA submits its bids for participation in the ID market.The DSA should strategically allocate its resources to different markets using a mathematical formulation that considers its resources' characteristics and price scenarios.Moreover, the DNO's provision of representative data enables DSA to build a set of equations with low computational cost to effectively construct constraints that reflect the bids' effect on the network's operational constraints.

B. Price Prediction Model
The DSA should strategise its offerings in the energy markets.To achieve this, DSA should first predict the DA and ID market prices using the ISO-supplied data.The price prediction could be accomplished using a set of scenarios that integrates the interdependencies between the two markets and represent the volatility of market prices.The proposed model for predicting prices is depicted in Fig. 2. A model based on GPR is developed to capture and integrate the interdependencies between markets to improve the accuracy of predictions.Owing to a kernel method, the GPR can model the interrelation dependency between different market prices as it accepts different types of explanatory parameters by configuring the kernel function of each parameter [19].Being able to capture the input-output mapping from empirical data and to quantify the uncertainties of the outcome, GPR has been applied for short-term power forecast of photovoltaic [20], capacity estimation of lithium-ion batteries [21], and probabilistic load forecasting [22].
GPR is a nonparametric Bayesian model where its complexity grows as data size inflates.Therefore, GPR-based model training becomes computationally intensive when all historical prices across all markets are considered.The MIS technique is applied to identify the most relevant historical prices across all markets regarding the price to be predicted to make the proposed model computationally efficient.The proposed approach consists of two phases.First, the model is trained using historical data of different markets that contain the most amount of information, identified by MIS.Next, the trained model is implemented to predict future prices.In the implementation phase, the same structure of explanatory parameters (with new data) is given to the trained GPR-based model to predict the desired market prices.
A separate GPR-based model is developed for predicting the price of each hour of the day (and hence in total, 24 models).The dependent parameter obtained from the GPR-based model for each hour is a Gaussian distribution of the price at that hour.These Gaussian distributions are combined by employing Cholesky decomposition to generate a multivariate PDF of the market prices, considering the correlations between different hours.Finally, this multivariate PDF is used to generate the 24-hour price scenarios.The dependence of electricity prices across markets is described as scenario trees.Therefore, each DA market price scenario will have several ID price offspring.In other words, the ID price scenarios represent prospective ID market prices if that DA market price materialises.A high number of scenarios are generated and later reduced to thoroughly model market prices.In the first step (before DA market closure), the GPR-based model is used to create price scenarios for both markets, while in the second step (after DA market clearing), the GPR-based model is updated with actual DA prices to improve the ID price prediction accuracy.
1) Mutual Information Screening Technique: Mutual Information (MI) is a measure of mutual dependency between two random parameters.Specifically, it reveals how much information can be gleaned for one random parameter from observing another.Using the MIS technique, historical prices of all markets containing the most information about the to-be-predicted market price are identified.The primary distinction between correlation and the MIS technique is that correlation measures linear dependence, whereas the MIS technique measures generic dependence (including non-linear reliance).Suppose (X, Y ) are two uncertain parameters (two market prices in this study) over the space X × Y, with the joint probability distribution function of Π (X,Y ) , and the marginal distributions of Π X and Π Y .The Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

MI of the pair is:
where D KL (A B) is the Kullback-Leibler divergence, known as relative entropy, that measures the statistical distance of probability distribution A and reference probability distribution B. ⊗ indicates the tensor product.For discrete parameters, MI is defined as: MI could also be defined in terms of entropy.Denoting marginal entropies of X and Y by H(X) and H(Y ), and joint entropy by H(X, Y ), MI could equivalently be expressed as: The entropy of an uncertain parameter, H(X), is the average degree of unpredictability and uncertainty inherent in the parameter's potential outcomes and is expressed as follows: The joint entropy, H(X, Y ), quantifies the uncertainty when two uncertain parameters are evaluated jointly.
In other words, H(X) is the degree of uncertainty of X, while MI(X; Y ) is the degree to which knowing Y reduces X's uncertainty.A larger MI value signifies a greater reduction in uncertainty of X, which corresponds to Y being more dependable.The unit of MI in this study is Shannon (bit).
2) Gaussian Process Regression: The GPR model describes the relationship between multivariate explanatory parameters (historical prices of different markets) and a scalar-dependent parameter (market price to be predicted).GPR uses the Gaussian process and Bayesian inference to estimate the nonlinear and nonparametric model between explanatory parameters and a dependent parameter [15].Let train = {(X n , y n )|n = 1, . .., N} be the training data set, and test = {(X * m , y * m )|m = 1, . .., M} be the test data set, where X n is a vector, with the elements of (x 1 n , x 2 n , . .., x D n ) (N being the number of samples of explanatory parameters and D being the dimension of each sample of explanatory parameters) and y n a scalar dependent parameter for the nth sample of explanatory parameters.The explanatory parameters-dependent parameter relation model could be expressed as follows: where f (X) is a noise-free latent function and represents the noise with a Gaussian distribution N(0, σ 2 n ).Assuming X = [X 1 ; X 2 ; . ..; X N ] N ×D be the matrix of all samples of explanatory parameters, Y be a column vector of all dependent parameters, with the elements of (y 1 , y 2 , . .., y N ), and f (X) be a column vector of all latent function values of the training data, with the elements of (f (X 1 ), f(X 2 ), . .., f(X N )); for all the training data set it can be stated that: where ε denotes the noise vector with a Gaussian distribution N(0, σ 2 n I).The latent function f (X), is characterised by a multivariate Gaussian distribution: where K(X, X) = K N is the kernel matrix: The marginal distribution of Y, p(Y|X), given that dependent parameters are normalised to a zero mean, could be obtained: The primary objective of GPR is to infer the distributions of test data-dependent parameters, y * m , given a fresh vector of explanatory parameters, X * m .The joint distribution of the training and test data for the function values f (X n ) and f (X * m ) is given by: where K M is the kernel matrix of test samples, and K NM = K tr MN (tr denoting transpose) are the kernel matrix between training and test samples.The joint distribution of y n and y * m could be expressed as: The predicted distribution of dependent parameters in the test data set can be determined by: ) [19].For practical application, the parameter vector ϑ = [K N + σ 2 n I] −1 y n is calculated in the training phase.The number of parameters in ϑ corresponds to the number of samples in the training data (N ).Consequently, (15) could be rewritten as follows (incurring only O(NM) computing cost): To optimise the structure of the GPR-based model, a platform is developed which optimises the basis function, kernel function, kernel function's hyperparameters, kernel scale (when applicable), and sigma of GPR for each hour to improve performance.Depending on the characteristics of each hour, these parameters/functions are optimised or assigned.A function of each explanatory parameter could replace that parameter to better model the relationship between explanatory parameters and the dependent parameter.This function, known as the basis function, is affine, for which the coefficient and intercept are optimised.The kernel function is chosen from quadratic, exponential, squared exponential, Matern 5/2, Matern 3/2, or rational.The assigned kernel function's hyperparameters are optimised to achieve the best performance.Moreover, the platform searches among values within the range of explanatory parameters and values within the range of 0.0001 to 10 times the standard deviation of the dependent parameter to determine the kernel scale and sigma, respectively.Additionally, the explanatory parameters are normalised to a mean of zero and a standard deviation of one to eliminate the dependence on arbitrary scales in the explanatory parameters and enhance performance.The exact form of matrix representation used for the training phase will be used for the implementation (test) phase.The process mentioned above is done for each hour of the 24-hour ID market separately.

C. Problem Formulation
The bidding strategy for a DSA to partake in the DA and ID energy markets is modelled as a multi-step, two-stage stochastic optimisation problem aiming at profit maximisation.Each step represents a market.Prior to the closure of each market, the offerings in that market are considered first-stage variables, while the offerings in all future markets are considered second-stage variables.This problem is formulated as a Mixed Integer Linear Programming (MILP).Four types of resources for DSA are considered, including dispatchable DERs, wind farms, Battery Energy Storage (BES), and DRs.

1) First
Step Objective Function, Before DA Market Closure: Initially, the DSA should determine bids for participation in the DA market and visions of bids in the ID market.The objective function of the DSA is profit (Υ) maximisation, which is comprised of the total profits from electricity markets minus the operational expenses, could be written as follows: where Network Constraints Modelling: Untrammelled actions of DSAs could result in violation of network constraints and poor quality of service for customers.Therefore, DSAs should ensure that their bids do not violate network constraints.
where Ω is the vector of network variables, i.e., nodal voltages and line currents, indexed by i, and t is the hour index.The maximum and minimum permissible ranges of network variables are denoted by Ω Max i and Ω Min i , respectively.The proposed bidding strategy takes the network operational constraints into account using the approach provided in [4], where the DNO does not interfere with the DSA's decision-making and there is no need for constant negotiation and data sharing between the two entities.This approach matches the state-of-the-art, and futuristic operation practices [23].
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
As DSAs have no access to network data, network constraints cannot be handled using a traditional OPF formulation [4].Network sensitivity coefficients, which yield satisfactory results under different topologies and resource configurations [24], [25], [26], can be employed to measure the fluctuation in network variables (Ω i,t ), due to changes in power generation/consumption by each resource.The approach to sensitivity analysis described in [26] is utilised because it has been demonstrated to be suitable for unbalanced feeders.
The DNO executes a series of calculations summarised in [4] and produces a linear regression that describes the variables of interest with a linear model.In other words, the degree to which the network variable Ω i,t will change as a result of active and reactive power variation of resources is determined.Leveraging network variable estimates and sensitivity coefficients acquired from the DNO, DSAs can represent the magnitude of the network variables, Ω i,t , as a function of active and reactive power variation of resources.Consequently, these representations allow DSA to construct the set of network constraints, where each network variable Ω i,t is bound within its permissible range.In contrast to conventional power flow equations, DSA does not require access to the network admittance matrix and thus is not responsible for network modelling (the DNO provides the network variable estimates and sensitivities).
2) Second Step Objective Function, After DA Market Clearing: Once the DA market has been cleared, the DSA leverages the DA market prices to update the GPR-based model.The updated GPR-based model is then used to create ID price scenarios.Moreover, in this step, the DSA's objective function will change.The DA offers should be fulfilled, however, considering the simultaneous delivery of DA and ID market offers during the RT operation of the grid, only the DA offers as a whole value (P DA t ) are meaningful at this stage, and the exact utilisation of each resource in (19) and ( 20) is irrelevant and could change.Now, using all resources, the DSA should decide how to participate in the ID market, provided that the accepted DA offers are delivered.
where λ ID s,t and π s indicate the ID market prices and probability of scenario s obtained from the updated GPR-based model with actual DA prices.P ID t is determined using (23).
The ID offers are the accumulated utilisation of all resources minus the accepted DA offer.In other words, DSA may utilise the full capacity of its resources to participate in the ID market, provided that DA offers are delivered.The constraints provided in ( 25), ( 26), ( 30), ( 32), ( 34), ( 36), ( 38)- (41), and ( 21) curb the DSA's offer in the ID market.

A. Network Models
The IEEE 37 node [27] in Fig. 3, and IEEE 123-bus test feeder [28] are used as test systems.These systems operate at nominal voltages of 4.8 and 4.16 kV and include unbalanced loads.The daily load variations are modelled based on a representative of the Irish distribution system [29].The IEEE 37-node test system is upgraded with a dispatchable DER, four storage units, and a wind farm.In addition, it is presumed that all load buses (red-highlighted in Fig. 3) engage in DR programs (see subsection III-C).The energy production cost of the 500 KW DER, β g , is set at 60€ /MWh [30].Four BESs with capacities 150, 200, 100, and 80 kWh are located on buses 6, 10, 18, and 22, respectively [31].δ b Conv of all BESs is set at 25 per cent.The degradation cost of BES is modelled as 9€ /MWh-throughput, as determined by averaging the values presented in [32].Moreover, the initial and final SOCs are fixed at 50%, while the minimum and maximum SOCs are set at 10% and 90%, respectively.Characteristics of the wind farm, located at bus 7, are given in subsection III-B3.The statutory voltage limits are 0.95 and 1.05 p.u. [33].For comparative and reproducibility purposes, the IEEE 37 node test feeders' data and the optimization scripts in different cases for all approaches have been made available at the online repository [34].

B. Uncertainty Handling 1) Scenario Generation:
For each day of the test data, 12,500 scenarios are generated.Each scenario has 72 parameters; the first 24 are the DA prices, the second 24 are the ID prices, and the final 24 represent the hourly wind generation.These scenarios are modelled as scenario trees, which means that for each DA price scenario, there are multiple ID price scenarios, and for each combination of DA and ID prices, there are several wind scenarios.After the DA market clearing, to determine the ID offers, the scenarios will have 48 parameters reflecting the ID prices and wind generation.To generate price samples, first, the MIS technique is used to evaluate and identify the historical data that contains the highest amount of information regarding the next day's market prices.Fig. 5 depicts the amount of information included in the previous DA and ID prices regarding the ID prices.In Fig. 5(a), zero days ago means the DA prices of the same day after DA market clearing, and they have shown the highest degree of dependency.Similar to the findings of the correlation analysis, provided in Fig. 4, historical DA market prices contain more information regarding the expected ID prices than historical ID prices, confirming the need to model the interdependence of markets.As a threshold to restrict the number of explanatory parameters to the GPR-based, 0.3 b is used.Therefore, the  explanatory parameters to the GPR-based model consist of DA prices of all hours of the same day and all the preceding DA and ID prices that meet the threshold.
The GPR-based model is then trained using these prices (see II-B2).Next, employing the trained GPR-based model, hourly DA and ID price samples are generated, neglecting the correlations between hours.Cholesky decomposition is then used to incorporate correlations between various hours [33].
Fig. 6 displays the mean of predicted ID prices and the 95% confidence interval for ARIMA and GPR-based model for the first day of the test data.In this figure, "GPR with predicted DA prices" uses DA price predictions, whereas "GPR with actual DA prices" uses the real DA market prices following market clearing.The results of both GPR-based models show superior performance relative to the ARIMA.Moreover, when actual DA prices are used, the prediction accuracy dramatically increases, and the confidence interval shrinks.
3) Wind Generation Scenarios: The Beal Hill wind farm with a nominal power of 1.75 MW is modelled.The biascorrected reanalysis approach [36] is utilised to determine the daily power generation of the wind farm across the study horizon.The acquired wind power generation is regarded as the mean of a normal distribution, and a standard deviation of 2 per cent is used to reflect the volatility of the wind generation.
4) Scenario Reduction: Gams/Scenred2 [37] is used to reduce the sample size to 250, using a backward reduction approach.In the reduction phase, the probability of eliminated sample is added to the probability of the closest sample.

C. Demand Response
At each node and hour, it is estimated that 10% of the demand is willing to participate in the DR program.It is anticipated that customers will reduce their electricity usage in response to the incentive offered by DSA and that their demand will not shift to other hours.The self-elasticity of demand is set at −0.1 for all hours and buses, and consumers' electricity prices before and after partaking in the DR program are fixed at € 100/MWh [29].The cost function of the DR program is approximated by ten segments.The DR is assumed to represent a variety of resources, including domestic load control, small storage, EVs, and electric heat pumps.

IV. RESULTS AND DISCUSSION
In this section, the results of applying the proposed methodology are presented.Unless otherwise specified, all results pertain to the situation where network operational constraints are considered.In this regard, the results for a single day are first reported and discussed.The effect of considering network operational limitations is then evaluated.Next, the long-term analysis is provided, detailing the 201-day test data results.Finally, the implementation results of the proposed methodology on the IEEE 123-bus test system are given.All case studies are modelled and solved on a desktop computer with a 3-GHz i7 CPU and 16 GB of RAM using MATLAB R2020b and the MILP solver of Gurobi [38] in GAMS [37].

A. IEEE 37 Node Single-Day Results
As an example, the results of applying the proposed bidding strategy to the first day of test data are depicted in Fig. 7.This example illustrates how the proposed bidding strategy scheme allocates resources and determines optimal offerings in each market.These results reflect the DSA's bids following DA market clearing; thus, they are final.Fig. 7(a) shows the stochastic market prices derived using the GPR-based model.The final market offerings are depicted in Fig. 7(b).As noted before, the total resource utilisation for market participation should be a constant value across all scenarios, as shown in Fig. 7(b).This constraint is enforced by (19) for the DA market and (23) for the ID market.The DA bids result from solving (18), whereas the ID offers result from solving (22).Comparing the offers with hourly price data reveals that when DA prices are higher than ID prices, the resources are offered to the DA market and vice versa.Negative DA offers at hours 15 and 24 imply that the DSA purchases energy from the DA market to charge the BESs.The DSA also procures energy for BESs charging during other hours; however, since this energy is subtracted from other bids, it does not result in an overall negative value.This figure shows that the proposed approach incorporates the wind generation in the bids based on the model indicated in subsection III-B3.It depicts the adoption of a normal distribution with a standard deviation of a fixed per cent of the hourly mean generation to simulate the uncertainties associated with wind power generation.It is worth noting that wind spillage is zero in most scenarios and hours; however, there are a few scenarios in which wind spillage occurs at certain hours.The average utilisation of BESs for market participation is shown in Fig. 7(d).In the early hours of the day, when energy prices are relatively low, BESs are charged with energy bought from both the DA and ID markets.As electricity prices rise in the ensuing hours, the stored energy is sold back to the market, as indicated by a fall in the average SOC of BESs.Then, at hours 14-17, the BESs are recharged to be able to provide energy to the market at hours 18-20.In the last four hours, the average SOC of BESs increases as they are charged to satisfy the limit presented in (41).This figure demonstrates that the solution algorithm, across all scenarios, satisfies all constraints for the BES, including SOC Min s,t =0.1, SOC Max s,t =0.9, and SOC b s,t=24 =0.5.Fig. 7(e) displays the deployment of DER at various hours and scenarios.Since the DER has a fixed generating cost, it is beneficial to engage in the market when market prices are greater than β g , and it is financially sensible to remain inactive when market prices are lower than β g .During hours 10-11, for instance, market prices in all scenarios are higher than β g ; hence, the DER produces at its highest rate in all scenarios.However, when the difference between market prices and β g is marginal (e.g., hours 7-8 and 17), DER is deployed in scenarios that exhibit DA or ID market prices above β g .Fig. 7(f) exhibits the accumulated utilisation of DR.As its application cost quadratically increases, DR is exploited moderately when market prices are relatively low.Alternatively, DR is implemented at a high level when market prices are high and can compensate for the expenses of the DR application.

B. Effect of Network Operational Constraints
Considering the sensitivities and magnitude of network variables provided by the DNO, the DSA should allocate its resources in a way that does not compromise the network's safety.Fig. 8 exhibits the three-phase node voltages of the network with and without voltage constraints.This figure represents the results for one hour (t = 9) across all scenarios, buses, and network phases.The marginal distribution of voltages shows the number of scenarios for each voltage level.According to Fig. 8(a), the voltages exceed the safe limit when no voltage limitations are imposed.As revealed by the marginal distribution of voltages, approximately 5 per cent of all nodal voltages exceed this constraint.In contrast, Fig. 8(b) shows the constrained network voltages, demonstrating that when the voltage limitations are applied, the proposed bidding strategy manages the resources in such a way that the network voltages across all scenarios, buses and phases are under the maximum limit.The marginal distribution of voltages above the permissible range is thus zero.

C. IEEE 37 Node Long-Term Analysis
Five case studies (CSs) are defined and compared to thoroughly investigate the significance of the uncertainty and dependency modelling of different markets.CS1 represents the proposed methodology in which the DSA participate in both markets, taking into account the uncertainties and interdependencies.CS2 is similar to CS1 except that ARIMA is utilised instead of the GPR-based model for price prediction.Thus market uncertainties are modelled, but interdependence is neglected in this case.The distinction between CS3 and CS1 is that in CS3, the DSA solely engages in the DA market.Similar to CS3, the DSA in CS4 only participates in one market; however, in this case, it is the ID market.CS3 and CS4 are defined to demonstrate how participation in only one market might vastly impact the DSA's revenue.The last case, CS5, describes a situation in which a deterministic approach is used, and uncertainties associated with the bidding strategy problem are disregarded.In this case, the average of market prices and wind scenarios of CS1 are given to the proposed methodology as a single scenario with the probability of 1.
Fig. 9 compares the profit of the DSA in these five cases and the perfect prediction case.The middle dot for each case represents the median value.The perfect prediction case represents the best possible solution with 100% accurate price predictions.The values depicted in this figure for the DSA's average daily profit are derived from solving (18).However, these values represent the actual profits of the DSA, which are calculated by substituting the DA and ID price scenarios with their actual values.On certain days in some cases, the profit of the DSA is negative, indicating that the price prediction was inaccurate and that the DSA incurred a loss.The average profit of the DSA in the perfect prediction case (the highest theoretically achievable profit) is € 1,957.5/day.This value is € 1,809.1 per day for the proposed methodology reported in CS1.On average, the DSA could make € 1,737.9 per day if ARIMA is used to predict both DA and ID prices (CS2).It is apparent from this figure that participating in only one market immensely reduces the profit as the average daily profit of the DSA is € 1,526.The results demonstrate that modelling the uncertainties and interdependencies of markets significantly increases the DSA's profit.The observed 4.1% increase in the DSA's profit when the GPR-based model is used as opposed to ARIMA will equal € 14,311 over 201 days, which is considerable.The most interesting aspect of the results is that despite using a deterministic approach based on the mean of the price scenarios of CS1 in CS5 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.(interdependence modelled, uncertainty neglected), the DSA makes more profit than when a stochastic approach based on ARIMA is used in CS2 (uncertainty modelled, interdependence neglected), confirming that modelling interdependence could be more important than modelling uncertainty.

D. IEEE 123-Bus Test Feeder
The proposed methodology is also implemented on the IEEE 123-bus test feeder.All data regarding this test feeder and the DERs' locations are obtained from [28].However, the DERs are quadrupled in size.All other case data are the same as the previous system.Table II summarizes the results of the case studies outlined in Table I for this test system.The network operational variables are determined by employing the approach provided in [4] to ensure they are within the permissible range for this highly unbalanced feeder with overhead lines and underground cables.
In Table II, the DSA's profit in each market is given.The obtained results for this test feeder align with the results of the IEEE 37-node test system and corroborate the superiority of the proposed methodology.When markets' interdependence is modelled (CS1), the average daily profit of the DSA rises by 4.3% as opposed to when it is neglected (CS2).Comparing the average ID profit of the DSA in CS1 and CS2 with the perfect prediction case demonstrates that the proposed GPR-based methodology makes better decisions than ARIMA in the ID markets by leveraging the DA market clearing results.Moreover, the lack of diversity in DSA's resources in this test feeder (only dispatchable DERs) makes the profit of the DSA more dependent on the interdependence and uncertainty modelling of markets.

E. Comparison to State-of-The-Art Approaches
This subsection compares the bidding strategy results obtained by implementing the proposed GPR-based approach with state-of-the-art price prediction approaches, including ARIMA, LEAR and LSTM neural networks.The same structure of input data, as proposed in [39] is utilised for ARIMA.The LEAR approach is discussed in [40].The LSTM parameters including the optimization algorithm, the number of epochs, gradient threshold, initial learning rate, and minimum batch size of LSTM are defined based on the values reported in [41].The same structure of input data as those fed to GPR (the previous prices that have the highest amount of information identified by the MIS approach) is used in LSTM.
Table III presents the values for the DSA's profit for each test system with actual market prices.The results for both test systems demonstrate the efficacy of the proposed approach in better modelling the multiple market prices and thus improving the average daily profit of the DSA.Moreover, considering that the DA and ID price predictions are the same in these two test systems, the differences in the obtained solution to the bidding strategy problem are not only related to the efficacy of the price prediction approach but also related to the mix of resources available in each test system.Nonetheless, the proposed model shows superior performance in both cases.

V. CONCLUSION
DSAs should accurately model the market uncertainties and dependencies to engage in different energy markets.This work proposed a novel methodology based on the MIS technique and GPR to capture the uncertainties and dependence of different market prices.The DSA uses this data-driven uncertainty and interdependence handling model to determine resource allocation in multiple markets.Such resource allocation follows the relative price variation of markets, indicating the overall performance of the proposed methodology.The proposed bidding strategy was tested in various operating scenarios (201 days).The longterm analysis results indicate that the revenue will reduce if the DSA participates in only one market or fails to model the interdependencies between markets.The comparison of the bidding strategy results obtained utilising the proposed GPRbased approach with the state-of-the-art approaches unequivocally indicates the efficacy of the proposed markets-dependence modelling methodology.
It is envisaged that new markets of energy and services will be incorporated into the power system.It provides an opportunity for DSAs to increase their profits which could facilitate the integration of more DERs into the distribution network.This in turn defers or eliminates the need for distribution network upgrade investments, thereby enhancing social welfare.However, to exploit the multiple-market participation opportunity, the uncertainties and interdependence of these markets should be captured and factored into the bidding strategy problem as demonstrated by the obtained results.These points were showcased and discussed at length for DA and ID markets in this paper.Moreover, as balancing markets are growing in importance and trading volumes across the world, the results of the application of the proposed method in these markets will be considered in future work.

APPENDIX
A. Resource Modelling 1) Wind Modelling: The total wind power generation is considered a stochastic parameter.Therefore, in each scenario, the upper bound is wind power available (P w,M ax s,t ) in that scenario as indicated by (24) and (25).Moreover, since the price could be negative at some hours, spillage (P w,Spill ) is introduced into the model to give the DSA the option of not providing its wind generation to the market.The spillage (P w,Spill s,t ) should be less than the maximum available wind power at each scenario and each period as imposed by (26) where Inc d t (€ /MWh) is the incentive offered to the customers for each MWh reduction in their typical demand pattern.L d t and L d,DR t are the demands (MW) of dth DR resource before and after DR programs, respectively.DR expenses are modelled using the concept of price elasticity of demand and linearised as proposed in [29].The DR provision in DA and ID markets is bounded by the upper capacity of DR resources (P In the proposed DR program, the DSA is only concerned about the aggregate demand reduction of customers when they are asked to provide DR, thus, there is no need for detailed behindthe-meter data sharing.The DR participation in this paper is in line with the approach provided in [42] to preserve the privacy of the customers. 3) DER Modelling: DERs are deemed dispatchable generators.Equations ( 31) and (32) restrict the generation of DERs; while the former calculates the total offered generation of DERs to both markets, the latter caps the DERs' generation by their maximum limits (P g,M ax ).The throughput energy of each BES at each scenario and time step is calculated using (37).The converter capacity, P b,M ax , bounds the total power provision or demand of each BES in all markets as indicated in ( 33)- (36).The converter capacity, which limits the maximum power transfer between the BES and the network at any moment, is modelled as a portion, δ b Conv , of the BES energy capacity, E b,M ax , as expressed in (38).The SOC of each BES at each time step of each scenario is calculated using (39) incorporating charging, η b ch,Conv , and discharging, η b dc,Conv efficiencies.Moreover, in all times and scenarios, the SOC of BESs should be within their acceptable range (40).It should be noted that L b s,t and P b s,t are modelled on the network side of the connection.Considering the converter's efficiencies, charging and discharging at the same hour in the same market is not profitable.Therefore, the model does not need to enforce such a constraint.To this end, only charging and discharging at the same hour in different markets should be modelled.Therefore, using this concept, the number of binary variables, i.e., U , to prevent the charging and discharging at the same hour is halved using the following equations:

ACKNOWLEDGMENT
The opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Science Foundation Ireland.

Fig. 1 .
Fig. 1.Market structure and bidding strategy mechanism considering network constraints.

Fig. 4 .
Fig. 4. Correlation matrix of DA and ID prices.Diagonal subplots depict the histogram of the distribution of market prices, and off-diagonal subplots depict a scatter plot of a pair of market prices, while the numbers show Pearson's correlation coefficients.

2 )
Price Scenarios: The historical DA and ID market prices are obtained from the Irish SEM Operator [35].The price data from 1st October 2018 through 29th June 2021 are used.Of 1003 days, the first 802 days are used for training the model, and the remaining 201 days are considered test data.Each test day is analysed separately.It is worth noting that the Irish electricity market is highly volatile, and the price is negative in the early hours of some days due to excess wind generation.This volatility makes price prediction more challenging and complex, accentuating the necessity for price uncertainty modelling.The correlations of each day's DA (λ DA d ) and ID (λ ID d ) electricity market prices respecting each other and also the previous day's DA (λ DA d−1 ) and ID (λ ID d−1 ) market prices are illustrated in Fig. 4. It is interesting to note that the highest value of Pearson's linear correlation coefficients belongs to the correlation between λ DA d and λ ID d .This higher value of correlation shows that more information could be gleaned from the DA market clearing results regarding the ID market prices than the ID market prices of the previous day.

Fig. 5 .
Fig. 5. MI of ID prices with historical DA and ID prices.

Fig. 7 c
Fig. 7 c illustrates wind power generation at various hours.This figure shows that the proposed approach incorporates the wind generation in the bids based on the model indicated in subsection III-B3.It depicts the adoption of a normal distribution with a standard deviation of a fixed per cent of the hourly mean generation to simulate the uncertainties associated with wind power generation.It is worth noting that wind spillage is zero in most scenarios and hours; however, there are a few scenarios in which wind spillage occurs at certain hours.The average utilisation of BESs for market participation is shown in Fig.7(d).In the early hours of the day, when energy prices are relatively low, BESs are charged with energy bought from both the DA and ID markets.As electricity prices rise in the ensuing hours, the stored energy is sold back to the market, as indicated by a fall in the average SOC of BESs.Then, at hours 14-17, the BESs are recharged to be able to provide energy to the market at hours 18-20.In the last four hours, the average SOC of BESs increases as they are charged to satisfy the limit presented in(41).This figure demonstrates that the solution algorithm, across all scenarios, satisfies all constraints for the BES, including SOC Min s,t =0.1, SOC Max s,t =0.9, and SOC b s,t=24 =0.5.Fig.7(e) displays the deployment of DER at various hours and scenarios.Since the DER has a fixed generating cost, it is beneficial to engage in the market when market prices are greater than β g , and it is financially sensible to remain inactive when market prices are lower than β g .During hours 10-11, for instance, market prices in all scenarios are higher than β g ; hence, the DER produces at its highest rate in all scenarios.However, when the difference between market prices and β g is marginal (e.g., hours 7-8 and 17), DER is deployed in scenarios that

Fig. 8 .
Fig. 8. Three phase bus voltages with and without network limits.

Fig. 9 .
Fig. 9. Comparison of the profit of DSA in different cases over 201 days.
4 and € 1,580 per day in CS3 and CS4, respectively.If uncertainties are entirely disregarded, and fixed values are used for uncertain parameters (CS5), the DSA can still earn an average of € 1,776/day.
ID s,t (MW) are the DSA's DA and ID offers, in scenario s and at time step t, respectively.λDA s,t (€ /MWh) and λ ID s,t (€ /MWh) indicate the DA and ID prices of scenario s at time step t.On the second line of (18), G is the number of DERs indexed by g, and B is the number of BESs indexed by b.The power generation of DER g is denoted with P g s,t (MW), and E b,tp s,t (MWh) is the throughput energy of BES b, in scenario s and at time step t.The generation cost of DER g is signified by β g (€ /MWh), while β b (€ /MWh-throughput) denotes the degradation cost of BES b, and τ (h) symbolises the time step.Finally, the deployment cost of DR is represented by Cost DR s,t .It should be noted that the total offering at each hour of the DA market (P DA t) is a first-stage variable and constant in all scenarios, while the total offering at each hour of the ID market (P ID s,t ) could vary over scenarios.P DA are the power provision of the wind farm, DR resource, and BES in scenario s at hour t, respectively.The power used for charging BES at scenario s and period t is denoted by L b s,t .In Appendix VI-A, resource models and limitations are given.
is the number of time periods (hours) indexed by t, S is the number of scenarios indexed by s, and π s is the probability of scenario s.Moreover, P DA t (MW) and P in the above equations, superscripts DA and ID represent the DA and ID markets.W is the number of wind farms indexed by w, and D is the number of DR participants indexed by d.Moreover, P w s,t , P d s,t , and P b s,t [29]centive-based voluntary DR programs model as proposed in[29]is used to formulate the customers' behaviour.The total incentive paid to the DR resources at each hour and each scenario is calculated as follows: The constraints governing the operation and employment of BES are as follows: g s,t = P g,DA s,t+ P g,ID s,t ∀g, s, t(31)