Decentralized Energy Management of Multiagent Distribution Systems Considering the Grid Reliability and Agent Misbehavior

In recent years, the high expansion of independent energy sources and development of multiagent structures have resulted in new challenges in the efficient power management of distribution networks. In this regard, decentralized management along considering operational concerns of the system will be a key factor in running the future multiagent systems. Therefore, this article proposes a decentralized framework based on the alternating direction method of multipliers for managing the peer-to-peer (P2P) energy trading in a multiagent distribution system while considering the technical constraints and reliability of the network. This strategy facilitates considering the effects of the network reliability while running the agents’ optimization in a decentralized manner. Respectively, each agent would tend to exchange energy with more reliable agents, which would result in the resilient operation of the network. Moreover, the uncertainty of renewable energy sources is addressed using distributionally robust optimization. Additionally, with the aim of increasing the security of the P2P energy market against communication errors and agents’ misbehavior, an algorithm is developed to identify the existence of a problem in the market convergence as well as how it could be mitigated. Finally, this scheme is investigated on 37 and 69 bus test systems to study its capability in running sustainable energy systems.

Stationary distribution in the kth iteration.

I. INTRODUCTION
R ESTRUCTURING in energy systems, the increase in fos- sil fuel prices, and environmental concerns in recent years have led to the widespread proliferation of renewable energy sources in local systems.In this regard, the emergence of renewable energy sources in the distribution networks has changed them into active distribution networks with potential reliability, and flexibility problems [1].Therefore, optimal management of renewable energy sources while considering economic issues and maintaining the resiliency of the distribution network is of great importance.
The concept of multiagent structures in local systems can be used as an optimal method to manage renewable energy sources.An agent could manage its local systems by receiving the required information and processing it [2].Moreover, agents can exchange information with other agents due to the development of communication infrastructures to optimize their objectives.As a result, the expansion of multiagent systems can lead to the creation of an energy market with decentralized operation, which has major advantages over the traditional centralized methods, i.e., reducing the need for huge communication requirements and heavy computational burdens, increasing the scalability, and preserving autonomy and privacy of local systems [3].
In recent years, numerous studies have been done on the development of operational management models for multiagent systems; hence, new decentralized schemes have been presented.Respectively, one of the major concepts in this field is peer-to-peer (P2P) energy trading, which allows prosumers and local systems to participate in local energy markets.A distributed market framework is proposed in [4] which allows the prosumers to trade based on the characteristics of the energy such as location and generation technology.Still, this article has not considered the uncertainties of the network operation.In [5], a two-stage P2P energy sharing strategy is presented and a distributed approach is used to implement it.In the first stage, buildings determine their optimal amount of energy exchange with each other as well as the retailer; while in the second stage, reciprocal prices are obtained using a noncooperative game model.In [6], game-theoretic frameworks are proposed, in which pricing between sellers is modeled with a noncooperative game, and the dynamics of buyers to select sellers are modeled as an evolutionary game.Purage et al. [7] proposed a scalable robust optimal scheduling model for multimicrogrid (multi-MG) systems.Nevertheless, it is noteworthy that, in these works, the operational and technical limitations of the network have not been considered and merely the economic factors of the P2P energy trading markets have been studied.Accordingly, the obtained results may not address the operational constraints of the network; therefore, technical issues such as power flow equations seem to be necessary to be considered in the developed P2P energy markets.
A unified energy market framework for P2P energy trading is proposed in [8] to enable the decentralized operation of distribution systems considering network constraints.In [9], the distribution network is divided into several energy-sharing regions (ESRs) with different network fees; each of these ESRs has an energy-sharing provider.While this scheme enables the flexible operation of the system, the developed Stackelberg game model to manage the market does not enable the fully decentralized operation of the system.Li et al. [10] used Nash bargaining to operate the transactive energy (TE) trading structure with high penetration of solar energy resources.In [11] a distributed energy management system is developed to operate community MGs.In this work, a microgrid central controller is considered to manage the related distributed energy resources and energy storage systems along with home energy management system in local houses.Wang et al. [12] presented an iterative bilevel programming structure for the coordination of multi-MGs in the TE market and the operation of the distribution network.At each iteration, MGs take their decisions at the lower level and send their equivalent loads to DSO.Then DSO modifies the distribution system and calculates the TE path for the MGs.Although the authors in [8], [9], [10], [11], and [12] have respected the operational limitations of the grid, they have not considered uncertainties as well as the reliability of the grid in their investigations.
By the increasing share of renewable energies, the uncertainties associated with their power generation must be taken into account in the operation of the grid to increase the system reliability.For this purpose, various approaches including stochastic programming (SP), robust optimization (RO), and distributionally robust optimization (DRO) have been employed in recent works to address the uncertainties of renewable energy sources.In this regard, the authors in [13] have used SP to model the uncertainty of power output of photovoltaic (PV) units as well as uncertainties of energy prices, and load demand.In [14], a method for scheduling of multi-MG systems by using the SP is presented.In this approach, MGs schedule their resources for the normal and resilient operation of the network when they are disconnected from the main grid.Intera-day markets are presented in [15] to enable prosumers to exchange energy with their neighbors or the grid considering their generation uncertainties via different scenarios.A bilevel risk-constrained SP for managing a TE market has been used in [16].The uncertainties of the day-ahead market prices are handled in the first stage.In the second stage, a noncooperative game is utilized to model a market for MGs to maximize their profits.Nonetheless, the authors in [15] and [16] have not considered reliability.Fattaheian-Dehkordi et al. [17] proposed a distributed transactive framework to efficiently manage multi-MG systems.In this work, each agent optimizes its local resources by receiving TE signals and the SP is employed to model the uncertain parameters.Still, the work in [17] did not study reliability and agents' probable misbehaviors.
In the RO method, unlike SP, there is no need for probability density function (PDF) for uncertain parameters, leading to a reduction in computational burden; hence, only an uncertainty set of the uncertain parameter is required to apply the RO model [18].In [3], RO is used to handle the uncertainties of local resources.Wei et al. [19] employed the RO with the aim of modeling the uncertainty of wind power output as well as minimizing the cost of each MG.Nevertheless, the RO approach could result in too conservative outputs.This method considers the worst-case scenario and therefore it may not be very costeffective.
Recently, the DRO approach has attracted remarkable attention in modeling the uncertainty in optimization models.However, the DRO approach stands out as it does not require the exact This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.
PDF for modeling uncertain parameters, which is a limitation of SP.In addition, unlike RO, DRO takes advantage of historical data for optimization modeling, making it more data-efficient and less conservative in its decision-making process.This unique combination allows DRO to inherit the beneficial aspects of both SP and RO while overcoming their limitations.As a result, DRO provides a promising solution for handling uncertainty in optimization problems.In [20], a co-optimization model for energy trading is presented.In this work, the Wasserstein metric (WM)-based DRO is employed to handle uncertainties.Duan et al. [21] used Wasserstein-metric-based affinely adjustable distributionally robust to manage the uncertainty of wind power and analyzed the effect of sample size on the results obtained from DRO.Note that the authors in [3], [18], [19], [20], [21], and [22] have not taken into account the grid reliability as well as the data manipulation or common data failures that could lead to convergence problems during the decentralized operation of the system.In recent works, only authors in [23] have considered the misbehavior of MGs in their distributed TE management scheme for multi-MG systems.However, their approach lacked consideration of uncertainties and network constraints, while also being limited to a small sample size of only 4 MGs.As a result, their model may not fully capture the complexities and challenges that arise in real-world scenarios involving a larger number of MGs and diverse network conditions.
Based on the above-mentioned discussions, the authors believe that the decentralized optimization of multiagent distribution systems based on the P2P concept while considering the grid reliability as well as agents misbehavior and uncertainties of renewable energy sources has not been investigated in previous research works.Respectively, a new scheme based on the TE concept is developed to enable the P2P transaction of system agents, while considering the grid reliability.In other words, the network reliability could act as an important criteria for agents while participating in the P2P energy management.In this regard, the developed scheme enables the agents to consider the grid reliability while optimizing their energy transactions with other entities.Moreover, most of the existing papers in this field have used SP and RO methods to handle the uncertainties in P2P energy trading.In this work, the DRO approach is used to overcome the deficiencies in the SP as well as RO methods in modeling the uncertainties of power output by renewable energy sources.Finally, the agents misbehavior could affect the convergence of the P2P energy management; therefore, a new algorithm is proposed to determine the agents misbehavior and resolve the convergence issue in the system.The taxonomy table of research works in the context of P2P energy trading is presented in Table I.
In this study, we have proposed a P2P energy trading framework for interconnected agents considering the operating constraints of the network.In addition, the P2P market pricing mechanism is developed in a decentralized manner.Furthermore, a novel method is presented to model the reliability of the grid as a cost term in the optimization objective of agents.Finally, to increase the security of the P2P energy market, an algorithm is presented to identify and mitigate communication errors and convergence problems.The contributions of this article are as follows.
1) A P2P energy trading framework is proposed for operating multiagent distribution systems.As a result, this decentralized framework preserves the privacy and autonomy of agents.Also, unlike previous works in this context, this article strives to model the effect of the reliability of the grid on the operation optimization of agents.This method reduces the impacts of the contingency (i.e., fault) in the network on the results of the TE market.As a result, the resiliency-based energy management of the system is conducted in a decentralized manner.
2) The optimization problem of agents is modeled using data-driven DRO to handle the uncertainties of solar energy resources, in which the WM is used to construct the uncertainty set.Finally, equivalent linear programming reformulation of the DRO model is derived to reduce the computational complexity.3) In this article, an algorithm for increasing the security of the P2P energy trading market is presented.In this algorithm, at the first step, any error in the communication between agents is examined for the presence of noise, misperception, or cyber-attacks.In the second step, if the transmitted information is correct, the communicated information in the market is examined for any potential problem in the convergence of the P2P market.Unlike other existing works, this algorithm strives to provide a method for identifying the existence of a convergence problem in the P2P market and then determining badbehaving agents.As a result, it can realize and diminish the convergence problem in a distributed manner or by considering a control entity.In this article, the presented P2P market framework is discussed in Section II-A.Moreover, the problem formulation for each agent is described in Section II-B.In Section II-C, the WM is used for constructing an ambiguity set of uncertain parameter and linear programming reformulation for the DRO model are illustrated.Decentralized P2P energy trading and pricing strategy are explained in Section II-D.Different convergence problems of the P2P market are proposed in Section II-E.Section II-F presents an algorithm for identifying and mitigating the convergence problems of the P2P market.The results of implementing the proposed framework and misbehavior detection algorithm on the modified IEEE 37-bus test system and its usefulness are demonstrated in Section III.Finally, the conclusion is presented in Section IV.

II. METHODOLOGY
Based on the current trend, distribution networks will be structured of multiagents/MGs, which would also facilitate the This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.integration growth of renewable energy sources.These selfstanding entities, which are connected by the distribution grid, will have their independently operated local resources.A simple model of an independently operated agent is presented in Fig. 1.Respectively, system agents could independently optimize the operation of their local resources.System agents will be able to exchange energy with each other and the main grid; therefore, a market framework would be required to facilitate the energy exchange between the agents.This market could increase public welfare and reduce network losses.For this purpose, in this article, the concept of multiagent systems is used to create a P2P energy trading structure to facilitate the distributed management of the multiagent energy system.As mentioned, the modeled agents aim to maximize their profits in the market.Each agent is in charge of optimizing its local resources by calculating its operational variables and estimated price signals.This process would be done by considering the operational constraints of local resources and the uncertainty associated with the power generation of renewable energy sources.In addition, the distribution network is subject to grid faults (e.g., line failure), and these faults may severely affect the determined optimized power exchanges in the energy market.Respectively, in this article, it is assumed that agents could model the reliability of the grid as a cost term and consider this term in their objective functions while optimizing their local resources.As a result, energy trading with other agents would be optimized considering the failure risk of the connecting lines and the reliability of the grid.On the other hand, due to the requirements for a lot of information exchange in this structure, this market can be affected by communication errors, cyber-attacks, or misbehaving agents; which can cause problems in running the proposed distributed algorithm.Therefore, it is important to consider an approach to identify information authenticity while running the market.

A. Distributed Transactive-Based Management Framework for Multiagent Systems
In this study, a distributed framework based on the TE concept is proposed to enable the operation of the distribution networks with multiagent structures.In this article, system agents would be responsible for optimizing their local resources and satisfying technical constraints.These agents can be divided into two subsets of sellers and buyers of energy, which could negotiate directly with each other in the developed TE market.Note that the alternating direction method of multipliers (ADMM) is employed to run the operation of the distribution network, while agents independently optimize their objective functions.Therefore, in every iteration, information exchange would be done among the agents and then every agent calculates global variables and energy prices with the received information.It should be noted that this framework is not dependent on a central coordinator which is required in a central market model for gathering, analyzing, and optimizing the operation of local resources.Therefore, the market is managed in a completely decentralized manner.

B. Agent Scheduling
As mentioned, each agent is responsible for minimizing the operational costs of its resources according to received price signals and uncertainties of power generation by renewable energy sources.These costs would depend on the available local resources and could include the cost of purchasing and selling energy with the main grid or the P2P energy trading market, the cost of storage units, as well as the cost of fuel-based distributed generation (DG) units.
1) DG Cost: The fuel-based generation units, such as distributed diesel or microturbines, have a nonlinear cost function that could be modeled with a quadrative cost function as follows: where the positive coefficients of α 2,m , α 1,m , and α 0,m are relevant to DG characteristics.The output power boundary of the DG unit is shown based on its minimum/maximum active and reactive generation capacities as follows: 2) Battery Energy Storage (BES) Cost: Frequent charging and discharging of BESs reduce their lifespan.For this reason, the degradation cost function (3) is considered as a linear function using the charge and discharge rate and the lifetime degradation coefficient of the BES (i.e., θ b ) [32].
Constraints imposed on the operation of BES are as follows: Constraints (4a) and (4b) indicate the charging rate, and discharging rate of the BES.The state of charge (SoC) limit is enforced in (4c) to protect batteries against overcharging and undercharging.Moreover, the SoC balance equation is shown in (4d).Finally, (4e) imposes that the SoC at the end of the day should be equal to the SoC at the beginning of the day.
3) Cost (Profit) of Energy Trading: Each agent can exchange energy with the main grid; therefore, the cost (profit) of agent m for buying and selling energy can be presented in (5).In (5) (5) Also, the cost (profit) of agent m for the P2P energy trading at time t is displayed in the following: 4) Proposed Reliability Cost: The distribution network is exposed to various faults such as wire breaks due to storms, fire, falling trees on power lines, etc. [33], [34].These faults and line failures would prevent the P2P power transactions determined in the energy market, which could result in load shedding and financial loss.As a result, these conditions will reduce the reliability of the P2P market.Therefore, in this article, a novel method for modeling reliability as a cost term has been proposed to increase the reliability of the P2P market by reducing the effect of the potential network faults on the P2P market.In this method, each agent considers a penalty coefficient for each of market participants while optimizing its P2P power transactions.Agents calculate these coefficients based on the unavailability time of other peers during a year.The calculated coefficients should be considered in the objective function of the agents in order to affect their traded energy with other participants in the P2P market.
For this purpose, it is possible to calculate the unavailability of different lines based on the mean failure rates and repair times.In other words, according to the network topology, the unavailability time of different agents per year could be estimated.For normalization, these numbers will be divided by 8760 h and the cost of reliability will be entered into the modeling of each agent as represented in the following: As a result, with the addition of the reliability cost term in the objective function of each agent, more energy exchange will be done among agents with higher reliable connections because their unavailability time is lower than the others and this will lead to smaller reliability costs, respectively.In (7a), U m n stands for the calculated normalized unavailability.Moreover, υ m is used for weighting in order to determine desirable reliability for each agent.In addition, the value of p m n,t will be negative when agent m is an energy seller.Therefore, χ m,t is a coefficient to ensure that the value of the reliability cost will always be positive.It means that, when the value of p m n,t is positive and the agent m is a buyer of energy, the value of χ m,t is +1 and when the agent is a seller, its value is −1.

5) Load Shedding Cost:
The cost of load shedding should be taken into account in the operational optimization of agents.The load shedding may be the result of line loading in the distribution grid [17], [28].In this regard, the cost of load shedding is modeled in (8), where the load shedding price is presented by 6) Uncertainties Costs: Each agent can have solar power generation units.Respectively, due to the unpredictable nature of renewable energy sources, their uncertainties should be considered in the agent's optimization model.In this regard, an ambiguity set can be created using the forecasted generation data, which is represented by ζ in this article.Also, the cost associated with the uncertainties of agent m at time t can be represented by the Φ m t (x m t , ζm t ).The predicted data, ζ, have a probability distribution (PD) of P .Moreover, PN represents the ambiguity set including all the possible PD of the forecasted variable, and x m t represents the decision variables of agent m at time t.Therefore, the total cost of each agent can be split into two parts.The first part which is not dependent on the uncertainty of the power generation by renewable energy sources includes the cost of buying and selling energy, the cost of distributed production, battery degradation cost, load shedding cost, and reliability cost.The second part is related to decisions affected by uncertainties.Hence, the objective function should minimize operating costs and the expectation of the function Φ m t (x m t , ζm t ) under the worst-case distribution called P worst in the ambiguity set PN .Thus, the objective for the uncertainties can be shown as follows: 7) System Model: The operational modeling of the grid is presented in (10) considering the linearized DistFlow model [17].10a) and (10b) demonstrate the active and reactive power balance at each bus of the grid.Equations (10c) and (10d) are employed to calculate the voltage and current [35].It is noteworthy that the child and parent nodes are denoted by C i and A i in this article.Furthermore, (10e) and (10f) demonstrate voltage and line capacity limits.

C. DRO Model 1) WM-Based Ambiguity Set:
The creation of the ambiguity set has an excellent outcome in evaluating the expectation of the random variable ζ.Therefore, the true PD of the random variable is required.However, in practical applications, the true PD is always ambiguous and only a limited number of historical data ζ := { ζ1 , ζ2 , . . ., ζN } is available.Hence, the exact PD of the random variable cannot be obtained and only the empirical distribution PN = 1 / N N k=1 δ ζk can be calculated, where δ ζk denotes the Dirac measure of the random variable ζ k .
In this article, we have used the WM to create the ambiguity set, because it has an out-of-sample performance guarantee.This asymptotic guarantee indicates that as N tends to infinity, the ambiguity set converges to the true distribution, and a tractable reformulation [36].
The WM which is a method for measuring the distance between two PDs can be formulated as (11) in our problem.Where Ξ is compact supporting space and the two PDs PN , P ∈ (Ξ), and (Ξ) denote the collection of all PDs with compact support Ξ.In this regard, the empirical distribution PN is used as an estimate of the true PD in (11) to construct the ambiguity set.
In (11), Π is a joint distribution of ζ and ζN with marginals PN and P. Therefore, the ambiguity set can be created as (12), which shows a Wasserstein ball with a center of PN and a radius of ε(N ).
Note that ε(N ) has a great effect on the performance of the WM-based method.ε(N ) is a function of the confidence level β and the sample number N .In addition, ε(N ) must converge to zero as N goes to infinite and can be shown as follows [37]: In (13), D is the diameter of the support of the random variable.From [38], D can be computed by solving the optimization problem displayed in the following: where μ is the mean of the sample.
2) Problem Reformulation: For solving the DRO, we require to evaluate the worst-case expected costs in (9), which is a multiple-dimensional optimization problem over PDs.This problem seems to be intractable.On the other hand, the cost function Φ m t is defined in such a way that any shortage or excess in the power production by solar units that cause load shedding or nontraded power is penalized.Accordingly, the general form of Φ m t would be as follows: Since γ 2 > 0; thus, Φ(x, ζt ) is a convex quadratic function.Considering the sample set { ζ1 , ζ2 , . . ., ζN } of the random variable ζ by the boundary [ζ, ζ], the worst-case costs in (9) could be evaluated according to Lemma 1 in [38] as follows: where s k is an auxiliary variable.However, with the increase of the historical sample set, the number of the quadratic constraints and auxiliary variables in (16c) will increase.This would result in a heavy computational burden.To overcome this drawback, the value of λ is taken as follows: where Φ (x, ζ) is the derivative of Φ(x, ζ) with respect to ζ.Therefore, we would have Note that the number of constraints and decision variables in (18) remains similar while using larger sample sets, however, in this case, the computational burden significantly alleviates.It is noteworthy that the results of (16c) and (18) have very small differences which could be ignored [38].

D. Decentralized P2P Energy Trading Approach Among Multiagents
Distributed methods are suitable for maintaining the privacy and autonomy of agents.Consequently, in this study, the ADMM algorithm is utilized to develop a P2P energy trading scheme in a distributed fashion.The purpose of this optimization is to minimize the operating costs of individual agents.Therefore, the overall objective function is represented in (19a).Note that the P2P energy trading payments are excluded from the presented formulation due to (19c) and (19d).Note that (19c) is the reciprocity coupled constraint, that prohibits the energy trading between two seller or buyer agents.
), ( 4), (7b)-(7c), (8b), (10) where This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.
To apply the ADMM algorithm and therefore the pricing algorithm of the energy trading among agents, ( 19) is reformulated.In this regard, (19c) is reformulated considering an auxiliary variable (i.e.,p m n,t ).The augmented Lagrangian function associated with the P2P energy trading is shown in (20c), where z m t denotes the set of auxiliary variables, i.e., z m t := {p m n,t } n ∈ M\m.Note that λ m n,t is the Lagrangian multiplier related to (20b).x m,τ +1 t := arg min After computing (21) 3) Each agent locally updates Lagrangian multipliers and sends λ m,τ +1 n,t to agent n.Regarding the presented decentralized algorithm, x and z are interpreted as local and global variables, respectively.Additionally, the stopping criteria could be defined as follows: The distributed algorithm for optimizing the P2P market is illustrated in Fig. 2. Note that, the convergence of the multiblock ADMM algorithm shown in (19) cannot be guaranteed in general.However, by introducing an auxiliary variable (p m n,t ), this optimization problem can be transformed into a corresponding two-block ADMM model (20c).This structure consists of the blocks related to the energy management terms and auxiliary variables, respectively.In conclusion, the research conducted in [39] reinforces the notion that the dual-block ADMM algorithm exhibits global convergence when employed to tackle convex optimization problems.

E. P2P Market Convergence Problems
Despite the major advantages of decentralized methods over centralized ones; due to the lack of a central observer, decentralized management is prone to malicious attacks and misbehaving.In other words, a malicious agent can easily manipulate information to prevent the ADMM algorithm from convergence or an attacker can cause communication errors or disruption with the scheduling model of agents.Hence, identifying the misleading agent and problems that prevent the P2P trading algorithm to converge to the optimal solution is another aspect of increasing the reliability of the P2P energy market.This section should be done in parallel with the ADMM algorithm and must be implemented on the output data of the ADMM algorithm to ensure the correctness of the data in each iteration.The proposed algorithm can be implemented in a distributed manner or by a control entity.
The P2P energy market is faced with different convergence problems.One of the possible problems is the noise in the data received by each agent, which can be a communication error or due to an attack on the P2P market.In this case, the data can be modeled as Â(k) = A(k) + δ(k) where A(k) is actual communicated information, Â(k) is the decision vector calculated by the ADMM, and δ(k) is a random vector that can be different in each iteration.
Another problem is the existence of a fraudulent agent.This misbehaving agent manipulates its output information in a way that maximizes its profits and reduces the profits of other market participants while following the ADMM algorithm.An attacker can also hack an agent's software and cause problems in the market convergence.Therefore, providing a process to identify and resolve these threats to the market seems to be necessary.

F. Identify and Mitigate Convergence Problems
ADMM properties have been taken into account to develop an algorithm to identify the convergence problems in the P2P market due to agents' misbehavior.In this regard, Boyd et al. [39] show that in case C m t would have a closed, proper, and convex form; therefore, the augmented Lagrangian (20c) has a saddle point.Thus, we have Therefore, if the mentioned conditions for C m t are met, the amount of disagreement p m,k n,t − pm,k n,t 2 2 might not decline monotonically.However, for large k, when there is no convergence problem, we would have This condition is true for all variables [40].Using the convergence property shown in (27), we could calculate the mean squared disagreement (MSD) between the agents m and n in the kth iteration as follows: In (28), for large k, in case the value of f m,k n decreases uniformly; there will be no problem in the convergence.However, the convergence problem will occur if for a large k, we have Our proposed algorithm for detecting convergence problem in the P2P market has two stages.In the first stage, each agent considers a square matrix D for MSD.Therefore, agent m calculates the MSD values according to the received and calculated data in the mth row of matrix D.Moreover, the agent m sends calculated values to other agents, so that all agents have access to the data.It should be noted that this does not mean the privacy of agents is not protected because the D matrix does not contain any information about the amount of energy exchanged between the agents.Furthermore, in this matrix, we should have f m,k n = f n,k m ; therefore, any inconsistencies indicate a mistake in the exchanged data.Based on the above-mentioned discussions, the communication error will be detected and should be addressed in the P2P market.Therefore, if the agent m detects such an error for each of its customers or vendors (i.e., the calculated value does not correspond to the received value), it can log a communication error.In addition, since these kinds of structures are vulnerable to communication failures such as data loss and it would affect the algorithm of this stage, a special moving list must be considered to record the misbehaving agents' ID; consequently, if these failures continue more than a predefined limit, the energy exchange between the problematic agents will be cut off.
Then, in the second stage of the proposed process, the weighted moving average of the MSD is defined as follows: where a k ∈ (0, 1) is the smoothing factor that meets ∞ k=0 a k = ∞.As a result, each agent can calculate the confidence level of other participants as follows: Central entity constructs state transition matrix; 7.
Central entity computes in a distributional manner; 8.
Central entity finds the misbehaving agents; 11.
Central entity updates the list of suspicious agents; 12. Else: 13.
The central entity receives misbehaving agents' ID; 14.
The central entity updates the list of agents with communication failures; 15.Central entity makes decisions; Based on the above-mentioned discussions, each agent can calculate the square matrix B using the calculated and received information.As it turns out, B is a row stochastic matrix and irreducible [41]; therefore, B has an eigenvalue 1 corresponding to a stationary distribution π(k) that can satisfy π(k)B(k) = π(k).As a result, B can be perceived as a state transition corresponding to a Marco chain in which each agent tends to interact more frequently with the suspicious agents, and vice versa.Every agent can compute stationary distribution π(k) in a distributed manner.Therefore, r mis (k) = arg max π(k) can be identified as the most suspicious agent in every iteration.
To prevent convergence problems in the P2P market, a supervisor entity can be introduced, which calculates the π(k) matrix in each iteration by receiving the information from all agents.Despite other works such as [23] that their algorithms are always suspect to an agent in each iteration; in this article, a novel index based on the standard deviation (STD) of the calculated stationary matrix is considered to detect the existence of convergence error in the main algorithm.Therefore, the central entity calculates the STD of the π(k) in each iteration, then when the value of the STD rises higher than a certain threshold, the entity becomes suspicious and tries to find misbehaving agents.Accordingly, the agents whose corresponding value in π(k) is higher than the average value between the maximum and minimum data of will be identified as misbehaving agents.If these agents are detected continuously at a certain number of iterations, the central entity sends alarm signals to these agents and stops them from exchanging energy with each other.It should be noted that this method can also be implemented entity in a distributed manner by eliminating the market supervisor.

III. RESULTS
To evaluate the effectiveness of the proposed P2P trading framework, two case studies were carried out.The first test system is considered to be a 37-bus multiagent distribution test system shown in Fig. 3, while another one is a 69-bus system demonstrated in Fig. 4 [42].Purposely, each network bus is considered as an agent that has a combination of local resources, i.e., PV units, load demands, BESs, and microturbine units.
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.The proposed P2P trading framework is applied to schedule the energy exchange among the agents and the main grid for 24 h while taking into account the operating constraints of the grid.Moreover, a sample set considered with 40 samples is used to construct an ambiguity set of DRO with 95% confidence level.
The first part discusses the results obtained from the P2P energy market within the IEEE 37 and 69-bus test systems, while the second part addresses the developed algorithm for identifying convergence problems in the market in case of agent misbehavior for a test system.

A. P2P Energy Market Analysis in Multiagent Systems
1) Case Study I.In this part, the results related to the P2P market on the IEEE 37-bus system are given.Furthermore, one of the main purposes of this article is to model the reliability in terms of cost.Consequently, Figs. 5 and 6 show   the amount of energy exchange between agents with and without modeling the cost of the grid reliability while considering the technical limitations of the grid.In this regard, Fig. 5 shows the amount of energy exchange of agent 8 with other agents at the 10th hour.Moreover, Fig. 6 shows the amount of energy purchased by agent 20 from other agents at 10th hour.Here, the negative values in the presented results indicate that the energy is sold to other entities.To emphasize the impact of reliability cost, an alternative scenario is explored in Fig. 7, where the failure rate of line-connected agents 2 and 16 is deliberately heightened.This scenario vividly demonstrates the influence of reliability cost on energy transactions.It is evident from Fig. 7 that, in response to increased failure rates, agent 20 opts to exclusively obtain energy from agents 17 and 19, who benefit from more robust and reliable connections.It can be seen that agents, which have a set of reliable lines connected them together, tend to exchange more energy with each other.Fig. 8 depicts the amount of BES power charging/discharging in agent 20 at every hour of the day.Fig. 9 demonstrates the amount of power generation by different types of DG units operated by agent 8 during the operational horizon.Fig. 10 illustrates the total energy that the main grid exchanged with the system agents during the operational horizon.On the other hand, the total energy produced by all agents in 24 h is shown in Fig. 11.The P2P price between agents 8 and 20 at the 10th hour is shown in Fig. 12.The results show the convergence of P2P price during operation of the developed trading market.Note that the P2P price is converged to 17.84 $/kW which is between the selling/purchasing price This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.announced by the upper-level network to agents (i.e., 29/62 $/kW and 17/77 $/kW).As expected for a time period like hour 10 when the agents' energy production is higher than the entire demand needed by them, the total energy required by the agents is provided within the P2P market and the surplus energy is sold to the main network.Therefore, it is anticipated that the P2P market price will be near the purchase price of the main grid.
2) Case Study II.IEEE 69-Bus System: To validate the effectiveness of the proposed method, the IEEE 69-bus system is employed for the second case study, characterized by an increased number of buses and a more sophisticated topology.This selection aims to enhance the comprehensiveness of the evaluation.
In  scenarios with and without the inclusion of reliability costs, shedding light on the amount of power acquired by agent 27 from other agents under these different conditions.Notably, the figures reveal a discernible trend wherein agents exhibit a preference for procuring energy from counterparts with lower associated reliability costs, as observed in Figs. 5 and 6.
Moving to Fig. 14, the visualization captures the impact of total solar generation on the P2P market price.Notably, when solar generation reaches its zenith, there is a corresponding decrease in the P2P market price.This phenomenon contributes to optimizing the total welfare of the market, as evidenced by the market experiencing its highest levels of overall well-being during periods of peak solar generation.Fig. 15 displays the P2P price dynamics between agents 2 and 17 at the 13th hour.The outcomes reveal a notable convergence in P2P prices, illustrating the stabilization and efficiency achieved within the developed trading market during its operational phase.

B. Results of Implementing the Convergence Problem Detection Algorithm for Agent Misbehavior
In this part, the behavior of the stationary matrix π(k) and the standard deviation diagram related to it, in P2P markets on the 37-bus test system with/without a convergence problem will be investigated.Furthermore, the introduced convergence problem detection algorithm has been implemented in the developed P2P market to mitigate the problem.
In the first case, the P2P market is operated without any convergence problem.In this case, the results related to the matrix π(k) and the standard deviation associated with each iteration are shown in Fig. 16.Based on the presented results in Fig. 16, both diagrams reach a permanent state after several iterations.
In the second case, agent 4 sends the random information to agent 8, leading to a convergence problem in the P2P market.Therefore, as shown in Fig. 17    numbers are increasing in both matrix π(k) and the corresponding standard deviation diagram.In this case, the process of "identifying and mitigating the convergence problems" by the supervisor entity is not deployed in the market; therefore, the market diverges and the convergence problem will not be solved.
In the third case, agent 4 sends the incorrect information to agent 8 again; but in this case, the control entity becomes suspicious to the P2P market when the standard deviation of π(k) in Fig. 18(b) exceeds a certain threshold value.Therefore, the supervisor entity has proceeded to identify problematic agents.Then as it ensures that the agents have a problem and the problem is not transient; the supervisor entity prohibits their energy exchange.As a result, their power exchange reaches zero.Finally, the effect of this action can be seen in Fig. 18(a), where the P2P market has converged.

IV. CONCLUSION
In this article, a decentralized framework is presented for operating the P2P energy market in the distribution network considering multiagent systems and network constraints.In addition, the uncertainties associated with the power generation of renewable energy units are modeled using a data-driven DRO model.As a result, each agent can exchange energy with other agents as well as the upper-level network in the developed P2P energy market.This would enable the agent to minimize its costs while maintaining its privacy and autonomy.In addition, a novel method for modeling the grid reliability as a cost term while running the P2P market is proposed.The obtained results present the importance of considering the reliability of the grid while optimizing the scheduling of local resources.Finally, an algorithm is presented to identify and reduce the effect of convergence problems due to the misbehavior of agents while running the P2P market.This algorithm would enable to check the correctness of the communicated information as well as prevent misbehavior by agents.The proposed scheme is implemented on the modified IEEE-37 and IEEE-69 bus test system to illustrate the effectiveness of the proposed P2P management structure considering the operating constraints of the grid.

2 (
20c) and λ m t := {λ m n,t } n ∈ M\m. 1) x m t -Update: Due to the decomposability of the augmented Lagrangian and the constraints (19b), x m t can be updated in a decentralized way.Therefore, each agent m computes the problem (21) iteratively for updating the local variable x m t with considering the auxiliary variables z m,τ t and the Lagrangian multiplier λ m,τ t as constants.Note that τ shows the iteration index.

Fig. 2 .
Fig. 2. Distributed algorithm for optimizing the P2P energy trading and agents schedules.

Fig. 5 .
Fig. 5.Total energy sold by agent 8 to other agents in the P2P market at 10th hour with/without considering the grid reliability.

Fig. 6 .
Fig. 6.Total energy purchased by agent 20 from other agents in the P2P market at 10th hour with/without considering the grid reliability cost.

Fig. 7 .
Fig. 7. Total energy purchased by agent 20 from other agents in the P2P market at 10th hour when the failure rate of line 16 (between agents 2 and 16) is increased with considering the grid reliability cost.

Fig. 9 .
Fig. 9. Power generation by PV and microturbine units in agent 8 in 24 h.

Fig. 10 .
Fig. 10.Total energy the main grid traded with agents in 24 h.

Fig. 11 .
Fig. 11.Total power generation by PV and DG units of all the agents in 24 h.

Fig. 13 .
Fig. 13.Total energy bought by agent 27 from other agents in the P2P market at 10th hour with/without considering the grid reliability cost.
Fig. the presented data at the 10th hour illustrates the quantity of power transactions specifically carried out by agent 27 within the P2P trading market.The comparison contrasts
(a) and (b), the corresponding This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.

Fig. 16 .
Fig. 16.Simulation results when there is no convergence problem.(a) Evolution of the stationary distribution matrix.(b) Evolution of the standard deviation value during the running iterations.

Fig. 17 .
Fig. 17.Simulation results when agent 4 sends fraudulent data to agent 8, without considering a central entity to supervise convergence problems.(a) Evolution of the stationary distribution matrix.(b) Evolution of the standard deviation value during the iterations.

Fig. 18 .
Fig. 18.Simulation results when agent 4 sends fraudulent data to agent 8, with a central entity to supervise convergence problems.(a) Evolution of the stationary distribution matrix.(b) Evolution of the standard deviation value during the iterations.
of parent node/child nodes of node i in the distribution network.
b /SoC b Upper/lower bound of SoC of BES.r i /x i Resistance/reactance of line i.v i /v i Upper/lower bound of squared voltage magnitude of node i. F i Maximum capacity of line i.
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.
, P m b,t (P m s,t ) shows the amount of purchasing (selling) energy by agent m from (to) the main grid at time t.Moreover, λ b t (λ s t ) represents the price of purchasing (selling) energy from (to) the in every iteration, each agent m sends p m,τ +1