A Robust Multi-Stage Power Consumption Prediction Method in a Semi-Decentralized Network of Electric Vehicles

A Virtual Power Plant (VPP) balances the load on a power grid by allocating power generated by various interconnected units during periods of peak demand. In addition, demand-side energy devices such as Electric Vehicles (EVs) and mobile robots can also balance energy supply and demand when effectively deployed. However, the ﬂuctuation of energy generated by renewable resources makes balancing energy supply a challenging goal. This paper proposes a semi-decentralized robust network of electric vehicles (NoEV) integration system for power management in a smart grid platform. The proposed approach integrates an aggregator with EV ﬂeets into a blockchain framework. The EVs execute a multi-stage algorithm to predict the power consumption based on a novel federated learning algorithm named Federated Learning for Qualiﬁed Local Model Selection (FL-QLMS). From the evaluation results, the proposed system requires 35% fewer transactions in short intervals and propagation delays than the previous approaches and achieves better network efﬁciency while maintaining a high level of security. Moreover, NoEV achieves a 5.7% lower root mean square error (RMSE) than the conventional approach for power consumption prediction, which is a signiﬁcant improvement. In addition, the FL-QLMS approach outperforms state-of-the-art methods in terms of robustness to client-side attacks. The evaluation results also show that the performance of FL-QLMS is not affected when 10% to 40% percent of the models are manipulated.


I. INTRODUCTION
The utilization of renewable energy resources has increased significantly over the last decade. By the end of 2020, global renewable energy generation capacity reached 2799 gigawatts [1]. Meanwhile, European emission standards limit carbon dioxide emissions from regular cars to less than 95 g/km by 2020 [2]. Recently, a growing number of electric vehicles (EVs) are being integrated into smart grids to solve the problem of fluctuating renewable energy feed-in and shifting peak loads. To achieve efficient distribution and utilization of renewable energy, the concept of a virtual power plant (VPP) has been proposed as an intermediary between distributed energy resources, the power grid, controllable loads, and EVs [3]. When investigating the information exchange between an EV fleet and the VPP center, critical The associate editor coordinating the review of this manuscript and approving it for publication was Kashif Saleem . factors such as robustness and cost-efficiency of data storage, fast response to demand, and good scalability deserve much attention [4], [5].

A. BACKGROUND AND MOTIVATION
Nowadays, modern EVs are equipped with devices for sensing, computation, communication, and data storage, providing a solution to offload cloud data centers [6]. Various efforts have been made to outsource edge computing tasks in vehicles [7], [8]. And a few studies have investigated the framework of vehicle edge computing for the VPP scenario [9]- [11]. For a complicated smart-vision task in a driving environment, vehicles must be equipped with high-speed systems that process a large amount of sensor data (about 1 Gb/s) [12]. However, one of the bottlenecks of today's local devices is still limited computing power. For example, the Renesas Xtreme, the most recent automotive microcontroller In the conventional VPP architecture, the EV fleet is generally considered as a type of end consumers. A VPP aggregator monitors activity on the vehicular network. In AEBIS, where each EV participates in FL by sharing local models via the blockchain, the EV fleets form a blockchain network and the VPP aggregator is thus replaced. Compared to AEBIS, NoEV introduces a combination of VPP aggregator and EVs. The aggregator first merges the local models from EVs and then uploads the global models to the blockchain. In the proposed system, a substantial number of local models are not stored in the blockchain, which ensures a more efficient environment for collaborative learning. The colored models denote local models, and the models in black denote global models. Modified from Wang et al. [31].
Security and privacy are other concerns in vehicular edge computing (VEC), which has great significance in avoiding traffic collisions, improving road efficiency, and reducing environmental impact [14]. As a concrete example, protecting the functionality of range anxiety is critical for EV drivers. In addition, a cyberattack on EV or charging stations can result in a large-scale charging outage that can have a significant impact on the vehicle and the power grid. Secure data sharing and management [15]- [17] have been investigated, and various federated learning-based framework have been proposed for vehicular networks [18], [19]. Other privacy protection frameworks such as differential privacy attempt to deal with aggregation issues, however, with a challenge of achieving optimal trade-off between data utility and data leakage [20].
As a decentralized and secure framework, blockchain is a popular solution to replace the traditional approach in edge computing. It benefits federated learning in secure energy trading, management, and protection of EVs and drivers' data privacy. Secure bidirectional energy trading (charging and discharging) [21]- [24] for EVs has been investigated using a blockchain scheme. Research in [21], [25] studied both blockchain-based energy trading and data sharing in vehicleto-grid (V2G) networks. The works in [26]- [28] proposed blockchain-based models for information authentication and trust management in a vehicular network. Other works proposed a variety of incentive-compatible schemes to encourage EV nodes to participate in demand response [29], [30]. While the above works addressed secure blockchain-based decentralized energy trading, EV participation, and data management issues in V2G, they did not concretely investigate secure data communication between the smart grid and the vehicular network. Moreover, the overall load on the network remains a significant challenge as the number of EVs continues to increase.
In previous work, we proposed an AI-Enabled Blockchainbased Electric Vehicle Integration System (AEBIS) for power management in smart grid platforms [31], [32]. AEBIS is a fully decentralized blockchain-based architecture for EV fleet model learning integrated with the VPP platform, as shown in Fig. 1(b). The system employs EV fleets as consumers and suppliers of electrical energy within a VPP platform [31]. A mechanism for charging the batteries of electric vehicles is proposed based on predicting the power consumption of the batteries using an artificial neural network. The neural network model is trained in a federated learning scheme and mapped into a reconfigurable AI chip [31]. Besides, by introducing blockchain technology into the system, secure and transparent service is achieved at the cost of storage and latency. In Fig. 1, the solid lines indicate the communication on the blockchain, and the dashed lines indicate the other communication activities. The overall decentralized architecture, EV charging mechanism, neural network configuration, and AI-chip integration were introduced in [31]. However, the earlier approach has the following shortcomings: (1) the constant proliferation choice of new models for the blockchain solution leads to a heavy load on the network. The efficiency of the blockchain suffers greatly from this problem, making it challenging to apply in real-world scenarios, (2) the system is only designed for power consumption prediction for a local area, along with weather information at the start time. In a practical scenario where an EV travels to another city, the trained model cannot handle such a complicated case because the geographical and weather information changes during the journey, (3) the stateof-the-art federated learning approaches pay little attention to attack scenarios. The assumption that a malicious model can be uploaded in any training round leads to significant degradation of model accuracy. To the best of our knowledge, none of the previous approaches, including our work in [31], [32], have simultaneously considered system efficiency in blockchain-based vehicular network, EV participation with power consumption prediction, edge computation robustness for local devices.

B. CONTRIBUTION
This paper proposes a semi-decentralized Robust Network of Electric Vehicles (NoEV) integration system for power management in a smart grid platform. The proposed approach integrates an aggregator with EV fleets into a blockchain framework. Each EV in NoEV executes a multi-stage algorithm to predict its power consumption based on a novel federated learning algorithm named federated learning for qualified local model selection (FL-QLMS). The main contributions of this work are summarized as follows: • A semi-decentralized robust network of electric vehicles (NoEV) integration system for power management in smart grid platform. The system maintains a high-security level while significantly increasing the efficiency of the blockchain network.
• A multi-stage power consumption prediction method which ensures the accurate prediction performance for intra and inter-district travel.
• A novel algorithm for robust federated learning, named federated learning for qualified local model selection (FL-QLMS). The rest of this paper is organized as follows. In Section II, we discuss related work on the integration of blockchain and FL in edge computing, EV power consumption prediction, and client selection for federated learning. In Section III, we present a semi-decentralized blockchain-based platform, which is based on a multi-stage algorithm for power consumption prediction and a novel FL model selection mechanism. Section IV provides the performance evaluation of the proposed system. Section V provides discussion, and Section VI presents the conclusion and future work. The nomenclature used in this paper is given in Table 1 and 2.

II. RELATED WORK
In this section, we survey related works on 1) integration of blockchain and FL in the edge, 2) EV power consumption prediction, and 3) client selection in federated learning.

A. INTEGRATION OF BLOCKCHAIN AND FL IN THE EDGE
The work in [33] discussed the communication costs, resource allocation, incentive learning, and security and privacy issues. Weng et al. [34] proposed DeepChain, a framework with a value-based incentive mechanism based on blockchain for secure collaborative training. Wang et al. [35] studied two types of Byzantine attacks in a blockchainempowered decentralized, secure multi-party learning system. Pokhrel et al. [19] proposed a local on-vehicle machine learning (oVML) method in an autonomous blockchain-based FL design. Bao et al. [36] proposed a decentralized FL system that provides incentives and disincentives for collaborative modeling. To analyze the latency performance and robustness of the blockchain system, decentralized architectures named BlockFL and FL-Block, were introduced in [37] and [38] respectively. Despite considering communication, computation costs, and incentive mechanisms, the increasing number of parties in the blockchain-based FL network poses a considerable challenge to the efficiency and applicability of the systems described in the works above.

B. EV POWER CONSUMPTION PREDICTION
Vatanparvar et al. [39] proposed a novel context-aware methodology to estimate driving behavior concerning future vehicle speeds for up to 30 seconds. In [40], a speed optimization framework is modeled for both battery life and power consumption of intelligent EVs during acceleration. Since these works focused only on the acceleration process, they are not suitable for long-trip scenarios. Ferro et al. [41] presented a detailed energy consumption model considering all aspects affecting the vehicle dynamics. Baek et al. [42] introduced a general methodology that allows predicting and optimizing the operation range of EVs. Zhao et al. [43] proposed a combined machine learning model for predicting the remaining range of EVs based on real driving data. One shortcoming of these methods is the complexity of their models. That is, the prediction for a single route requires a large amount of vehicle, routes, and battery data. Moreover, careful and elaborate route-planning for a terrestrial EV involves high time and data storage costs. Features, such as weather conditions and geography, were not investigated.
Gomez-Quiles et al. [44] proposed a novel ensemble method for predicting EV power consumption by examining the non-stationary time series of consumption. Although the algorithm is used for predictions for the next month or two, it is not unsuitable for specific driving activities.

C. CLIENT SELECTION IN FEDERATED LEARNING
The original FedAvg algorithm in [45] randomly selects a group of clients in each training round, which means that the communication quality and delay are difficult to evaluate. Authors in [46] researched performance degradation due to non-independently and identically distributed (non-IID) data in the FL protocol. The approach focuses on the resource constraints of clients, including data heterogeneity, computation limitation, and communication capability. In [47], the authors proposed a multicriteria-based approach for client selection in FL, which aims to group many clients in each round to reduce the communication rounds. However, none of these works considered the importance of local data that affects learning performance.
He et al. [48] proposed another scheme for data selection and resource allocation based on the importance of data in the FL system to improve the learning efficiency.
Authors in [49] identified a fundamental property of FL, namely the temporal pattern and varying significance of different learning rounds. They formulated a long-term client selection and bandwidth allocation problem under finite energy constraints and proposed a new Lyapunov-based online optimization algorithm to guarantee long-term performance. Cho et al. [50] presented a convergence analysis of FL with limited client selection and demonstrates how local loss affects convergence speed. Zhang et al. [51] proposed a weight-based client selection mechanism to recognize the VOLUME 10, 2022 non-IID degrees of local data. However, the strategies mentioned above were adopted only when the clients' reputations remained unchanged. Considering that an edge node is prone to attacks in any training round, the quality of the model decreases due to tampering. Therefore, a long-term client selection mechanism is required to achieve a robust FL model.

III. ROBUST NETWORK OF ELECTRIC VEHICLES (NoEV) INTEGRATION SYSTEM
The proposed NoEV system integrates an aggregator with EV fleets into a blockchain framework. The EVs execute a multi-stage algorithm to predict the power consumption based on a novel federated learning algorithm named federated learning for qualified local model selection (FL-QLMS).

A. SECURE SEMI-DECENTRALIZED FL-BASED FRAMEWORK
As we explained in the previous section, the proposed system is based on a semi-decentralized architecture. As shown in Fig. 2, the solid lines in black means that the local models are uploaded from clients to the aggregator. This communication is not conducted in the blockchain. Other activities, which are denoted by dashed lines in blue, belong to the blockchain network. A VPP aggregator, EV fleets and a group of miners are integrated into the blockchain network. In the proposed architecture, the miners are vehicles itself, while we display EVs and miners separately in Fig. 2 for the sake of explanation. The overall workflow for each training round is described as follows:  c) The aggregator then creates a transaction TX that contains the original transaction TX 0 , the encrypted message and a public key. d) The transaction will be sent from the aggregator to one of the nodes and then broadcasted to all miners. e) Each miner can start performing validation. One will use the same hash function H and generate the hash value of TX 0 . We denote the hash value by H 1 . Since the same hash function always produces the same output, H 1 should be identical to H (TX 0 ). Besides, the encrypted message will then be decrypted using the public key. If the resulted value matches H 1 , the digital signature is proven to be valid. Therefore, TX is considered valid and added to each node's transaction pool. Once TX is confirmed by the blockchain network, it is added to the block. f) A block header contains a 32-Byte previous block hash, 32-Byte Merkle root, 4-Byte timestamp, 4-Byte difficulty target, and 4-Byte nonce. A nonce is a 32-bit target that is guessed by miners by solving the following equation: where, n is a pre-determined value controlling the mining difficulty. g) Once the nonce is found, the mined block is added to the distributed ledger. 5) Each client downloads the global model from the distributed ledger for model update.
The local model is transmitted and merged without blockchain support. To ensure the robustness of the model aggregation, we present a novel algorithm named Federated Learning for Qualified Local Model Selection (FL-QLMS). Thus, the fake models are out and therefore do not affect the model aggregation. Besides, a multi-stage power consumption prediction method is proposed to improve the accuracy of the models, which will be introduced in section III-B. We will present the FL-QLMS algorithm in section III-C. The proposed semi-decentralized FL-based platform drastically reduces blockchain congestion while maintaining a high level of system security. A functionality comparison between the decentralized (i.e., AEBIS) and the proposed semi-decentralized (i.e., NoEV) systems is given in Table 3.

B. MULTI-STAGE POWER CONSUMPTION PREDICTION
To present the multi-stage power consumption prediction, we consider a single trip from a start city to a destination, as shown in Fig. 3(a). The start city is located in Area 1 and is denoted by City s . The destination is located in Area N and is denoted by City d . Each city is associated with latitude and longitude, e.g., City s is associated with latitude Lat s and longitude Long s . The duration of driving is abbreviated as DoD. We assume that DoD takes only integers and ranges from 1 to 12 hours to simplify the problem. The start time is denoted by t s . We also assume that the EV moves at a constant speed in a straight line. Therefore, we can calculate the position of the EV at each time t, t ∈ {t s , t s + 1, . . . , t s + DoD − 1}. Each calculated position City c is called an ''equal point'' because the distance between two adjacent points is the same. The equal points are marked by green dots, as shown in Fig. 3(a). These equal points divide the entire path into multiple sections. We then predict the power consumption for each section and sum up the results. For each section, we need the following features: 1) start time t, 2) weather information at time t, 3) geographic information (latitude and longitude), 4) user information, and 5) duration of driving. For each equal point, we use the weather data from the nearest weather station, which is highlighted in yellow in Fig. 3(a). Algorithm 1 describes the proposed approach to predict power consumption in detail.
First, a start city City s (Lat s , Long s ), a destination City d (Lat d , Long d ), DoD, and start time t s are given. Latitude and longitude of all cities are stored in {Lat k } k∈K and {Long k } k∈K , respectively, where K denotes the set of city IDs. The weather information is presented by Weather k,t k∈K ,t∈T , including temperature, rainfall, humidity, and wind speed, where T is the time period of the weather data and is given by each hour. User_Info contains information about the driver's gender and age. N total denotes the total number of cities, and M is the neural network model for power consumption prediction. In Stage 1, the empty arrays Lat c , Long c are initialized for recording equal points. Lat p , Long p , and City_ID are used to record nearest cities to each equal point. As shown in Line 6 and 7, we find coordinates of point that divide the line segment, City s City d , into multiple equal parts. The length of each array is set to DoD. The temporary variables ED and ED min are initialized for calculating and storing distance information. An empty sample S is prepared as input for model prediction. In Stage 2, the latitude and longitude of each equal point are calculated, given Lat s , Long s , Lat c , Long c and DoD. In Stage 3, for each equal point, we traverse all practical cities and find the nearest one by Euclidean distance. In Stage 4, we prepare samples with respect to each section and perform prediction. We extract the hour and day of the week from time t s +i−1, i ∈ [0, DoD). We extract gender and age from User_Info. Given the weather data at time t s + i − 1 and a city with City_ID[i], we obtain temperature, rainfall, humidity, and wind speed. We also obtain the latitude Lat p and the longitude Long p . Finally, we input the sample S PC pred = PC pred + M (S) 32: end for 33: return PC pred into the model M . When the prediction is completed for each driving section, we obtain the final result PC pred .

C. FL-QLMS ALGORITHM
As we explained in section II-C, the conventional approaches (i.e., work in [45]) randomly select a group of clients in each training round, which means that the communication quality and delay are challenging to evaluate. Moreover, the approach makes the model vulnerable to client attacks, which eventually leads to severe degradation of the prediction performance (e.g., accuracy in classification or root mean squared error in linear regression). Therefore, to ensure a robust learning environment, it is necessary to always select the ''qualified'' local models for aggregation, where qualified models are considered not polluted and contribute to the performance of the global model.
In the proposed FL-QLMS algorithm, we focus on selecting a group of ''qualified'' local models for model aggregation. In general, if the distribution of the data is similar, the convergence trend of a local model should also be similar to the centralized model [53]. Thus, if the parameters of a local model are similar to those of the centralized model, that is, if the parameter diversity between the two models is low, the local model is considered to contribute to model aggregation. On the other hand, if a local model is contaminated by a malicious attack, the diversity between the contaminated model and the centralized model should be high. The diversity between two models can be expressed as follows: where DI a,b denotes the diversity between model parameters P a and P b . Consider a FL process with N clients, each training round consists of the following six steps: 1) First, each client trains its local model using the collected local data set. In each local model, the gradient ∇g L is calculated using adaptive moment estimation (Adam) optimizer [54], as shown by the following formula: where W denotes a set of weights, and E(W ) denotes the loss function with respect to W . E(W ) is used for measuring the model error and finding an optimal solution. Also, δ indicates partial derivatives. 2) Each client uploads the local model M i local to the aggregator. Besides, the aggregator is informed of the local data size D i local from each client, where D i local denotes the local data set of the client i, i ∈ N .
3) The aggregator selects a group of uploaded models based on the FL-QLMS algorithm. The number of selected models is determined by the parameter α, i.e., α% of all models used for aggregation. Given a total set of N models, the number of selected models is N selected = α% · N . The list of selected models is denoted by M selected . 4) Before aggregating the models, we need to calculate the contribution of each selected model concerning the corresponding data size [45]: (4) where N selected m D m local is the total data size with respect to the selected models. 5) The selected models are aggregated, resulting in a global model with gradient ∇g L [45]: 6) Once the edge nodes receive the global model from the server-side, they update the parameters as follows [54]:

Algorithm 2 FL-QLMS With Auxiliary Model
Require: Auxiliary model M aux , local models M i local i∈N , the total number of clients N , and parameter α Ensure: List of selected models for aggregation 1: Initialize an empty list M selected , which is used to store the selected local models 2: Store all parameters of M aux as a one-dimensional array, denoted by P aux 3: Store all parameters of each M i local as a one-dimensional array, denoted by P i local 4: for each i ∈ N do 5: Calculate the diversity between P aux and P i local using the Manhattan distance, denoted by DI aux,i 6: end for 7: Select α% · N models with lowest DI aux,i and store them to the list M selected 8: return M selected where W r and b r denote the weights and biases in the r-th training round, respectively. η denotes the learning rate. We present the FL-QLMS algorithm with and without auxiliary model. Algorithm 2 describes how FL-QLMS works when an auxiliary data set is available. There are two ways to obtain a reliable auxiliary data set. One option is to pay the EV clients for the data set and get the data set on the spot. Another possibility is that the aggregator uses a group of EVs to collect data. Both methods collect the data without online data transmission, thus avoiding data leakage. The auxiliary data set is prepared on the aggregator's side. We denote the auxiliary model as M aux . First, we store all parameters (weights and biases) of M aux as a one-dimensional vector, denoted by P aux . We treat each local model M i local in the same way and obtain the flattened vector P i . P aux and P i have the same size, i.e., |P aux | = |P i |. Then, for each model, we calculate the diversity between P aux and P i using the Manhattan distance: where p j aux is a parameter of P aux , and p j i is a parameter of P i . Then, α · N models with the lowest DI aux,i are selected for aggregation.
Algorithm 3 describes how FL-QLMS works when an auxiliary data set is not available. For each local model M i local , we store all parameters (weights and biases) as a onedimensional vector, denoted by P i local . We then calculate the diversity DI i,j between P i local and each P j local , where j ∈ N and j = i. Therefore, the average diversity of M i local can be computed as follows: VOLUME 10, 2022

Algorithm 3 FL-QLMS Without Auxiliary Model
Require: Local models M i local i∈N , the total number of clients N , parameter α Ensure: List of selected models for aggregation 1: Initialize an empty list M selected used to store the selected local models 2: Store all parameters of each M i local as a one-dimensional array, denoted by P i local 3: for each i ∈ N do 4: for each j ∈ N and j = i do 5: Calculate the diversity between P i and P j using the Manhattan distance, denoted by DI i,j 6: end for 7:DI i = 1 N −1 N j=1,j =i DI i,j /* Calculate the average diversity between P i and P j local j∈N ,j =i 8: end for 9: Select α% · N models with lowest DI i and store them to the list M selected 10: return M selected A model with a lower average diversity is considered more representative. In other words, the data set associated with the model is considered to have a similar distribution to the entire data set. For this purpose, α · N models with the lowestDI i are selected for aggregation.

A. EVALUATION METHODOLOGY
To show the advantage of our proposed system in terms of cost-efficiency, we studied the network load in a blockchain system and compared the proposed NoEV with AEBIS, oVML and DeepChain. We mainly focused on the number of blocks and transactions generated in a given period. We used an extensible simulation tool BlockSim for blockchain systems introduced in [55]. The configurations are summarized in Table 4. We simulated 63 nodes for AEBIS, oVML, and DeepChain and 63 + 1 nodes (1 additional node for the aggregator) for NoEV. We implemented ten runs for each simulation, with each run lasting 6000 seconds.
As discussed previously, the data set for the power consumption prediction includes weather, geography, and user information. We collected weather data from December 2019 to November 2020 in 63 cities in Japan [56]. The start time of vehicle reservation was set from 0:00 to 23:00 and the duration of driving from 1 to 12 hours. We considered the age of drivers ranging from 21 to 69 years old. The daily power consumption was measured considering the input characteristics and the measurement model [31]. We summarize the detailed information of the data set in Table 5. The data set contains a total of 66000 samples. We compared the proposed multi-stage power consumption prediction with the original power consumption prediction (PCP). We investigated the performance of the two methods under different driving activities -(a) short-distance journey, (b) mid-distance journey, and (c) long-distance journey. We summarize our definition of the above three activities in Table 6.
We considered a set of N = 63 clients in the federated learning environment. The data set contains 63000 training samples and 3000 test samples. First, we studied the effects of the model initialization methods -a) global initialization and b) local initialization. We considered an independent and identically distributed (IID) setting and employed the FedAvg (Federated Average) algorithm [45]. Then we considered a scenario where the local data is non-IID. Finally, going a step further, we compared the robustness of different FL algorithms against client attacks. For each algorithm, the simulation was repeated 20 times. Each simulation included 50 iterations. We used the root mean square error (RMSE) to measure the model's performance.

1) BLOCKCHAIN EFFICIENCY ANALYSIS
The block size was set to 1 megabyte (MB). We considered different combinations of TI and B delay , which represent the average time to generate a new block and the propagation delay of a block, respectively. In [55], the transaction size T size is 572.5 bytes by default. In our experiment, T size is larger because each transaction must additionally store a portion of a model. The total number of parameters for our fully connected network (11-8-8-1) is 157. Each parameter in floating-point format occupies 4 bytes; thus, if we extract the parameters from the model, the total size is 157 × 4 = 628 bytes. In general, the return operator (OP_RETURN), which is part of the Bitcoin script language, is used to allow storing metadata on the blockchain with a maximum storage limit of 83 bytes according to release 0.12.0 [57]. Therefore, at least eight transactions are required for each model. The updated transaction size T size is 572.5 bytes + 628/8 bytes = 650 bytes. We implemented 63 nodes (N 0 to N 62 ) for AEBIS, oVML, and DeepChain simulation with respect to a total of 63 EV clients. For simplicity, we consider a simple scenario that each miner has the same hash power. Therefore, given 63 nodes and the total hash power of 1, each of them will have a hash power of approximately 1.587%. For the NoEV simulation, the aggregator is introduced as an additional node N 63 . Since N 63 is not assigned any mining task, its hash power is set to 0%. We assume that the number of transactions (T n ) created per second is eight in NoEV. Accordingly, T n = 8 × 63 = 504 in AEBIS since 63 nodes are considered. Table 7 summarizes the results of AEBIS, NoEV, oVML, and DeepChain on the BlockSim simulator. When the average block interval increases, the total number of blocks decreases accordingly. Moreover, as the block propagation delay increases, the number of blocks included in the main chain decreases, while the number of stale blocks increases. The stale blocks have been successfully mined but are not included in the current best chain. Therefore, the overall rate of stale blocks increases. When comparing with other   methods, it is observed that NoEV generally requires the fewest transactions, especially for short TI . For example, for a short block interval (TI = 30) and short block propagation delay (B delay = 1), NoEV requires an average of 25166 transactions, which is 38%, 37%, and 35% less compared to AEBIS, oVML, and DeepChain respectively. The significant decrease in NoEV can be explained by the fewer number of transactions, because the NoEV requires only global model transmission on the block, while the other methods require frequent local model transmission. DeepChain averaged the model updates every 10 to 20 iterations rather than at each iteration to increase communication efficiency, as in AEBIS and oVML. However, DeepChain and oVML still require the exchange of local models over the blockchain network.

2) MULTI-STAGE POWER CONSUMPTION PREDICTION
A comparison between PCP and the proposed multi-stage PCP is illustrated in Fig. 4. The overall prediction results are shown in Fig. 4(a), where the multi-stage PCP achieves 5.7% lower RMSE compared to PCP. We observed that the multi-stage PCP performs better in scenarios with the short-distance journey. This result is surprising because the original PCP mainly focuses on local driving activities and has achieved decent performance. Our most compelling case is long-distance driving. As illustrated in Fig. 4(d), the multi-stage PCP still achieves better results by completing 14.3% lower RMSE. Besides, we analyzed the performance variance of the two methods in each case. For medium and long distances, the variance of RMSE of the multi-stage PCP is more significant than that of PCP. The multi-stage approach can explain the reason. The multi-stage PCP first divides the journey into multiple sections for a long trip and then runs the prediction model for each section. When the prediction results are summed up, the errors caused by each prediction are also accumulated. Therefore, the multi-stage PCP leads to higher variability. On the other hand, for a short trip, e.g., one or two hours, the multi-stage approach has little effect, and therefore the variance of the multi-stage PCP is lower.

3) FEDERATED LEARNING FOR QUALIFIED LOCAL MODEL SELECTION (FL-QLMS)
We considered a set of N = 63 clients for the FL schedule. We split the whole data set D into the training set D train of 63000 samples and test set D test of 3000 samples. First, we evaluated two approaches of model initialization: a) global initialization and b) local initialization. Global initialization means that the aggregator creates an initial model and distributes it to all clients. On the other hand, local initialization involves each client creating its initial model and performing the training task. FedAvg is used for model aggregation. The number of local updates is set to one before each global aggregation. We randomly assigned 1000 samples to each client. Thus, each subset D i iid follows independent and VOLUME 10, 2022 identical distribution (IID), where D train = D 1 iid ∪ D 2 iid ∪ · · · ∪ D N iid . Fig. 5 illustrates the impact of two model initialization options on training performance. The red and blue shaded regions denote local and global initialization performance fluctuation, respectively. While local initialization leads to slower convergence in the first 20 iterations, it achieves a lower average RMSE of 7.77 than global initialization at the end of training. This shows that it makes more sense to build the initial models on the client-side rather than on the server-side. Therefore, we implement local initialization in the following FL simulations.
We then considered a scenario where all local data is non-IID. We refer to this scenario as Scenario-I. We distributed the entire dataset across N = 63 clients, each of which is associated with 1 to 5 start cities. Besides, each local data set D i non−iid contains different reservation times, i.e., morning, afternoon, or evening. For each D i non−iid , i ∈ N , the data size ranges from 200 to 2000. Similar to IID scenario, we have D train = D 1 non−iid ∪D 2 non−iid ∪· · ·∪D N non−iid . We compared the performance of FedAvg, FCS, and the proposed FL-QLMS with or without auxiliary model M aux . As shown in Fig. 6, the FL-QLMS with an additional model has a similar performance as FedAvg, while both algorithms cannot keep up with FedCS with an average RMSE of 7.28. The reason for this is the robustness of FedAvg and FedCS against the Non-I.I.D setting to some extent. Also, compared to FedCS and FL-QLMS, FedCS allows two times as many clients in each training round. We then found that the average RMSE of FL-QLMS without an auxiliary model is higher than the other methods, reflecting the importance of an additional model during training.
We further investigated the impact of hacked clients on various FL algorithms. We refer to this scenario as Scenario-II. We assume that k% of all clients are hacked in each training round. Each hacked client uploads a malicious model where all parameters range from -1 to 1 randomly. Compared to Scenario-I, we used the same setting for data distribution and training simulation. From Fig. 7 we can see how each method performs against model attacks of varying severity. FL-QLMS (with M aux ) is shown to be robust when 10% to 40% of clients are hacked, holding average performance constant. In contrast, FedAvg and FedCS are highly sensitive to attacks, as the training process hardly converges when the number of faked models increases. For FL-QLMS (without M aux ), it always leads to convergence, but with slightly worse performance than FL-QLMS (with M aux ).

V. DISCUSSION
A semi-decentralized FL-based architecture is proposed to integrate both an aggregator and edge nodes into a blockchain FIGURE 6. Comparison among FedAvg, FedCS [46], and the proposed FL-QLMS (w/o the auxiliary model). In this experiment, a Non-IID setting is considered. An average RMSE is shown beside each boxplot.
platform. Although the blockchain does not secure the transmission of local models from the client to the server side, the aggregator could perform a robust model selection strategy to remedy potential model attacks during or before dispatch.  In this way, we significantly reduce the communication load on the blockchain and still maintain a robust and secure network. Using a simple AI model to predict battery power consumption for a long-distance trip is inappropriate for accurate prediction. Since a long journey can be divided into multiple small sections, a multi-stage algorithm helps reduce the prediction error. A shortcoming of our strategy lies in the assumption that the driving activity is a uniform linear motion, which is ideal in practice. To transform the process into a real scenario, we prefer to create an optimal route based on the global positioning system. Moreover, the efficient division of the whole trip into several sections remains a problem to be optimized. Besides, a qualified local model selection is essential to ensure the robustness of federated learning. The FL-QLMS algorithm demonstrates robustness against model attacks during the federated process. However, the performance of the current FL-QLMS algorithm is highly dependent on a prepared auxiliary data set, which raises two critical issues. First, the supplemental data should ideally have the same distribution as the entire data is not guaranteed.
In addition, since the client-side data is updated daily, the ancillary information is unreliable for the local model selection. Second, due to privacy and security awareness, edge nodes may not share raw data to the server.

VI. CONCLUSION
This work presented a semi-decentralized Robust Network of Electric Vehicles (NoEV) integration system for power management in smart grid platform. NoEV integrates an aggregator with EV fleets into a blockchain framework, where EVs execute a multi-stage algorithm to predict EV power consumption using a novel FL-QLMS algorithm. In addition, we evaluated the proposed semi-decentralized system regarding storage and communication efficiency in the blockchain network. Compared to the previous approaches, NoEV requires 35% fewer transactions in short intervals and propagation delays. The comparison results show that the proposed system achieves better network efficiency while maintaining the system's security level. Moreover, the system achieves a 5.7% lower root mean square error (RMSE) than the conventional PCP approach, significantly improving power consumption prediction. In addition, FL-QLMS approach outperforms state-of-the-art methods in terms of robustness to client-side attacks. The evaluation results demonstrate that the performance of FL-QLMS is not affected when 10% to 40% percent of the models are manipulated.
Nevertheless, the proposed system still has room to improve. First, the robustness of the system relies on the high stability of the aggregator. The aggregator collects local updates and broadcasts the global model to the blockchain. However, once the aggregator is not working, a backup server is needed to maintain the system. Second, in our blockchain proposal, EV clients are deployed to act as miners. If the local training needs a more powerful edge device, then the computing device on the EV may not be enough. Therefore, other miners need to be associated with EVs, and the system architecture needs to be redesigned.
In our future work, we plan to investigate the mining reward mechanism by extending our work to both public car-sharing and private car services. In addition, we will also study the security issue in more complicated attack scenarios.