User Behavior Clustering Based Method for EV Charging Forecast

The increasing adoption of electric vehicles poses new problems for the electrical distribution network. For this reason, proper electric vehicle forecasting will be of fundamental importance for a predictive energy management system, which could greatly help the operation of the grid. This paper proposes a comprehensive novel methodology to forecast single charging sessions of electric vehicle and the resulting cumulative energy forecast of the charging infrastructure. Historical charging sessions are first clustered on the basis of similar user characteristics and their respective probability density functions are defined. From this, every charging session is predicted with a triplet of parameters, namely the arrival time, the charging duration and the average power expected during the process. The proposed method has been evaluated by considering a real case study. The results showed the ability to greatly improve the accuracy with respect to the chosen benchmark, both in terms of energy required by the station and the predicted number of charging sessions. The overall performance measured by Skill Score is 0.37 for the year 2019.


I. INTRODUCTION
The European government recently required a set of measures aiming to decarbonize the transport sector [1]. Stronger CO 2 emissions standards for cars, vans and vehicles in general, will accelerate the transition to zero-emission mobility by requiring the average emissions of new cars to reduce by 55% of 2021 levels in 2030, and by 100% in 2035. As a result, all new cars registered in 2035 will be zeroemission [2]. To ensure that Electric Vehicle (EV) drivers can charge their vehicles at a reliable network across Europe, adequate regulation will require Member States to expand charging capacity in line with zero-emission car sales.
In the United States, the Advanced Clean Cars II (ACC II) regulatory proposal [3] was recently approved by the California Air Regulatory Board. It will gradually drive the sales of Zero Emission Vehicles (ZEVs) in California, starting The associate editor coordinating the review of this manuscript and approving it for publication was Akin Tascikaraoglu . with the year 2026, up to 100% ZEVs by 2035, including Battery Electric Vehicles (BEVs) and hydrogen fuel cell electric vehicles (FCEV) and the cleanest-possible plugin hybrid-electric vehicles (PHEV), while reducing carbon emissions from new internal combustion engine vehicles (ICEVs). For these reasons, an accurate electric load forecast of the power requested by EV charging stations is needed, and thus the topic has gained more attention recently.
Electric load forecasting methods can be categorized into statistical time series models and artificial intelligence models [4]. At the very early stage in EV load forecasting studies, statistical models [5] were the most suitable choice as the lack of real, comprehensive data about EV charging made it necessary to build plausible data scenarios through computational algorithms [6]. However, these load forecasting methods were unable to provide predictions with sufficient accuracy. The recent growth in EV adoption has precipitated publically available datasets, and therefore EV load forecasting has moved from a purely probabilistic VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ approach to a data-driven one. Here, for example, the daily load data of the Beijing Olympic Games EV Charging Station in 2010 are employed [7]. Recently the main direction of the research in EV load forecasting is focused on different types of Artificial Neural Networks (ANNs) and ML models [8], [9].
In [4] a novel Long Short-Term Memory (LSTM)-based model is established for forecasting EV charging station load. Actual EV charging measurements are adopted for model training and validation, an ultra-short-term load forecast at two different time scales is investigated. The LSTM is compared to a simpler ANN structure; numerical results demonstrate that the LSTM model has better performance via higher accuracy compared with the traditional ANN. In [10], authors propose a machine learning ensemble model named EnLSTM-WPEO based on a cluster of LSTMs to forecast short-term traffic flow. Reference [11] is a performance comparison of four deep learning-based methods: a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), LSTM and Gated Recurrent Unit (GRU). Each is implemented as an hourly forecast. At the end of the experimental process, during which also the change in the number of hidden layers was investigated, the one hidden layer GRU model outperformed the other three models. Reference [12] compared deep learning approaches for ultra-short-term charging load forecasting of plug-in electric vehicles.
The aggregate results indicate that machine learning models effectively forecast ultra-short-term PEV charging load for providing accurate prediction curves in both cases. Among the deep learning methods, the LSTM model is superior to the other methods. It is meaningful to note that [4], [11], [12] are basically adopting the same approach applied to the same database (the data platform of a company that has a large proportion of EV charging stations in Shenzhen China), with different time horizons. In [13] a model based on Wavenet is described and evaluated through a comparison with state-of-the-art Neural Networks such as ARIMA, LSTM, GRU, Multi Layer Perceptron (MLP), causal 1D-Convolutional Neural Networks (1D-CNN) and ConvLSTM (Encoder-Decoder). Wavenet, originally designed for generating raw audio waveforms, uses dilated causal convolutions and skip-connection to utilize long-term information. It performs better than the other architectures for 24-hour-ahead forecasting using the RTE dataset of the whole country of France [14]. However, from the bibliographic review, it's understood that a uniquely superior EV load forecasting method has not been identified.
In [15] authors present a long-term demand forecasting Sequence to Sequence (Seq2Seq) model, which is particularly suited for the EV forecast as it takes into account temporal correlation between historical records in different time steps. The performance is evaluated after training the model on two real-world datasets coming from the USA, and the benchmark models are Historical Average, ARIMA, Facebook Prophet, XGBoost, and LSTM. The researcher demonstrated that the Seq2Seq model yielded the best results, however, the sparsity of the data makes the superiority of Machine Learning (ML) models less obvious.
Finally, in [16], authors propose a deep-learning-based method for short-term (5-minute) probabilistic EV charging demand prognostic. The model is based on Machine Theory of Mind to forecast users' behavior both in terms of habits and temporary conditions, it is composed of three networks: the habit net, the trend net, and the forecast net. Both the habit net and trend net are built based on LSTM networks, while the forecast net is a three-layer fully-connected NN. The application of two case studies on real EV charging demand improves upon the baseline methods represented by quantile forecast models belonging to either parametric or non-parametric families. However, as these ML techniques strongly rely on data, it is evident that researchers are struggling against two main problems. First, the availability of sufficiently expansive datasets which could be a statistically representative sample of the actual EV charging station situation is missing. Second, as the timing of the whole charging process may strongly differ among the vehicles due to their different states of charge on arrival at the EV station, the paradigm of the most accurate forecasting method, aiming at modeling the expected values of the electrical power load time series has recently moved towards focusing on specific parameters of the EV charging process. For instance, in [17] authors aimed at minimizing the uncertainty of the duration of the charging and synchronizing it with the typical schedule of the Electric Car Sharing (ECS) operators. However some problems still are unsolved, affecting traditional load forecast approaches: charing power is curtailed due to the power capacity of the station, and the related EV features are not deducible via commonly available parameters. Consequently, the behavior of end users becomes predominant and necessary to consider.
The main objective of the present work is to develop a comprehensive and accurate EV charging session forecasting method that completely represents the power required of the charging station. Most of the works focus on the forecast and reconstruction of the aggregated power profile. Here the EV load forecast decomposes the aggregate load power curve into a summation of the most likely individual EV charging sessions to have resulted in that load power curve. The individual EV charging sessions are determined by clustering user behavior data, namely, the time charging started and energy consumed over the charging period. This resulting forecast (both in terms of load power curve and EV charging sessions) is fundamental to overcoming the traditional concept of unidirectional and noncontrollable load (or ''passive load'') where the only control is given by the enforcement of power availability constraints (i.e. load curtailment). Instead, the poposed methodology allows the shift to a new concept of controllable bidirectional load through the active scheduling of energy requested from individual EVs on the basis of available energy from the grid and expected end-user needs, which could also be shared by the EVs owners (''active load''). This enriched forecast represents more valuable and complete foresight for the EV charging station operators, rather than simply an aggregated load power curve forecast. In fact, knowing the charging session composition subdued by the power curve demand it is possible to better support the Energy and Power Management System in the optimization task. In fact, knowing the charging duration time and total energy consumption for each session permits the prioritization and optimization of session charging power, subject to the infrastructure and economic constraints [18].
The paper is organized as follows: in section II the proposed novel methodology is explained and detailed; in section III the implemented metrics are shown while in section IV a comprehensive case study is presented. Finally, in section V conclusions are drawn.

II. METHODOLOGY
In this section, a proposed comprehensive methodology is detailed, providing an accurate forecast of the power trend required in the following 24 hours by Electric Vehicles at the EV charging station is here detailed. The methodology workflow is fully depicted in Figure 1.
As depicted in the workflow, on the basis of historical data, the procedure is divided in two main parts: the first one, related to the dashed green box, is devoted to feature extraction and feature engineering, and the second to the actual forecast leveraging on the information previously identified.
The first step of the proposed methodology is devoted to the identification of recurring patterns through the clustering process. From the historical session data recorded during operation, the latest n days are extracted and used for the generalization and characterization of the charging sessions. Each of the EV clusters defined is further described in terms of Probability Density Functions (PDFs) in the arrival and power-duration spaces. This step can be addressed through Gaussian-Kernel Density Estimator (KDE). The expected number of occurrences (i.e. number of charging sessions) per cluster must be carefully defined, due to the presence of remaining outliers after the clustering algorithm. Ordering Points To Identify the Clustering Structure (OPTICS) algorithm, in fact, automatically identifies some points in the space as outliers. These, though not interesting for the characterization and the definition of the arrival and powerduration PDFs, contribute significantly to the overall amount of power required by the station in the charging process. To properly address this criticality, the outliers are associated with the closest cluster for the sole definition of the number of expected occurrences. Hence, the obtained cluster labels are used in the training of a supervised classification process. Following a sensitivity analysis, a random forest classifier with 100 trees is used. Having associated each outlier with the most similar cluster, a PDF representing the number of expected occurrences can be computed.
The second step, on the other hand, leveraging the previously engineered information, reconstructs the expected power curve, through a Monte Carlo sampling approach. In the following sections, the exploited methodologies are presented and detailed.
As often mentioned in literature [19], comparison among different methodologies is difficult since they are applied to private data. To overcome this problem, this work will be tested on a publicly available dataset and specific evaluation metrics defined.

A. CLUSTER ANALYSIS
User behavior can be analyzed by clustering based on similar characteristics. The clustering process is carried out considering a space defined by each user arrival time (t a ) and the time the vehicle is connected to the station (d). This identifies the major similar characteristics among the charging sessions in terms of user behaviors. To properly identify the clusters, a density-based methodology is exploited. These algorithms are very useful for data mining, detecting regions of the space in which observations are denser and separating them from regions with low density (noise). Moreover, they do not require any prior assumptions regarding the points' statistical distribution.
In [20], a density-based clustering method is presented: Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The concept is that for each point of a cluster, VOLUME 11, 2023 the neighborhood of a given radius (ε) has to contain at least a minimum number of points (MinPts) where ε and MinPts are input parameters. Though, an important property of many real data sets is that their intrinsic cluster structure cannot be characterized by global density parameters. Very different local densities may be needed to reveal clusters in different regions of the data space. To overcome this problem, an extension of the DBSCAN algorithm was presented in [21] named OPTICS. For each point, two values are computed and stored: the core distance and a reachability distance which allows building a dendrogram called reachability plot.

B. PROBABILITY DENSITY FUNCTION DEFINITION
Through the clustering process, groups of similar characteristics were found in the arrival time vs charging duration space. Those clusters have to be then characterized in terms of the main features enabling the reconstruction of the overall station power curve, which are: arrival time, duration of the charging process, the energy required and the overall number of occurrences expected for each cluster. In general, this is performed through the reconstruction of the PDFs of those characteristics, in order to define an underlying function describing the distribution. KDEs are non-parametric estimations that do not require any explicit parametric model to fit the data [22]. The kernel estimator may be written compactly as: where K h (t) = K (t/h)/h and h the bandwidth or smoothing parameter.
In a multivariate domain, the previous formulation can be extended asf where H, as previously, is the bandwidth, defined as a square matrix symmetric and positive definite.
Much of the first decade of theoretical work focused on various aspects of estimation properties relating to the characteristics of a kernel. The quality of a density estimate is now widely recognized to be primarily determined by the choice of smoothing parameter, and only in a minor way by the choice of kernel. This is even more true in the multivariate case. Several practical approaches aim at processing data before applying the KDE. For example, common is to first rescale the data to equalize sample variances across each dimension (scaling). Alternatively, a linear transformation can be applied to the data to obtain as a covariance matrix the identity (sphering). In general, those approaches can be detrimental. This is because the entries of the covariance matrix are usually not able to take into account the curvature in f and its orientation. [23] For multivariate kernel density estimation, the bandwidth matrix induces an orientation, not defined for 1D kernels. This leads to the choice of the parametrization of this bandwidth matrix. In general, three main parameterization classes can be implemented, which include the adoption of positive scalars multiplied by the identity matrix, a diagonal matrix with positive values on the main diagonal, and a symmetric positive definite matrix. In the current work, Scott's factor was implemented for the evaluation of the kernel bandwidth, which is computed as a multiplication of a scalar, dependent on the number of available samples, and the covariance matrix.

C. MONTE CARLO SAMPLING AND CHARGING SESSION RECONSTRUCTION
The previous cluster analysis and resultant PDF definitions identify the most relevant user behaviour and charging characteristics from the historically recorded data. Therefore, to estimate the expected charging sessions in the forecast horizon, the desired number of Monte Carlo simulations is defined as MC and the PDFs are sampled according to what is shown in Algorithm 1. For every cluster c ∈ C, where C is the overall set of identified clusters, the expected number of charging sessions n c is estimated. Then, the arrival time average power and duration are sampled.
The number of Monte Carlo simulations run in the current work is equal to 100, chosen as a trade-off between accuracy and added computational burden.
The result is then MC lists of charging sessions, each one of them representative of a possible realization the station could face in the future. In Figure 2, an exemplifying representation of the result of the simulations is given. For every Monte Carlo simulation, different charging sessions belonging to different clusters are visible. Of great importance is that this information can be greatly beneficial when dealing with the optimal management of the station, having more information with respect to the mere overall power profile.
From this, the cumulative power curve can be drawn. Averaging all the Monte Carlo realizations, the expected power draw at the station can be found. Furthermore, the closest single Monte Carlo realization to the average power curve can be considered the most probable EV charging composition.

III. EVALUATION
In order to systematically evaluate the obtained results, different metrics are used and a comparison with a benchmark methodology must be given. Hence, following an autocorrelation analysis, a seven-day horizon is set for the persistence model, in this work used as a benchmark (P t = P t−7d ).
Moreover, a ML methodology is implemented and used to further compare the obtained results and validate the proposed methodology. In particular, a recurrent neural network architecture based on an LSTM cell is implemented as detailed in [24]. Following a tuning procedure, it presents a single layer of 256 units. The input sequence is equal to 672 samples, equivalent to a full week.
The power forecast is evaluated considering the percentage of energy error committed throughout the day as follows: where E f and E M stand for the energy forecast and measured respectively. Moreover, in order to evaluate the quality of the proposed methodology over a longer time horizon, the Skill Score (SS) is evaluated. This metric is defined as follows: where the rmse f stands for the Root Mean Squared Error (rmse) computed for the proposed forecast methodology while the rmse b is the rmse computed considering the benchmark. This metric returns values comprised in the range (−∞, 1], where 1 can be only achieved with the perfect forecast. Furthermore, a value of 0 or negative is obtained if the methodology under study is equivalent or worse with respect to the benchmark model respectively. Finally, the present methodology is evaluated by comparing the forecast number of sessions with the actual number seen throughout the day.

IV. CASE STUDY
The methodology described above is tested on real public data available online at [25], where a full and comprehensive description is available. Adaptive Charging Networks (ACN)-Data were collected from JPL, California. It includes 52 EVSEs in a parking garage where access is restricted to employees only. The JPL site is representative of workplace charging. EV penetration is also quite high at JPL which leads to high utilization of the EV charging infrastructure further fostered by an ad-hoc program where drivers free the charging spot as the charging is completed. Here, data about each occurring charging session is collected. In Table 1 the most relevant variables to the scope of the present work are presented.
In Figure 3, the overall number of connections to the EV charging station is shown. In particular, in blue the connections are given with respect to the time they take place, while, in orange, the same is provided for the disconnection time. Being located at an office, most of the users connect in the morning and later disconnect in the afternoon, as expected.
In order to simulate a full year, only a portion of the full database is considered, spanning from October 2018 to December 31 st 2019, for a total of 21259 charging sessions. The considered months of 2018 are valuable since an initial period is required to extrapolate information from the users. In Figure 4 the number of sessions per day is visually given for the first months of the simulation period. As visible, being the EVSEs located at an office, a great disparity is visible between the week and weekend days. A full description of the data is given at [25].
Among other parameters, each charging record contains the charging time (beginning and end) and the acquired energy in kWh. In order to reproduce the power time  series with the desired granularity, the charging records are converted to time series by uniformly dividing the energy delivered to the vehicle by the time the vehicle was connected. Raw data cannot be directly used in the forecasting process and have to undergo processing. In particular, one important step that must be performed is data imputation, which is the process of providing the best guess for the missing values [26]. This is particularly important when dealing with time series since it may affect ordinal properties. Dealing with aggregated data, sessions having erratic information regarding arrival time, departure time and delivered energy were discarded, having a marginal effect on the time series values.
In order to account for a scenario where a), the adoption of EVs is rapidly increasing and b), their use may change throughout the year, historical data are selected with a rolling horizon approach. Following a sensitivity analysis, only the previous three months are used to generalize the user behavior and the charging characteristics, as depicted in Figure 5. If there is an abrupt change in the overall charging station power demand and users' behavior, the predictive ability may be negatively impacted. This, though, is true for every forecasting model, which may be affected by a rapid change in trend. Moreover, the presence of impermanent changes of behaviors (e.g. summer holidays) may affect the forecasting ability and must be accounted for, for example not including them in the feature engineering period In the present work all the available days are included.  The peak of the autocorrelation is at seven days. For this reason, the forecast of a specific day will leverage the historical information previously collected for that specific day. For example, to forecast Monday 1 st April 2019, the algorithm will extrapolate information collected up to the previous Monday. For this reason, the forecast horizon is seven days. Simulations are performed on the year 2019.

A. RESULTS AND DISCUSSION
Once the data pre-processing is performed, sessions are clustered [19]. The OPTICS algorithm is here implemented that, differently from k-means, the number of clusters is automatically detected and does not to be a priori specified.
In Figure 6, an example is provided. On the lower part of the graph (a) the search space of the clustering process is visible. On the x-axis, the EV arrival time is presented, normalized between 0 and 5, where 0 represents Mondays and 5 Fridays while on the y-axis, the duration in hours is given. Above (b), the reachability plot, the output of OPTICS algorithm is shown. Here on the x-axis is the core distance and on the y-axis is the reachability distance. The valleys, representative of the clusters and separated by points with high reachability distance, identify the user behavior cluster, each one shown in a different color. In black, the automatically identified outliers are shown.
In Table 2, the number of clusters found for every day of the week is given. Worth highlighting is that, having adopted a rolling horizon approach, the number of clusters obtained is not constant but changes throughout the year of simulation. The values provided in Table 2 are hence an average value.  Interestingly, not all weekdays have the same number of clusters, implying different user behaviors on different week days.
Each of the identified clusters can be further characterized by two PDFs, namely the one associated with the time of connection while the second one associated with the power required during the session and the duration of the session. In Figure 7, an example of the latter is provided. Clearly the derived PDFs may have a multimodal and non trivial distribution, different for every identified cluster [27].
The outlier charging sessions are not representative of common user behavior the station faces but still, if merely neglected, would result in a substantial underestimation of the charging power forecast. For this reason, though not considered in the composition of the arrival and powerduration PDFs, they must be considered to correct the number of occurrences expected for every cluster. To the scope, the labels provided by the clustering are used in a supervised classification problem to bring back the outliers to the nearest behavior. This process allows to properly estimate the PDFs associated with the number of occurrences expected for every cluster.
Once the clusters are properly defined and their relative PDFs are retrieved through Monte Carlo sampling, the parameters of a set of incoming vehicles and hence of charging sessions can be simulated for the following day. In particular, the number of expected incoming vehicles per cluster can be sampled and, from this, all the other session parameters can be evaluated. Hence, every different run of the Monte Carlo procedure produces a different set of charging sessions for the desired day which finally leads to a different aggregated power profile.
In Figure 8, a simulation for the week from the 29 th of April 2019 to the 5 th of May 2019 is given. In particular, the overall expected power estimated by every Monte Carlo run can be seen in grey. In green, on the other hand, the average of the different Monte Carlo profiles is shown. Finally, In black, the actual measurements are depicted. As it is possible to see, the green forecast profiles follow with great accuracy the actual measurements. This is confirmed by the results reported in Table 3, where the energy forecast errors are provided compared with a seven-day persistence model (Pers. in Table) and with the ML based algorithm (LSTM in Table). The forecast computed according to the proposed methodology (Forec. in Table) generally outperforms the two models. Worth noticing is the inaccuracy shown by the three models for Friday, May 3 rd . This day of the week is particularly hard to forecast since the power significantly changes from one week to the other without any significant correlation with the available regressors. This is further confirmed by the difficulty that a RNN based model such as the one implemented faces for this particular day. Furthermore, the expected power during weekends is particularly hard to estimate due to the small number of connections and the impact that a single charging session may have on the overall measured power. A single missed profile tends to have a high overall impact on the accuracy in these conditions.
In Figure 9 the single EV charging sessions forecast for the previously mentioned days can be seen.
Those charging profiles are obtained considering the closest single Monte Carlo trial, in terms of Mean Absolute Error, to the average of all the computed Monte Carlo trials which is the actual forecast curve (i.e. the green line in Figure 8). Each different fill color represents a different unique vehicle connected to the EV charging station. The cumulative power is compared with the measured   power at the station (i.e. the black curve). Interestingly, the cluster composition largely varies on a daily basis. Worth highlighting is that the inaccuracy seen every late afternoon in Figure 9 can be mitigated by changing the definition of the metric used to select the single Monte Carlo.
In Table 4 a comparison between the actual number of charging sessions and the forecast one is given. As visible, the proposed methodology tends to slightly underestimate the expected number of sessions. In particular, this could be due to the misallocation of the late afternoon sessions. Moreover, for Friday May 3 rd the overall power underestimation is here explained by the highly imprecise number of sessions.
In order to highlight the generality of the proposed approach throughout the simulated year, three more weeks are presented in Figure 10. These weeks are chosen in different periods of the year. As visible, thanks to the feature engineering and cluster analysis performed in the closest 3 months, the forecast is able to adapt to different behaviors that may occur.
As visible, the forecast is more accurate during the weekdays, since, due to the underlying methodology, it tends to perform worst when a small fleet of incoming vehicle is expected. In Table 5 numerical evidence of the previous statements is given. Here the details of the error committed in the three presented weeks are given. As visible, the error committed by the proposed methodology is comparable in terms of accuracy with the LSTM methodology. The latter, though, do not provide any information concerning the composition of the power curve in terms of single recharging sessions.
In Figure 11 the single charging sessions composing the overall load forecast are given for the three above-described weeks, with great accuracy on weekdays. On the weekend, though, some problematic behaviors are still observed, deriving from the small number of occurring sessions.
The number of predicted charging sessions for the three considered weeks is given in Table 6. From this the high error values for the weekends can be explained. In fact, a small mismatch and wrong forecast of the number of sessions greatly impact the overall prediction.
In Figure 12 a particular situation is depicted. Moving from the 30 th of June, the model was able to predict with a high level of accuracy the power curve, until the end of the 2 nd of July. From the 3 rd of July, in fact, due to the Independence Day holiday the connections to the station and the following overall power required dropped. This resulted in a great overestimation of the power produced. The quantitative information is reported in Table 7.
Though not currently included (this information was not a priori available to the authors), information regarding the presence of scheduled holidays can be properly accounted for considering a post process on the forecast data or proper     the presented methodology is proven to be beneficial with respect to the implemented benchmark. Moreover, this result is remarkable considering the seven-day time horizon considered in the present work. A further comparison with other works is not here provided since the adoption of different databases or time horizons and resolution may prejudice the evaluation. For what concerns the estimated number of sessions, to the best of authors knowledge, no comparable work is found in literature.

V. CONCLUSION
This research proposes a comprehensive and novel methodology to estimate the individual EV charging sessions and forecast the resulting total charging load at an EV charging station. In particular, user-behavior characteristics are extracted from historical data through unsupervised clustering analysis and, for every predicted charging session, a triplet of parameters is determined: the arrival time, the charging duration and the average power expected during the process. This information is of fundamental importance for a predictive EMS, which could then reallocate the power delivered to each EV according to different optimization strategies.
On the basis of actual data, the proposed methodology achieved a SS of 0.37 with respect to the benchmark (the seven-day naive persistence) in terms of estimated energy with a seven-day forecast horizon. Moreover, it was able to predict with great accuracy the number of expected charging sessions. It must be highlighted that end-user behavior is greatly affected by unpredicable behaviors due to irregularities (i.e. National Holidays, weather conditions, etc. . . ) which should be handled on a case-by-case basis.
Future steps and developments should include exogenous inputs, both for the generalization of the user behavior and for the forecast post-process, which are expected to be of great benefit for the EVs charging sessions forecast accuracy.

ACKNOWLEDGMENT
This manuscript reflects only the authors' views and opinions, neither the European Union nor the European Commission can be considered responsible for them.