Boosting Vehicle-to-cloud Communication by Machine Learning-enabled Context Prediction

The exploitation of vehicles as mobile sensors acts as a catalyst for novel crowdsensing-based applications such as intelligent traffic control and distributed weather forecast. However, the massive increases in Machine-type Communication (MTC) highly stress the capacities of the network infrastructure. With the system-immanent limitation of resources in cellular networks and the resource competition between human cell users and MTC, more resource-efficient channel access methods are required in order to improve the coexistence of the different communicating entities. In this paper, we present a machine learning-enabled transmission scheme for client-side opportunistic data transmission. By considering the measured channel state as well as the predicted future channel behavior, delay-tolerant MTC is performed with respect to the anticipated resource-efficiency. The proposed mechanism is evaluated in comprehensive field evaluations in public Long Term Evolution (LTE) networks, where it is able to increase the mean data rate by 194% while simultaneously reducing the average power consumption by up to 54%.


I. INTRODUCTION
While cars were only seen as means for personal transportation in the past, they are currently transcending to mobile sensor nodes that provide crowdsensing-based services with highly up-to-date information [1]. Applications range from predictive maintenance and intelligent traffic control to roadroughness detection [2] and distributed weather forecast [3].
In addition, small-scale autonomous robots such as Unmanned Aerial Vehicles (UAVs) are expected to become native parts of Intelligent Transportation Systems (ITSs) [4]. Since the operation time of these vehicles is highly determined by the available energy, energy-efficient communication has become one of the major research fields in mobile robotics [5]. With the expected massive increase in vehicular MTC [6] and the general growth of cellular data traffic [7], the network infrastructure is facing the challenge of resource-competition between human cell users and Internet of Things (IoT)-related data transmissions [8]. A promising approach to address this issue is the application of context-aware communication [9] that exploits the dynamics of the communication channel Benjamin  to schedule delay-tolerant transmissions in an opportunistic way for increasing the transmission efficiency with regard to data rate, packet loss probability and energy consumption. As a consequence, communication resources are occupied for shorter time intervals and can early be used by other cell users, which enables a better coexistence and overall system performance [10]. In this paper, we extend and bring together the methods, results and insights of previous work [11], [12], [13], [14], [15] on context-aware car-to-cloud communication and propose a client-side opportunistic transmission scheme that applies machine learning-based data rate prediction for scheduling the transmission times of sensor data transmissions with respect to the expected resource-efficiency. Moreover, mobility prediction and connectivity maps are exploited to integrate the anticipated future channel behavior into the transmission process. The analysis focuses on resource-aware machine learning models that allow the online-prediction on off-theshelf smartphones and embedded systems without causing significant additional computation overheads themselves. The remainder of the manuscript is structured as follows. After discussing relevant state-of-the-art approaches in Sec. III, the proposed transmission scheme and its individual components are presented in Sec. IV. Afterwards, a machine learningempowered process to assess the uplink power consumption of embedded devices is presented in Sec. V. Sec. VI gives an overview about the methodological setup for the field evaluation and finally, the achieved results of the different models are evaluated and discussed in Sec. VII.
The key contributions of this paper are the following: • A highly configurable probabilistic model for opportunistic vehicle-to-cloud data transfer with respect to the channel properties. • Machine learning-based uplink data rate prediction based on measured passive downlink indicators, which is applied as a metric to schedule data transmissions. • Mobility prediction using navigation system knowledge to forecast the future vehicle position and allow exploitation of a priori information about the transmission channel by using multi-layer connectivity maps. • A closed process for post-processing uplink power consumption analysis exploiting machine learning-based transmission power prediction from passive downlink indicators and device-specific laboratory measurements. • The developed measurement applications and the raw measurement results are provided in an Open Access approach.
RSRQ = 10 log 10 (N PRB ) + RSRP − RSSI, (1) where N PRB reflects the total cell bandwidth by means of available Physical Resource Blocks (PRBs). The RSRQ indicates the current cell load and the degree of interference by neighboring cells.
Signal-to-interference-plus-noise Ratio (SINR) [dB] is defined as the difference between the signal power of received cellspecific reference signals to the background power level and the interference power in those resource elements. Channel Quality Indicator (CQI) is an integer value in the range of 0 to 15, which is computed by the UE to inform the evolved NodeB (eNB) about the highest possible Modulation and Coding Scheme (MCS) for future downlink transmissions under the current radio conditions. The calculation of the CQI is not specified by the standard and depends on the modem manufacturer. However, according to [20] the reported CQI shall guarantee a correct reception of a transport block with an error probability ≤ 0.1.

B. Key Performance Indicators
In the result section of this paper, the performance of different transmission schemes will be compared in three dimensions. The corresponding Key Performance Indicators (KPIs) are: Data rate, Age of Information (AoI) and uplink power consumption. In this paper, the data rate evaluations are performed on the application level and are considered as a measurement for the transmission-efficiency of data packets. Moreover, since high data rates indicate short transmission durations, data rate optimization is also related to early release of occupied spectrum resources. AoI is a metric for the freshness of information of delaytolerant applications such as crowdsensing and data analytics. Therefore, it provides a better match with the considered crowdsensing use-case than delay measurements. In the considered definition AoI = t app −t gen , it covers the time from the data generation t gen (e.g., the actual measurement report of a physical sensor) to the reception time t app of the information by the processing application and also includes the delays caused by buffering and transmission. From an application point of view, information is considered useless if a certain AoI value is exceeded. The uplink power consumption P UL is mainly depending on the actual transmission power P TX of the device and related to the different amplification states of the power amplifiers [21]. Since P TX is usually not reported by embedded devices and smartphones, a mechanism for accessing this hidden parameter using machine learning and direct modem interfacing is presented in Sec. V. Energy-efficiency is an important optimization goal for energy constraint vehicles (e.g., UAVs) and independently powered embedded devices (e.g., container tracking modules).

III. RELATED WORK
A detailed evaluation about the complex interdependencies of Machine-to-machine (M2M) and Human-to-human (H2H) data traffic as well as their coexistence in the same cellular network is performed in [10]. The optimization of the coexistence of these data traffic types is often addressed on the network infrastructure side, e.g., by scheduling mechanisms [22] that consider different traffic types and priorities. In [23], the authors propose a biology-inspired approach that considers M2M and H2H as populations of predators and prey in order to achieve a stable equilibrium for both traffic types. Although these optimizations might have an impact on the design of future networks, they can often not be applied within existing networks as the involved changes could lead to incompatibilities. Moreover, the scientific evaluation of infrastructure-side optimizations is often limited to simulation scenarios due to lacking access to the required hardware equipment and the inherent complexity of real-world scenarios. Especially for platooning, one way to reduce the crowdsensing-related cell load is to pre-aggregate the sensor data in a gateway vehicle [24] before it is actually transmitted in order to avoid the transfer of redundant information. Alternative approaches are provided by socialbased forwarding [25] and offloading techniques [26]. Within this paper, we focus on optimizing the transmission behavior of individual non-coordinated vehicles. Anticipatory mobile networking aims to raise the situationawareness of the communicating entities by integrating additional information into the decision processes in order to optimize different individual KPIs or the overall system performance [27]. The anticipatory communication paradigm is closely related to the application of machine learning, which can be exploited for the prediction of future behaviors and the consideration of hidden parameters that are not directly accessible with the complex systems [28]. In [29], the authors propose a data-driven framework for optimizing the resourceefficiency of the network infrastructure by centralized and distributed predictions using control channel analysis. Within the case-study, about 95% of the overall traffic value was precisely predicted, which enabled the network operators to more than double the offered data rate using the optimization framework. While the resulting data rate of a transmission is the result of a deterministic process in theory, predicting those values proactively within a live-system is a challenging task due to the large number of involved hidden influences (e.g., scheduling, packet loss, channel stability, spectrum sharing and cross-layer interdependencies) [30]. For mobile wireless networks, the dynamics of the communication channel are highly affected by the mobility characteristics of the moving vehicle [31]. Therefore, mobility-awareness allows the explicit consideration of these impact factors e.g., for handover optimization [32] and improved routing in vehicular ad-hoc networks [33]. While crowdsensing forms the considered application scenario for this work and provides the reason for the actual car-to-cloud data transfer, it can also be exploited itself in order to optimize the environmental awareness of the vehicles. In this context, the usage of connectivity maps for anticipatory communication [34], [35] allows to exploit a priori information about the channel quality based on previous measurements in the same geographical area.

IV. MACHINE LEARNING-ENABLED TRANSMISSION OF VEHICULAR SENSOR DATA
In this section, the machine learning-based sensor data transmission schemes and their corresponding components are presented. In the first step, the legacy Channel-aware Transmission (CAT) scheme [10] is generalized and augmented using machine learning-based data rate prediction. Afterwards, we transit from context-aware to context-predictive communication with the extended predictive CAT (pCAT) that exploits multi-layer connectivity maps and mobility prediction to consider the anticipated future network state in the transmission process. Finally, the main contributions of this paperthe transmission schemes Machine Learning CAT (ML-CAT) and Machine Learning pCAT (ML-pCAT) -are derived by bringing the key insights together. The overall system architecture model of the proposed approach that operates on the application layer is shown in Fig. 1. Transmissions are performed probabilistically with respect to the network quality by exploiting connectivity hotspots that allow fast and reliable data delivery and avoiding connectivity valleys that implicate high packet loss probabilities. The acquired sensor data is stored in a local buffer until a transmission decision has been made for the whole data buffer. The training phase consists of passive probing of the LTE downlink -which forms the channel context C(t) -as well as of active data transmission with variable payload sizes using Hypertext Transfer Protocol (HTTP) POST. The feature set of the data rate prediction is composed of the network quality indicators, the velocity and the payload size of the data packet. The resulting data rate of the active transmission is used as the label for the prediction process, which is performed with the models Artificial Neural Network (ANN), Linear Regression (LR), Random Forest (RF), M5 Decision Tree (M5T) and Support Vector Machine (SVM). Finally, the prediction performance of the different models is evaluated using 10-fold cross validation. Additionally, the measured channel context parameters and the position information of the vehicle are utilized to create a multi-layer connectivity map that stores the cell-wise average of each indicator from multiple visits of the same geographical area. During the application phase, the context information is leveraged to calculate the channel-aware transmission probability p Φ (t). The channel is only probed passively and the most accurate previously trained prediction model uses measurements for the channel context C(t), the mobility context M(t) and the application context A(t) to predict the currently achievable data rate S(t). The latter forms the metric for the transmission scheme that is configured with multiple system parameters (see Sec. IV-A). Alternatively, the different channel quality indicators can be used directly to serve as a transmission metric. For the context-predictive pCAT transmission scheme (see Sec. IV-B2), mobility prediction is applied to estimate the future position P(t + τ ) for a defined prediction horizon τ . P(t + τ ) is then used to access the corresponding cell entry in the connectivity map in order to obtain an estimation for the future channel context C(t + τ ). This knowledge about the anticipated channel behavior enables the calculation of the forecasted data rate S(t + τ ) that is integrated into the transmission process of the proposed pCAT. An example comparison of the temporal behavior of context-predictive data transmission using SINR-based pCAT and naive fixedinterval data transfer is shown in Fig. 2. Since the periodic approach does not consider the network quality within the transmission decision, the SINR at the begin of the transmission is uniformly distributed over the whole value range of the SINR and multiple transmissions are performed during low network quality periods. Contrastingly, the proposed contextpredictive approach is able to avoid those resource-inefficient transmissions by exploiting connectivity hotspots.

A. Context-aware Data Transmission with CAT
The proposed model is based on a probabilistic process with the aim to calculate the transmission probability p Φ (t) with a fixed channel assessment interval t p based on the measured network quality indicators. While the groundwork for this idea that is presented in [10] was purely focused on the  SINR for assessing the channel quality, current off-the-shelf LTE modems and smartphones provide additional indicators, that allow a finer-grained analysis of the current connectivity situation (see Sec. II-A). Therefore, an abstract metric Φ is introduced, which is described by its assumed minimum value Φ min and its maximum value Φ max that implicitly define the operation range Φ max − Φ min . Each indicator contained in the channel context C(t) can be mapped to a corresponding metric Φ i (t). In order to allow the comparison of multiple metrics that are related to different value ranges (e.g., RSRP and RSRQ), the measured metric value Φ(t) is transformed into the normed current metric value Θ(t) with Eq. 2. This approach also enables the joint-consideration of multiple metrics by linear combination [14].
The resulting transmission probability is then computed with Eq. 3. With ∆t being the elapsed time since the last performed transmission, p Φ (t) is computed based on the measured channel quality, if the time interval condition t min < ∆t < t max is fulfilled. t min guarantees a minimum packet size and t max specifies a maximum buffering delay that corresponds to the actual application requirements.
The formula allows to control to which extend a metric should prefer values that are close to Φ max by the weighting exponent α. Fig. 3 shows the resulting analytical temporal behavior for different values of α. If the time interval condition is fulfilled and Φ(t) exceeds Φ max , the transmission probability is 1 and the transmission is performed in any case.

B. Context-predictive Data Transmission Exploiting Multilayer Connectivity Maps with pCAT
In the following, the previously presented CAT scheme is extended to the context-predictive pCAT that leverages a priori information about the channel quality along the anticipated trajectory. The aim is to optimize the data transmission scheduling further by integrating knowledge about the future network state into the transmission process. As the basic requirement to predict the future channel context C(t + τ ) is the availability of a position forecast P(t + τ ) for a defined prediction horizon τ , pCAT is divided into a mobility prediction component and the actual transmission process.

1) Mobility Prediction:
In the following, multiple prediction approaches, that exploit different sensors and information types and differ in the implementation complexity are discussed. Since Global Positioning System (GPS) coordinates and the World Geodetic System 1984 (WGS84) reference frame are used in the live-system, all calculations have to be performed in the orthodromic domain [36]. Nevertheless, the formulas are presented in a cartesian coordinate system for better understanding here. The proposed scheme focuses on the use of generic approaches that can be efficiently used in a live-system without involving a high computation overhead. More complicated approaches based on maneuver detection [37] have been proposed in literature and will be considered for future extensions. a) GPS-based Extrapolation: The most straightforward approach is to extrapolate the future position by using the location, direction and velocity information provided by the GPS receiver. With the north-aligned angular direction λ and the current vehicle velocity v, the future position is estimated with Eq. 4.
The advantage of using extrapolation is that it can be implemented in a very simple manner by using only the currently measured GPS information. However, it assumes the direction λ and the velocity v to be constant for the duration of τ . Therefore, the resulting prediction accuracy is highly reduced -especially for larger values of τ -if the vehicle turns, encounters stop-and-go traffic or is influenced by traffic signals and other traffic participants.
b) Leveraging Trajectory-knowledge from the Navigation System: For overcoming the limitations of the previous approach, mobility prediction based on trajectory information is now discussed. While planned trajectories might be accessible through a direct interface to the navigation system -which will likely be the case for upcoming automated vehiclesthis type of information could also be derived by exploiting the regularities in human behavior itself. In fact, the analysis in [38] points out that 95% of human mobility can be predicted by exploiting people's regular movement on the same paths (e.g., the way to work or to grocery stores). With the assumption of having data for the same track from multiple trips available, the segment-wise mean trajectory is calculated in a preprocessing step with the approach presented in [39]. During the online mobility prediction, the current trip is detected by the highest matching of the measured position and direction values to all locally stored trips. P(t + τ ) is then derived by virtually moving the vehicle along the anticipated path for a duration of τ in an iterative process. For each prediction, the movement potential D = v · τ is computed and the traveled distance D is initialized with D = 0. In each iteration i, D is incremented by the distance d i,j = ||W j − W i || between the consecutive waypoints W i and W j=i+1 . When D exceeds D, the final position is obtained from interpolation using Eq. 5.
c) Lightweight Trajectory-aware Approach -Prediction based on a Reference Trace: A lightweight alternative, that requires less data than the previous approach is to utilize the last measurements of the same track as a reference trace. The mobility prediction process itself is then equal to the one presented in Sec. IV-B1b, but the preprocessing stage is omitted as only a single track is utilized for the computation of the future position. Analogously, the connectivity map only contains the values of a single measurement drive per distinct track. The price to pay for the increased resource efficiency is the loss of the aggregation gain that is obtained by cell-wise averaging, which reduces the impact of outlier measurements, especially for highly dynamic metrics as the SINR. In Sec. VII-C, the resulting error for the network quality prediction is discussed for the considered mobility prediction methods.
2) Context-predictive Transmission Process: With the predicted position P(t + τ ) being available after the mobility prediction step, the future channel context C(t + τ ) is looked up from the connectivity map as illustrated in Fig. 4. The cell index m for the corresponding entry is obtained with Eq. 6 for a defined cell size c.
The predicted metric value Φ(t+τ ) is extracted from C(t+τ ) and the anticipated gain ∆Φ(t) is computed using Eq. 7.
Analogously to CAT, the transmission probability p Φ (t) is then computed with respect to the defined timeouts using Eq. 8. For the consideration of the channel quality development, the pCAT-specific exponent β is introduced, which controls the impact of the context prediction within Eq. 9. Multi-layer connectivity maps as an enabler for anticipatory communication. With the help of mobility prediction, the current measured channel quality can be compared to its predicted future state. steps fails (e.g., due to missing GPS signal or if the map does not contain data for the predicted cell), pCAT performs a context-aware fallback by switching to the purely probabilistic CAT model. Although the necessary data collection highly benefits from the mutual synergies of a crowdsensing-based approach, the whole proposed scheme can also be implemented in a local sandbox for dealing with any privacy-related concerns.

C. Machine Learning-based Data Rate Prediction
Predicting the data rate is a supervised learning task. Given the features X of the measurements, a prediction model M that predicts the data rate S is applied. Different model classes M are possible, each representing a different function class f : X → S and pertaining a different list of parameters ψ. Due to the supervised learning task, arg min f ψ ∈M l(f ψ (X), S),  labeled data is available as ground truth for the data rate. The error of a particular model function f may thus be assessed by comparison of the predicted data rate to the known data rate. Possible loss functions l are the Mean Absolute Error (MAE), which is a measure for the absolute distance among prediction and the true label, and the Root Mean Square Error (RMSE) measuring the Euclidean error.
Several model classes are possible. In order to decide on a model class, not only the performance on training instances is important but also the validation of novel, yet unseen, examples. Otherwise it could possibly happen that a model learns all examples 'by heart' and has perfect prediction performance on training instances but does not generalize at all -this phenomenon is defined as overfitting. Possible countermeasures against overfitting exist, e.g., recording a larger more diverse data set or choosing a model with less capacity, or including the capacity of the model in the objective function via a regularization term. The details on regularization are beyond the scope of this paper and can be found in [40]. The model selection is performed with the tool WEKA [41] and different model classes are tested. In this paper, the applied models are regression tree (in particular, the M5T [42]), RF [43], LR, multi-layer perceptron (ANN), and SVM [44]. All methods are discussed in [40], thus a brief overview of selected methods is provided here.
The simplest model is LR which fits a linear combination of the input features X to the output label S. A more sophisticated way is the split of the data set into regions and application of different linear models within different regions. The regression tree performs exactly these splits using axis parallel hyperplanes by comparison of each feature with a threshold. This distinction of feature vectors based on thresholds of its features is captured in a tree structure. In its leafs, different linear models are applied. The RF trains not one regression tree but multiple and learns a linear combination of their predictions. The theoretical aspects of artificial neural networks and support vector machines are out of scope of this paper, additional information can be found in [40]. The evaluation of the prediction methods is performed based on more than 2500 real-world measurements of periodic and CAT-based transmissions that were performed in the context of earlier work in [14] on two different tracks (details about the measurement setup are provided in Sec. VI). The feature set is formed by RSRP, RSRQ, SINR, CQI and velocity measurements in combination with the payload size of the data packets. The label is defined as the measured data rate of the active transmissions. Tab. I shows the resulting prediction performance for the considered models and evaluation metrics. Although the absolute highest accuracy is achieved with the RF model, it only performs slightly better than the M5T approach.
In order to evaluate the generalizability of the learned Based on the prediction results, the M5T model is chosen as the applied prediction model within the application phase. It achieves a good overall performance, allows a very lightweight implementation and can be used for online data rate prediction without causing significant computation overhead. For simplicity, in the following, the usage of the Φ M5T metric for sensor data transmissions will be referred to as ML-CAT, respectively ML-pCAT. Fig. 6 shows the resulting prediction accuracy using M5T. The left upper triangle shows the underestimation area and the lower right triangle represents the overestimation area. From the application-centric perspective, underestimations are not considered harmful, as the transmission even achieves a higher data rate than expected.

V. MACHINE LEARNING-ENABLED POST-PROCESSING UPLINK POWER CONSUMPTION ANALYSIS
Performing communication-related power consumption measurements of an embedded device is a non-trivial task that requires precise isolation of physical components and involved software. As numerous components of the system, such as modem and processor, are highly integrated into a single System on a Chip (SoC), it is not possible to measure the modem's power consumption isolatedly. On the other hand, measurements of the whole system's power consumption are superimposed by any other system activities, e.g. GPS, IO operations, and background services. Consequently, such measurements are rather performed within a laboratory setup and not with a highly integrated mobile system. In order to allow the evaluation of the energy-efficiency of CAT on (mobile) embedded platforms, a closed process for isolated uplink power consumption analysis is derived in the following that combines existing models and approaches [45], [46], [15]. Fig. 7 shows the overall architecture model. Accurate laboratory measurements are used to obtain the device-specific behavior characteristics of power consumption with regard to the applied transmission power. The Context-aware Power Consumption Model (CoPoMo) [45] is a validated model, which uses devicespecific characteristics obtained from laboratory evaluations to estimate the power consumption for uplink transmissions as a function of the transmission power within a state machine. Fig. 7 includes the characteristic curve of the Galaxy S5 Neo smartphone operating in two different frequency bands for a transmission power range of −10 dBm to 23 dBm [46]. For a single frequency band, e.g., the blue curve, the characteristics can be approximated by two linear functions of different slope, which are separated by a device-specific break point γ. This behavior is caused by switching between two different internal power amplifiers. CoPoMo further reduces the characteristics to a probabilistic four-state power model with negligible loss of accuracy. Depending on the radio conditions, a UE uploads its data with Low, High, or Max power and enters Idle mode afterwards. The state transitions are modeled by the transition probabilities λ and µ that are   obtained from the corresponding data set. Calculating the equilibrium state of the Markovian chain finally provides an estimate for the UE power consumption in the given scenario. Unfortunately, in the context of this paper, the model cannot directly be applied as embedded operating systems (e.g., Android) do not provide information about the currently used transmission power and therefore circumvent the determination of the current power state. As a consequence, this paper applies a novel machine learning-based approach for power estimation [15], which is based on the available LTE downlink indicators. According to the LTE standard [20], UEs choose their transmission power P TX based on Eq. 12:
P 0 is broadcasted by the eNB and depicts the target SINR per PRB of the received signal at the eNB. Thus, the UE at least has to compensate the estimated path loss P L, which is derived from RSRP and the actual transmission power of the eNB. It is weighed by a Fractional Path Loss Compensation (FPC) α, which is also configured by the base station. An additional offset ∆ MCS ensures a sufficient SINR for the selected MCS. Finally, the transmission power needs to be increased according to the number of emitted PRBs M in order to keep the received SINR constant regardless of the number of allocated resources. The closed-loop component δ is an absolute or accumulated offset, which is transmitted in Transmission Power Control (TPC) commands by the base station in Physical Downlink Control Channel (PDCCH) together with the resource allocations. By this, the eNB can increase or turn down the output power of those UE in a feedback loop. However, the eNB never transmits an absolute P TX value to the UE. Since P max , P 0 , and α can be seen as network constants and δ should average to 0 for a well-configured network, only M , P L, and ∆ MCS have to be obtained in order to estimate the transmission power P TX at application layer. Although these remaining variables are not accessible as well, they are tightly related to available indicators: P L: The path loss is internally calculated from RSRP. However, the eNB's reference signal transmit power still remains unknown to the application layer. M : The number of allocated PRB corresponds to the resulting data rate at a given MCS. Unfortunately, in case of a prediction, the data rate is not available as an indicator. However, assuming a non-congested network, the data rate follows the Transmission Control Protocol (TCP) slow start mechanism during the transmission, which depends on the upload size and the maximum achievable rate. The latter is related to P L, since M and ∆ MCS are capped by P max for large P L. ∆ MCS : This lookup table provides MCS-dependent power offsets to ensure a proper SINR for a correct demodulation and decoding at the base station. Interference and mobility (fast fading) adversely affect the MCS, which can be indicated by RSRQ and the UE's velocity. According to the UE's power headroom, the eNB may select the highest possible MCS to maximize throughput and spectral efficiency. Hence, this value is also related to P L and M . However, the exact relationship of these variables is blurred by case differentiation and operator-specific configurations of their base stations, which makes analytical approaches complex and impractical. Therefore, a data-driven approach leverages machine learning to obtain a prediction model for P TX , which is presented in [15]. The work analyzes the usage of different indicators and machine learning techniques for the estimation of P TX in detail. It also provides different prediction models for simulations, practical applications and detailed analysis, which differ in the number of available indicators for this task. The data is obtained in excessive field measurements of drive tests in public cellular networks. An overview of the covered trajectory is shown in Fig. V and covers urban, suburban, and rural environments. The measurements were performed using an embedded Vehicle-to-everything (V2X) platform with a direct modem interface, which allows to access the current transmission power in order to obtain the ground truth label for the prediction. The device was placed in the rear trunk of a car and periodically uploaded files of 1 MB to 5 MB to a HTTP server. The integrated LTE modem provides 31 network indicators including the current transmission power. From the three considered machine learning models, Ridge Regression (RR), Deep Learning (DL), and RF, the latter achieved the lowest RMSE of 5 dB to 6 dB, depending on the feature set. In addition, the absolute sum of errors shrinks as the number of predictions grows and falls below 1 dB after 28 predictions. Hence, this approach is well-suited for long-term applications and post-processing analysis of large data sets. In this paper, the RF-based approach for practical applications was applied to predict the P TX for transmissions based on the indicators RSRP, RSRQ, SINR, upload size, and the vehicle's velocity. Finally, the dwell times of CoPoMo's fourstate model are computed by mapping the P TX predictions into the corresponding power states, which in turn enables an estimation of the average power consumption of the UE. In conclusion, the presented process allows to analyze the energy-efficiency of the considered transmission schemes without the requirement for dedicated measurement equipment and explicitly without knowledge of the applied transmission power. Although this approach is utilized for offline result analysis in the next section, it can also be applied for online prediction directly on the embedded device.

VI. METHODOLOGICAL SETUP OF THE EMPIRICAL
PERFORMANCE EVALUATION In order to evaluate the properties of the proposed transmission schemes in a realistic scenario, a comprehensive empirical performance evaluation is performed in the public cellular LTE network and within a vehicular context. Tab. III provides a summary of the application-related key parameters.
Channel sensing and data transmission are handled by an Android-based application (executed on a Samsung Galaxy S5 Neo -Model SM-G903F), which is provided in an Open Source way 1 . Sensor packets of size s sensor are generated by a virtual sensor application with a sensor frequency f sensor and stored in a local buffer until a transmission decision has been made for the whole data buffer. For each t p , the channel is sensed and the transmission probability is computed. All data transmissions are performed in the LTE uplink to a cloud server using HTTP POST. The different considered metrics CAT with the Φ SINR metric is equal to the legacy version presented in [10], analogously, pCAT with the Φ SINR has the same behavior as [11]. The values for the weighting factor β have to be chosen with respect to the metric's value range and its granularity. In order to allow a fair comparison among the different metrics, β is chosen with respect to the definition of Φ SINR metric and its respective value range with Eq. 13.
For the context-predictive pCAT, the connectivity map and the trajectory prediction utilize the acquired data of the CAT evaluation phase. The raw data of the experimental evaluation of the different transmission schemes is provided at [47] and the measurement software as well as the obtained data samples for the transmission power estimation can be accessed via [48]. Fig. 8 shows the street map with the different tracks used for the experimental performance evaluation. • Track 1: Suburban roads with upper speed limits in the range of 50-70 km/h (14 km) • Track 2: Highway traffic with upper speed limits of 130 km/h (9 km) Each parameterization of the transmission schemes has been evaluated five times on each of the tracks. Overall, more than 7500 data transmissions were performed within a total driven distance of more than 2000 km. On the application layer, all data transmissions were performed successfully.
In Sec. VII, the performance of the different schemes is compared in the dimensions data rate, AoI and uplink power consumption. Since all transmitted data packets contain multiple measurement values that have individual times of generation, the AoI of a data packet is defined as the mean age of all contained sensor measurements.
VII. EMPIRICAL RESULTS OF VEHICULAR SENSOR DATA TRANSFER In this section, the results of the empirical performance evaluation are presented and discussed. At first, the impact of the different considered context information on the resulting uplink data rate is evaluated. Afterwards, results for data rate, age of information and uplink power consumption are presented for the context-aware transmission scheme using CAT with different single downlink indicators as transmission metrics. Then, the accuracy of the mobility prediction schemes and their impact on the predictability of the future context information is discussed and the results for the context-predictive scheme pCAT are presented. Finally, detailed measurements for the machine learning-enabled transmission methods ML-CAT and ML-pCAT are provided.

A. Correlation of Passive Downlink Indicators and Uplink Data Rate
The correlation of RSRP, RSRQ, SINR, CQI, velocity and payload size with the resulting data rate is shown in Fig. 9. Since these indicators also are the features of the data rate prediction, the analysis gives an impression about the importance of the different features for the overall prediction behavior. The plots contain the individual transmissions of the whole data set, consisting of periodic, CAT and pCAT data transfer. It should be denoted that the resulting value range of the data rate exceeds the limits given by the data rate prediction shown in Fig. 6. The reason for this behavior is that the pCAT-based and the machine learning-enabled schemes, which mainly achieve these high values, were not part of the used training set (see Sec. IV-C). Furthermore, the plots contain multiple forced transmissions, which are triggered for all CAT-based approaches, if Φ(t) ≥ Φ max . As expected, the data points for the periodic transmission scheme are uniformly distributed among the value ranges of the respective metrics as the transmissions are performed regardless of the channel quality. The characteristics of the RSRP can be divided into two distinct areas that are divided by a breakpoint at −85 dBm, that allows a division into cell edgeand cell center-behavior. In the cell edge area, the RSRP is a dominant factor for the achievable data rate, which is increased with higher RSRP values. Within the cell center, the dependency is decoupled since other effects (e.g. interference) have a more dominant influence on the behavior. The behavior of the CQI shows a peak for CQI=2. During the drive tests, those values occured frequently on both tracks and without any obvious correlation to the other indicators. As discussed in Sec. II-A, the actual calculation of this indicator is not standardized for LTE and depends on the modem manufacturer. It can be concluded that the reported CQI is limited for being used as a CAT-metric, which is also confirmed by the evaluations in the following sections. Although the drive tests were performed within a velocity range of 0 km/h to 140 km/h, the observed dependency of the data rate to the velocity is very low. For the payload size, multiple lobes can be identified that separate periodic transmissions, CAT, pCAT, ML-CAT and ML-pCAT. Another region mainly consists of outliers that are related to forced transmissions either caused by Φ max or t max as well as inaccurate measurements and high channel variances. It can be observed that the machine learning approaches are systematically able to achieve higher data rates for the same payload size. Especially the introduced lookahead of ML-pCAT approaches is able to proactively avoid transmissions during low channel quality periods. This fact is underlined by the correlation analysis in [14], that is based on an earlier version of the data set and does not contain the pCAT-and ML-pCAT-specific lobes. The value range for the payload size is limited to 6 MB as t max is defined as 120 s and the sensor application generates 50 kB of data per second. The overall behavior can be characterized by two areas that have different grades of dependencies to the channel coherence time. Up to 4 MB, the data rate highly benefits from increased payload sizes as the slow start of TCP is less dominant for the overall transmission duration and a better payload-overhead-ratio is achieved. After the breakpoint, the probability for low data rate transmissions is highly increased as the channel is more likely changing its characteristics during active transmissions due to the longer transmission duration. The correlation analysis shows that no single indicator is able to provide a robust measurement for the channel quality in all considered situations.

B. Single-metric Context-aware Transmission
The results for the context-aware transmission are shown in Fig. 10. Multiple variants of CAT are configured with each of the passive downlink indicators as a metric according to Tab. IV. The results for periodic transmission with a fixed interval of 30 s are shown as reference. Using the contextaware approach, an average data rate gain of 55 % is achieved for CAT-based metrics with Φ SINR and Φ CQI having the highest variance during the different evaluation tests. Although the results for the data rate are similar for all CAT-based metrics, the AoI behavior shows significant differences, which are related to the dynamics of the corresponding network quality   indicator during the drive tests. Here, high AoI values are an indicator for longer periods of low channel quality that prevent the vehicle from transmitting its data. In the suburban scenario, the SINR values are rarely close to Φ max and even multiple transmissions are forced by the maximum buffering delay t max , resulting in a very high AoI. All schemes highly exceed the baseline defined by the periodic transmission approach. Yet, the up-to-dateness of sensor measurements is still sufficient for the considered crowdsensing scenario (see Sec. II). On the highway track, the average AoI is reduced, as the measurement channel behavior is frequently changing due to the high velocity of the vehicle, resulting in a higher transmission frequency. These aspects are further confirmed by analyzing the uplink power consumption. It can be seen that transmissions during low quality periods -here most clearly illustrated by the t max -related forced transmissions for the SINR-metric -severely increase the average power consumption and reduce the energy-efficiency.

C. Impact of Mobility Prediction on the Network Quality Indicators
Before the measurement results of the context-predictive transmission schemes are presented, the accuracy of the mobility prediction mechanisms and the impact of prediction errors on the channel context estimation is discussed. An evaluation of the velocity-dependent mobility prediction accuracy as well as the implications for error-effected forecasts on the network quality assessment is provided in Fig. 11. The accuracy of the GPS extrapolation approach is significantly influenced by the probability of direction changes during the prediction horizon  τ . With regard to the speed-dependency of the prediction error, three characteristical regions can be identified. Up to 70 km/h (urban/suburban roads), the error dimension is proportional to the velocity. For higher velocities, the vehicle is more likely moving on a highway track with a low probability for direction changes. However, above 90 km/h the higher prediction distance becomes the dominant error source again. For the trajectory-aware approaches, the resulting distance error is much lower. Moreover, due to the consideration of the vehicle's turn behavior, the dependency between prediction accuracy and velocity is decoupled, allowing robust predictions even for higher values of τ . The future position P(t + τ ) is used to look up the future channel context C(t + τ ) from the connectivity map. Therefore, inaccurate forecasts may lead to situations where the connectivity map does not contain data for the (falsely) predicted cell, which is statistically captured by the Prediction Failure Ratio (PFR) metric. For the pCAT-based schemes, prediction failures trigger the CAT-fallback, where the respective transmission scheme behaves equally to a pure probabilistic CAT scheme with the same metric properties. As a consequence, only the trajectory-based prediction methods are further considered in the following as the PFR is unacceptably low for the extrapolation approach. For assessing the added information by predicting the passive downlink indicators, the resulting error has to be set into relation to the value range Φ max − Φ min of the corresponding metric Φ and is severely influenced by the accuracy of the position prediction. While for RSRP and RSRQ only slight differences between the two prediction approaches can be observed, SINR and CQI achieve an aggregation gain by the cell-wise averaging within the connectivity maps. As both indicators are significantly affected by short-term fading, the single reference trace is not able to provide an accurate estimation.

D. Single-metric Context-predictive Transmission
As a consequence of the prediction accuracy analysis, the context-predictive pCAT scheme uses the trajectory-based mobility prediction with connectivity maps. Fig. 12 shows the results for the considered KPIs with prediction horizon τ = 30 s. By consideration of the future channel behavior, pCAT adds another dimension that is sensitive to the dynamics of the channel quality and significantly changes the transmission behavior. With the vehicle proceeding on its route, the channel context is considered with a moving window with the range [t, t + τ ]. As a consequence, the transmission scheme is much more influenced by the probability of the vehicle encountering the anticipated context within the remaining time interval to t max than on having a perfect prediction for a discrete point in time. Additionally, transmissions are performed more often, resulting in a reduced AoI, which is even falling below the baseline of the periodic approach on the highway track. In contrast to CAT, the resulting data rate is significantly different for the considered metrics as it is now also depending on the predictability of the metrics themselves. The highest gains are achieved with the SINR-and the RSRP-metrics (up to 95 % on the suburban track and up to 77 % on the highway track). Due to the doubled dependency to the channel dynamics by consideration of C(t) and C(t + τ ) and the proactive detection of connectivity valleys, transmissions are less likely forced by t max . As a consequence, the power consumption behavior of pCAT significantly outperforms the CAT-based approach.

E. Machine Learning-enabled Context-aware and Contextpredictive Transmission
Finally, the results for ML-CAT and ML-pCAT are shown in Fig. 13. The machine learning-enabled schemes exploit the correlation of the individual features with the resulting data rate as shown in Fig. 9 with the aim of data rate optimization. Both variants are able to achieve massive boosts in the resulting data rate. The best overall performance is achieved for ML-pCAT with τ = 30 s on the highway track, where it is able to achieve a data rate gain of 194 % while simultaneously reducing the average uplink power consumption by 54 %. In comparison to the previously discussed results for single-metric pCAT, the AoI is increased as ML-CAT and ML-pCAT highly exploit the correlation between payload size and data rate. Therefore, these approaches actively introduce additional buffering delays to achieve higher packet sizes. This relationship is also illustrated in Fig. 14, which shows an example trace of the temporal behavior of the M5T-based data rate prediction during a drive test. The dependency of the data rate to the payload size can be observed by the step-wise linear component of the curve that is caused by the payload increase due to the addition of further sensor data packets with respect to the increased time. In comparison to the single-metric transmission schemes and with regard to the complex interdependencies of context parameters and data rate, it can be concluded that the data rate prediction provides a much better metric for channel quality assessment than the measured value of a single indicator. Machine learning enables the implicit consideration of hidden effects such as TCP slow start, channel coherence time and interdependencies between the downlink indicators into the transmission process itself.
VIII. CONCLUSION AND FUTURE WORK In this paper, we presented the machine learning-enabled transmission schemes ML-CAT and ML-pCAT for client-side context-aware transmission of vehicular sensor data. Machine learning-based data rate prediction is used as a meaningful metric for scheduling the transmission time of delay-tolerant sensor data transmissions. By implicit consideration of hidden effects such as interdependency of payload size and channel coherence time, the resulting data rate can highly be improved while simultaneously reducing the required uplink transmission power of the embedded device. The latter is a crucial factor for data sensing by energy-constraint vehicles (e.g., UAVs) and embedded systems. The trade-off between achieved benefits and introduced delay due to local packet buffering can be controlled by different parameters and variants of the proposed probabilistic transmission scheme. All measurement tools and raw results of the experiments are provided in an Open Access manner in order to achieve a high level of transparency and reproducibility. In future work, we will investigate the cross-layer interdependencies of the proposed transmission scheme. Furthermore, we aim to improve the data rate prediction by integrating knowledge about the active cell users obtained from control channel analysis. Moreover, the proposed approach will be brought together with methods for coordinated pre-aggregation of the data in vehicular crowds. Additionally, the mobility prediction algorithms will be evaluated in their potentials for predictive steering of millimeter wave (mmWave) pencil beams in a vehicular context.