Android Head Units vs. In-Vehicle ECUs: Performance Assessment for Deploying In-Vehicle Intrusion Detection Systems for the CAN Bus

Following the numerous attacks that exploited vulnerabilities of Controller Area Networks (CAN), intrusion detection systems have become a topic of prime importance for in-vehicle buses. Newer in-vehicle communication layers, such as CAN-FD, despite the larger payloads which can easily integrate cryptographic elements, need similar attention. But detecting intrusions may call for demanding algorithms that are not computationally cheap while timely detection is necessary in order to process frames in real-time and take the appropriate actions. In this work we evaluate the performance of several binary classifiers on traditional in-vehicle Electronic Control Units (ECUs) and compare them to modern Android devices which have become widespread inside cars with the adoption of Android-capable infotainment systems. Needless to say, these modern devices benefit from higher computational and memory resources while cloud connectivity may alleviate computational costs even further. Contrasting between traditional controllers and Android devices has become necessary and so far there have been little efforts in this direction. To create a realistic testbed, we use collected in-vehicle CAN bus traffic from an SUV as well as more demanding logs from Advanced Driver-assistance Systems (ADAS) implemented on CAN-FD which we augment with adversarial activity.

The associate editor coordinating the review of this manuscript and approving it for publication was Parul Garg. same is implied by recent regulations which are pushing the 28 vehicle industry to evolve in terms of electronics by develop- 29 ing new technologies that will make the driving experience 30 safer and decrease the environmental pollution or energy 31 consumption. But as a side-effect to the increased complexity 32 of in-vehicle electronics and interconnectivity, the number of 33 attack surfaces will increase as well. 34 More than three decades after its introduction by BOSCH, 35 the Controller Area Network (CAN) is still the most com-36 monly used communication protocol inside vehicles which 37 makes it one of the most important assets that requires pro-38 tection against malicious attacks. But the security of CAN 39 buses is lacking since no mechanisms were put in place at 40 design time and CAN offers no protection against malicious 41 adversaries. The potential of attacking in-vehicle networks 42 potential variations. First, the IDS can be deployed 86 on an Android head unit which is already a com-87 mon component in modern vehicles. Besides exhibit-88 ing increased computational resources, these units are 89 also equipped with 5G communication which can be 90 used for remote diagnosis (possibly via cloud-based 91 services, which can be used to enhance even further 92 the intrusion detection mechanism inside vehicles by 93 more demanding algorithms and large data pools). Sec-94 ond, the IDS can be deployed on the user device, e.g., 95 a smartphone, that collects CAN bus data by using 96 WiFi connectivity to the OBD port as also outlined in 97 Figure 2 (i). This would allow similar capabilities to the 98 case of Android head units. However, there are addi-99 tional advantages since users may easily change their 100 smartphone thus benefiting from increased computa-101 tional and communication capabilities over the years 102 (changing the Android head unit is less convenient). 103 Also, this may open room for third-party software 104 that may be published in Android application stores 105 and may be aquired by users similar to existing anti-106 virus software. An immediate disadvantage however is 107 that Android head units or smartphones may become 108 more easily corrupted than in-vehicle controllers. For 109 example, the authors of [6] performed some attack 110 experiments on real vehicles by repackaging Android 111 commercial apps. Another demonstration of possible 112 attacks on Android devices is made in [7], in which 113 an Android infotainment unit is hacked and enables 114 attackers to inject messages on the CAN bus. Further, 115 applications vulnerabilities are discussed in [8] and [9]. 116 Another possible disadvantage in implementing IDS 117 on Android devices is that Android smartphones are 118 not directly connected to the CAN bus and wireless 119 communication may induce additional delays (fortu-120 nately, these delays may not be so significant but this 121 depends on the interface used for data collection, as we 122 show later in the experiments). Using smartphones may 123 also turn into an advantage from a security perspective, 124 since the smartphone is not directly linked to the CAN 125 bus and wireless connectivity to the bus may be imple-126 mented in a read-only fashion. Thus, a compromised 127 phone will not be able to cause attacks on the bus.   Android platforms that may benefit from remote connectivity 149 and increased computational power. A summary of the brief 150 comparison between in-vehicle ECUs and Android units is 151 given in Table 1. As a collateral contribution, though not 152 necessary the main focus of our work, we also evaluate the 153 efficiency of several machine-learning algorithms in detect-154 ing intrusions. VMs and three representative automotive-grade micro-175 controllers, as well as memory requirements on the 176 latter due to the stringent constraints on such platforms. 177 The rest of the paper is organized as follows. In Section II 178 we provide some background on CAN buses and discuss 179 related work. Section III presents the utilized in-vehicle 180 traces, the devices that we used in our experiments and the 181 adversary model. In Section IV we present our experimen-182 tal testbed. Section V places the binary classifiers in the 183 previously outlined setups and evaluates their performances.

184
Finally, Section VI holds the conclusion of our work and 185 section VI contains the list of acronyms. were proposed, which we now discuss. However, most of 250 these studies focus only on the detection accuracy and do 251 not take into account the computational constraints which are 252 crucial in the context of automotive embedded platforms -253 these constraints are the main focus in our work. In what 254 follows we survey more than twenty-five papers related to 255 the development of in-vehicle IDS, but only a small amount 256 of them, namely [18], [19], [20], [21] and [22] are using 257 embedded development boards. Also, a comparison between 258 in-vehicle controllers and Android units that are now com-259 mon in cars is missing from related works.

260
In [18] an IDS based on remote frames is presented. The 261 authors measure the time-interval between request frames 262 (also known as remote frames) and response frames (also 263 known as data frames) and show how adversarial frames 264 cause offset variations that do not occur in a free attack sce-265 nario. The use of Bloom filters was explored in [19] in order 266 to detect malicious activity on the CAN bus. The proposed 267 detection technique is based on a training stage that examines 268 the message periodicity in order to detect replay attacks and 269 the content for data field in order to detect injections with 270 random data. The authors show that the real-time classifi-271 cation is time-memory efficient and obtain good detection 272 results. A graph-based IDS that models the CAN traffic is 273 considered in [23]. Other lines of works employ entropy 274 characteristics [24], [25]  This accounts for the use of voltage thresholds [29], clock-280 skews [20] or signal characteristics [30].

281
Significant attention is also given to machine learning 282 based approaches. A hierarchical taxonomy on these method-283 ologies can be found in [31]. The authors from [32] provide  In this section we describe the in-vehicle traces and experi-323 mental devices that we use in our evaluation. Also, we discuss 324 the adversarial behavior that our intrusion detection system 325 accounts for.

327
In our analysis we used two real-world datasets collected by 328 us and a data-set from [1] which we use as a reference. The collection of the CAN dataset from the cars was 330 performed using a Vector VN1630 USB-to-CAN interface. 331 We have implemented a Windows application using Vector 332 XL Driver Library to interface with the VN1630 hardware. 333 For the first CAN trace, we connected the VN1630 to the 334 Dacia Duster in-vehicle OBD port and extracted the CAN 335 bus traffic. Our second dataset was extracted directly from 336 a private CAN bus on which automotive radar ECUs were 337 connected, i.e., ADAS systems (Advanced Driver-Assistance 338 Systems). Using data from such a system is relevant since 339 future autonomous vehicles will directly depend on it, not 340 to mention the increased help these system have to offer to 341 regular drivers.

342
The CAN logging procedure is graphically depicted in 343 Figure 9 (i). Several details on these datasets are summarized 344 as follows: 345 1) the first data set comes from a Dacia Duster ( Figure 4) 346 which is a compact sport utility vehicle (SUV) which 347 we see as representative for mid-range cars. The col-348 lected data is more limited in terms of the number of 349 IDs, only 12 IDs are visible on the OBD port, but it 350 is almost identical to the rest in terms of delays and 351 entropy.

352
2) the second dataset comes from a high throughput 353 CAN-FD network that accommodates ADAS systems, 354 e.g., vehicle radars used to detect vehicles and pedes-355 trians. This type of traffic is representative for mid to 356 high-end cars that posses modern equipment needed 357 for complex tasks such as autonomous driving. This 358 dataset is more complex containing more than 80 IDs 359 and frames of up to 512 bits. The communication layer 360 is the newer CAN-FD. 361 We also use the dataset from [1] which was recorded in a 362 Hyundai Sonata and we keep it as a reference to compare our 363 results with existing works. The trace contains 27 IDs making 364 it more similar to our first dataset and less complex than the 365 second.

366
A few words on the traces based on the depictions from 367 Figure 5 are necessary. This figure depicts some statistics for 368 one ID in each trace. We notice that in all traces the content 369 of the datafield shows clear patterns which would make it 370 VOLUME 10, 2022  The first category of devices that we used in our setup com-382 prises the Android-based devices. We used a PNI A8020 head 383 FIGURE 6. The two stage intrusion detection algorithm in our work.
unit whose production started in 2017 and a more recent one, 384 Erisin ES8791V, released in 2019. Due to their high usage and 385 capabilities, smartphones were also considered in our setup. 386 Consequently, we chose to work with a Samsung A6, a Sam-387 sung S8 and a Samsung Note10+. In addition to smartphones, 388 we also included a tablet in our work, namely the Samsung 389 Galaxy Tab S7.

390
The second category of devices that we worked with con-391 sists of automotive-grade microcontrollers. We used from 392 Infineon two devices from the Aurix 32-bit microcontrollers 393 family which are meant especially for automotive and indus-394 trial applications. The first microcontroller is a TC224, 395 belonging to the 1st generation of AURIX, while the other 396 microcontroller is a Tricore TC397, which is part of the 2nd 397 generation of AURIX. From the low-end sector, we chose an 398 S12XEP microcontroller which is part of S12XE family that 399 provides 16-bit arhitecture microcontrollers having Hybrid 400 Electric Vehicle (HEV), Tire Pressure Monitoring Systems 401 (TPMS) or Motorcycle Engine Control Unit (ECU) as target 402 applications in the automotive sector. All devices that we 403 used in our experiments and their specifications are listed in 404 Table 3. The following three types of attacks have been commonly 407 considered against CAN nodes. Fuzzing attacks in which an 408 attacker modifies the data-field of the genuine CAN frames 409 and transmits the malicious frames on the bus. The injected 410 data field is filled with random values. Replay attacks in 411 which genuine CAN frames are intercepted by an attacker 412 and retransmitted on the bus at a later time. In this scenario, 413    In addition to our own datasets, we also executed our 442 algorithms on datasets from related work [1]   this attack is a DoS, i.e., the malicious ECU will occupy the 454 resources allocated to the CAN bus, limiting the communi-455 cation among the other ECUs. The data field of the injected 456 messages is always set to zero. Departing from [1], in our 457 work, the flooding attack consists of frames with IDs whose 458 values are less than the genuine ID with the lowest value from 459 the dataset and the data field is filled with random values. The 460 effect is similar, although the attack is more difficult to detect 461 and more realistic since, with our adversary model, a DoS is 462 not caused by ID 0 × 00 alone. The second type of attack, 463 i.e. fuzzy attack, consists of sending frames with random IDs 464 and data. This type of attack will be much easier to detect 465 than ours since most of the random IDs will not be part of 466 the legitimate trace (for this, machine learning algorithms are 467 not needed since unknown IDs are easy to detect by a look-468 up-table). Since this attack will be immediately detected by 469 filtering, we do not reproduce it in our dataset as it will be 470 trivial to detect by the first stage of the intrusion detecting 471 mechanism which checks that IDs belong to genuine ECUs. 472 Note that in real-world scenarios, the IDs are indeed known 473 by manufacturers at the time of designing the in-vehicle 474  We investigated two options to achieve this with the Android 505 head unit and with Android smartphones respectively. 506 We first used an USB to CAN adapter to connect the 507 Android head unit to the CAN bus. The USB to CAN adapter 1 508 is commercialized by Seeed Technology and supports both 509 CAN 2.0A and CAN 2.0B with baudrates ranging from 510 1 https://www.seeedstudio.com/USB-CAN-Analyzer-p-2888.html 5 kbit/s to 1 Mbit/s. A software application is available for 511 Windows and Linux which may be used to work with the 512 adapter. In addition, a document that describes the UART 513 protocol and the way in which the device can be configured 514 and controlled is available on Github. 2 Therefore, as our 515 target was to use it on the Android head unit, we implemented 516 our own control code in Android Studio. To enable the UART 517 communication on Android, we have used a library also 518 hosted on Github [45].

519
As a second option, we used a Raspberry Pi module to 520 wirelessly route the CAN messages to the Android head unit. 521 This is the scenario in which the Raspberry Pi is connected 522 to the CAN bus using the OBD port and forwards all the 523 CAN messages from the bus to the head unit or smart-524 phone via WiFi. The Raspberry Pi device does not feature 525 an embedded CAN transceiver, therefore we needed to use 526 an external one. We chose to work with MCP2518FD click 527 board 3 from MikroElektronika which provides a complete 528 CAN and CAN-FD solution. The board is equipped with the 529 MCP2518FD CAN controller, which has SPI interface, and 530 the ATA6563 transceiver. Both integrated circuits are pro-531 duced by Microchip. The MCP2518FD click board ensures 532 CAN communication speeds up to 5 Mbps and can run in one 533 of the followings operating modes: normal CAN 2.0, normal 534 CAN FD, restricted operation, sleep, listen only, internal and 535 external loop back modes and configuration. The CAN click 536 board is connected to the Raspberry Pi via the Pi 3 Click 537 shield, which is designed by MikroElektronika to support a 538 wide range of click boards.

539
For both scenarios we used the CANoe environment and a 540 VN1630 hardware to replay the attack traces on the CAN bus. 541 The replay procedure is ilustrated in Figure 9 (iii). The frames 542 were monitored, processed and classified in genuine or attack 543 frames by the Android smartphone in one scenario or by the 544 head unit in the other scenario. The results are discussed in 545 the next section.

546
Our experimental setup with all components that we used 547 to deploy the two scenarios is presented in Figure 10. We used 548 a Mastech power supply for the PNI head unit, a laptop to 549 run CANoe Application, a VN1630 hardware to connect the 550 laptop to the CAN bus, a CAN decoder to enable the head 551 unit to communicate on the CAN bus, a Raspberry Pi to route 552 the CAN traffic from the CAN bus to the smarthone and 553 eventually the Samsung A6 and PNI head unit which ran the 554 IDS procedures. 555 The four classifiers that we later used in our on-line 556 analysis, i.e. AB, CART, ET and RFC, were trained in 557 Python and the generated code was converted to C code 558 using sklearn-porter library [46], so that it becomes easy 559 to adapt and use for Android devices and microcontrollers. 560 On the Android devices, we used the Native Development Kit 561 (NDK) which allows developers to use C and C++ code with 562 Android applications. Therefore, we compiled the C code of 563   locally stored in the Security Event Memory or transmitted to 592 the IdsR which collects the QEVs from multiple ECUs and 593 can provide the data to Security Operation Centers for fur-594 ther processing. Currently there is no specification for IdsR 595 provided by AUTOSAR. In our case, we consider that our 596 intrusion detection mechanism should be categorized as an 597 advance security sensor and its deployment should be done on 598 the application layer of the AUTOSAR architecture. This is 599 suggested in Figure 12. The ML module would receive CAN 600 frames from the communication (COM) stack and would 601 report security events to IdsM if intrusions are detected.

603
In this section we discuss experimental results both from 604 the off-line and on-line analysis. We also focus on compu-605 tational and memory requirements and particularly highlight 606 the importance of delays.

608
In order to compare the performance of the binary classifier 609 candidates and decide which of them is suitable to be embed-610 ded in a vehicular CAN bus IDS, we used regular metrics for 611 machine-learning algorithms. 612 VOLUME 10, 2022 from three different vehicles, i.e. Hyundai YF Sonata, KIA 647 Soul, and CHEVROLET Spark. Then, the authors of [1] 648 created for each vehicle three different traces, each of them 649 containing one of the three attacks that they defined in their 650 work, i.e. flooding, fuzzy and malfunction attack. In our off-651 line analysis, we evaluated the datasets which contained the 652 fuzzing and malfunction attacks on the Hyundai Sonata CAN 653 traffic. Both datasets contain approximately 60 seconds of 654 CAN traffic. We trained the algorithms on the CAN frames 655 from the first half of the datasets (≈30 seconds) while the 656 second half of the datasets was used for the evaluation phase. 657 The results are presented in Table 4 and Table 5. The 658 performance of the algorithms was almost perfect in detecting 659 fuzzing attacks. The recall was the only metric whose value 660 was 0.99 for half of the classifiers, while the other metrics 661 values were 1.00 for all classifiers. In case of malfunction 662 attacks, the results decreased a bit, especially in precision. 663  However, the overall performance is still pretty good, i.e.      Table 7. Perhaps not surprising, as this dataset 696 is the most complex from the ones that we evaluated, the 697 classifiers recorded the lowest performance results on this 698 trace. Except for the NB, which did not performed well on 699 this dataset, for the rest of the algorithms the accuracy varied 700 between 0.89 and 0.98, precision between 0.72 and 0.97 and 701 specificity between 0.87 and 0.99.

702
The results from the off-line analysis prove better than the 703 ones obtained in the on-line analysis and this is due to the 704 fact that in the on-line evaluation variations of the timestamps 705 are possible due to frame overlaps on the bus. This points 706 out that the off-line analysis presented in most papers may 707 provide more optimistic results compared to the real-world 708 evaluation.

710
One specific problem in the on-line evaluation is that the 711 devices which we used for recording CAN bus traffic, have 712 their own imperfections which influenced the performance 713 of the IDS. We note that the timestamps of the frames may 714 have slight variations according to the device. In particular, 715 the Raspberry Pi that we used over the WiFi bridge performed 716 excellent, offering almost identical timestamps to that from 717 the VN1630. The CAN decoder however did not perform very 718 well, giving poor accuracy for the recorded timestamps.

765
The results obtained on the Duster dataset are presented 766 in Table 8

785
In addition to the detection performance evaluation, we also 786 assessed the proposed IDS mechanism in terms of runtime 787 speed and memory requirements on several Android devices 788 and three automotive-grade microcontrollers. It is well known 789 that controllers employed nowadays as automotive ECUs 790 have limited computational power and memory. On the other 791 hand, ECUs communicate in real time inside the in-vehicle 792 network, so the IDS algorithms have to be very efficient in 793 terms of execution speed. Computational time and memory 794 requirements are the main challenges in adopting IDS solu-795 tions in the automotive world.

796
The first stage of our proposed IDS mechanism, which sim-797 ply evaluates the arrival time and frame rate, is of no concerns 798 in terms of execution speed or memory consumption. There-799 fore we focus our evaluation on the four selected machine 800     this in mind, we also measured the execution time with only 834 one JNI call. For this, we hardcoded the evaluated messages 835 in a C file which we compiled with the application so that 836 we can ran the algorithms on all messages at the native layer. 837 Consequently, we reduced the numbers of JNI calls to one. 838 These results are presented in Table 11. With only one JNI 839 call, the time decreases significantly for all algorithms on 840 both head units. On Duster dataset, the required execution 841 time of CART, ET and RFC ranges between 0.47 µs and 842 3.82 µs and between 1.55 µs and 10.67 µs in case of the 843 ADAS Systems dataset. AB is executed in less than 86 µs on 844 the PNI head unit and in less than 72 µs on the Erisin head 845 unit. 846 We further did the same evaluations on the Android 847 smartphones and tablet. The results are presented in 848  Tables 12 and 13. Table 12 contains the execution times eval-849 uated with multiple JNI calls, while Table 13 lists the exe-850 cution time results with one JNI call. The smartphones and 851 the tablet prove to be somewhat faster than the head units. 852 According to the results, the fastest algorithm is CART, which 853 required an execution time in the ranges of 0.77 µs (on Sam-854 sung Galaxy Tab S7) to 5.76 µs (on Samsung A6) when clas-855 sifying frames from the Duster dataset. As expected, the time 856    configuration. It seems that VMs running Ubuntu are 891 somewhat faster than the Windows based VMs. However, 892 an important aspect that needs to be considered for cloud 893 solutions is the data transmission time, which depending on 894 various factors (e.g. location of the server, internet connec-895 tion) can range from tens of milliseconds to hundreds of 896 milliseconds or even more. For a better visualization, the 897 computational results on the Android devices and cloud VMs 898 are depicted as bar-charts in Figure 14. 899 Next, we evaluated the algorithms on the automotive-grade 900 microcontrollers. In our experiments, we compiled the C code 901 with the default compiler options for each microcontroller, 902 which leaves room for optimization in terms of memory or 903 execution speed, depending on the needs. For this class of 904 devices, in addition to execution speed, we also evaluate the 905 required code flash for each algorithm, since memory con-906 sumption is one of the most stringent limitations of the auto-907 motive microcontrollers. The results are listed in Table 16. 908 Regarding memory consumption, the situation looks good for 909 the algorithms that were trained on the Duster dataset. The more than 40% of the entire available memory of this micro-924 controller. We were not able to include and assess CART or 925 RFC on S12 as the compiler that we used for S12 has a 64 kB 926 code limitation.    However, this happens only if one can avoid expensive API 957 calls over the JNI interface and if the code is run at the native 958 level on the ARM processor of the Android unit. This will 959 depend on the number of JNI calls that are time consuming, 960 i.e., when multiple calls are used, the high-end controllers 961 will outperform low-end Android devices. This implemen-962 tation detail may significantly reduce the capability of such 963 devices. For example, when performing multiple calls from 964 Java to the C/C++ code of the classifier, the Android head 965 unit proved to be slower than the fastest microcontroller. Also, 966 we notice that the CAN decoder that was linked through the 967 serial interface to the Android Unit is not reliable enough for 968 recording the timestamps which further impedes the detection 969 rates of the IDS. Nonetheless, the same CAN decoder was 970 unable to cope with the frame rate from the bus and there was 971 a consistent frame loss. Finally, the WiFi bridge performed 972 very well giving almost identical results in terms of times-973 tamps compared to industry standard VN1630. This suggests 974 this option as a reliable one for implementing an IDS inside 975 vehicles. The flexibility offered by implementing an IDS on 976 Android devices, which may take advantage of high CPU and 977 memory resources as well as cloud support, may open road 978 for the deployment of more advanced IDS in future cars.

AB
Adaptive Boosting.

ADAS
Advanced Driver-Assistance Systems.