MalView: Interactive Visual Analytics for Comprehending Malware Behavior

Malicious applications are usually comprehended through two major techniques, namely static and dynamic analyses. Through static analysis, a given malicious program is parsed, and some representative artifacts (e.g., control-flow graphs) are produced without any execution; whereas, the given malicious application needs to be executed when conducting dynamic analysis. These two mainstream techniques for analyzing the given software are effective in detecting certain classes of malware. More specifically, through static analysis, the patterns and signature of the malware are exposed, helping in detecting any known malicious payload hidden in or injected into the code. On the other hand, behavioral and run-time execution patterns of software are explored through dynamic analysis. To ease the analysis process, a third analysis approach, known as the visual representation of the artifacts created by both static and dynamic analysis tools, would also be a supplementary asset for malware experts. This paper introduces MalView, an interactive visualization platform, for malware analysis by which pattern matching techniques on both signature-based and behavioral analysis artifacts can be utilized to 1) classify malware, 2) identify the intention and location of the malicious payload in the artifacts, 3) analyze unknown malware (i.e., zero-day malware) by recognizing any unusual signature or behavior, and 4) explore the time dependencies and thus the system components affected or tampered by the underlying malware. The results of several case studies conducted in this work show that MalView offers more features and information compared to some other visualization tools, facilitating the malware analysis process.

unauthorized activities on behalf of their originators on the 23 host machines for various reasons such as stealing advanced 24 technologies and intellectual properties, governmental acts 25 of revenge, and tampering sensitive information, to name a 26 few. Malware applications are complex software programs 27 that are often obfuscated to disguise their main intentions 28 The associate editor coordinating the review of this manuscript and approving it for publication was Laxmisha Rai . and thus deceive network administrators and the underlying 29 intrusion detection systems. Although such obfuscations can 30 be captured, reported, and maintained in a repository as a 31 reference for building better detection mechanisms, newer 32 malware programs are constantly developed by professional 33 hackers raising the challenging problem of zero-day malware 34 detection [1]. As a result, in order to build an effective mal- 35 ware detection and defense system, it is crucial to understand 36 each malware and comprehend its behavior through rigorous 37 analysis. 38 There are two conventional approaches that are widely 39 adopted for analyzing software programs: 1) static anal-40 ysis by which the underlying software is parsed, and 41 VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ intermediate transformations of the underlying software are 42 generated without actually executing the software program. 43 For instance, a control-flow graph can be created to repre- 44 sent the execution control of the program under test; and 45 2) dynamic analysis by which the program under test is exe- 46 cuted in a controlled environment (e.g., a sandbox) and the behavior of the program is observed under various environ-Malware visualization systems can be categorized into 95 three categories: Malware forensics, Malware Comparison, 96 and Malware Summarization [5]. The work in MalView is 97 under Malware Forensics and Malware Comparison cate-98 gories: assisting the understanding of the behavior of an 99 individual malware sample for forensics. By exploring the 100 characteristics and relationships between the process and its 101 dependencies and mapping them to visual features, MalView 102 provides an interactive and intuitive platform to comprehend 103 malware behavior towards the ultimate goal of generating 104 rules and signatures for fully-automated malware detection 105 systems. To demonstrate the effectiveness of MalView in 106 identifying and interpreting malicious and suspicious activ-107 ities of malware, the paper reports the analysis of differ-108 ent families of malware namely: Remote Access Trojans 109 (RATs), Backdoor, Ransomware, Behavioral, Email Flooder, 110 and Hacktool. The results show that using MalView it is pos-111 sible to quickly understand the main functionalities of the 112 underlying malware without delving into a complex analysis 113 of the static and dynamic analysis reports. 114 While conducting the case studies and inspecting some 115 malware families, the authors noticed the different behavior 116 exhibited by the same malware on different operating system 117 (OS) platforms. As a result, each malware was executed and 118 inspected on three different Windows platforms: Windows 119 XP, Windows 7, and Windows 10. Even though the execution 120 of each malware was performed in a controlled environment, 121 it was noticed that the newer platforms of Windows operating 122 systems (e.g., Windows 10) were creating more system and 123 kernel-level processes making it harder to thoroughly inspect 124 and analyze the exact flow of each malware on these recent 125 versions of platforms. As a remedy for such problem, it is 126 suggested to apply additional filtering mechanisms in order 127 to analyze each malware and its processes thoroughly. This 128 paper makes the following key contributions: 129 1) It introduces MalView, a malware visualization tool to 130 enable analytical reasoning of malware behaviors.

131
2) The MalView visualization tool visualizes the output of 132 several dynamic and static analysis tools.

133
3) The tool also integrates the output of many anti-virus 134 tools using their Application Programming Interface 135 (API) to provide additional insights for each malware. 136 4) The paper demonstrates the efficiency and effective-137 ness of MalView through several case studies conducted 138 on a set of the family of malware. 5) The paper also compares the behavior of each malware 140 when executed on three different Windows platforms 141 (i.e., XP, 7, and 10) in order to recognize the impact of 142 environmental settings on malware comprehension and 143 analysis.

144
A. ORGANIZATION OF THE PAPER 145 The rest of the paper is laid out as follows: Section II gives 146 an overview of the data collected and feed into the visual-147 ization. Section III introduces the system and visualization 148 99910 VOLUME 10, 2022 tasks that guide the system design. Next, section IV elaborates The log file output from Procmon contains five major types of 206 process activities, which are color-coded in our framework. 207 • Registry: Events of registry operations, such as querying 208 and enumerating keys and values.

209
• File System: Events related to operations on local and 210 remote storage and file systems.

211
• Network: Network activities, including TCP and UDP. 212 • Process: Events of process/thread, such as process cre-213 ation, start, and exit.

214
• Profiling: Events for every process in the system in terms 215 of memory used, kernel and user time charged, output as 216 a log for the profile.

218
MalView is aimed at accelerating malware analysis and inte-219 grating visual analytics to enable interactive data exploration 220 and malware behavior comprehension. Figure 1 depicts the 221 architecture of MalView. The flow of information in MalView 222 is as follows: 1) It uses a data provider in dynamic analysis, 223 where the malware sample is executed on a host system, then 224 the data provider logs relevant information into execution 225 traces. 2) MalView takes in the raw data captured by the data 226 provider, extracts the information, and maps them to visual 227 features. 3) MalView explores the relationship between each 228 process and its dependencies. To the best of our knowledge, 229 this feature has not been taken into account in previous work, 230 not only the malware as an individual but also its interactions 231 with the system and the artifacts created.

232
The MalView prototype provides visual representations 233 for system and malware activities captured by Procmon [6] 234 utility. In the context of malware analysis, four important 235 system-level activities are of utmost importance that need 236 to be captured, namely registry, file system, processes and 237 threads, and network activities. These are four major cate-238 gories that are highlighted by InfoSec [7], [8] as an indication 239 of malicious activities. The processes and events related to 240 these four activities are captured by Procmon, filtered, and 241 then fed to MalView. 242 MalView provides an analysis of linked views with inter-243 actions for users to gain comprehensive insights into mal-244 ware behaviors within the system. Details of MalView visual 245 components with their corresponding interactive features 246 are described in the following section, MalView: Visual 247 components.

248
The tool MalView is developed as a web-based appli-249 cation using JavaScript and D3.js library created by 250 M. Bostock et al. [10]. The primary goal of MalView is to 251 provide an interactive visualization platform that demon-252 strates the malware behaviors and interactions within the 253 system. The captured events are presented in multiple 254 VOLUME 10, 2022 FIGURE 1. The schematic overview of MalView framework for analyzing the dynamic behavior of malware. The visualization provides linked views and supports interactive features, such as filtering, highlighting, ranking, and details-on-demand. its dependencies. To characterize the complex associ-285 ations between entities within the system, the system 286 should show the relationships caused by interactions 287 among processes and function calls from a process to its 288 dependencies.

289
• T5 Highlight critical activities in context. Here, crit- 290 ical is defined in context: For the timeline as a whole, 291 MalView should allow user to zoom into the time interval 292 that captured the most active interactions of the malware. 293 For malware activities in particular, the system should 294 incorporate filter-based feature to highlight the com-295 monly encountered malicious types, besides the original 296 representation.

297
• T6 Order the entities based on dependencies charac-298 teristics. A specific ranking order along a data dimen-299 sion be of tremendous help in the arrangement of 300 visual components to convey important characteris-301 tics and allow the user to focus on the top essential 302 entities.

303
• T7 Classify malicious vs. benign activities. Another 304 key to understand malware forensics is the ability to 305 show the malicious and benign activities. The system 306 should be able to classify the level of malice that cor-307 responds with the malware sample captured.

309
Taking into account the mapping to time, the associations 310 between processes and dependencies, and guided by the 311 designed tasks, we designed the user interfaces of MalView. 312 Figure 2 depicts the main modules of MalView for 313 99912 VOLUME 10, 2022  Building upon the visual information mantra by Shneider-348 man [12]: ''Overview first, zoom and filter, then details-on-349 demand,'' the process activity in Figure 2(B) is designed to 350 explore the temporal patterns from the system's low-level 351 events along with inter-process communications. The time-352 line is presented horizontally from left to right, while the 353 processes are listed vertically. Besides the operations exe-354 cuted by a process itself, there are interactions between two 355 processes, such as one creating the other with its primary 356 thread, demonstrated by the arc connecting the two. On top 357 of panel B is an area chart showing the arc distribution, pro-358 viding the overview of the function call frequencies (visual-359 ization task T1). 360 Each process is associated with an aligned set of events 361 executed by the process itself. An individual event is rep-362 resented by a thin vertical bar, color coded by its event 363 type, which is introduced in section II-B and presented in 364 panel A. These small, thin bars are presented with 50% 365 VOLUME 10, 2022 transparency so that if multiple events appear at nearly the 366 same time, the color will add up on display (visualization 367 task T3); therefore, users can see that the calls are busy there 368 and there is a chance for anomalies detection at these spots 369 (visualization task T7). 370 The interaction arc starts from the parent process (the one 371 that initializes the call, or the source) and ends at the child  In this case, the process is both the one that initializes and 381 the target of the call.

382
MalView supports details-on-demand in terms of process 383 detail, event call detail, filtering calls related to one specific 384 process, and zooming in a period (visualization task T2). 385 The details of an event can be shown on the tooltip by    Figure 2(C) presents the classification for malicious or benign 422 activities of the captured log file produced by Procmon (visu-423 alization task T7). Aligning with the primary aim of pro-424 viding a visual analytics tool and platform to demonstrate 425 malware's static and dynamic behavior, MalView captures 426 the results provided by the integrated APIs and visualizes 427 them to the end-user. MalView incorporates a number of APIs 428 such as VirusTotal API and inherently relies on the output 429 produced by these APIs. We investigate the target domains 430 that the network activities are connected to. The extracted 431 information for each connected domain contains its Internet 432 Protocol (IP) address, the detection classification results, the 433 associated process and activities related to the domain, and 434 lastly, the country to which the server is hosted. The API automatically scans a given malware, and 436 their patterns are automatically compared with more than 437 70 servers and databases. The classification result consists of 438 four categories: malicious, suspicious, undetected, or harm-439 less, each indicated by the number of detections found corre-440 sponding to the targeted domain. Spring et al. [13] discussed 441 that the malicious domains are attempts to connect with a 442 command and control server or dropbox and are expected 443 to behave differently from a typical phishing or a drive-by-444 download malicious site. In MalView, this list of connecting 445 domains is ordered by the variety of the outcomes of each 446 domain (visualization task T6). Figure 3 demonstrates the 447 analysis summary of TeeracB malware on Windows 7. One 448 malicious domain is detected, named ''maatuska.4711.se'', 449 connected by the ''explorer.exe'' process with ''TCP Recon-450 nect'' activity.

452
This process dependencies view (Figure 2(D)) presents an 453 in-depth analysis of each process in the system, where one 454 process can operate on many types of objects, as introduced 455 in section IV-B1 and shown in panel (D1). The visualization 456 task T4 is actualized as presenting the one-to-many relation-457 ships between the process and its dependencies. In addition, 458 as the number of dependencies increases in cases with com-459 plex activity, we need a way to handle visual clutters by 460 reducing the number of visual elements while preserving the 461 structure. For these reasons, we employ 1) the force-directed 462 layout with node-link diagram to demonstrate the relation-463 ships and 2) the node bundling technique [14] incorporated 464 into the force-directed layout to reduce visual clutter by node 465 aggregation. Force-directed layout has been explored in many 466  In an effort to provide its users with a safe and pro-  [20]. 522 MalView can be utilized in different settings. 1) When 523 the objective is to comprehend malware functionalities and 524 not detection, 2) when a new malware application (zero-day 525 malware) is developed and not detectable by any tool (due 526 to lack of profiles and signatures), 3) when the objective is 527 to classify a family of malware and then employ a set of 528 generic solutions and remedies to address each class of mal-529 ware, and 4) when new malware is developed, and we are 530 interested in investigating whether it follows some existing 531 known malicious patterns or not (i.e., labeling malware type). 532 Accordingly, if there is an incident report about zero-day vul-533 nerability where there is no clear patching solution developed, 534 MalView can help us to analyze and comprehend the malware 535 with zero-day vulnerability and thus enable us to identify 536 patches or solutions better. To demonstrate the usability of 537 MalView in analyzing malware software visually, we con-538 ducted a set of case studies in which the output and behavior 539 of the selected malware were captured. Due to the space limit, 540 we capture and present the processes involved in seven mal-541 ware, namely 1) Backdoor, 2) RemoteAccess, 3) Behaviour, 542 4) Ransomware, 5) EmailFlooder, 6) Hacktool, and 7) Trojan 543 (Info stealer). The following sections demonstrate the appli-544 cations of MalView to several of these malware types.

546
The malware experimentation setup needs an isolated and 547 controlled environment so that the malicious code does not 548 propagate or infect other entities in the network. This clean 549 and isolated environment also helps to identify the changes 550 and possible tampers in the system due to the malicious activ-551 ities of the malware specimen. For this work, we installed 552 three different Windows systems on an Oracle Virtual Box: 553 Windows XP, Windows 7, and Windows 10. The windows 554 defender services, windows security services, firewalls, and 555 other automatic security updates were disabled on each of the 556 virtual OSs to prevent any interruption during the malware 557 sample's execution and capture all the traces of their dynamic 558 behavior. To capture the interaction between the malware and 559 each host system, Procmon was installed on all environments. 560 More specifically, all the user applications on the virtual OS 561 were closed, the malware process name was added to the 562 monitor filter to capture only the events of the malware exe-563 cutable. Then the executable was run for two minutes before 564 saving the time-ordered system activities from Procmon and 565 fed to MalView. 566 Since MalView depends on the output of Procmon, the 567 amount of information it visualizes depends on how long 568 Procmon is executed. The execution time also shortens the 569 amount of data captured by Procmon. According to our expe-570 rience with MalView, a larger and more complex output and 571 traces produced by Procmon makes MalView less effective 572 since the visualization needs to capture a vast number of 573 processes and events. However, a key feature of MalView is 574 to offer different levels of abstraction and complexity. If we 575 VOLUME 10, 2022 Threat Index [21] published by Check Point, RATs are ranked 602 among the top 10 ''most wanted'' malware. 603 We captured the run time behavior of RATs on differ-604 ent Windows and visualized the behavior using visualization 605 tool MalView. The live malware sample was downloaded 606 from public malware dataset VirusShare [22]. According to 607 a multi-scan report from Virustotal [23], this sample has a 608 community score of 66 out of 70, i.e., out of 70 detection 609 engines, 66 could identify it as a malicious executable. Fig-610 ure 4 shows the detail analysis performed on an RAT sam-611 ple using MalView. The malicious indicators presented by 612 MalView are as follows:

613
• Process: The malicious executable spawns processes 614 like explorer.exe, wscript.exe, and svchost.exe. The exe-615 cution of these processes indicates that the RAT pro-616 gram is trying to start a command prompt and then run 617 some scripts to start a session to monitor the process 618 remotely.

619
• Registry: The sample RAT performs a large number of 620 registry operations, including the creation of registry 621 keys as well as a query of the registry entries.

627
Besides the malware-associated events, MalView is also 628 able to capture the recurrent pattern of periodical operations, 629 such as the system process of Local Security Authority Sub-630 system Service lsass.exe or Virtual Box's vboxservice.exe. 631 The influence of running platform will be discussed further 632 in Section VI.

634
A Trojan is a type of malware that pretends to be a benign pro-635 gram, but after installation, it executes hidden code and then 636 performs malicious activities such as deleting or tampering 637 with data, stealing information, running some other scripts, 638 and creating backdoors. In general, it enables the attacker to 639 access the victim's system, and these types of malware are 640 not able to replicate themselves [24].  processes. The visualization helps to discern these low-level 685 operations from the malware to other system processes.

687
A backdoor is a type of malware that provides unauthorized 688 remote access to the compromised system by exploiting secu-689 rity vulnerabilities. The malware works in the background 690 while hiding from the user. Meanwhile, it enables the attacker 691 to have access to the victim's computer, such as databases 692 and file servers, as well as running system-level commands. 693 The process of injecting Backdoor is usually performed in 694 two stages: First, a small file, called a dropper, is installed. 695 Second, the dropper downloads the main malicious file from 696 a remote location [26]. It is important to mention that Trojan 697 and backdoor malwares are not the same: A Trojan might con-698 tain a backdoor, but a backdoor can execute as a stand-alone 699 program without being a part of a Trojan. 700 VOLUME 10, 2022 medium-sized businesses were under attack of ransomware. 749 A lot of times, these organizations end up paying for the 750 ransom. According to a multi-scan report from Virustotal, the 751 sample studied in this paper has a community score of 47 out 752 of 72, i.e., out of 72 detection engines, 47 could identify it as 753 a malicious executable.

799
Hacktool is a piece of software that malicious attackers use 800 to gain unauthorized access to user's devices [18]. As of 801 the time of this writing, Microsoft lists 188 active entries as 802 Hacktools, of which 93 are severe, 80 are high, and 15 are 803 moderate in terms of alert levels [20]. The popular attacking 804 channel for Hacktool is via insecure Universal Serial Bus 805 (USB) communication design and Windows Autoplay fea-806 tures [34]. Malicious activities for Hacktool launched from 807 USB include 1) changing registry settings, 2) installing a 808 backdoor, 3) stealing confidential information, and 4) reading 809 data encryption keys. Recently, besides Trojan, Hacktool is 810 also the second most prevalent type of malware embedded in 811 pirated software [35].   Windows XP, 7, and 10, as depicted in Figure 13. In partic-   As described below, these features are able to detect any 901 ''behavioral patterns in the set of malware studied and thus 902 enable us to classify them according to their dynamic behav-903 ior. Instead of trying to generate patterns of interest, in this 904 study, we show how the analysis works based on malware 905 behavior tracing, the kind of information it entails, and how 906 the tool can enable analysts to quickly study the interaction 907 of malware with system internals using selections, focus and 908 context technique, and aggregations.

909
With MalView, we focus on the interactions of the malware 910 program to other system internals processes. While Proc-911 mon, as the data provider, brings detailed information into 912 each of the processes running in the system, the interval 913 and log activity captured may be subjective to the person 914 behind the capturing execution. To focus on the time inter-915 val in which we can witness the most significant amount of 916 malware activity to other system internals processes, called 917 busy interval, we applied focus and context visualization 918 technique in MalView to support 1) close-up view for indi-919 vidual malware analysis and 2) standardization for malware 920 comparison. To accommodate the context around the focal 921 point, we select the interval that satisfies either ensuring the 922 equal paddings to the first and last interaction to the bound-923 ary of the interval or equal paddings to the peak of the area 924 chart -where there witness the highest amount of interactions. 925 Patterns of Bladabindi malware behavior across platforms: 926 (a) Windows XP, (b) Windows 7, and (c) Windows 10, all 927 under focus and context technique with busy interval length 928 of 20 seconds, are shown Figure 15. By using mousing over 929 an arc representing a function call, an user can observe the 930 detailed information including type of operation, source and 931 target processes. A recurring pattern observed from the blad-932 abindi is the following sequence of calls: A ProcessCreate 933 from explorer.exe to the malware, following by a Process-934 Create from the malware to netsh.exe. As shown in Fig-935 ure 5 and Figure 7, different processes produce very different 936 dependency connections in terms of topology, grouping and 937 volume. However, as presented on the right of Figure 15, 938 the dependencies of the three bladabindi malware processes 939 across different platforms demonstrate many similarities: the 940 three biggest nodes that have the degree of one are all from 941 registry (green), file (sand color), and dll (grey). For nodes 942 with a degree of two -having connections with both blad-943 abindi and netsh.exe, their categories are the same regardless 944 of the running platforms. For further analysis, these patterns 945 can serve as indicators for such classes of malware.

947
This section compares the features offered by MalView with 948 the ones offered by some other malware visualization tools, 949 including Hybrid [38] and AnyRun [39]. First, we briefly 950 review each visualization tool and then compare its features. 951   [39], Hybrid [38] and MalView.

1006
This section aims to highlight the key features of 1007 AnyRun [39] and Hybrid [38] in comparison with the fea-  Table 1 lists the features classified into these four groups. The features related to dynamic analysis are considerably 1024 diverse. As a result, each analysis tool offers its own set of 1025 unique features. Given the fact that MalView mostly visu-1026 alizes the output of Procmon [6], it is primarily a dynamic 1027 analysis tool. Depending on how the underlying malware 1028 visualization tool is implemented, most of these tools are 1029 able to visualize the ''basic'' sets of dynamic data captured 1030 through Procmon or similar utilities. For instance, as Table 1 1031 shows, most of the behavioral features are visualizable by 1032 these three tools.

1033
The major and key feature that is unique to MalView is 1034 the exploration of ''time dependencies between processes'' 1035 (Features #15 and #16). The visualization of time and pro-1036 cess dependencies are an important part of malware anal-1037 ysis in order to comprehend the nature of the underlying 1038 malware. While static analysis is computationally efficient, its per-1118 formance could be impaired by packed or encrypted mal-1119 ware. On the other hand, dynamic analysis analyzes actual 1120 behaviors from the malware while it is running, so it is more 1121 efficient [51]. In their work [52], Shaid and Maarof pro-1122 posed a method to generate images representing malware 1123 API calls. First, the API calls are monitored in the mal-1124 ware behavior capturing step. These calls are then sorted 1125 from malicious to less malicious. Finally, each API call 1126 is assigned a color depending on its maliciousness level. 1127 Similarly, Kancherla et al. [51] proposed to convert the mal-1128 ware into a gray-scale image called byteplot. They then 1129 used machine learning (ML) methods (e.g., Support Vector 1130 Machines) to analyze the low-level features (e.g., intensity 1131 and textures) extracted from the resulted images. Regarding 1132 ML approaches, LeDoux and Lakhotia [53] presented that 1133 ML has a natural fit with malware analysis, where ML oper-1134 ates by rapidly learning, discovering inherent patterns and 1135 similarities in the corpus.   hierarchical structure of the Pythagoras tree is characterized 1202 as the distance between nodes at each lower depth of the 1203 tree is reduced by √ 2/2. Thus, the tree helps bring malware 1204 with higher similarities into clusters as leaf nodes with shorter 1205 distances stay close to one another. 1206 Anderson et al.

As stated earlier, MalView is primarily a visual analytics
[66] presented a malware classification 1207 system that works based on the combination of static and 1208 dynamic features. For static feature extraction, they used 1209 three sources, including 1) the binary file, 2) the disassem-1210 bled binary, and 3) the control flow graph of the disassem-1211 bled binary file. For dynamic feature extraction, they used 1212 dynamic instruction sequence and the dynamic call sequence. 1213 They tested their system using a large malware dataset and 1214 achieved 98.07% accuracy with the combined static and 1215 dynamic features. They also achieved a 96.14% accuracy by 1216 using only static features. Yoo [67] designed the visualization 1217 based on the belief that malicious content in an executable 1218 file has a unique feature called SOM (Self-Organizing Map). 1219 By calculating the SOM and visualizing a specific executable 1220 file, the potential portion of the malicious content can be 1221 determined, and by checking the generated pattern, the mal-1222 ware family can be detected. 1223 Saxe et al. [68] developed an interactive visualization sys-1224 tem for comparing malware samples in a dataset using the 1225 extracted features. Based on the presence of the system call 1226 sequence, the similarity matrix for the malware dataset is gen-1227 erated. This system also provides a comparison view among 1228 malware samples based on their malicious activity. On mobile 1229 computing platforms such as Android devices, Jenkins  Analyzing malware through visual behaviors has been stud-1237 ied with the aim to observe the overall flow of a program, 1238 discover malicious patterns, and quickly assess the nature 1239 of the malware sample [5], [71]. Wagner et al.
[72] pro-1240 posed KAMAS, a knowledge-assisted visualization system 1241 for behavior-based malware analysis, which visualizes API 1242 call sequences gathered during the execution of malicious 1243 software. Our approach aligns with this direction, but we shift 1244 the focus on the analysis side with different malware families 1245 and the influence of operating systems on malware behavior. 1246 In particular, the design decisions and techniques in MalView 1247 are applied in the malware analysis domain and derived from 1248 visualization principles for time-series data, which is the col-1249 lection of observations through repeated measurements over 1250 time, including but not limited to numerical, geolocation, 1251 and text data [73], [74], [75].   They tested their proposed approach by executable and non-1290 executable (PDF format) malware samples. 1291 Gregio and Santos [83] developed an interactive timeline 1292 tool for visualizing dynamic malware behavior using various 1293 techniques [48]. They ran the given malware in a controlled 1294 environment and captured its behavior using a modified ver-1295 sion of BehEMOT [84] (a malware behavior monitoring tool).

1296
They captured high-level activities such as file write and 1297 delete, process creation and termination, registry reads and 1298 writes, mutexes and network operations, and system calls 1299 using System Service Dispatch Table (SSDT) hooking, which 1300 operates at the kernel level. In addition, they used identifica-1301 tion labels provided by VirusTotal [23].   [85] also utilize PCAP to 1313 discover the patterns in traffic to explore the intrusive behav-1314 ior from malware activities.

1316
This paper introduces MalView, an interactive visualization 1317 platform for hybrid analysis and diagnosis of malware. Our 1318 approach first represents the behavioral properties of the 1319 major malware classes (such as Trojan or backdoor), aim-1320 ing to capture the common visual signatures of these mali-1321 cious applications. MalView implements a web-based proto-1322 type for demonstrating our approach to analyzing 60 malware 1323 samples from seven different classes. that MalView comparatively implements most of the features 1336 offered by the other two tools. In addition, the time and pro-1337 cessed dependencies, the key features of MalView, are imple-1338 mented in the prototype, making the analysis more thorough. 1339 Given the ability to process, visualize and analyze the system 1340 activities and put them into a comprehensive view, MalView 1341 can serve as an informative and potential interest to develop-1342 ers, engineers, and practitioners outside the laboratory.

1343
There are several lines of research that can be explored 1344 through visual analytics when complemented by conven-1345 tional static and dynamic analysis: The early detection 1346 of zero-day vulnerability and malware is a grand chal-1347 lenge. There are several machine learning-based approaches 1348 for addressing this problem [1], [53]. With the capability 1349 of visual analytics facilitating explainable machine learn-1350 ing [86], [87], applying visual analytics techniques to detect-1351 ing and analyzing unknown and zero-day malware is an inter-1352 esting research approach that can be explored using MalView. 1353 The key feature of MalView is its features in demonstrating 1354 time and process dependencies that occurred during static 1355 and dynamic analysis. A potential research direction is to 1356 model malware behavior through recurrent neural networks 1357 on the visual signatures and then predict malware behaviors 1358 or even classify suspicious programs into a particular class of 1359 malware. It would also be interesting to model malware sam-1360 ples through genome alignments and then model the malware 1361 classification or detection problem through deoxyribonucleic 1362 acid (DNA) or sequence matching approaches. The sequence 1363 matching might be useful in capturing the core malicious 1364 functionalities of obfuscated malware. The obfuscation tech-1365 niques employed by the obfuscating tools often follow similar 1366 for all these obfuscated malicious applications share fully or 1368 partially the same core. MalView will offer a visual analytic 1369 approach to spot these similar patterns in the execution traces. International of ASTM, and has been presented at IEEE International Con-1698 ference on Big Data, IEEE VIS, and EuroVis. 1699 MOITRAYEE CHATTERJEE received the Ph.D. 1700 degree from Texas Tech University, in 2020. She 1701 is currently an Assistant Professor at New Jersey 1702 City University. Her research work lies at the inter-1703 section of machine/deep learning and cyber secu-1704 rity. She has been contributing to various NSF and 1705 the U.S. Department of Education funded research 1706 projects and workshops on digital forensics, gener-1707 ating secure software configuration, mental model 1708 development of cyber adversaries, reverse engi-1709 neering, cloud security & abuse, and malware analysis. She regularly partic-1710 ipates in top conferences, such as BlackHat, DefCon, IEEE Bigdata, IEEE 1711 COMPSAC as a Workshop Organizer, a Program Committee Member, and 1712 a Presenter. She has been over ten publications in various peer-reviewed 1713 journals and conferences with more than 100 citations, since 2018. Besides, 1714 academic and research involvements she has eight years of industry experi-1715 ence working in software consulting industry in countries, such as India and 1716 Malaysia.

1717
AKBAR SIAMI NAMIN received the Ph.D. 1718 degree in computer science from Western Univer-1719 sity, London, Canada, in August 2008. He is cur-1720 rently an Associate Professor in computer science 1721 at Texas Tech University. His research interests 1722 and expertise include software engineering, testing 1723 and program analysis, software and cyber secu-1724 rity and malware analysis, and machine and deep 1725 learning. He has coauthored over 100 research arti-1726 cles published in premier journals and venues. His 1727 research on cyber security research and education is funded by the National 1728 Science Foundation.

1729
TOMMY DANG is currently an Assistant Profes-1730 sor in computer science at Texas Tech University, 1731 where he directs the interactive Data Visualization 1732 Laboratory (iDVL). His research on big data visu-1733 alization and visual analytics has appeared in Com-1734 puter Graphics Forum and IEEE TRANSACTIONS 1735 ON VISUALIZATION AND COMPUTER GRAPHICS and has 1736 been presented at IEEE Information Visualization, 1737 IEEE Visual Analytics Science and Technology, 1738 and EG/VGTC Conference on Visualization. Pre-1739 viously, he has been a Postdoctoral Researcher on a DARPA-funded project 1740 on biological network visualization at the Electronic Visualization Labora-1741 tory, University of Illinois at Chicago, which focuses on advanced virtual 1742 reality, notably the CAVE2 hybrid reality environment and the SAGE2 scal-1743 able amplified group environment.