An Analysis of Conti Ransomware Leaked Source Codes

In recent years, there has been an increase in ransomware attacks worldwide. These attacks aim to lock victims’ machines or encrypt their files for ransom. These kinds of ransomware differ in their implementation and techniques, starting from how they spread, vulnerabilities they leverage, methods to hide their behaviors from antivirus software, encryption methods, and performance. The Conti ransomware is sophisticated ransomware that operates as ransomware-as-a-service. It started in 2019 and had an unprecedented human impact by targeting healthcare systems and cost $\$ $ 45 million. This paper analyzes the Conti ransomware source codes leaked on February 27, 2022, by an anonymous individual. We first look at the general code structure. Then, we analyze its flow, starting with its application programming interface disguise techniques, anti hook mechanisms, command-line arguments, and finally, its multithreaded encryption. We also perform a static and dynamic analysis of the latest known Conti sample in an isolated environment and compare its behavior to its source code flows.

Ransomware as a service (RaaS) is a new trend in the ran-23 somware world. It is a business model that mirrors Software 24 as a Service (SaaS), as shown in Fig. 1. RaaS allows anyone 25 to use pre-created ransomware tools to launch a ransomware 26 attack. RaaS affiliates profit by cutting a percentage of each 27 successful ransom payment [5], [6]. Ryuk, Satan, Netwalker, 28 Egregor, and many more are all ransomware variants that 29 follow the RaaS ecosystem. One of the most dangerous RaaS 30 ransomware is Conti, which started its operations in 2019 by 31 The associate editor coordinating the review of this manuscript and approving it for publication was Xiangxue Li. targeting healthcare, first responder networks, law enforce-32 ment agencies in the U.S., and more than 400 organizations 33 worldwide [7]. 34 Conti ransom is usually tailored to its victims. For exam-35 ple, in May 2021, the backup storage vendor ExaGrid was 36 attacked by the Conti ransomware; the Conti group demanded 37 a $7 million ransom; ExaGrid managed to negotiate and paid 38 $2.6 million in the end [8]. However, the ransom can even go 39 higher; in May 2021, the Health Services Executive (HSE) 40 in Ireland was attacked by the Conti ransomware and asked 41 for a $20 million ransom which Ireland refuses to pay [9]. 42 According to the FBI, Conti ransom demands have been as 43 high as $25 million [10], making it the most aggressive and 44 profitable ransomware. 45 In Feb. 2022, the Conti group announced its full support 46 to the Russian government after the Ukraine invasion [11]. 47 The Conti group also threatened to deploy retaliatory mea-48 sures to critical infrastructure if cyberattacks were launched 49 against Russia [11]. This announcement led to around 60,000 50 messages from internal Jabber chat logs being leaked by 51 an anonymous individual who showed their support for 52 Conti ransomware uses. It also describes how it disguises 75 those libraries' names and API names using API hashing, 76 unhooking, and dynamic loading techniques. We also list 77 all its command-line options with their description. Finally, 78 we describe how it can delete Windows shadow copies and its 79 multithread encryption process for local and shared network 80 files. 81 The rest of the paper is organized as follows. In Section II, 82 we highlight some related work for ransomware analysis 83 and the related work for Conti ransomware. In Section III, 84 we present the Conti ransomware source code analysis, 85 including many subsections based on the execution phases 86 of the ransomware. In Section IV, we present Conti ran-87 somware's static and dynamic analysis in a controlled and 88 isolated environment. Section V lists some defense and coun-89 termeasures to protect against the Conti ransomware. Finally, 90 in Section VI, we conclude the paper. 92 In the past few years, ransomware attacks have increased 93 significantly, leading cybersecurity researchers to study these 94 kinds of ransomware and analyze their behaviors. Many 95 researchers suggest various methods for detecting and mit-96 igating some ransomware attacks. There are standard ransomware analysis techniques. These 99 techniques consist of static analysis and dynamic analy-100 sis [13]. The static analysis focuses on analyzing ransomware 101 files without executing them. In [13], the authors stati-102 cally analyze a Portable Executable (PE) file of Avaddon 103 ransomware using tools such as PeStudio, x64dbg, and 104 BinaryNinja. They succeed in extracting strings and import 105 functions from the PE file. These strings and functions can 106 provide helpful information that shows the ransomware's 107 capabilities before executing it.

108
The recent ransomware families usually implement obfus-109 cated techniques to hide their data from static analysis tools or 110 delay the analyst [13]. They also can have an anti-debugging 111 mechanism to hide their actual behavior when executing 112 under a debugger [13], [14], [15], [16]. The other downside 113 of static analysis is that the ransomware author can alter the 114 PE files to provide false information to mislead the analyst; 115 for instance, in [13], the authors extract the compilation time 116 from the PE file. This field contains the information on when 117 the PE gets compiled. Ransomware authors can manually 118 alter this field to provide a false date [13].

119
Almost all existing static analysis tools extract information 120 from sample files without trying to decide whether the file 121 belongs to malware or not. However, in [17], the authors 122 develop a static analysis tool that analyzes malware and 123 extracts its information, such as APIs, and then decides if 124 there are adversarial or not. For example, the tool checks API 125 names such as SetWindowsHookEx API and GetAsyncK-126 eyState. If the analyzed sample uses those APIs, the tool cat-127 egorizes it as a Keylogger since those APIs record keyboard 128 strokes. The tool can also identify Ransomware and Backdoor 129 using the same method. However, since the tool relies mainly 130 on API names, it has some false-positive results; it can also 131 not detect malware that employs evasion techniques such as 132 API name obfuscation and dynamic library loading.

133
The second analysis type is called dynamic analysis, 134 also called behavior analysis. In this type, the ransomware 135 is executed in an isolated and controlled environment. In [19], the authors claim that static and dynamic anal-  The authors in [20] suggest using a Markov model and     a specific ransomware module, as shown in Fig. 2. Our 194 analysis focuses on the locker folder responsible for encryp-195 tion operations. The locker folder contains multiple sources 196 and headers files. We divide the execution into six phases, 197 API hashing, API unhooking, Mutex creation, deleting Win-198 dows shadow copies, kill running process, and multithreaded 199 encryption, as shown in Fig. 3. Many kinds of ransomware use dynamic API loading and 202 hashing to hide the libraries and API names that they use 203 to cover their functionalities from static analysis and con-204 ventional signature-based malware scanners [22]. The Conti 205 ransomware obfuscates all its API calls and libraries names 206 and resolves them dynamically at runtime. This obfuscation 207 technique makes sure that the Conti can still access all its 208 APIs without writing them directly to the import table, which 209 will make them completely hidden from possible reverse 210 engineers.

211
The Conti ransomware starts execution from the WinMain 212 function in main.cpp file.

213
The WinMain function as shown in Fig. 4 starts by 214 invoking InitializeApiModule function located in api.cpp 215 file. The InitializeApiModule function as shown in Fig. 5 216 calls GetApiAddr function which is responsible for load-217 ing kernal32.dll library. The kernal32.dll library includes 218 all programs' basic and core functionality, including read-219 ing and writing files; it also includes LoadLibraryA API 220 function [23]. The LoadLibraryA API function loads any 221 given dynamic link library into the virtual memory of the 222 ransomware and returns its address; the ransomware then uses 223 GetProcAddress API to access any API in any loaded library. 224 This GetProcAddress API can get any API address given its 225 name and its library's virtual memory address.

226
The GetApiAddr function uses the API camouflages tech-227 nique [24] to hide the API names resolved at runtime by hash-228 ing them leveraging the Murmur2A algorithm, as shown in 229 Fig. 6    it provides a realistic virtual environment for malware and 252 decreases the chance of being detected by malware.

253
B. API-UNHOOKING MECHANISM 254 We explain the API hooking technique before diving into 255 Conti ransomware's second call, which involves an API 256 unhooking mechanism. Many new generations of anti-virus 257 software and Endpoint Detection and Response (ERD) solu-258 tions have a real-time protection feature. This feature is a 259 behavior-based dynamic malware analysis that monitors all 260 executing processes activities in real-time, and it can detect 261 malware by its suspicious patterns of behaviors. The protec-262 tion software must inject its code into these running processes 263 for this feature to work, which then performs a Windows API 264 hooking for targeted API calls. The API hooking allows the 265 protection software to see what API function is called along 266 with its parameters [28]. The API hooking can be developed 267 to be light with no effect on computer performance [29]. 268 Unfortunately, many malware can detect API hooking, and 269 they will try to apply an API unhooking technique, as we 270 will see with Conti ransomware. We should mention that 271 the API unhooking technique is not enough to prevent this 272 VOLUME 10, 2022      The Debugger hook relies on a debugger that gets exe-324 cuted alongside the target application. The debugger will 325 have multiple breakpoints at each entry point of an API [35]. 326 If the targeted application reaches a breakpoint, it throws 327 a debug exception. The debugger will catch this exception, 328 and its address point to the intended API, which is how API 329 hooking is achieved. The Debugger hook technique relies on 330 a debugger which makes it easy to be detected by malware, 331 and also it uses breakpoints with a predictable instruction; 332 malware can detect such breakpoints using simple if-else 333 statements [33].

334
The Inline Hook technique operates by first copying the 335 original instructions of the entry point of an API target func-336 tion to a new memory location, and these instructions are 337 called Trampoline function [33]. Then, the entry point of an 338 API target function will be overwritten with new instructions 339 to redirect its execution to a Detour Function [37]. Finally, the 340  The HandleCommandLine function definition exists in 375 app.cpp file as shown in Fig. 13. The ransomware accepts 376 four command-line arguments as shown in Table 1.

378
The Conti ransomware tries to delete all system shadow 379 copies before encrypting files to maximize its damage. The 380 DeleteShadowCopies function in the locker.cpp file invoked, 381 it starts by initializing Component Object Model (COM) 382 library using CoInitializeEx API. Then, by using the CoIni-383 tializeSecurity API function, the ransomware changes the 384 security levels of the COM object by passing -1 as a value 385 for the cAuthSvc parameter. Next, the Windows Management 386 Instrumentation (WMI) is initialized using the CoCreateIn-387 stance API function; both WMI and WMI query languages 388 are obtained through the IWbemLocator::ConnectServer 389 method. To avoid the WMI authentication, the ransomware 390 changes the WMI proxy security levels using the CoSetProx-391 yBlanket API function by setting RPC_C_AUTHZ_NONE 392 flag. The shadow copies ID needed to be identified; this 393

403
The Conti ransomware uses multithreads to encrypt files. 404 To determine the number of threads it needs to create, the 405 GetNativeSystemInfo API function is used to get the num-406 ber of processors in the machine. If the encryption mode is 407  Each created thread waits for a task in the TaskList queue; 422 if a new task is added, the filename is extracted; if the 423 filename is the stop marker value ''stopmarker'', the thread 424 is terminated. Otherwise, if the restart manager library is 425 loaded, the RmStartSession, RmGetList, and RmShutdown 426 API functions are used to kill each process for applica-427 tions using the file, which makes the file available for 428 encryption.

429
The ChaCha20 algorithm, a variant of the Salsa20 [40] 430 encryption algorithm, is used for file encryption. Its imple-431 mentation is publicly available online. It is stored inside the 432 ransomware in a folder named ''chacha20''. When a file 433 becomes available for encryption, first, the GenKey function 434 from the locker.cpp file is invoked to generate the required 435 encryption keys. The CryptGenRandom API function gen-436 erates a 32-bytes random key and an 8-bytes random initial 437 vector (IV). It stores them in a FileInfo structure. Next, the 438 generated 32-bytes random key is encrypted using the RSA 439 public key. Then, the encryption method is determined based 440 on the file extension and size described in Table 3. Before 441 the encryption, the first bytes of the file are overwritten with 442 details about the encryption method and encryption key used. 443 Finally, the file is encrypted, and its extension is changed 444 to .EXTEN. 445

446
The ransomware loops through all paths contained in the file 447 passed using the -p command line flag. First, the ransom note 448 file ''R3ADM3.txt'' is written in each path. Next, FindFirst-449 FileW and FindNextFileW API functions are used to iterate 450 through each directory's content; if the item name is ''.'' or 451 ''..'', it is ignored; if the item is a folder and its name is 452 one of the following: tmp, winnt, temp, thumb, $Recycle.Bin, 453 $RECYCLE.BIN, System Volume Information, Boot, Win-454 dows, or Trend Micro, it is ignored; if the item is a file and its 455 name or extension is one of the following: .exe, .dll, .lnk, .sys, 456 .msi, R3ADM3.txt, or CONTI_LOG.txt, it is ignored. If the 457 item is a directory, the described process is repeated recur-458 sively for all its content. Each non-ignored file is passed to the 459 first available thread for encryption. After finishing specified 460 paths passed using the -p command line flag, the ransomware 461 utilizes the GetLogicalDriveStringsW API function to get a 462 list of available drives. Then, the root path is obtained for each 463 available drive, and the above-explained process is repeated 464 for each subdirectory and subfiles.

466
After encrypting local files, the ransomware tries to 467 encrypt shared files. The EnumShares function in the net-468 work_scanner.cpp file is invoked, and in the EnumShares 469 function, the NetShareEnum API function is used to get 470 information about shared resources. A loop is performed 471 through all resources; if a resource is a disk drive, a special 472 share ( $IPC communications, ADMIN$ remote adminis-473 trations, administrative shares), or a temporary share, the 474 resource share path is extracted. The above-explained process 475 is repeated for each subdirectory and subfiles for each path. 476 VOLUME 10, 2022     If the IP address conforms to one of the above masks, 488 a thread is created to scan the IP address subnet for possible 489 addresses from 0 to 255; TCP protocol is used to make a 490 connection to each possible address on the SMB port 445; for 491 each successful connection, the valid IP address is stored in a 492    We list all API functions used by the Conti ransomware 501 in Table 4.

503
In this section, we use static and dynamic analysis tools 504 to analyze Conti ransomware's sample file and compare its 505 behaviors to its source code flows. We obtain a copy of 506 the latest known Conti ransomware executable file on the 507 internet, which we use to perform the analysis.

508
A. STATIC ANALYSIS 509 We start by preparing an isolated test environment. First, 510 we use VirtualBox to run a virtual Microsoft Win-511 dows 10 operating system. Then we install the necessary 512 analysis tools such as PeStudio, Process Monitor, Wireshark, 513 and x64dbg.

514
Using PeStudio, we extract the malware MD5 and SHA1 515 hash values as shown in Fig. 17. Those values consider Indi-516 cators of Compromise (IoCs). However, since the Conti group 517 VOLUME 10, 2022  is active, they change the ransomware signatures with each 518 version to prevent antivirus software from recognizing and 519 stopping it from executing. 520 We also extract its strings; as described in its source code, 521 most of the strings are encrypted, but we notice that the 522 ransom note file content is not encrypted, as shown in Fig. 18. 523 Furthermore, Conti's file extension to append to each file 524 it encrypts is also not encrypted, as shown in Fig. 19. This We start by executing the Conti ransomware in a newly 543 installed Windows 10 without any updates to the system or 544 Windows Defender. The Windows Defender discovers the 545 attack, but it is too late, and the ransomware has already 546 finished encrypting machines' files. Therefore, we try an 547 older version of the Conti ransomware again, and Windows 548 Defender can detect the malicious file and stop the attack.

549
When we execute the Conti ransomware, it starts by scan-550 ning the same network subnet and trying to connect to other 551 devices using the SMB port 445, as shown in Wireshark 552 captured data in Fig. 21. Furthermore, as seen in its source 553 code, the Conti scans each possible IP address that matches 554 our default getaway 192.168.244.* pattern. Fig. 22  we create a shared folder on our host machine, and Conti 559 manages to encrypt its content.  The Conti ransomware has three different encryption rou-587 tines for files based on their size and type. We create three text 588 files to inspect the Conti encryption routines: small, medium, 589 and large. The small file size is 4 bytes, the medium file 590 size is 1790082 bytes (1.70 MB), and the large file size is 591 8950410 bytes (8.53 MB). The first encryption routine is 592 Full Encryption, which targets files smaller than 1.4 MB 593 or has one of the extensions listed in Table 3. In the Full 594 Encryption mode, Conti generates a random encryption key 595 for the ChaCha20 encryption algorithm. It uses this key to 596 encrypt the entire file content and encrypts this encryption 597 key using a hard-coded RSA public key shown in Fig. 27. 598 Finally, it writes the encrypted content back to the file, 599 followed by the encryption key, the encryption mode value 600 (24 for Full Encryption), and the original file size. The 601 small text file we created is encrypted, as illustrated 602 in Fig. 28.

603
The second encryption routine is Header Encryption, 604 which targets files with a size between 1.04 MB and 5.24 MB. 605 In this encryption mode, Conti encrypts only the first 1 MB of 606 the file and then writes the encrypted content back to the file, 607 followed by the rest of the unencrypted file content, followed 608 Conti extracts the encryption key from each file and then 634 decrypts it using the RSA private key. Next, it extracts the file 635 size and uses it to extract the encrypted file content correctly. 636 Finally, it extracts the encryption mode value and uses it 637 alongside the encryption key to decrypt each file respectfully. 638

639
The Conti ransomware spreads using many tactics and tech-640 niques, and we can protect our system from such attacks by 641 knowing those tricks.  updates is essential to protect against ransomware attacks.

681
Unfortunately, the Conti group knows that many users do 682 not patch their systems regularly and wait for weeks or 683 even months, making their systems vulnerable and easy 684 targets.

685
The Conti ransomware can also encrypt files over the SMB 686 connection, as seen in its source code and dynamic analysis.  The Conti ransomware leaked source codes show us that this 697 ransomware, without a doubt, is modern and sophisticated 698 with unique techniques. In this paper, we analyzed Conti 699 ransomware source codes and illustrated its methods of dis-700 guising from antivirus software and its unique multithread 701 encryption. We also listed its API obfuscation tactics and all 702 of its API function calls.

703
Unfortunately, we believe that many less mature ran-704 somware groups take advantage of this leak to enhance their 705 ransomware tools, and much Conti-like ransomware will start 706 to emerge shortly.

707
As future work, we plan to analyze the other Conti leaked 708 files. Those files consist of internal logs, Jabber chat mes-709 sages, and additional source code for some web applications 710 the Conti group uses to manage their business. By analyzing 711 those files, we can get insight into how such group works and 712 understand their hierarchy and operations. We also plan to 713 design a system with a detection mechanism to detect Conti 714 family ransomware. The system should be tailored around the 715 techniques and tricks that this ransomware utilizes that we 716 discovered in this paper.