Utilizing Cyber Threat Hunting Techniques to Find Ransomware Attacks: A Survey of the State of the Art

Ransomware is one of the most harmful types of cyber attacks that cause major concerns on a global scale. It makes the victims’ resources unusable by encrypting data or locking systems to extort ransom payments. Ransomware has variant families that continue to evolve. Moreover. cybercriminals use advanced techniques to develop ransomware, making it harder for anti-malware detection systems to detect them. Ransomware solutions need the capabilities of timely and effective detection and response to discover uncommon behavior before losing sensitive data. Cyber threat hunting (CTH) is a novel proactive malware detection approach that includes cyber threat intelligence (CTI) methods and data analysis methods. However, most present CTH solutions depend on internal data sources and reactive techniques to detect unusual activities. An effective CTI technique is required to obtain knowledge from external data sources and combine it with internal sources to enhance the hunting capabilities. Then, using the optimal data analysis technique is needed for the CTH approach to obtain valuable insights into abnormal patterns in running activities in the early stages. In this study, we investigate using a practical CTI approach and different CTH models. Subsequently, we discussed ransomware research directions to detect known and unknown ransomware attacks. Also, we discussed the available ransomware datasets used in present ransomware studies.


I. INTRODUCTION
I N 2020, ransomware attacks against healthcare systems increased during the COVID-19 pandemic. Healthcare institutions face disruption of medical services and longterm consequences because of ransomware attacks [1]. Also, ransomware attacks affect both individuals and organizations to gain more money [53]. In 2021, 66 percent of surveyed companies were attacked by ransomware, up from 37 percent in 2020 [2]. Ransomware is a form of malware that uses encryption methods to encrypt a user's files or locks the system. Ransomware attacks aim is to gain payments from the victim to unlock the system or decrypt the victim's data files [3]. In 1989, the first ransomware was created by Joseph Popp, when he initiated a ransomware attack called AIDS, also known as PC Cyborg. He shared floppy disks with several AIDS researchers containing the malicious scripts [4].
Subsequently, ransomware attacks have continued to evolve using different tactics and techniques. Cryptoransomware and locker-ransomware are the two main types of ransomware. In a crypto-ransomware attack, the attacker encrypts the victim's valuable data using robust encryption methods, such as Rivest-Shamir-Adleman (RSA) or Advanced Encryption Standard (AES), and locks them until the victim pays a ransom. In contrast, instead of encrypting data files, locker ransomware locks the victim's system and requests a ransom payment to unlock it. Attackers primarily design ransomware attacks for money extortion from the victims. Pre-paid vouchers, premium rate SMS or calls, and online purchases are examples of early ransom payment techniques. Cryptocurrencies or virtual currencies, such as Bitcoin, are currently one of the most widely used ransom payment methods [48].
In recent years, security researchers have been investi-gating and tracking the evolution of various ransomware types. Famous ransomware families include TeslaCrypt, CryptoWall, Locky, Cerber, and WannaCry [5]. CryptoWall appeared in 2014 as locker ransomware that spreads by phishing emails, exploit kits, and infected attachments. In 2015, TeslaCrypt was distributed by exploit kits, and it used an AES encryption algorithm to encrypt all user data. Moreover, Cerber is frequently distributed by exploit kits and exchanged on hacker forums as ransomware-as-a-service (RaaS). Cerber starts by encrypting user data using the AES algorithm without connecting to the command and control (CC) server. Locky ransomware came into view in 2016 and included embedded macros with Microsoft Office documents. As a custom method, encrypted communication is used by Locky ransomware for Tor and Bitcoin payments. WannaCry was one of the most severe ransomware attacks in 2017, affecting more than 300,000 computers in over 100 countries [6]. WannaCry employs the EternalBlue exploit tool set to exploit the SMB vulnerability in Microsoft Windows and uses the AES algorithm to encrypt data files [7]. Ransomware could attack various platforms, including PCs, mobile devices, and Internet of Things (IoT) devices. Ransomware attacks on mobile devices have increased since 2017, and they have a variety of impacts, such as stealing important data or locking mobile devices. Ransomware attacks on IoT devices have recently become a challenge. [8]. Currently, adversaries do not need to develop their ransomware; instead, they can purchase it from another adversary, a practice known as Ransomware-as-a-Service (RaaS). RaaS makes it easier for inexperienced actors to create and launch ransomware attacks.
Most existing ransomware solutions are reactive. File hashes, IP addresses, and DNS records are examples of known indicators used in reactive approaches [9]. Adopting reactive methods to identify ransomware can result in data and system damages. However, employing a proactive defense strategy is the safest alternative for ransomware attacks. Proactive approaches use indicators and behavioral artifacts to identify malicious threats. Registry paths, system calls, user and authentication records, DNS queries and responses, and other run-time activities are captured by behavioral indicators [10] [11].
The cyber Threat Hunting (CTH) technique is a proactive approach utilized to secure critical assets. CTH is performed proactively in the environment, without any threat alerts. [12]. CTH's major purpose is to identify hidden threats, disable them, and establish policies to avoid them in the future. It integrates cyber threat intelligence and data analysis methods to find evidence of a threat in a network. Cyber Threat Intelligence (CTI) is the process of seeking and collecting information beyond what is readily available, such as event logs [13]. Evidence-based knowledge outside security logs is necessary to adopt a proactive step and help in the decision-making process.
Ransomware has received much attention recently due to the rise in ransomware attacks on individuals, businesses, and governments worldwide. Ransomware attacks are constantly changing and becoming more sophisticated than before. From this perspective, This study investigates the literature review of CTI and CTH for both malware and ransomware works with their limitations and gaps. Also, this study will investigate the current CTH techniques and the utilization of CTI techniques. Related studies and available datasets were reviewed to highlight the main trends. In addition, potential research directions of ransomware studies are described.
The remainder of this paper is organized as follows: section 2 provides a summary of ransomware studies and the existing CTI and CTH techniques. Section 3 provides an overview of cyber threat intelligence techniques. Section 4 presents a detailed over view of malware analysis approaches. Section 5 discusses cyber threat hunting techniques. Section 6 discusses the evolution of ransomware attacks and research directions. Section 7 discusses datasets of ransomware detection studies. Finally, Section 8 provides the conclusion of this study.

II. BACKGROUND
Ransomware targets computer, mobile, cloud-based, IoT, ICS, and other systems as extortion-based cyber threat [14] [15]. Researchers have developed several taxonomies to help understand how ransomware operates. Specific countermeasures should be implemented to secure different digital assets. Ransomware is classified into two categories based on confiscated resources: locker-ransomware and cryptoransomware. In a Locker-ransomware attack, the victim will not be able to reach system services; however, data will not be compromised. Locker-ransomware is classified by the type of non-data resources it encrypts, such as operating systems, applications, services, user interfaces, and other utilities. Crypto-ransomware encrypts data resources and requests a ransom payment from users. Crypto-ransomware is classified into three types based on the encryption process: symmetric, asymmetric, and hybrid.
A deep understanding of the ransomware attack steps is required to discover an effective solution. Infection, installation, communication, execution, extortion, and emancipation are the most common ransomware attack phases. Figure 1 depicts the steps involved in a typical ransomware attack.
• Infection phase: This phase begins when the malicious ransomware code enters the victim's system. Different infection vectors for ransomware attacks include affiliate programs, exploit kits, and email-based malvertising campaigns [16]. • Installation phase: This phase begins following ransomware infection, when the ransomware installs itself on the system and takes control without attracting attention. • Communication phase: This phase starts when the ransomware establishes an initial connection with the main adversary to carry out the following level of actions. The ransomware initiates a connection with a command and control (CC) server. VOLUME 0, 0 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  [17]. • Extortion phase: This phase starts when the ransomware notifies users that they have been attacked and must obey the attacker's instructions. A ransom note is shown to the victim, which uses social engineering techniques to persuade them to pay the ransom. • Emancipation phase: This phase starts after receiving the ransom payment when the attacker unlocks system resources. Following ransom payment, attackers would send a link to infected victims that contains a specific decryption tool for some crypto-ransomware attacks. Security researchers have investigated two defense approaches for ransomware attacks: signature-based and behavior-based approaches. Signature-based methods, often known as static analysis, refer to the process of examining a malicious file without its execution. Because of the growth of ransomware attacks and anti-forensic tactics such as packing and obfuscation, signature-based approaches have limitations. Behavior-based approaches, often known as dynamic analysis, refer to running a malicious program and observing its activities in the system. Behavior-based approaches can strive for the detailed characteristics of ransomware behavior. Their ability to strive for detailed characteristics makes using a defensive technique based on ransomware behavior much more effective. Thus, employing a behavior-based approach as a defensive strategy is more effective in preventing ransomware attacks from carrying out damaging actions.

III. LITERATURE REVIEW
Protecting data and systems from ransomware attacks requires a proactive solution. A proactive solution refers to the early recognition of the malware threat. CTI and CTH are novel techniques used to spot cyber threats in the environment.

A. CTI STUDIES
CTI is a proactive method that gathers valuable information from various sources to provide insight into the most recent cyber vulnerabilities and threats. Discovering and extracting such vital threat information is crucial for cybersecurity researchers and practitioners to improve awareness. Williams et al. [18] utilized a web crawling technique to find proactive cyber threat intelligence (CTI) in hacker forums. They implemented the Depth-First Search (DFS) technique, an incremental crawling method for collecting attachments while avoiding various popular anti-crawling measures.
Li et al. [19] relied on articles focusing on security eventrelated topics to build a proactive CTI. They collected 131 articles from the Internet and built an SVM model for data analytics. Samtani et al. [20] presented a methodology for implementing a more proactive CTI by mining hacker com-munities for source codes, tutorials, and attachments. The framework employs social network analysis methodologies and metrics to identify the key individuals behind discovered hacking assets. Ebrahimi et al. [21] focused on cyber threats hosted by the deep net market to avoid significant financial losses. They developed semi-supervised cyber threat identification, an integral part of the CTI, used to detect various types of threats and their primary data sources. They created a web crawler that used a combination of approaches to combat deep net marketplace anti-crawling mechanisms. Table 1 summarizes the proactive CTI strategies studied.

B. CTH STUDIES
Cyber threat hunting (CTH) is an approach that integrates CTI with data analysis methods to detect and respond to threats proactively. HOMAYOUN et al. [23] developed sequential pattern mining as a ransomware hunting mechanism. They tried to hunt abnormal behavior within the first 10 seconds of ransomware execution by mining system logs of file system activities, registry, and Dynamic Link Libraries (DLL). Sequential pattern mining was implemented to discover Maximal Frequent Patterns (MFP) and combined with machine learning classification techniques to identify ransomware and benign samples and distinguish ransomware families.
Mavroeidis et al. [24] suggested a Sysmon log-based automated threat hunting system. Sysmon refers to a Windows system monitor service for monitoring and logging system activities. The proposed solution presents an automated threat assessment system that analyzes the continuous incoming feeds from Sysmon logs to classify the system processes to different threat levels. Detection was performed based on a predefined knowledge base.
Darabian et al. [25] developed an integrated multi-view learning approach that uses multiple features rather than a single feature view to detect malware behavior on diverse platforms. Weight is added to each view to enhance the hunting approach, including the header information, ByteCodes, API call, OpCodes, permission, and the attacker's intent. SVM model was used to assign weights to the obtained view. The proposed solution was employed on Windows, Android, and IoT platforms.
Naik et al. [26] developed triaging methods as hunting techniques to determine the similarity of two ransomware samples. They applied four evaluation methods: import hash-VOLUME 0, 0 FIGURE 1. Ransomware attack steps [14].
ing method (IMPHASH), SSDEEP and SDHASH fuzzy hashing methods, and YARA rules. The performance results are described with the number of detected samples and a comparison between the four methods without showing performance results.
Jadidi et al. [27] proposed an industrial control system threat hunting framework (ICS-THF). The proposed framework focuses on detecting cyber threats against ICS devices. The proposed framework consists of three phases: threat hunting trigger, threat hunting, and cyber threat intelligence. The first phase includes events that could trigger the hunting phase. Then, the second phase uses a combination of the MITRE ATT&CK matrix and a diamond model of intrusion analysis to generate hunting hypotheses and predict future behavior. Finally, the third phase generates indicators of compromise (IoCs) for future threat hunting.
HaddadPajouh et al. [28] developed an IoT malware hunting method using a Long Short-Term Memory (LSTM) structure based on their OpCode sequences. Their findings demonstrated that stacked LSTM techniques could achieve high accuracy and handle input sequences of any length.
Jahromi et al. [29] developed an Extreme Learning Machine (ELM) approach that includes two hidden layers. They aimed to achieve an extremely fast learning speed, good generalization capability, straightforward implementation, and reduce the human intervention characteristics.
Homayoun et al. [30] developed a system for deep ransomware threat hunting in the fog layer. They used LSTM and CNN for classification to discover ransomware attacks within the first 10 seconds of program execution.
Al-rimy et al. [31] proposed two novel techniques, incremental bagging (iBagging) and enhanced semi-random subspace selection (ESRS), which are combined into an ensemble-based detection model. iBagging technique is used to build incremental subsets that show the evolution of crypto-ransomware behavior over various attack phases. ESRS technique is then used to construct feature spaces and exclude weak features.
Al-rimy et al. [32] proposed a novel Redundancy Coefficient Gradual Upweighting (RCGU) technique that improves redundancy-relevancy tradeoffs during feature selection. RCGU technique increases the redundancy term weight proportional to the number of selected features. The Enhanced MIFS (EMIFS) was developed by combining the RCGU technique with the Mutual Information Feature Selection (MIFS) technique. Moreover, MM-EMIFS was developed as an improvement that incorporates the MaxMin approximation with EMIFS to prevent redundancy overestimation. They mentioned that the limitation of the proposed work is the lack of consideration of the conditional redundancy term when calculating the feature importance.
Kok et al. [33] proposed a Pre-Encryption Detection Algorithm (PEDA) that aims to discover crypto-ransomware attacks at the phase of pre-encryption using two levels. The first level uses static analysis to compare the file signature with the known ransomware signature. The second level uses dynamic analysis with a learning algorithm model that analyzes the API generated in the pre-encryption stage.
Darem et al. [34] proposed an adaptive behavioral-based incremental batch learning malware variants detection model (AIBL-MVD) using concept drift detection and sequential deep learning.
Roy et al. [35] proposed a deep learning-based ransomware detector (DeepRans). The proposed model monitors the infected host's suspicious activity in the bare metal server network. DeepRans was developed using attention-based Bi-LSTM with Conditional Random Fields to classify the normal and infected host activities.
Pundir et al. [36] proposed a hardware-assisted ransomware detection technique using DL methods. They mon-itored micro-architectural events using a hardware performance counter to detect abnormal events. They showed that the proposed solution could detect ransomware in 2 milliseconds before encryption. However, hardware data processed in run-time could be corrupted.
Ullah et al. [37] proposed a ransomware detection model using online ML classifiers. The proposed model extracts the run-time features and performs ransomware detection. The model performs detection by tracing the ransomware behavior features during the execution, such as registry, network, and file systems API calls. Their proposed model used a modified decision tree, random forest, and AdaBoost classifiers.
Zhang et al. [38] proposed a deep learning-based model that uses a self-attention mechanism. The authors extracted contextual information to apply the static analysis. They used an N-gram of opcodes to identify ransomware fingerprints in the environment.
Khan et al. [39] proposed a DNAact-Ran system that uses digital DNA sequencing along with ML to detect ransomware. Naïve bayes, random forest, and Sequential minimal optimization classifiers were utilized in the proposed system. The proposed system was able to predict ransomware using DNA sequencing, which illustrates several numbers of features.
Poudyal et al. [40] proposed an AI-based ransomware detection framework (AIRaD). The proposed framework combines static and dynamic analysis to detect ransomware attacks. SVM, logistic regression, random forest, AdaBoost with J48, and J48 classifiers were used in the AIRaD framework. Multi-level analysis was performed on the assembly, DLL, and function calls.
Zuhair et al. [41] proposed a machine learning-based multi-layer ransomware detection system. The proposed solution consists of analysis, learning, and detection phases. The model utilized behavioral analysis to detect unknown ransomware variants. A decision tree and naïve bayes classifiers were used in the proposed solution. The first step is ransomware detection using a decision tree, and the second step is ransomware prediction using naïve bayes decisions. The limitation of the proposed system is the time of the fed samples analysis, which was done for 5 minutes.
In summary, the limitations and gaps in current CTH solutions are as follows:

IV. CYBER THREAT INTELLIGENCE TECHNIQUES
Finding reliable intelligence regarding cyber threats helps defend against current attacks in a proactive manner [43]. Many CTI techniques have been proposed for obtaining timely information from trustworthy sources. CTI can provide detailed information related to anticipated cyber attacks. For example, an email designed for phishing attacks could include various vital features such as the attack technique used, attacker information, target information, software, and tools used to launch the attack [44].
The collection and analysis of massive amounts of online sources of threat data present a new area of challenges that enhance CTI abilities to mitigate or disable rising attacks [20]. Different capabilities are required to produce comprehensive CTI to find knowledge. To discover online sources, extensive data analysis, awareness of web crawling and anticrawling mechanisms, understanding of foreign languages, knowledge of cyber world terms, and understanding of the complex structures of malicious assets are needed. Malicious assets can be found on different online platforms such as repositories, IRC channels, and hacker forums to exchange content and knowledge.
The web crawling mechanism is applied to search for web content as a computer program that systematically browses sources on the World Wide Web [45]. A web crawler is used for different purposes, such as searching for and extracting information or classifying web content. A crawler parses HTML tags and retrieves pages, extracts new hyperlinks from these tags, and stores HTML content. After collecting the data, the analysis technique is utilized to leverage the discovered information to understand the critical trends of malicious cyber assets.

V. MALWARE ANALYSIS
To detect malware, researchers used various techniques, including analyzing files with various tools, extracting static or dynamic features from the analyzed files, and categorizing features to distinguish between malicious and benign software. Malware analysis could be classified into static, dynamic, and hybrid [46]. Malware samples can be analyzed manually or automatically [47]. Automatic analysis requires advanced data science programming skills; however, domain expert knowledge is needed in manual analysis.

A. STATIC ANALYSIS
Static malware analysis is applied by reverse engineering, disassembling, or dissecting a malware binary file to analyze the different structural and semantic information found in the binary file. The structure of the malware sample is identified by static analysis without actually executing malicious code. File strings, header information, and functions are examined in fundamental static analysis. More details of the program commands are examined in the advanced static analysis.

B. DYNAMIC ANALYSIS
On the other hand, dynamic malware analysis is applied by observing or debugging a malware's program instructions to evaluate its behavior in an isolated environment. Isolated environments, such as virtual machines or sandboxes, are used to perform the dynamic analysis. API calls, memory and registry changes, parameters, information flows, and network activities are tested in dynamic analysis. There are two parts of dynamic malware analysis: basic and advanced. The fundamental dynamic analysis uses monitoring tools to examine malware's behavior. However, the advanced dynamic analysis uses debugging tools to execute each command individually to view command contents such as variables, parameters, and memory areas.

C. HYBRID ANALYSIS
In addition, hybrid malware analysis is a file analysis that combines both static and dynamic analysis aspects. It extracts the structural and semantic information of the binary file besides the run-time information.
Static analysis is easier and faster than dynamic analysis; however, it is impossible to analyze malicious software that utilizes obfuscation, packed, or polymorphic techniques using static analysis. For detecting unknown malware threats, dynamic analysis is more effective. Although dynamic analysis shows malware's actual functionality, some malware variants could be aware of being analyzed in isolated or closed environments, resulting in hiding their actual behavior.

VI. CYBER THREAT HUNTING TECHNIQUES
The concept of CTH describes combining an effective CTI method with a robust data analysis technique to detect cyber threats. Cyber attacks are evolving and becoming more sophisticated because of the advanced level of threat actors' skills [48]. Various solutions have been proposed as a data analysis technique to detect cyber threats. The application of machine learning-based techniques has a great majority of the current methods of CTH. The main development trends of the CTH are described in the following paragraphs.

A. TRADITIONAL MACHINE LEARNING APPROACHES
Machine learning (ML) is a part of artificial intelligence where machines learn from data or experience to automate the building of analytical models [49]. The efficiency of the ML model depends on the quality and performance of the chosen learning algorithm. Supervised learning, unsupervised, semi-supervised, and reinforcement learning are the four major categories of ML algorithms.
ML involves several steps such as data collection, data cleaning and preparation, model building, model evaluation, and model deployment. A set of software features is extracted during data preparation to describe it and classify it as benign or malware. The features were then used to train the model to solve the specified problem. Malware can be identified based on different features categorized according to the type 6 VOLUME 0, 0 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and  of malware analysis approach: static and dynamic. Figure 2 shows the malware feature taxonomy.

B. DEEP LEARNING APPROACHES
Deep learning (DL) is a subset of ML that learns from data, and the computation is performed through multilayer neural networks, and processing [50]. DL models require a large amount of data for each problem domain to construct a datadriven model. Moreover, DL algorithms require high computational capabilities to train models with a large amount of data. An essential characteristic of DL is that it decreases the time and effort required to construct the feature extractor. Supervised, unsupervised, and hybrid learning are the three major categories of DL algorithms.

C. OTHER DATA ANALYSIS APPROACHES
Other data analysis approaches that do not include artificial intelligence methods have been utilized in CTH studies. Table 4 presents a summary of other data analysis techniques for CTH.

VII. DIRECTION OF FUTURE RESEARCH ON RANSOMWARE
Most ransomware-related research works focus on different characteristics such as threat delivery, encryption algorithm and communication, associated IoCs, and behavior analysis [51]. Threat actors can change a malware's appearance to obfuscate its code; however, it is difficult to change its motivation and behavior. New ransomware variants are constantly being developed. Several detection and protection solutions rely on static analysis, which detects only earlier forms of ransomware samples. Cybercriminals apply advanced techniques to conceal the ransomware executable program intention to avoid detection.
Ransomware can appear as a standalone crypto-worm that replicates itself to other computers to maximize the impact on the network. In addition, ransomware can appear as a Ransomware-as-a-Service (RaaS), which is a distribution kit sold on the dark web. RaaS permits novel attackers with limited technical skills to launch ransomware attacks [53]. Moreover, ransomware can be deployed by threat actors who scan the Internet to find IT systems with soft protection to make them targets.
Ransomware can infect files on locally fixed, removable, or remotely shared drives. To minimize detection, attackers VOLUME 0, 0 7 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.   This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3181278 may attempt to sign their ransomware using code-signing technology by buying or stealing it. In addition, current ransomware utilizes exploits to abuse stolen administrator privileges and elevate their privileges. After that, the ransomware will start encrypting as many files as possible to ensure receiving ransom money from the victim. Files can be encrypted individually as a single thread or more than one at the same time as multiple threads. Moreover, ransomware can be programmed to start encrypting files with smaller file sizes or alphabetically [10].
A ransomware attack will have a tremendous financial impact on an organization when it encrypts its mapped network drives. Restoring multiple servers from backup data takes a long time, and data could not be up to date. Many organizations use only backup solutions as a critical defense against ransomware, which makes it a recovery solution rather than a detection solution. Ransomware attacks can also target backup files and folders, which can cause permanent damage and data loss [52].
The ransomware performs the file encryption process using two methods: overwrite (in-place) and copy. The overwrite method encrypts files by reading the original file, writing an encrypted version over the original file, and renaming the file. On the other hand, the copy method encrypts files by reading the original file, creating an encrypted copy, and deleting the original file. It is impossible to recover the original files using the overwrite method. However, ransomware that uses the copy method will use an additional wiping action to ensure that data files are not recoverable.
Ransomware behavior follows specific patterns that include the file identification process, file encryption, network command, and control communications [54]. In most ways, ransomware uses a Windows application programming interface (API) to make function calls. Windows API offers a collection of programming interfaces that simplify the software development process. Windows API calls can be used as behavioral features to identify abnormal patterns. Table 5 lists some Windows API call categories and examples extracted from different ransomware samples. Software API calls can be extracted from most modern devices [55]. Figure  3 shows the process of gathering API call sequences from ransomware samples.

VIII. RANSOMWARE DATASETS
Datasets are essential to foster the development of an effective ransomware detection solution. The outcome of a ransomware detection solution depends on the utilized dataset. Therefore, the accuracy of the solution is directly related to and dependent on the input dataset. Datasets contain several samples for benign and ransomware; however, one of the crucial challenges is a balanced dataset. Datasets for ML could either be privately collected or publicly available to anyone. Different ransomware studies used datasets from different repositories. Popular repositories that offer malware data include the following sources: VirusTotal, VirusShare, and theZoo. Table 6 shows a summary of openly available popular datasets and repositories used for ransomware detection studies.

IX. CONCLUSION
Ransomware is an evolving form of malware designed to block access to the system or encrypt its data. Various static and dynamic features of ransomware can be extracted and used to reveal its activities. This paper presents a systematic review of Cyber Threat hunting techniques for detecting ransomware attacks. The previous works of CTI and CTH have been investigated, and the limitations and gaps have been mentioned. Then, we explained the CTI technique. We provided an extensive overview of the malware analysis. CTH techniques are discussed based on the used data analysis method. Ransomware evolution and research directions are highlighted. The available ransomware datasets used in the previous works are mentioned with their data sources. In summary, ransomware attacks must be detected proactively, as shown in this study. Developing an effective ransomware CTH technique that can detect known and unknown ransomware is a concern. We provided a detailed review of ransomware research directions and the available ransomware datasets utilized with different data analysis methods. In our future work, we will adopt a CTI method to enhance the development of a CTH technique by collecting the latest shared information about ransomware attacks. Subsequently, the collected information will be incorporated into an effective new learning strategy model to enhance detection accuracy. A deep focus on dynamic features will be performed to hunt ransomware attacks based on behavior classification. Perform an operation on a specified file. ExitProcess End the calling process and all its threads. 5 memory VirtualAlloc Reserve, commit, or change the state of a region of memory within the virtual address space of a specified process. 6 synchronization CreateMutex Create or opens a named or unnamed mutex object. OpenMutex Get a handle to another process's mutex. 7 services OpenSCManager Return a handle to the Service Control Manager. OpenService Open an existing service.