Scalable Static Detection of Use-After-Free Vulnerabilities in Binary Code

The number of use-after-free vulnerabilities has been increasing rapidly in recent years, which is a serious threat to computer systems. However, few effective mitigations exist for large-scale binary code. In this study, the authors propose a scalable static approach for detecting use-after-free vulnerabilities in binary code. First, the use-after-free feature model is proposed to provide guidance for detection. Then, the binary code of the target program is converted to an intermediate representation, and CFGs (control flow graphs) are constructed. Finally, lightweight pointer tracking is performed to identify the use-after-free vulnerable point. Compared with state-of-the-art approaches, this approach uses function summaries rather than naive in-lining technique for the inter-procedural analysis in the vulnerability detection. Therefore, our approach has the ability to avoid redundant repeat analysis caused by the in-lining technique in the existing approaches and reduce the unnecessary performance overhead. The authors have implemented a prototype called UAFDetector and evaluated it using standard benchmarks and real-world programs. The experimental results show that this approach is effective in detecting use-after-free vulnerabilities in binary code and is more efficient and scalable than state-of-the-art static solutions.


I. INTRODUCTION
When an object is deallocated in a program, the pointer to it becomes a dangling pointer. Any dereference of dangling pointers causes a use-after-free vulnerability that can be exploited by an attacker. Moreover, a double-free vulnerability is considered a special use-after-free vulnerability [1]. Fig. 1 shows the number of use-after-free and double-free vulnerabilities recorded in the National Vulnerability Database (NVD) [2] since 2009. These two classes of vulnerabilities are increasingly common. These vulnerabilities are present in various applications, including Microsoft's and Google's browsers, Windows and Linux operating system kernels, and OpenJPEG and LibPNG image processing libraries. Attackers can exploit these vulnerabilities to read or rewrite sensitive data [3]- [6] and can even hijack the program control flow to execute arbitrary code [7]- [9]. In the famous Operation Aurora attack, the attacker exploited several 0-day vulnerabilities, including use-after-free vulnerabilities, The associate editor coordinating the review of this manuscript and approving it for publication was Junaid Arshad . to attack the networks of Google and Adobe, causing serious economic losses [10].
Current solutions for use-after-free vulnerability detection fall into two categories: dynamic analysis and static analysis. Most solutions rely on dynamic analysis [7], [11]- [14], which generally have a high performance overhead and low code coverage. Static analysis does not suffer from such limitations. However, there are only a few studies [1], [15], [16] VOLUME 8,2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ on static analysis for use-after-free detection, and they are not widely used in practice due to some important limitations. The solutions in [1] and [15] are only able to analyze the open-source programs. But in many practical situations, the original high-level source code is unavailable. The solution GUEB in [16] is a state-of-the-art static tool for detecting use-after-free vulnerabilities and has certain ability to detect vulnerabilities in binary code. In this approach, an in-lining technique is used for inter-procedural analysis and functions that are called many times need to be analyzed repeatedly. This approach introduces a lot of unnecessary overhead in analyzing large-scale programs. Therefore, this solution is not scalable enough and only targets small programs.
In this paper, we propose a scalable static approach for detecting use-after-free vulnerabilities in binary code. First, a use-after-free vulnerability feature model is presented through the analysis of a large number of samples, which provides a guide for detecting vulnerabilities. Then, considering indirect jumps, we construct more complete CFGs. Finally, we focus on pointer-related operations in the program to track each pointer's states and use a finite state machine (FSM) to identify use-after-free vulnerabilities. The main advantages of our approach are as follows: (1) the function summaries technique is used for an inter-procedural analysis instead of in-lining, which avoids repeated analysis of functions and improves the scalability of the analysis; and (2) pointer aliases and indirect jumps are considered to enhance the completeness of the analysis.
Based on this approach, we implement a prototype called UAFDetector and use standard benchmarks [17] and real-world programs to evaluate it. The experimental results show that our approach can detect known vulnerabilities effectively and can discover new vulnerabilities in binary code. Compared with the existing tool GUEB [16], our approach has less overhead and more scalability.
Accordingly, we make the following contributions in this paper: • Proposing a scalable static detection approach for detecting use-after-free vulnerabilities in binary code. This approach uses the function summaries technique in an inter-procedural analysis to reduce overhead and improve scalability, and it integrates a pointer alias analysis and CFG recovery to improve the completeness of detection.
• Implementing UAFDetector, which is a prototype for detecting use-after-free vulnerabilities in binary code.  [20], UAFDetector performs the detection procedure to identify useafter-free vulnerabilities.
• Evaluating the effectiveness and performance overhead of our approach using a standard benchmark (Juliet Test Suite [17]) and real-world programs. UAFDetector is able to detect use-after-free vulnerabilities in the benchmark with zero false positive rate and 2.39% false negative rate. In given real-world programs, it can discover most known vulnerabilities and an unknown vulnerability (CNNVD-201904-1451) despite some false positives and negatives. Furthermore, our approach has lower performance overhead and better scalability than the stateof-art tool GUEB [16]. The remainder of this paper is organized as follows: Section II presents the use-after-free feature model and introduces the challenges of detecting use-after-free vulnerabilities. The overview of the proposed method is described in Section III. Section IV describes the details of each part and takes a real-world vulnerability as an example to introduce the workflow. Section V presents the implementation details and evaluates the implemented prototype. The limitations, applicable conditions and improvement measures are discussed in Section VI. We summarize the related work in Section VII and provide our conclusions in Section VIII.

II. PROBLEM DESCRIPTION A. FEATURES OF USE-AFTER-FREE VULNERABILITIES
Through the analysis of a large number of use-after-free vulnerabilities, we summarize the feature model of use-after-free vulnerabilities, as shown in Fig. 2. The model focuses on three types of pointer operations in the program: • FREE(p): frees the object pointed to by the pointer p.
This type of operation changes the pointer p into a dangling pointer, such as the function free in C language.
• USE(p): dereferences the pointer p. FREE(p) is a special dereference.
• DESTROY(p): destroys the dangling pointer p through reassigning p to a new valid address or NULL. Through observation, the path with a use-after-free vulnerability has the following features: 1) Creating dangling pointers. A use-after-free vulnerability is caused by dereferencing dangling pointers. Therefore, performing FREE(p) to create dangling pointers is a necessary condition for causing vulnerabilities. 2) Preserving the dangling pointer. If a dangling pointer is destroyed by DESTROY(p) before being dereferenced, no vulnerability will occur. Thus, when use-after-free vulnerabilities are discovered in practice, developers usually set the dangling pointer to NULL to prevent them. 3) Dereferencing a dangling pointer. Not all dangling pointers lead to use-after-free vulnerabilities because 78714 VOLUME 8, 2020 dangling pointers may not be dereferenced later. A useafter-free vulnerability occurs only when USE(q) is executed, where q is a dangling pointer or a dangling pointer's alias. Accordingly, the path a → e → h → i → j in Fig. 2 contains a use-after-free vulnerability, as it satisfies the above three features, where node a creates a dangling pointer p and node j dereferences it. However, the path a → e → f → g → j has no use-after-free vulnerabilities because node f destroys the dangling pointer p. In the path a → b → c → d, the dangling pointer p is retained but not dereferenced in the following nodes. Hence, this path also does not contain vulnerabilities.

B. CHALLENGES
Compared to detecting famous buffer overflow vulnerabilities [21], there are many challenges in detecting use-after-free vulnerabilities using a static analysis in binary code: 1) Incomplete CFG: The construction of a CFG is the basis of detecting use-after-free vulnerabilities. There are many binary analysis tools that provide the ability to construct CFGs, such as IDA Pro. This tool constructs CFGs from assembly code based on static analysis. This method cannot obtain indirect jump addresses, which leads to incomplete CFGs. Detecting use-after-free vulnerabilities using incomplete CFGs may result in false negatives. 2) Inter-procedural analysis: The operations FREE(p), USE(p) and DESTROY(p) related to use-after-free vulnerabilities may occur in different functions. Hence, vulnerability detection needs to support an inter-procedural analysis. The existing solution in [16] implements inter-procedural analysis by using a naive in-lining technique, which inserts callee functions directly into the caller function. This approach may be easier to implement but has a critical limitation that a function called many times will be analyzed repeatedly. The repeated analysis introduces a lot of redundancy overhead in analyzing large-scale programs. Therefore, the method has a high performance overhead and only targets small programs. An example will be shown in Section IV-B to further clarify the limitations of existing methods. 3) Alias analysis: Pointer alias is a difficult problem to solve in detecting use-after-free vulnerabilities. Fig. 3 shows an example of an alias. Two pointers, p and q, refer to the same memory block; they are then called aliases. The object pointed to by these two pointers is deallocated in line 4, and p and q become dangling pointers. DESTROY(p) in line 5 destroys the dangling pointer p, while q is still dangling. USE(q) in line 6 leads to a use-after-free vulnerability. If a vulnerability detection lacks the alias analysis, this vulnerability will not be discovered. Fig. 4 shows the overview of our proposed approach, in which the white modules represent the existing techniques and the gray modules represent our proposed parts. The input of the detection is the binary code of the target program, such as an executable file or a dynamic library. The outputs are error reports containing the sets of program locations involved in the use-after-free vulnerability. Our approach consists of two procedures: a pre-processing procedure and a detection procedure.

III. OVERVIEW
• Pre-processing procedure. The main function of this procedure is to convert binary code into an intermediate representation and construct the CFGs of functions in the program. First, we use a disassembler to transform the binary code into an assembly code. Then, this assembly code is translated into an intermediate representation by a translator. The CFGs containing indirect jumps are constructed from the intermediate representation by CFGConstructor, which combines dynamic and static analysis. Finally, this procedure outputs the intermediate code and the CFGs, which are taken as inputs of the detection procedure.
• Detection procedure. A scalable static analysis is performed to detect use-after-free vulnerabilities on the intermediate code and the CFGs. We use the function summaries technique for an inter-procedural analysis and a dedicated data flow analysis for the pointer alias. Finally, exploiting the above analyses, we track the pointer operations and transfers to identify use-after-free vulnerabilities and report program points that create and dereference dangling pointers.

A. CFG CONSTRUCTION
Constructing CFGs is the basis of vulnerability analysis. Function summaries, alias analysis and pointer tracking in use-after-free vulnerability detection all depend on CFGs.
It is important to construct complete and accurate CFGs from binary code. There are two methods for constructing CFGs from binary code: dynamic analysis and static analysis. A dynamic method in [22] constructs CFGs by recording the program execution paths, which has high accuracy but low path coverage and efficiency. The static method, such as IDA Pro, directly analyzes the code to construct CFGs without running programs. This method has high code coverage but cannot VOLUME 8, 2020 obtain indirect jump addresses, resulting in some code areas that cannot be reached.
We use a hybrid analysis CFGConstructor [20] that has been completed in our previous work to construct CFGs from binary code. First, a static analysis is used to obtain the basic control flow without indirect jumps by dividing and connecting basic blocks. Second, test cases generated by fuzz testing were exploited to run the target program, during which a dynamic binary instrumentation technique is used to obtain indirect jumps. Finally, the results of these two steps are integrated to construct more complete CFGs by adding indirect jumps to the basic control flow.

B. FUNCTION SUMMARY GENERATION
Detection of use-after-free vulnerabilities requires tracking the creation and dereference of dangling pointers, which may occur in different functions. Hence, it is necessary to support the inter-procedural analysis. For this issue, the existing solution GUEB [16] in-lines callee functions directly into the caller function. This solution has the serious limitation that a function called many times needs to be analyzed repeatedly.
An example of inter-procedural analysis is shown in Fig. 5. The (a) and (b) show the code and the control flow of the example. The results of the inter-procedural analysis using the in-lining approach in GUEB are shown in (c). The functions foo1 and foo2 are analyzed twice because the in-lining approach always analyzes the callee function in detail at each function call point. Although this example only has 3 functions, the total number of functions to be analyzed is 5 due to the redundant repeated analysis. The redundant analysis is more serious in analyzing large-scale programs. Therefore, the existing method has high performance overhead and is not suitable for analyzing large-scale programs.
We introduce function summaries technique for the inter-procedural analysis to address this problem. First, important behaviors of each function are summarized with an intra-procedural analysis, such as creating, using and destroying dangling pointers. Then, when the analysis is enlarged to an inter-procedural analysis, each function call is replaced by its summary. Therefore, each function needs to be analyzed only once, even if it is called many times.

1) GENERATING A SUMMARY FOR A SINGLE FUNCTION
This section describes which functions should be summarized and how to generate a summary for a single function with an intra-procedural analysis. As described in Section II-A, three kinds of operations related to pointers are focused on in useafter-free detection. We consider generating summaries for the functions that have these three kinds of operations.

a: DANGLING POINTERS CREATING FUNCTION
If a function makes an external pointer dangling after the function is called, we call this function as a Dangling Pointers 78716 VOLUME 8, 2020 Creating Function (DPCF). The external pointers mean the pointers that are not local variables in the function, generally including pointer-type arguments and global pointers. In static analysis, when a function satisfies the following conditions, a DPCF is identified.
1) The function uses a external pointer ext_ptr, that may be a pointer-type argument arg_ptr or global pointer glb_ptr. 2) There are operations, FREE(ext_ptr) or FREE(Alias(ext_ptr)), to free ext_ptr or its aliases in the function. 3) There are at least one path that does not contain DESTROY(ext_ptr) from the point where ext_ptr or its aliases are freed to the exit of the function. A DPCF may have more than one such pointer. If F is a DPCF, Dangling(F) represents the set of pointers satisfying the above conditions. Because the static analysis in our approach is an over-approximation, all pointers in Dangling (F) are considered dangling after F is called. Therefore, the set of operations that free pointers in function F is considered as the summary of the function as follows: In an intermediate representation, the offset relative to the stack base BP of a function is used to represent the argument, and the fixed offset is used to represent the global pointer.
In particular, the function free in C is a DPCF, which makes the first argument become dangling. Thus, Summary(free) = {FREE(arg 0 )}. A simple example is shown in Fig. 6 to explain how to generate a summary for a single DPCF.
The function foo is a DPCF according to the above conditions. The arguments, arg 0 and arg 1 , are pointer-type arguments and global_ptr is a global pointer. These pointers are freed in lines 2, 3 and 4, respectively. There exists a path that does not contain DESTROY from FREE(arg 0 ) and FREE(global_ptr) to the exit of the function. Thus, arg 0 and global_ptr are considered dangling pointers after foo is called. However, arg 1 is not a dangling pointer because it is redefined in line 5. Therefore, the summary of foo is

b: DANGLING POINTERS DESTROYING FUNCTION
Similarly, we define the Dangling Pointers Destroying Function (DPDF). If a function assigns an external pointer to a new valid address or NULL after the function is called, we call this function as a DPDF. It requires the following conditions to be satisfied: 1) The function uses an external pointer ext_ptr.
2) The operations DESTROY(ext_ptr) exist in all paths from the entry to the exit of the function. Suppose F is a DPDF, the set of operations that destroy pointers in the function is considered as its summary as follows.
Assigned(F) denotes the set of all external pointers assigned in function F.

c: DANGLING POINTERS USING FUNCTION
If a function dereferences an external pointer, we call this function as a Dangling Pointers Using Function (DPUF). It requires the following conditions to be satisfied: 1) The function uses an external pointer ext_ptr.
2) There are operations, USE(ext_ptr) or USE(Alias(ext_ptr)), to dereference ext_ptr or its aliases in the function. 3) There are at least one path that does not contain DESTROY(ext_ptr) from the entry of the function to the point where ext_ptr or its aliases are dereferenced. The summary of a DPUF F is as follows.
Dereferenced(F) denotes the set of all external pointers dereferenced in function F.
As can be observed above, our analysis is conservative to discover as many use-after-free vulnerabilities as possible. If a function has any path where a dangling pointer is created and not destroyed, it is considered as a DPCF. And if a function has any path where a pointer is dereferenced before assigned, it is considered as a DPUF. However, only if all paths from the entry to the exit in a function contain DESTROY(p), the function is considered as a DPDF and p is considered destroyed. These three kinds of functions are not mutually exclusive. A function is probably a DPCF, a DPDF and a DPUF at the same time.

2) ITERATIVE ANALYSIS
The summary of a single function is already generated through the above procedure. However, summaries of all functions cannot be obtained by simple traversal. In the example in Fig. 7, obviously, foo2 is a DPCF with a summary {FREE(arg 0 )}. When foo1 calls foo2, the function call is replaced by the summary. Then, foo1 is also a DPCF. Therefore, iterative analysis is required to summarize all the DPCFs, DPDFs and DPUFs in the entire program.
We perform an iterative analysis along the backward direction of the call graph of the target program to generate summaries of the above three kinds of functions. Summarizing DPCFs is taken an example to explain how the iterative VOLUME 8, 2020  analysis works. Fig.8 shows our algorithm for the iterative analysis. Q represents the work queue to be analyzed. S DPCF represents the set of DPCFs that have been summarized. At the beginning, the work queue Q is initialized with the DPCFs specified by the user, such as free. The algorithm gets and removes a function f from Q and determines whether it is a DPCF or not in each loop. If f is a DPCF, it will be summarized using the method in Section IV-B.1 and added to S DPCF . In addition, all functions calling f is added to the work queue Q. If f is not a DPCF, it will be ignored. The algorithm ends until Q is empty. It should be noted that we ignore recursive calls.
Obviously, the summaries of all DPDFs and DPUFs can be obtained by the same approach. With this knowledge, function calls can be handled easily in the inter-procedural analysis. If the callee of a function call belongs to the above three kinds of functions, the call is replaced by the summary of the callee. Otherwise, the call is ignored. Therefore, a function only need to be analyzed once in our method although it is probably called many times.
The function summaries technique is used to analyze the example in Fig. 5 and compared with the in-lining technique in existing solutions. The results of the analysis are shown in Fig 9. The number following ''line'' indicates the line number in Fig 5. The function foo1 is first analyzed and summarized. Summary(foo1) = {Free(arg 0 )}. When foo2 is analyzed, the function call that calls foo1 at line 6 is replaced by foo1's summary and foo1 does not need to be analyzed repeatedly. Similarly, foo2 also only needs to be analyzed once, although it is called twice in line 12 and line 14. Therefore, the total number of functions to be analyzed using our approach is 3 in this example, which is less than the total number using the in-lining technique. Our approach can effectively prevent redundant analysis in inter-procedural analysis. The advantage is more obvious when analyzing large-scale programs, which is shown in Section V-C.

C. ALIAS ANALYSIS
The purpose of an alias analysis is to determine which pointers point to the same memory address at a program point. As described in Section II-B, alias analysis is a challenge in use-after-free vulnerability detection, which affects the detection accuracy. Although there are many solutions for alias analysis in high-level languages, there are few solutions for binary code due to lack of information during compilation. Thus we need to create our own alias analysis to mitigate this problem. A dedicated data-flow analysis is performed to address alias analysis.

1) ABSTRACT STATE
To identify aliases, we focus on each pointer and the addresses it points to. The association between a pointer and memory addresses is represented by a pair (p, ADDR(p)), where p denotes a pointer variable and ADDR(p) denotes the set of memory addresses that p points to. In the intermediate representation, if the pointer is on the stack, it is represented as the offset relative to stack base BP. Otherwise, if the pointer is a global variable, it is represented as a fixed offset. The symbol addr i is used to represent the memory address because the static analysis cannot obtain the specific value of the address. For each program point, AbsState contains all such pairs that associate pointers and addresses. If ADDR(p)∩ ADDR(q) = φ in a point, p is considered an alias to q.

2) STATEMENT TRANSFER FUNCTION
The assignment statement of the pointer that changes AbsState is defined as s : p = addr. In the statement, addr represents a memory address, which may be the result returned by malloc, the address pointed by another pointer, an immediate value or NULL.

3) BASIC BLOCK TRANSFER FUNCTION
The values of AbsState immediately before and immediately after each basic block B are denoted by IN [B] and OUT [B], respectively. Suppose a basic block B consists of assignment statements of pointers s 1 , s 2 , . . . , s n , in that order. We derive the transfer function f B of a basic block by composing the transfer function of the statements in the block. Thus, f B = f s n • . . . f s 2 • f s 1 . The relationship between the AbsState before and after the basic block is The purpose of our data-flow analysis is to find all possible addresses pointed to by a pointer. Thus, the input of a basic block B is the union of all the output of its predecessors, i.e.,

IN [B] = P∈predecessors(B) OUT [P].
We use the iterative algorithm in [23] to solve these equations and obtain the least fixed point. For a pointer in each program point, the results of these equations contain all the addresses that the pointer may point to. The pointer aliases can be easily identified by exploiting the results. An example given in Section IV-E explains the detailed process of the alias analysis in use-after-free vulnerability detection.

D. POINTER TRACKING AND USE-AFTER-FREE VULNERABILITY CHECKING
According to the vulnerability feature model in Section II-A, the creation of dangling pointers is a necessary condition for use-after-free vulnerabilities. In the previous section, we have obtained the summaries of DPCFs, DPDFs and DPUFs. Supported by these summaries, the inter-procedural analysis is transferred to the intra-procedural analysis through replacing function calls with the summaries of callee functions. Therefore, we only need to perform an intra-procedural pointers tracking in the functions that call DPCFs to check useafter-free vulnerabilities. If a function calls a DPCF, it is called as a Calling Freeing Function (CFF).
A finite-state machine (FSM) for use-after-free detection is established according the vulnerability feature model, as shown in Fig. 10. The vulnerability checking maintains the state of each pointer and traces the state transition in CFFs. The pointer's state is tracked forward along the CFGs from the entry of each CFF. The state of a pointer p is Start in the beginning and converted to Dangling when the pointer or its aliases are freed by FREE(p) or FREE(Alias(p)), otherwise the state remains the same when other operations are performed. In the Dangling state, dereferencing the pointer (USE(p)) shifts the state to a use-after-free defect. Moreover, the creation and dereference of the dangling pointer are recorded and output to the error reports. Note that FREE(p) is a special kind of dereference of pointer p (in Section II-A). Thus, double-free vulnerabilities can be detected by this FSM. Finally, if a dangling pointer is reassigned by DESTROY(p), the state is converted to End, and the pointer tracking for this pointer ends.

E. EXAMPLE
A real vulnerability, CVE-2015-5221, is taken as an example to introduce the workflow of our approach in detail. This vulnerability is a double-free vulnerability in an image manipulation software, Jasper. The key code associated with vulnerability is shown in Fig. 11. To explain the process of alias analysis in vulnerability detection, we modified the code to include more complex alias relationships. • Generating summaries of the DPCFs. According to the algorithm in Fig. 8, the functions free, jas_free and VOLUME 8, 2020 • Tracking pointers and checking vulnerabilities. The functions mif _process_cmpt, jas_tvparser_destroy and jas_free are considered CFFs. We perform pointer tracking and vulnerability checking (in Section IV-D) in these functions. The pointer alias are identified using the method in Section IV-C. The detection discovers a useafter-free vulnerability in function mif _process_cmpt, as shown in Table 1. In Table 1, the first and second columns represent the line number and the code related to the pointers, respectively. The third column is the program abstract state. The pointer aliases are represented in the fourth column. The last column records the FSM state. In the first row, tvp1 is allocated a memory object in line 10. Thus, a new address addr 0 is associated with tvp1. The state of tvp1 is set to Start. Then, tvp2 points to the same address as tvp1 because of the assignment statement in line 11. The state of tvp2 is also set to Start. In line 13, tvp1 is freed by a DPCF jas_tvparser_destroy and becomes a dangling pointer. Meanwhile, tvp2 also becomes a dangling pointer because it is an alias of tvp1. Then, when tvp2 is used in line 18 (FREE is a special USE), its state is shifted to UAF. The points of creation and dereference of the dangling pointer tvp2 in the program are reported.

V. IMPLEMENTATION AND EVALUATION
A prototype called UAFDetector is implemented based on the proposed approach. This section describes the implementation details of the prototype and evaluates its effectiveness and performance overhead using Juliet Test Suite (JTS) and real-world programs.

A. IMPLEMENTATION
In order to analyze binaries from multiple architectures, we carry out vulnerability detection on the intermediate representation. We implemented the prototype based on Google's binary analysis platform BinNavi [19], which is also employed in GUEB [16]. BinNavi can be combined with IDA Pro [18] to easily translate binaries from multiple architectures into an intermediate representation REIL. In the pre-processing procedure, IDA Pro is employed to transform the binary code into assembly code. Then, this assembly code is translated into REIL by BinNavi. The CFGConstructor is employed to obtain more complete CFGs.
BinNavi also provides APIs for analyzing the binaries on the intermediate representation. In the vulnerability detection procedure, we implemented function summary generation, alias analysis and pointer tracking by utilizing the APIs. The UAFDetector maintains the state of each pointer and tracks the changes of the state according to the FSM in Fig. 10. Finally, the program locations involved in use-after-free vulnerabilities are reported.
The prototype is currently capable of analyzing binaries on x86 architecture. However, in theory, our approach can be applied to multiple architectures because it is based on intermediate representations.

B. EFFECTIVENESS
We used the standard benchmarks, Juliet Test Suite (JTS) [17], and real-world programs to evaluate the effectiveness of UAFDetector.

1) JTS
This benchmark is a collection of C/C++ programs with known vulnerabilities provided by the National Institute of Standards and Technology. We selected 1457 C/C++ programs with use-after-free or double-free vulnerabilities and compiled them into binaries. Then, UAFDetector is used to analyze these binaries and report the vulnerabilities. The analysis results are shown in Table 2. The columns in this table represent, from left to right, the type of vulnerabilities in the target programs, the language of the target programs, the number of target programs, the number of known vulnerabilities in the target programs, the number of vulnerabilities reported by UAFDetector, the number of false positives (FPs), the false positive rate (FPR), the number of false negatives (FNs) and the false negative rate (FNR).
As shown in Table 2, UAFDetector is capable of finding use-after-free or double-free vulnerabilities in JTS. UAFDetector correctly reports 2042 of the 2092 known vulnerabilities in 1457 target programs. The false negative rate is 2.39%, and the false positive rate is zero percent. We have shared the test binary programs and the experimental results at https://github.com/BinaryAnalysis/UAFDetector.

2) REAL-WORLD PROGRAMS
We further evaluate the effectiveness of UAFDetector using real-world programs. UAFDetector analyzes six popular programs with known vulnerabilities and outputs error reports. We manually determine whether each item in the report is true or false positive. The six programs analyzed are  Jasper, an image processing/coding tool kit; OpenJPEG: a JPEG 2000 codec; Boolector: a satisfiability modulo theories solver; LibTIFF: a collection of tools for manipulations of TIFF images; LibPNG: the official PNG reference library; and GNU cflow: a tool for charting the control flow within programs. The experimental results are shown in Table 3. The columns in this table represent, from left to right, the name of the program, the version of the program, the number of known vulnerabilities in the program, the CVE identifiers of the known vulnerabilities, the number of vulnerabilities reported by UAFDetector, the number of true positives (TPs), the number of false positives (FPs), and the number of false negatives (FNs).
The results show that UAFDetector can successfully detect 5 of the 6 known vulnerabilities in the six programs and can discover a new vulnerability in GNU cflow despite some false positives and negatives. We have submitted the new vulnerability to the public vulnerability database and obtained a vulnerability identifier CNNVD-201904-1451. Thus, our approach is effective in discovering use-after-free vulnerabilities in binary code.
Although our approach can effectively detect use-after-free vulnerabilities in binary code, there are still false positives and false negatives. We analyze the reason of each incorrectness and propose corresponding improvements.

3) FALSE NEGATIVES
There are several reasons for false negatives. (1) The pointer alias analysis is not completely accurate. Considering the time cost, we analyzed only the pointer aliases of simple patterns, which cannot address complex situations, such as indirect aliases [24] caused by re-allocation. We will improve the accuracy of alias analysis by tracking the memory allocation or combining the local dynamic analysis in the next step.
(2) According to the observations in [1] and [16], most of the use-after-free vulnerabilities are not sensitive to a particular loop iteration. Therefore, UAFDetector unrolls each loop at most once, similar to the solutions in [1], [16]. However, there are a few vulnerabilities that depend on executing a loop several times in our experiments. UAFDetector cannot detect the vulnerabilities in this case.

4) FALSE POSITIVES
The main reason for false positives is that our approach uses a path-insensitive analysis. A path-insensitive analysis has the advantage of low time overhead, but it is not completely accurate. Some paths that are considered accessible on the CFG in the static analysis cannot be executed dynamically. A typical example of false positives is shown in Fig. 12. Here, the function f 2 returns 0 when it frees arguments and does not free arguments otherwise. The caller f 1 checks the returned value and never uses the pointer if it is freed. However, our analysis considers a possible path from freeing argument (line 3) to using argument (line 12). A path-sensitive analysis using symbolic execution can ameliorate this problem. Because symbolic execution has a high overhead, we consider adding a lightweight local path-sensitive analysis to our approach to achieve the trade-off between performance and accuracy.

C. PERFORMANCE OVERHEAD
We evaluate the performance overhead of UAFDetector using real-world programs. The experiment was conducted on an Intel Core i7-6700 processor with 8 cores at 3.4 GHz and 16 GB of memory, running on a 64-bit Windows 7 operating system. Table 4 shows the test binary programs and the time overhead needed by UAFDetector to analyze them. The target programs contain dynamic libraries of different sizes: from VOLUME 8, 2020  hundreds of kilobytes to dozens of megabytes. We record the time taken in the detection procedure that contain function summary generation, alias analysis and pointer tracking. The time in Table 4 is the average of ten repeated analyses.
The experimental results show that UAFDetector can complete the analysis in a relatively short time for binary programs of different sizes. For example, it only takes 135.15 seconds for UAFDetector to analyze libmergedlo, which is a 50 MB binary in LibreOffice. Therefore, UAFDetector has the ability to analyze large-scale real-world programs.
We use GUEB [16] and our approach to analyze the above binaries in real-world programs to compare their performance overhead. The main difference between the two methods in design is that GUEB uses the in-lining technique for inter-procedural analysis, while our approach uses function summaries. However, there are many differences in the implementation details between the two approaches; for instance, GUEB provides graphical results that are not necessary for vulnerability detection. Consequently, it is not convincing to compare the analysis time directly. We compare the number of functions to be analyzed by the two approaches in the detection procedure. The statistical results are shown in Table 5. In the table, F total represents the total number of functions in the test binaries. F GUEB and F UAFDetector represent the number of functions to be analyzed by GUEB and UAFDetector in vulnerability detection, respectively. The results show that UAFDetector can detect vulnerabilities by analyzing only some of the functions in the program in detail because it only focuses on useafter-free vulnerability-related functions (DPCFs, DPDFs, DPUFs and CFFs) and ignores other functions. Moreover, the number of functions to be analyzed by UAFDetector is approximately 70.8% lower than that by GUEB. The reason is that UAFDetector uses function summaries instead of the in-lining technique used by GUEB to achieve inter-procedural analysis. As shown in the example in Fig 5, the in-lining method needs to repeatedly analyze the functions that are called multiple times, which results in a lot of redundant overhead. However, the function summaries method used by our approach does not have this limitation. In this method, all function calls are replaced by the summaries of callee functions. Thus, functions are analyzed only once, even if they are called many times.

VI. LIMITATION AND FUTURE WORK
Although UAFDetector has been demonstrated to be able to detect the use-after-free vulnerabilities in the large-scale binaries, it still has important limitations. We briefly describe the limitations and applicable conditions of our system.
First, UAFDetector utilizes static analysis to detect vulnerabilities without the ability of dynamic analysis. Therefore, its analysis results are not completely correct. As discussed in the Section V-B, the incomplete pointer alias analysis and the imprecise loop handling may cause false negatives. In addition, UAFDetector uses the path-insensitive analysis. This method may cause false positives.
Second, we employ some existing tools to implement UAFDetector, such as IDA Pro, BinNavi and CFGConstructor. The prototype depends on the assumption that the employed tools convert the binary code into the intermediate representation and obtain the CFGs correctly. However, this assumption is not always satisfied. For example, the code transformation methods and the code confusion methods may prevent IDA Pro from disassembling the binary or obtaining the CFGs correctly. In addition, IDA Pro may fail to identify the function boundaries in some cases. These problems are the basic problems widely existing in many binary analysis applications, such as the detection of code clone [25], [26] and the detection of malware [27]- [29]. Our prototype cannot detect the vulnerabilities correctly in the above cases.
In addition, the impact of using different compilers or compilation settings on binary analysis tools is often discussed. Many existing binary analysis tools rely on assumptions about specific compilers and compilation settings [25]. UAFDetector carries out the analysis on the intermediate representation and focuses on the semantics of the binaries rather than the structure of code. Therefore, different compilers or compilation settings do not impact our system in general.
Although our static analysis approach can provide good results from a scalability point of view, it is not as accurate as dynamic approaches. Therefore, our analysis is suitable as a first level vulnerability detection step. Our results containing the sets of program locations involved in the use-after-free vulnerability can provide basic knowledge for further vulnerability detection. For example, our approach can point out which parts of the target program must be stressed for popular fuzzing techniques.
The approach would be further improved in our future work. First, A lightweight local path-sensitive analysis could be added to reduce the number of false positives. Second, we will consider more patterns of pointer aliases, such as indirect aliases [24], to improve the completeness of the alias analysis and reduce the number of false negatives. In addition, a fuzzing technique will be utilized as the further analysis to avoid false positives.

VII. RELATED WORK
This section summarizes the related research on protecting against use-after-free vulnerabilities from two aspects: vulnerability detection and vulnerability mitigation.

A. USE-AFTER-FREE DETECTION
Most existing use-after-free detection solutions depend on dynamic analysis. CETS [30] inserts a runtime check when the program is compiled. When a pointer is referenced, CETS checks whether the object pointed to by this pointer is still allocated to find the dangling pointer. However, this solution lacks robustness in its prototype implementation, which causes a large number of complex programs to be unable to compile using this method. AddressSanitizer [13] is a popular runtime detection tool. It can also dynamically detect the illegal use of pointers in programs. However, like CETS, AddressSanitizer uses instrumentation at compiling and requires the source code, which limits the practical application of the tool. Valgrind [12] and Purify [11] detect vulnerabilities by checking whether the dereferenced pointer points to valid memory. This approach is unable to detect dangling pointers that point to an object that has reused the memory. To address this problem, Undangle [7] uses a dynamic analysis approach called early detection to protect against use-after-free vulnerabilities. This method combines taint analysis and pointer tracking to effectively identify unsafe pointers that are created but not used, which improves the completeness of vulnerability detection. However, this method relies on an execution trace analysis, which has a high performance overhead and cannot analyze large-scale programs. Clause et al. [14] presents a dynamic technique for detecting invalid memory access. Their approach taints both the objects and the corresponding pointer using the same taint mark. The taint marks are propagated and checked every time a pointer is referenced. If the taint marks of the object and the pointer differ, then the illegal access is reported. The approach is able to work on binary code but requires hardware support to achieve an efficient taint analysis.
Although dynamic analyses achieve a high detection accuracy and has few false alarms, the method requires inserting dynamic runtime checks and has a high runtime overhead and a high memory overhead. In addition, it is difficult to generate input that can execute vulnerable paths, which leads to low code coverage in dynamic analyses.
Static analyses do not suffer from the above limitations. However, there are very few studies on static analyses for use-after-free vulnerability detection. UAFChecker [1] uses classic static analysis techniques, including taint analysis and symbolic execution, to detect use-after-free vulnerabilities in C/C++ code. This paper does not discuss the performance overhead of symbolic execution and taint analysis, and this method is not suitable for binary code. Tac [31] is a machine learning-guided static use-after-free vulnerability detection framework. It learns the correlations between program features and use-after-free vulnerability-related aliases by using a support vector machine and exploits this knowledge to improve the precision of the alias analysis. Nevertheless, the approach is not yet sound and requires a large number of marked training samples. The most important limitation of the above methods is that they can only analyze the open-source programs. However, the source code for a large number of applications is unavailable.
The closest approach to ours is GUEB [16], which uses static analysis to detect use-after-free vulnerabilities in binary code. The solution uses a dedicated value set analysis to track heap operations and address transfers. Then, it exploits these results to statically identify use-after-free vulnerabilities and extract the subgraph for each vulnerability. The main difference between this solution and ours is that GUEB uses a naive in-lining technique for inter-procedural analysis. The functions that are called many times need to be analyzed repeatedly, which results in a high overhead for analyzing large programs. UAFDetector uses function summaries instead of an in-lining technique to improve the efficiency and scalability of vulnerability detection.

B. USE-AFTER-FREE MITIGATION
Instead of detecting use-after-free vulnerabilities, some studies focus on how to prevent the exploitation of vulnerabilities.
Cling [32], Diehard [33] and Dieharder [34] are safe memory allocators that are designed to make the exploitation of use-after-free vulnerabilities harder. These allocators restrict memory reuse by using more address space or randomizing VOLUME 8, 2020 the memory allocation. These solutions can effectively prevent exploitation of use-after-free vulnerabilities and have acceptable overhead. However, attackers can bypass these mitigations by using ''heap spraying'' or ''heap fengshui''like attacks.
DangNULL [4], FreeSentry [35] and DangSan [36] prevent the exploitation of use-after-free vulnerabilities by using pointer invalidation. These solutions insert runtime checks during compilation to track per-object pointers and invalidate the pointers once the object is freed. When the dangling pointer is referenced, pointer invalidation crashes the program to prevent the attacker from exploiting the vulnerability. Despite a number of optimizations, the approach still has a high performance and memory overhead and is not widely applied in practice.
Overwriting virtual table pointers is the most widely used technique to exploit use-after-free vulnerabilities. VTGuard [37], SafeDispatch [38] and VTV [39] protect against this exploitation technique to mitigate use-after-free vulnerabilities. These mitigations have low performance overhead because they only focus on protecting the virtual tables. However, these approaches are ineffective when attackers target other pointers rather than virtual table pointers.
Many vulnerabilities are caused by incorrect memory management by programmers using C/C++. Therefore, some efforts have been made on safe languages to avoid known memory corruption vulnerabilities by modifying language constructs. These languages use garbage collection instead of explicitly freeing memory to reduce the harm of dangling pointers. There are a number of safe languages [40], [41] that are as close to the C/C++ language as possible. Although they try to maintain compatibility with C/C++ programs, it still requires a lot of effort to translate existing projects in these languages.

VIII. CONCLUSION
Use-after-free vulnerabilities caused by dangling pointers are an increasingly serious threat to computer systems. While a number of mitigations are proposed to address this problem, few of them are sufficiently practical for large-scale programs. In this paper, we propose a scalable static approach that combines CFG construction, alias analysis, function summaries and pointer tracking to detect use-after-free vulnerabilities in binary code. Our approach uses the function summaries technique instead of the in-lining technique for the inter-procedural analysis and avoids repeated analysis problem in existing approach. We have implemented a prototype called UAFDetector to detect use-after-free vulnerabilities in binary code and evaluated it using standard benchmarks (JTS) and real-world programs.
The experimental results show that our approach can effectively find use-after-free vulnerabilities. UAFDetector achieves a low false negative rate (2.39%) and a zero false positive rate in the JTS. Moreover, despite some false positives and false negatives, UAFDetector finds most known vulnerabilities and one unknown vulnerability in the real-world programs provided. Compared with the existing static analysis solution GUEB for detecting use-after-free vulnerabilities, our approach can reduce the redundant overhead caused by repeated analysis. In our experiments, the number of functions to be analyzed in UAFDetector is approximately 70.8% less than that in GUEB. Therefore, our approach has a lower overhead and better scalability for detecting use-after-free vulnerabilities in large-scale programs.