Interactive Image Inpainting of Large-Scale Missing Region

Image inpainting is a challenging reconstruction of the damaged image in photography, especially for more valued artwork than before. The damages are mostly caused by scratches and worn out, so they cannot be easily fixed physically. Thus, many scientists proposed sophisticated methods for restoring the damaged image into a new one similar to an original image. However, these methods have not solved the problem effectively if the missing region is large. In this paper, we focus on how to restore a large missing region in image inpainting. This algorithm is composed of two steps: structure propagation and color propagation. In structure propagation, we segment a large region (non-homogeneous) into several small regions (homogeneous) based on the salient structure of missing region. Then, we applied a simple pixel-based inpainting method called the Fast Marching Method (FMM) to fill in the missing homogeneous regions by color propagation. In the experimental section, we applied several kinds of missing regions, such as irregular and regular missing regions, in large sizes. The results show that our proposed method performs well in various conditions.


I. INTRODUCTION
Many technologies preserved images and artworks a long time ago. The spots and scratches caused many damages that are not easily removed. Scientists convert them into digital images and try to restore them to original images. Image inpainting is one of the techniques to restore damaged images. Some previous methods worked well if the damaged region is small in size. However, if the size of a missing region is too large, it will cause a blurry result.
In general, a missing region can be filled in by using several interactive operations manually, such as Adobe Photoshop. Since interactive tools require accurate work driven by a skilled professional artist to complete the restored image smoothly, it will be a high challenge for the beginner.
Qureshi et al. [32] explained three kinds of traditional image inpainting methods: Partial Differential Equation (PDE)-based, pixel-based, and exemplar-based. The PDE-based method used texture information in isotropic, anisotropic, linear, or nonlinear directions. Local structure information is diffused from known to unknown regions. This method is employed for extended narrow areas such The associate editor coordinating the review of this manuscript and approving it for publication was Wei Liu. as lines and cracks because it provides more reliable confidence accuracy. The pixel-based method extracts information using pixels around an unknown region and then fills in the damaged part. However, it does not apply to large unknown texture regions because it only works well in the surrounding boundary unknown area so that a blurry area will exist in the center. In contrast to PDE-based and pixel-based methods, the exemplar-based method will match the similarity between known and unknown patches. This method is suitable for restoring the large unknown texture region. Unfortunately, some unwanted artifacts from disconnected edges and inconsistent texture areas will be visible in a particular condition.
Bertalmio et al. [1], and Oliveira et al. [3] proposed a PDEbased image inpainting method for pre-processing propagated information to an unknown region from its surrounding pixels in the isophotes direction. They used isophotes to find the minimum change by tracing contour. After that, the information of known region will be spread to an unknown region for maintaining continuity of linear sketch.
Telea [5] employed a pixel-based method, which transmitted the pixel information of surrounding unknown regions along the image gradient to improve the accuracy value. It is a fast and straightforward method for restoring small and homogeneous regions.
Reconstruction in a large missing region is more challenging than in a small region since it should deal with non-homogeneous image information. The exemplar-based method can be an alternative way of reconstructing large missing regions. Drori et al. [2] estimated a large missing region and filled in it using robust fragments. A fragment is selected from the most similar and constant examples of the known region in different radius scales. Therefore, performance depends on the total of the available fragments. However, it will result in a blurred artifact if the missing region is on object boundaries. Based on the concept of Drori et al. [2], Criminisi et al. [4] proposed an improved version. The gradient of an unknown region assigned the order of inpainting to reduce the running time.
Learning-based methods approved techniques to solve image inpainting, which consider two aspects: image generation and structured prediction. In image generation, Nazeri et al. [7] used an edge generator and image completion. Edge information is an essential part of first object initialization. It is followed by image completion to analyze the unknown region. Structured prediction defined the model of structured similarity between known and unknown regions. Yang et al. [6] proposed a multi-scale neural patch synthesis that retains the structure and matching patches with the most similar features that produced the high-frequency of unknown regions.
This paper designs an unsupervised and fast novel algorithm for solving image inpainting with a large missing region. First, our method segments a large missing region (non-homogeneous) into some small missing regions (homogeneous) by using structure propagation (an initial process of Sun et al. [8]). Then, we apply a simple pixel-based method, which is FMM [5] to fill in each pixel intensity of unknown homogeneous regions. Those two existing methods are not simply used. The basic idea of our approach is how to find homogenous regions. There are three points of our contributions: 1) An interactive and simple image inpainting technique based on the traditional method. 2) Segmentation of the large missing region (nonhomogeneous) into some small missing regions (homogeneous). 3) Completion of large missing regions using a fast method that reduces compilation time. This paper is organized as follows. After the explanation of related works (Section II), we detail our proposed method (Section III). Next, we show the experimental results (Section IV) and discuss about implementation (Section V). Finally, we present the conclusion and interesting future works (Section VI).

II. RELATED WORK
Most researchers used exemplar-based methods to fill in large missing regions because it effectively generated a restored region by sampling and copying information from known areas. This method used a preference function, which selected a patch of an unknown region first and then found a patch with similar details in known regions. Exemplar-based methods can only be applied to images that have a simple structure and texture. Criminisi et al. [4] was the first method to propose patch-based or exemplar-based for recovering large missing regions by simultaneously reconstructing structure and texture in missing regions. This approach employed the bestfirst filling strategy that depended on each patch, which has a substantial edge and was surrounded by high-confidence pixels. Then it was selected as the priority choice. The highest patch values of known region will be propagated via structure and texture diffusion to fill in the missing patch with similar information. However, this method cannot restore large missing regions based on structure and texture.
Many methods proposed to improve Criminisi et al. [4] in restoring large missing regions. The improved approaches were categorized into two groups. The first group focused on improving texture reconstruction [10]- [12], and [13]. Cheng et al. [10] generalized a function of [4] to provide more robust performance. It selected the higher weight of data and value used for the best matching patch from known regions. Hesabi and Mahdavi-Amiri [13] presented a modified patch, which measured the similarities between fill-front and candidate patches to analyze the information of surrounding pixels. However, Cheng et al. [10], and Hesabi and Mahdavi-Amiri [13] could not handle the complex texture of unknown regions. Besides, the second group to restore the structure from known region to unknown region more accurately such as [12], [14]- [17], and [18]. Anupam et al. [12] proposed a technique to segment a large missing region into small regions and then search the optimal patches in the known region based on Minimum-mean Squared Error (MSE).
Sun et al. [8] presented structure propagation for optimization problem by accomplishing the consistency structure. It had a similar proposed technique with Anupam et al. [12] to segment large missing regions into several small missing regions. After structure propagation was completed, it used an exemplar-based method [4] to fill in small unknown regions. This method had a superior idea of solving the inpainting problem by segmenting a large missing region and using the exemplar-based method to fill in the remaining areas. Nonetheless, the artifact area will be visible caused by an exemplar-based method that could not obtain the optimal fit patches in the complicated texture and large size between known and unknown regions.
Efros and Leung [9] employed a non-parametric model based on the assumption of spatial locality. This method filled in missing regions, pixel by pixel. All pixels in a rectangle window around a defined pixel are used as samples. This method first finds all neighbor samples in a known region similar to an unknown region for synthesizing an unknown pixel. Then, the center of a sample will be the newly synthesized pixel. Moreover, Telea [5] also presented a simple pixelbased and fast inpainting method. This method propagated pixel gradient from the known region to fill in unknown areas. It considered some factors such as directional component, geometric component, and level set distance component. The elements determined how to paint a pixel by spreading color from a small known region around an unknown region. However, it will produce a large blurry region when recovering unknown areas of large size, especially in the center area. This method only works well in the surrounding boundary unknown region.
More recently learning-based methods solved inpainting problems and then obtained acceptable results such as [19]- [24], and [25]. Yu et al. [20], and Jo and Park [22] generated images with an interactive technique that can apply user sketches as direction to produce more user-desired results. These methods have shown that systems were excellent for restoring large regions in one pass and obtained realistic results. Zheng et al. [24] and Zeng et al. [23] trained models both regular and irregular regions. However, training models for irregular regions take about twice the time compared to regular regions.
All of the learning-based methods can work to restore large missing regions. Nonetheless, training models cannot effectively transmit information to missing inner areas and then produces a large blurry result between an object and its boundary edge. The learning-based method also needs more time to train a model than an unsupervised learning method, faster until it obtains the final restored image.
The goal of our proposed method is to restore a large missing region by generating a fast and straightforward technique based on an unsupervised method, as shown in Figure 1. We improved the result of image inpainting from previous methods that only employ structure propagation [8] or FMM [5]. Our work purposes a user interaction to segment a large missing region (non-homogeneous) into several small missing regions by structure propagation [8]. Because small missing regions are homogeneous, we employ FMM [5] for filling in unknown regions based on their surrounding pixels information. Therefore, our method performs well compared with other methods, even in various conditions such as regular  and irregular missing regions. However, our approach has a failure case when something is missing at the peak of a curved object. It only depends on sample patch information along a user-defined curve of the known region. So, our method cannot work well in a missing region with significantly different structures, as seen in Figure 2.

III. OUR PROPOSED METHOD
In this section, we introduce the concept of our proposed method: (1) Description of the algorithm in a simple flowchart, (2) Segmentation of the large missing region into several small missing regions by structure propagation [8], (3) Explanation to fill the small regions based on the pixelbased method using FMM [5], and (4) Detailed explanation of our proposed method.

A. FLOWCHART
Flowchart 3 describes a short explanation of our method. In structure propagation, input data is an image with a large 56432 VOLUME 9, 2021 missing region. Users will be asked to draw the curve or line manually to detect salient structures in a large missing region. The method will segment that region based on a user-defined curve into several small missing regions using structure propagation.
The output of structure propagation will be input for color propagation. In color propagation, small missing regions will be filled in by spreading the surrounding pixel information from known to unknown regions using FMM. For more details, this method will be explained in section III-D.

B. STRUCTURE PROPAGATION
The problem of structure propagation is how to estimate a structure through a user-defined curve C in an unknown region by spreading patches from the known region. According to the theory of structure propagation [8], this process calculates the similarity of both unknown and known patches using global optimization. It also considers consistency between two adjacent patches along a user-defined curve based on their overlapped regions by Sum of normalized Squared Difference (SSD).
Dynamic programming [26] computes minimal cost among known patches. Afterward, this method develops the minimum cumulative cost from center points of patch x 1 to x L through the unknown region curve for all possible to obtain optimal unknown labels [8] : where M is total number of patches. M 0,1 = 0 and L is current center point of patch. E 1 calculates the similarity patches of the unknown to known regions which uses this equation: where w s and w i are the related weights. E S (x i ) represents as the structure similarity on the curve C at each point of an unknown region p i . E I (x i ) describes the similarity between patches on the boundary of unknown region and known patches I − . E 2 x i , x j determines the consistency between two adjacent patches. E 1 (x i ) and E 2 x i , x j are calculated as Sum of normalized Squared Differences (SSD) between their overlapped regions.
Completely, the converged optimal label of patch L is computed by :

C. FAST MARCHING METHOD (FMM)
Fast Marching Method is an analytical method for solving boundary problems that is proposed by Sethian [27]. It uses a numerical technique for computing the position of propagating values. Based on Telea [5], FMM can also solve the pixel-based problem of image inpainting. This method uses travel time T and distance map T of unknown pixel to boundary ∂ . Borgefor [28], [29] and Meijster et al. [30] purposed distance transform (DT) to compute the distance map T . It was also similar to FMM. However, FMM has an advantage that provided a narrow band such as a stack for dividing between known and unknown regions. Moreover, it will decide which next pixel should be inpainted. The pixel of the narrow band is the inpainting boundary ∂ . Every pixel has some T value, pixel value I , and flag f . The flag has three kinds of values such as BAND, KNOWN, and UNKNOWN.
Since FMM-based image inpainting is a method to solve the boundary problem of missing region, it could lead to more robust results due to propagation from the closest known pixel and determine which the next pixel should be processed. FMM calculates the weighting function w (p, q) for determining how to fill in the unknown pixel p by propagating the pixel value of the known region q. This function considers about directional component dir (p, q), geometric distance component dst (p, q), and level set distance component lev (p, q).

D. THE PROPOSED METHOD
In the following section, we will introduce more details of our proposed method, as shown in Algorithm 1. Given an image I that has a large missing region. Firstly, we segment a large missing region (non-homogeneous) into several small missing regions (homogeneous) using structure propagation [8]. Since small missing regions are homogeneous regions, we use a simple image inpainting method such as FMM [5] to repair them and then produce remarkable results. The purpose of structure propagation is to segment the large missing region into some small missing region and then make it easier to analyze them, as shown in Figure 4. We employ a user interaction to initialize user-defined curve C from known to the unknown region (red lines). Suppose that the center point of patch {x i } M i=1 along curve C in the unknown region . Then, these center points will build VOLUME 9, 2021 Algorithm 1 Our Proposed Method Structure Propagation Data: An image with the large missing region (nonhomogeneous). Result: An image with the small missing regions (homogeneous).
• Initialize a user-defined curve C from known to unknown region .
• Spread a set of patches Y from known to the unknown region along curve C on each center point of patches x i .
• Determine optimal label of patches Y (x i ) by using Dynamic Programming. • Flag f to all pixels of image (KNOWN, BAND, and UNKNOWN).
• Boundary pixels are stored to narrow band as BAND and be sorted in ascending order of T values.

Propagation while narrow band != empty do
Decide the smallest T value of narrow band Change the flag of (x, y) to be KNOWN for (a, b)

flag (a,b) = UNKNOWN) then
Determine flag (a, b) to be BAND; Propagate neighbor pixels of a known region inside radius r; Update a narrow band by inserting (a, b) that has minimum T value; else Continue; end end end a simple graph G = {V , E}. V is the vertex set of M points, and E is the edge set of connecting adjacent points on curve C. Moreover, patches of a known region along C represent as Y . Since the graph is simple, optimal patches of Y are determined by using dynamic programming and then propagate to an unknown region. Thus, we can segment salient structures of a large missing region into small missing homogeneous regions ( 1 , 2 , 3 , and 4 ) as shown in Figure 4(d).
The process of color propagation will work simultaneously for every small missing region. FMM fills in for each small missing region, pixel-by-pixel. Therefore, it outperforms the result of previous methods. Given pixel p on the boundary ∂ of the unknown region as shown in Figure 5(a). For filling in p, we use the information of known pixels inside radius area q (p). This technique uses the propagation process based on the weighting function, assisted by four neighboring pixels, as shown in Figure 5(b). Suppose that p 1 is an initial pixel position of boundary in an unknown region. By determining the order of the minimum distance between p 1 and its adjacent pixels p i in boundary, it proposes to fill in the closest unknown pixel. This method works for all boundary pixels iteratively until there are no more unknown pixels.

IV. EXPERIMENTS
Applied dataset of experiments approved that our approach is better than other image inpainting methods. It is available at https://github.com/openimages/dataset and the size of each image is roughly 500 × 500 pixels.
The experiment employs two kinds of missing regions [19]: regular (rectangular shaped holes) and irregular (random shaped holes). We apply these missing regions to images that have line and curve objects. Sizes of regular region are 50×50, 100×100, and 150×150 (height×width) in Figure 6. Otherwise, irregular regions are 30, 50, and 75 pixels (width) in Figure 7. Result of our method will be compared with other previous methods : original FMM [5], structure propagation and texture propagation (SP-TP) [8], and EdgeConnect [7]. Moreover, we consider visual and quantitative comparisons for the validity of measurement.

A. VISUAL COMPARISON
As shown in Figure 9, 10, 11, and 12, original FMM [5] generates some artifacts and disconnected salient structure of missing region. FMM is more difficult to recover missing regions in large size because it only propagates the information from the nearest neighbor pixels of missing regions and is more challenging than when only a small missing region.
Moreover, we compare our results with those obtained from SP-TP [8]. The results show that our method is slightly better than the SP-TP result. SP-TP also uses structure propagation to restore the missing region's complex salient structure, similar to our method. However, SP-TP uses exemplar-based propagation for filling in missing regions, which depends on their size. If a missing part is large, this method will take a large exemplar from known regions to restore it. Therefore, this method produces failed propagation when the missing region's exemplar does not match or exist in known regions.
Afterward, we also compare with a deep learning method such as EdgeConnect [7]. This method produces a large blurry region and disconnected salient structure of missing region because the training model cannot effectively propagate the information to a missing inner region. The edge generating model sometimes fails to accurately depict the boundaries in the large missing region.

B. QUANTITATIVE COMPARISON
Many evaluation metrics can be used to compute the accuracy of image processing outcomes. However, image inpainting cannot use simple traditional metrics such as Mean Squared Error (MSE) and Peak Signal to Noise Ratio (PSNR) because they are not well correlated with perceptual quality assessment [31], [32].
To evaluate our method, we use three types of Image Inpainting Quality Assessment (IIQA) metrics: Perceptual SSIM [34], Perceptual MSSSIM [35], and Perceptual FSIM [33]. The perceptual metric includes two evaluations: (1) objective metric based on the equation, (2) subjective assessment that is human judgments of perceived quality images. Moreover, the objective metric should be statistically consistent with subjective evaluation.
In our experiment, we have assisted by six observers with the normal or corrected vision to judge the subjective evaluation. Observers rate each image at a time and then give their score. Optional scores which could be selected by observers are ''not noticeable'', ''noticeable'', ''not acceptable'', ''acceptable'', ''salient'', ''very salient'', ''good'', or ''perfect''. The score starts from 1 to 8, respectively.

1) PERCEPTUAL SSIM COMPARISON
Structural Similarity Index Measure (SSIM) evaluates the similarities of images from three aspects: brightness, contrast, and structure. Range of SSIM value is [0, 1]. The higher SSIM value represents more similarities between the two images.
where α, β, and γ are parameters to define brightness, contrast, and structure. If we set α = β = γ = 1, then SSIM is given by where µ x and µ y are average grayscale values of image X and Y. σ x and σ y represent standard deviations of image X and Y. σ xy is the covariance between image X and Y. C 1 and C 2 are small constants to prevent the denominator from being zero.

2) PERCEPTUAL MSSSIM COMPARISON
Multi-scale SSIM (MSSSIM) constructs an image pyramid between two images and then calculates the SSIM at each layer separately. MSSSIM evaluation is obtained by combining the calculation on a different scale by where j is scale index, and M is the total of scales. This equation is similar to Equation (3), which α M , β j , and γ j are importance of three components (brightness, contrast, and structure).

3) PERCEPTUAL FSIM COMPARISON
Feature Similarity Index Measure (FSIM) proposes Human Visual System (HVS) that perceives images based on its lowlevel features. FSIM uses two kinds of elements: high Phase Congruency (PC) and Gradient magnitude (G). S PC is used as a primary feature to compute the contrast invariant and the VOLUME 9, 2021  weight of each pixel's contribution to the similarity of two images. S G is used as a second feature to encode the contrast information. Both features deliver different conditions of HVS to evaluate the quality of images.
where is a whole image spatial domain. Then, S PC and S G are combined to get the similarity of pixel values which is S L (x) = S PC (x) .S G (x). Meanwhile, it uses PC m (x) = max (PC 1 (x) , PC 2 (x)) to weight the importance of S L (x) in the similarity between two pixels.

4) OUR PERFORMANCE
The total number of evaluated images is 192 images in different missing region sizes, missing region types, and object types. We employ mean evaluations in the similar categories in Table 1, 2, 3, and 4.
As shown in Table 1, the accuracy of our proposed method is highest than other methods when the missing region is larger, although using different objective metrics such as SSIM, MSSSIM, and FSIM. Meanwhile, when a missing region is smaller, the score of our method is similar to SP-TP because a small missing region can be restored easily, even if the propagation applies the pixel-based or exemplar-based method. We also perform the ranking between objective and subjective metrics in Table 2 consistently.  Spearman Rank Order Correlation Coefficient (SROCC) is operated to measure the correlation between two variables' ranks (objective and subjective metric). The range is from −1 to +1. Suppose a value is close to +1, which represents a strong correlation. On the other hand, the −1 value has a strong disagreement. SROCC value is zero when there is no correlation between both variables. As shown in Table 3, our method is close to +1, explaining the strong correlation between objective and subjective metrics. The highest values are highlighted by boldface. It proves that our approach has better performance to recover large missing regions in the line object than a curved object because structure propagation does not solve a salient dynamic structure with a changed shape frequently. Meanwhile, Table 4 shows that a regular missing region is better than an irregular missing region because it has a random and more complicated pattern. This case generates a significant failure when spreading the patch from a known to an unknown region, as shown in Figure 8.

C. TIME COMPARISON
In the following section, we compute the average compilation time of similar image categories. As shown in Table 5, Edge-Connect has the fastest running time. It is a learning-based method that the testing time does not take a long time to fill in the missing region. However, training time takes around two days.
SP-TP matches the exemplar between unknown and known regions fastly. It can run straightforward and fast to solve the uncomplicated patch. FMM needs running time longer than the exemplar-based method because it recovers pixel by pixel based on the unknown region's surrounding information. Furthermore, our computational time is faster than the FMM but slower than SP-TP. It is caused by operating structure propagation and color propagation based on SP-TP and FMM. However, our visual and quantitative results outperform than theirs.

V. DISCUSSION
In the experiment section, we apply an open-access dataset that contains many image categories. We employ missing regions between foreground and background with different colors and textures. Therefore, if there is a large missing region, it will be difficult to restore the center of a missing region because of less information in its known surrounding area. The deep learning method cannot work effectively for this problem because it will produce a large blurry in the restored missing region's center area. We employ the traditional approach that uses structure propagation to segment a large missing region into many homogeneous regions with almost similar color and texture to solve this problem. After that, we operate an easy, simple, and fast method for completing all of the missing homogeneous regions simultaneously using a pixel-based method, FMM. Because it is a fast and straightforward method, our approach has a shorter running time than deep learning methods.
Moreover, we propose an unsupervised or traditional approach rather than a deep learning method because a traditional approach is faster for running time and more straightforward for implementation. The deep learning method needs a training process that takes long times, at least 2-3 days, to create a model from many dataset images for testing an input image. Sometimes the high specification of computer is required. However, our method only takes 3-32 seconds to run the program. Besides, it does not need many dataset images to create a training model and high computer specification to restore missing regions. Results also show our traditional approach is better than image inpainting based on deep learning methods.
Most comparison results display that our method is better than the original FMM, starting from small to large missing regions. FMM is appropriate to recover the small area because it works pixel-by-pixel. It has a significant failure when propagating information in a large missing region. SP-TP works to restore a large missing region using the  exemplar-based method. However, if there is no proper and fit exemplar between unknown and known regions, it will be unworkable to synthesize the desired structure and pixel information.
Considering image inpainting in different missing regions, we show that a regular missing region is better than an irregular missing region. It has a more complicated shape and generates complexity when structure propagation is applied. 56438 VOLUME 9, 2021 FIGURE 11. Irregular missing regions with different sizes (width = 30, 50, and 75 pixels, respectively). VOLUME 9, 2021 FIGURE 12. Regular missing regions with different sizes (height×width = 50 × 50, 100 × 100, and 150 × 150, respectively). VOLUME 9, 2021 Moreover, we also compare line and curve objects of images. Our proposed method demonstrates that an image with the missing region in a line object would better result than a curve object. It is produced by structure propagation does not consider dynamic propagation that has a changed structure regularly.

56440
Since structure propagation is a method to restore large missing regions, it needs a user-defined curve from known to unknown regions to determine the salient structure of a missing region. In this case, image inpainting works well and obtains better results properly. However, structure propagation that employs user interaction will take more time to process. The user is required to draw the curve precisely. If a missing region is large, a user needs to analyze more structure areas in the known region. It is used for initializing samples and then propagate to an unknown region.

VI. CONCLUSION AND FUTURE WORKS
This paper has presented an interactive approach for restoring the damaged image with a large missing region using a fast and straightforward method. We operate structure propagation to segment a large missing region into small homogeneous regions. Due to the homogeneous area, we employ a fast and straightforward method for color propagation by filling in the small missing regions using FMM. Our method effectively outperforms in various large missing regions such as regular and irregular shapes. Moreover, most comparisons show that our proposed approach is better than previous unsupervised methods, even in the learning-based techniques.
In the future, we plan to implement an automatic process for structure propagation that demonstrates a better result than user interaction. We also consider an uncomplicated texture propagation to produce a more natural-looking result. Furthermore, we will apply a complex content image as input and present interesting opportunities.