Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators

We present incomplete gamma kernels, a generalization of Locally Optimal Projection (LOP) operators. In particular, we reveal the relation of the classical localized <inline-formula><tex-math notation="LaTeX">$ L_{1}$</tex-math><alternatives><mml:math><mml:msub><mml:mi>L</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math><inline-graphic xlink:href="stotko-ieq1-3349967.gif"/></alternatives></inline-formula> estimator, used in the LOP operator for point cloud denoising, to the common Mean Shift framework via a novel kernel. Furthermore, we generalize this result to a whole family of kernels that are built upon the incomplete gamma function and each represents a localized <inline-formula><tex-math notation="LaTeX">$ L_{p}$</tex-math><alternatives><mml:math><mml:msub><mml:mi>L</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math><inline-graphic xlink:href="stotko-ieq2-3349967.gif"/></alternatives></inline-formula> estimator. By deriving various properties of the kernel family concerning distributional, Mean Shift induced, and other aspects such as strict positive definiteness, we obtain a deeper understanding of the operator's projection behavior. From these theoretical insights, we illustrate several applications ranging from an improved Weighted LOP (WLOP) density weighting scheme and a more accurate Continuous LOP (CLOP) kernel approximation to the definition of a novel set of robust loss functions. These incomplete gamma losses include the Gaussian and LOP loss as special cases and can be applied to various tasks including normal filtering. Furthermore, we show that the novel kernels can be included as priors into neural networks. We demonstrate the effects of each application in a range of quantitative and qualitative experiments that highlight the benefits induced by our modifications.


INTRODUCTION
D IGITAL 3D scene models have become a crucial pre- requisite for numerous applications in entertainment, advertisement, design, architecture, autonomous systems, and cultural heritage.In this context, the accurate digitization of real-world objects and scenes is of great relevance and offers new opportunities regarding a variety of tasks including AR/VR-based inspection and collecting realistic training data for tasks in robotics, autonomous driving, aerial or satellite surveys.Aside from professional scanning campaigns with expensive laser scanning equipment, there has also been an increasing trend towards more practical scene capture with consumer-grade hardware such as passive purely image-based scene scanning using Structurefrom-Motion and Multi-view Stereo approaches, or with respective cheaper active time-of-flight depth sensors that have meanwhile even been integrated into numerous mobile devices.However, the use of passive scene scanning or active scanning based on cheap hardware with low sensor quality and low sensor resolution induces noise in both the capture process and the 3D reconstruction procedure, and thereby results in noisy point clouds and a low number of points that might not preserve finer geometric details in the reconstruction respectively, which, in turn, may lead to registration artifacts.Furthermore, the limited accessibility of capture conditions as well as occlusions induce holes in the reconstructed models.These challenges result in an increasing interest in robust surface reconstruction techniques capable of handling noise, outliers, registration artifacts and missing data.
Among others, the Locally Optimal Projection (LOP) operator [1] has gained a lot of attention in recent years due to its benefit of not relying on a well-defined surface parametrization or a piecewise planar approximation and, meanwhile, there has been a whole series of further extensions of this approach [2], [3], [4], [5].Furthermore, many learning-based approaches also aim at projecting the noisy data onto a (latent) denoised manifold [6], [7].Therefore, investigations towards the unification of traditional approaches with their respective regularization techniques might be of great relevance for future learning-based approaches as well.Even further, traditional techniques such as Mean Shift clustering [8], [9], [10] become more and more relevant in modern deep learning methods.Besides their application to structure the latent space representation of the data within encoder-decoder approaches [11], there is even a direct relation between the Mean Shift approach and denoising autoencoders [12].In particular, as the output of an optimal denoising autoencoder corresponds to the local mean of the true data density [13], the autoencoder loss can be interpreted as a Mean Shift vector [12].However, to the best of our knowledge, this observation has not yet been explored in the context of surface reconstruction and denoising.Hence, relating traditional concepts to modern deep learning methods might not only lead to a more explainable behavior of the latter but also allow increasing the resulting performance.In turn, this relies on the better understanding of the relationship between previous (traditional) techniques.
In this paper, we investigate the theoretical relationship of projection-based surface reconstruction approaches with their respective properties and show that these are unified within the common Mean Shift framework.In particular, the P q (0) q (1)   q (2)   t = 1 q (0) q (1)   t = 0 q (2)   q (3)   t = 2 q (3)  q (4)   t = 3 q (0) q (4)   Input L 1 Attraction Energy Minimization Mean Shift Localized at q (t) Global Fig. 1.Relation between LOP and Mean Shift at the example of the 2D Fish model.Minimizing the localized L 1 attraction energy with the Gaussian kernel K Gaussian (left) results in the same trajectory q (t) as applying Mean Shift on a global kernel density estimate with the kernel K LOP (right).
key contributions of our work are: • We reveal the relation of the classical localized L 1 estimator used in LOP to the Mean Shift framework via a novel kernel K LOP and introduce the family of incomplete gamma kernels K Γ as a generalization of this result where each kernel represents a localized L p estimator (see Section 3).

•
We derive various properties of the kernel family concerning distributional, Mean Shift induced, and other aspects such as strict positive definiteness to obtain a deeper understanding of the operator's projection behavior (see Section 4).

•
We demonstrate that leveraging the derived theoretical insights enables several applications including an improved Weighted LOP (WLOP) density weighting scheme, a more accurate Continuous LOP (CLOP) kernel approximation as well as the derivation of incomplete gamma losses, a set of novel robust loss functions (see Section 5).
In our evaluation, we demonstrate the benefits induced by our modifications in a range of quantitative and qualitative experiments.Furthermore, the theoretical insights of our investigations with their proven effect may be of great relevance also for future learning-based approaches.

RELATED WORK
In the following, we provide a review of geometric and learning-based denoising approaches.Furthermore, we also review seminal work regarding the theory and application of the Mean Shift framework due to its relationship to LOP approaches that we will demonstrate later.

Geometric Denoising Approaches
Following early approaches such as the local fitting of tangent planes [14] or using radial basis functions [15], respective developments particularly focused on projection-based methods, sparsity-based methods and non-local methods.Projection-based Methods.These approaches rely on the assumption of an underlying smooth surface and the projection of noisy data points onto the estimated local surface.For this purpose, respective approaches apply moving least squares (MLS) based methods [16], [17], [18], [19], robust principal component analysis (RPCA) [20] and moving robust principal component analysis (MRPCA) [21], or locally optimal projection based operators where the LOP operator [1] has been extended in terms of Weighted LOP (WLOP) [2], Feature LOP (FLOP) [3], Continuous LOP (CLOP) [5], Edge-Aware Resampling (EAR) [4] and a Gaussian mixture model inspired projection operator [22].The latter has been demonstrated to be capable of resampling point clouds while preserving features due to the additional guidance of filtered normals.
Sparsity-based Methods.This class of approaches relies on the assumption that objects can be represented in terms of piecewise smooth surfaces with sparse features.Respective denoising techniques include L 0 -norm [23], [24] and L 1 -norm minimization [21], [25], [26], sparse dictionary learning [27] as well as patch-based or feature-based graph Laplacian regularization [28], [29], [30], graph-based point cloud denoising based on jointly leveraging geometry and color information [31], guided filtering based on normal information followed by a L 1 -medial skeleton extraction to get the sharp structure of the surface [32] as well as leveraging gravitational feature functions [33].In the context of denoising dynamic point clouds, Hu et al. [34] explored the temporal coherence of spatio-temporal graphs with respect to the underlying surface, where a respective manifold-tomanifold distance has been introduced.Furthermore, datadriven exemplar priors have been used for surface reconstruction [35], where the sparsity of local shapes from a collection of 3D objects has been explored.

Non-local Methods.
In contrast to the previous classes, these approaches rely on the assumption that geometric statistics are (approximately) shared by certain surface patches of a 3D model, i.e. local surface denoising is conducted based on collected neighborhoods with similar geometry [36], [37], [38], [39].However, the definition of a suitable metric as well as the regular representation of local surface structures remain challenging.Furthermore, densitybased point cloud denoising has been approached by first applying particle-swarm based optimization for kernel density estimation followed by a Mean Shift clustering-based outlier removal and a final bilateral mesh filtering [40].

Learning-based Denoising Approaches
Recent works more and more leverage deep learning for surface reconstruction from point clouds as well as point cloud denoising.Examples include approaches for point cloud consolidation and resampling such as PointNet [41], PointNet++ [42], patch-based progressive point cloud upsampling [43] as well as the unification of the considerations of densifying, denoising and completing point clouds [44].Other approaches followed the principles of initially projecting the points onto coarse-level local reference planes and applying a subsequent refinement [45] or the initial removal of outliers before conducting the denoising [46].Further approaches include edge-aware point cloud consolidation [47], adversarial defense [48], graph-convolutional methods [49], unsupervised approaches such as Total Denoising [50], gradient field based denoising [51], [52], [53], differentiable approaches [54], [55], [56] as well as manifold learning based on encoder-decoder architectures [6], [7].Non-local self-similarities have also been considered to define neural self-priors that capture geometric repetitions [57], capture semantically related non-local features [58], or apply self-correction by allowing the model to capture structural and contextual information from initially disorganized parts [59].Furthermore, normalizing flows have been applied to the learn the distribution of noisy points and disentangle noise from the latent space [60].In addition, the feature-aware recurrent point cloud denoising network (RePCD-Net) [61] combines a recurrent network architecture for noise removal with multi-scale feature aggregation and propagation and a feature-aware Chamfer distance loss.

Mean Shift Approaches
The Mean Shift approach [8] is a well-studied local modeseeking method with diverse applications including data clustering [9], [10], [62], [63], image filtering [10], segmentation [10], [64], denoising [12], [65], and object tracking [64].Tremendous effort has been spent to study its convergence behavior [9], [10], [66], [67], [68], [69] which culminated in a rigorous set of properties proven by Yamasaki and Tanaka [70].Recently, Mean Shift clustering has also been applied in the latent space of neural encoder-decoder approaches to achieve a better structured data representation [11].Furthermore, the connection between the Mean Shift approach and denoising autoencoders [71] has been revealed by Bigdeli et al. [12], who leveraged the observation that the output of an optimal denoising autoencoder (DAE) is a local mean of the true data density [13] to show that that the autoencoder loss is a Mean Shift vector and to use the respective magnitude to define a prior for image restoration.

BACKGROUND
Before deriving our proposed kernel family as a generalization of LOP in the context of Mean Shift, we first provide a brief introduction into the concepts of both approaches.

Mean Shift
The basic objective of Mean Shift [8] is to find the modes of a density function f which has been observed by a (sparse) set of points P = {p i ∈ R d } and is modeled by a kernel density estimate function: Here, h denotes the kernel window size and K a kernel that is non-negative (K(x) ≥ 0), normalized ( R d K(x) dx = 1), and radially symmetric (K(x) = c K k( x 2 )).The function k defined in the symmetry constraint along with the normalization constant c K is called the kernel profile of K and plays an important role in the analysis of Mean Shift [70].Furthermore, the gradient of the kernel density estimate can be derived using the kernel ) and its corresponding profile g(x) = − d dx k(x).Based on these two functions, Mean Shift finds the modes by iteratively applying the Mean Shift vector which describes the gradient vector normalized with respect to the kernel G and is the main component in the algorithm.
In particular, it directly defines the update step of the corresponding fixed-point iteration q (t+1) = q (t) + m P,G (q (t) ) which performs gradient ascent on the kernel density estimate function fP,K .

Locally Optimal Projection
The central tendency of a set of data points, often measured in terms of the geometric mean, is a crucial and desirable property used in many applications, but its definition and computation has been a challenging research question for decades.Although the geometric mean can be easily evaluated, it is highly sensitive to outliers.In contrast, the L 1 median -also called geometric median -is a more robust quantity and can be computed by the iterative Weiszfeld algorithm [72].In addition to many other contexts, it has been applied to 3D surface reconstruction to define a robust projection operator [1].Given a set of noisy target points P = {p i ∈ R d } sampled from a surface S where d = 3 is usually considered, the task consists in projecting an additional set of projection points Q = {q j ∈ R d } onto S based on the observations P.This can be expressed in terms of an energy formulation based on an attraction and a repulsion term where θ(x) = e −x 2 /(h/4) 2 denotes a compact localization kernel and η a decreasing regularization function penalizing small distances between projection points.Common choices of η include the originally proposed function as well as the less rapidly decreasing function η WLOP (x) = −x [2].Both energy terms are balanced by weights λ j which are chosen such that they only depend on a single, global parameter µ ∈ [0, 1/2).Based on the Weiszfeld algorithm, the solution to this optimization problem can be obtained by the fixed-point iteration with kernels α(x) = θ(x)/x and β(x) = θ(x)/x d dx η(x) .

Generalization via Incomplete Gamma Kernels
Although Mean Shift and Locally Optimal Projection were developed to solve different problems, we can link both concepts by rewriting the update step: This reveals that the LOP operator is a combination of standard Mean Shift with respect to the target set P and reverse Blurring Mean Shift with respect to the moving source set j }.Therefore, we can interpret the localized L 1 attraction energy minimization with a Gaussian kernel as a global density maximization with respect to a different kernel K LOP .An example of this relation is shown in Fig. 1.
To derive this corresponding kernel, we consider the profile of the involved kernel which follows a gamma distribution f Γ (x | a, b) ∝ x a−1 e −x/b with support x ∈ (0, ∞) and parameters a > 0, b > 0. The profile of the actual kernel then follows the distribution FΓ (x | a, b) ∝ Γ(a, x/b) which is the complementary CDF of f Γ and based on the upper incomplete gamma function Γ(a, x) = ∞ x t a−1 e −t dt.Since we want to define a d-dimensional kernel K Γ , we also need to compute the respective normalization constant.For this, we switch the integration domain to d-dimensional spherical coordinates and substitute s = r 2 /b: Due to radial symmetry, both integrals can be solved independently.The former one describes the surface area of the d-dimensional unit sphere and has the closed form we get an expression for the latter one in terms of the ordinary gamma function.We can also apply the recursive relation of the gamma function Γ(a + 1) = a Γ(a) and conclude that 1/c KΓ = (πb) d/2 Γ(d/2 + a)/Γ(d/2 + 1).Finally, we change the parametrization by setting a = p/2, b = 2σ 2 to obtain the final kernel: These incomplete gamma kernels span a family of Mean Shift kernels corresponding to L p estimators of the attraction energy localized by a Gaussian kernel.An important special case of this family is the LOP kernel for which we choose p = 1, σ 2 = 1/32 and apply the identity where erfc denotes the complementary error function.Another special case is the corresponding Gaussian kernel K Gaussian obtained by setting p = 2 which is a common choice in Mean Shift and has been extensively analyzed as the localized L 2 estimator of the geometric mean.Fig. 2 shows an interpolation between these kernels by varying the p-norm.3. Comparison of LOP and Gaussian kernels in spatial and frequency domain.Filtering with the LOP kernel K LOP better preserves higher frequency information.

KERNEL PROPERTIES
In the following, we derive several theoretical properties of the family of incomplete gamma kernels which are summarized in Table 1.

Characteristic Function and Fourier Transform
In order to gain a deeper understanding of the proposed kernel family K Γ , we are interested in its characteristic function ϕ Γ which can also be interpreted as the Fourier transform F. First, we can apply the relation between the d-dimensional Fourier transform of a radially symmetric function f (x) in terms of the Hankel transform of order d/2 − 1 of the function x d/2−1 f (x) [74] to reduce the dimensionality of the integral to the radial component where J q (x) denotes the Bessel function of the first kind of order q.This integral has the closed-form solution (see the Appendix for a more detailed derivation): Therefore, the characteristic function of the incomplete gamma kernel can be written in terms of the confluent hypergeometric function of the first kind 1 F 1 .Fig. 3 shows a comparison between the Gaussian kernel (p = 2) and the LOP kernel (p = 1) both in spatial and in frequency domain.
If we consider the special case 1 F 1 (a, a, x) = e x , we can observe that this result is consistent with the Fourier transform of the Gaussian kernel.Furthermore as d → ∞, the entire family of localized L p kernel estimators converges to the L 2 estimator since distances become increasingly similar in higher dimensions due to the curse of dimensionality.

Moment-generating Function
Another closely related and useful quantity to consider is the moment-generating function M Γ of the kernel K Γ which can be used to compute the mean µ Γ and covariance matrix Σ Γ .Although there is a direct connection to the characteristic function in terms of it does not necessarily exist in general, so we have to prove this property for all ω ∈ R d .For this purpose, we repeatedly apply the comparison theorem of calculus to derive a finite upper bound of the integral.After switching to spherical coordinates, we bound cos(∠(ω, x)) ≤ 1 to decouple the radial component from the angular one: For brevity, we put finite terms into constants c i .Next, we combine the two individual upper bounds ) and apply them after splitting the integral at r 0 = max(1, p/2).Furthermore, we simplify the expression by completing the square in the exponential term: Since r ∈ [1, ∞), the remaining polynomial can be bound by a higher order k = max(0, p + d − 3 ) ∈ N 0 .Therefore, this integral describes the (incomplete) k-th raw moment of a 1D normal distribution and is finite for any k ∈ N 0 which, in turn, proves that M Γ (ω) < ∞ exists.

Mean
We can now use the moment-generating function M Γ to directly compute all raw moments of the kernel K Γ by evaluating the respective derivative at ω = 0.For the mean, we consider the first-order derivative and get µ Γ = ∂ ∂ω M Γ (0) = 0 as the expected result for a radially symmetric kernel.

Covariance
Similarly, we compute the second-order derivative and obtain the covariance matrix of the kernel K Γ using the first-order and second-order raw moments as

Mean Shift Properties
In addition to the distribution-specific properties above, we can get further insights into the kernel family by exploiting the comprehensive theory that has been developed for the Mean Shift algorithm [70].This requires proving several additional properties including that the kernel K Γ is bounded and analytic and that its profile k Γ is differentiable, strictly decreasing, and convex.

Differentiability, Monotonicity and Convexity
In order to show that the profile is strictly decreasing, we consider its first-order derivative (20) which is defined for all x ∈ (0, ∞) as well as for x = 0 if p ∈ [2, ∞).Since the involved polynomial and exponential terms are always positive, it follows that the derivative must be negative, that is d dx k Γ (x | p, σ 2 ) < 0, and the profile strictly decreasing.
Similarly, we see that the second-order derivative is given by where, in order to ensure that d 2 dx 2 k Γ (x | p, σ 2 ) > 0, the latter term must be non-positive which is equivalent to the condition x ≥ (p − 2) σ 2 .Since this should hold for all x ∈ (0, ∞), convexity is only guaranteed for kernels with p ∈ (0, 2] which, in particular, includes the Gaussian kernel (p = 2) as well as the LOP kernel (p = 1).

Boundedness and Analyticity
Instead of showing both properties for the kernel K Γ itself, it is sufficient to show them for its profile k Γ .Since k Γ is nonnegative and monotonically decreasing, we only have to consider the case x = 0.For this value, Γ(a, x) reduces to the gamma function Γ(a) which is finite for a > 0. Furthermore, analyticity directly follows from the fact that Γ(a, x) is holomorphic in x ∈ (0, ∞) for any fixed a > 0.

Consequences for the LOP operator
The aforementioned properties have several direct implications [70] on the behavior of the LOP operator (with zero repulsion) as well as to Mean Shift applied with the incomplete gamma kernel K Γ for p ∈ (0, 2].With the exception of the finite set of target points P where singularities are introduced in the kernel G Γ , the following properties hold: Non-zero Gradient.The gradient of the kernel density estimate ∇ fP,KLOP is non-zero outside the convex hull of the target point set P. This implies that all solutions must lie within the convex hull.
Plateau-free Density.In addition to non-zero gradients, the kernel density estimate function on the set R 3 \ P has no plateaus.Since the set of target points P is finite, we can extend this property to the full space R 3 .
Non-decreasing Density Estimate.Another interesting subset to consider is the improvement ball I(q (t) j ) which denotes a d-dimensional sphere centered at the point q (t) j + m P,GLOP (q (t) j ) with radius m P,GLOP (q (t) j ) .In case of the LOP operator, it follows that all points x within the improvement ball have non-decreasing kernel density estimates fP,KLOP (x) ≥ fP,KLOP (q Convergence of Density Estimate Sequence.As a consequence of the above property, the sequence of kernel density estimates { fP,KLOP (q (t) )} obtained via the fixed-point iteration q (t+1) = q (t) + m P,GLOP (q (t) ) is non-decreasing.Furthermore, this sequence always converges.
Convergence of Mode Estimate Sequence.Finally, we can conclude that the mode estimate sequence {q (t) } converges to a single point.Depending on the window size h and the distribution of the target points P, this solution could be either a point p ∈ P due to the singularity (for very small window sizes) or a different point p ∈ R 3 \ P in the corresponding convex hull (for larger window sizes).

Further Properties
Besides the theoretical results of our proposed kernel family concerning basic distribution-related aspects as well as insights in the behavior and structure of the Mean Shift algorithm, we want to derive further properties that opens up a broader set of applications such as solving linear equation systems.

Complete Monotonicity
For this purpose, we show that the kernel profile k Γ is completely monotonic, that is (−1) n d n dx n k Γ (x | p, σ 2 ) ≥ 0 for all n ∈ N 0 and x ∈ (0, ∞).From the derivation of the Mean Shift properties, we already know that k Γ (x | p, σ 2 ) > 0 and d dx k Γ (x | p, σ 2 ) < 0 holds for all x ∈ (0, ∞).Since x a with a ≤ 0 as well as e −x are both completely monotonic and the product of two completely monotonic functions retains that property [76], we can conclude that this also holds for k Γ (x | p, σ 2 ) with p ∈ (0, 2].

Strict Positive Definiteness
A direct consequence of the complete monotonicity of its profile is that the kernel K Γ must be a strictly positive definite function [77].Therefore, for any set of points P, the matrix is symmetric and positive definite for p ∈ (0, 2], so any linear system with respect to C has a unique solution which can be computed by, e.g., conjugate gradient solvers.This also directly extends to any truncated version of K Γ where the matrix C becomes sparse and more efficient to solve as vanishing derivatives of the truncated profile do not affect complete monotonicity.

APPLICATIONS
Besides the application of other localized L p estimators for point cloud denoising via the incomplete gamma kernels K Γ , as shown in Fig. 4, we illustrate several further applications to demonstrate the benefits of the theoretical results derived for the kernel family.

WLOP Density Weights
Although the repulsion term mitigates the clustering effect of the attraction term, the projection is still highly dependent on the distribution of the target points P.This has been addressed in WLOP [2] by computing weights v i for each target point p i and v (t) j for each projection point q (t) j based on the reciprocal and ordinary density value respectively with the (unnormalized) localization kernel θ.However, our derived theoretical properties reveal two major limitations of this particular choice: 1) Although the Gaussian localization kernel θ could be considered a reasonable approximation of the actual kernel K LOP (see Fig. 3), high-frequency information in the density estimate is not properly handled and smoothed out; 2) taking the reciprocal to invert the density of p i ignores the dependencies between the weights which corresponds to the assumption of constant density in a window of size h.In order to achieve a more accurate normalization, we propose two novel weighting schemes.
Simple Scheme.A simple extension to the WLOP weights keeps the assumption of the latter limitation and addresses only the former one by applying the actual kernel K LOP , that is estimating the weights via the density of the point clouds P and Q (t) respectively.This scheme can be easily integrated into existing applications of WLOP as it only involves a different kernel function in the overall weight computation.Full Scheme.To address both limitations, we consider the kernel density estimate function fP,KLOP with weights v i applied to each term.We want to enforce constant density at the points p i which can be formulated as a linear optimization problem in matrix form: This corresponds to radial basis function (RBF) interpolation and we can obtain a unique solution since K LOP is strictly positive definite.Furthermore, we truncate the kernel at h/2 to drastically reduce the memory requirements of the matrix and use a sparse conjugate gradient solver.In case of the projection points q (t) j , we consider the inverse of the matrix which leads to the same weights as in the simple scheme.

CLOP Kernel Approximation
Both LOP and Mean Shift operate on a discrete set of points and are formulated as discrete sums over the point cloud which may have a significant runtime cost on large datasets.To allow for a more compact modeling of the input data, CLOP [5] replaces the point set P by a smaller set of normal distributions P N = {(w i , µ i , Σ i )} with weights w i , means µ i , and local covariance matrices Σ i and extends the attraction energy to the continuous space.However, since the integral in the respective update step cannot be directly solved, the kernel α( x ) is approximated by a radially symmetric Gaussian mixture model α(x) = 1/h 3 k=1 ŵk ĉk N (x/h | 0, σ2 k I) consisting of three components with fitted parameters {( ŵk , σk )} and dimension-dependent constants ĉk = |2πσ 2 k I| 1/2 .In the context of Mean Shift, this implies that the kernel G LOP is in fact approximated which directly allows us to derive as an approximation of the kernel K LOP with the same set of fitted parameters {( ŵk , σk )}.Kernel Fit.However, finding the optimal parameter set is highly challenging due to the singularity of G LOP at x = 0 and, thereby, the unbounded ratio between the smallest and largest sampling value in the half-open fitting interval (0, 1] for h = 1.In contrast, we directly optimize on the kernel K LOP which does not suffer from these limitations.We fix the parameter w 3 = 1 to constrain the remaining degree of freedom and obtain the solution from 10 7 uniformly sampled points in the interval [0, 1] via the Levenberg-Marquardt algorithm (see Table 2).Although the LOP operator is scale-invariant in terms of the kernel α, we nevertheless estimate a global scaling factor for the weights ŵk via Levenberg-Marquardt optimization in the interval (0.01, 1] for a better comparability with CLOP.
Consistent Fit.We can also see from the definition of the kernel approximation KLOP that its variance consists of a convex combination of the individual variances σ2 k : In the limit d → ∞, this combination degenerates to σ2 LOP → max k σ2 k which is similar to the maximum norm L ∞ being the limit of the L p norms.Therefore, we can enforce an additional consistency constraint in the parameter optimization process by fixing the parameter σ3 = 1/32 to match the expected standard deviation.

Robust Loss Functions
In the context of point cloud reconstruction, LOP formulates the projection onto the underlying surface via the localized L 1 median in a robust way.Similarly, mesh denoising applications aim to improve the quality of meshes in a twostage approach where the face normals are initially filtered and subsequently used in the second stage to estimate the original vertex positions.Obtaining a reliable estimate of a surface normal n can be performed by M-estimators in the field of robust statistics which is also related to the concept of anisotropic diffusion [78].Here, the objective function defined with a robust loss function ρ is considered and a solution can be found based on the corresponding influence function Ψ(x) = d dx ρ(x) and anisotropic weight function g(x) = Ψ(x)/x [79]: This result is closely related to the derivation of Mean Shift and shares many properties with it [10].Thus, we can LOP Gaussian Fig. 5. Comparison of LOP and Gaussian M-estimators for σ 2 = 1.Due to the close relation to Mean Shift, the robust loss functions ρ do not only share the shape of the corresponding kernels K but also have similar properties.
define the family of incomplete gamma losses along with the respective influence and anisotropic weight functions: Here, the losses ρ Γ are built upon the lower incomplete gamma function γ(a, x) = x 0 t a−1 e −t dt which is connected to the upper incomplete gamma function via the relation γ(a, x) + Γ(a, x) = Γ(a).By choosing p = 1 and applying the identity γ(1/2, x) = √ π erf( √ x), we get the LOP loss where erf denotes the error function and is related to its complementary counterpart via erfc(x) = 1 − erf(x).Considering σ 2 = 1/32, the relation gLOP (x | 1/32) ∝ g LOP (x 2 ) further highlights the close similarity to Mean Shift.Fig. 5 shows a comparison between the Gaussian M-estimator (p = 2) and the LOP M-estimator (p = 1).

EXPERIMENTAL RESULTS
In the following, we demonstrate the effectiveness of our proposed extensions that are derived from the theoretical properties of the kernel family.

Evaluation of WLOP Density Weights
In order to evaluate the performance of our density weighting schemes, we measured the regularity of the point cloud Q after projection onto a highly irregular target P [2].For this purpose, we sampled 74 000 target points from a 3D surface patch inversely proportional to the intensity of the mapped Bird image and took a random subset of 3700 points for projection.Then, we applied 100 iterations of the LOP operator as well as its weighted versions and computed the regularity which is defined as the standard deviation of the nearest neighbor distances d(x, Y) = min j x − y j within the point cloud Q. Fig. 7 shows the quantitative results for 60×50 combinations of h and µ.Throughout 76.7 % of all combinations, our simple weighting scheme performs better than WLOP with a slightly lower value of σ Q on average.Our full scheme outperforms WLOP in 99.3 % and the simple scheme in 98.7 % of all combinations, especially in configurations with low repulsion weights µ ∈ [0, 0.2].These improvements in point cloud regularity directly correspond to a more evenly distributed density along the surface.Fig. 6 depicts a comparison of 1000×1000 evenly sampled density values on the respective 3D surface patch.Both WLOP and our simple scheme normalize the lower frequency components of the density, but still retain highfrequency variations due to the independent computation of each weight.On the other hand, our full scheme does not suffer from these artifacts and only leads to underestimated densities at the boundary and in sparsely sampled regions where the window size h is not sufficiently large to bridge these gaps.Since Mean Shift and, thereby, LOP and its variants are scale-invariant with respect to a global normalization constant, we computed the mean density f for each weighting scheme and used this value to normalize each density distribution for a fair comparison.Whereas this value is close to one for both of our schemes due to the correct handling of the normalization constants, we can derive a theoretical estimate of this value for WLOP which consists of three terms: 1) the normalization constant of the kernel density estimate function; 2) the missing normalization constant of the kernel θ; and 3) a dimensiondependent correction factor.The last term models the different domains from which the density is accumulated as we consider a surface patch that corresponds to a 2D subspace embedded in the 3D space.Therefore, the integration domain of the density differs by one dimension which can be accounted for by the ratio of the normalization constants of both the actual density kernel K LOP as well as the chosen kernel θ for density weight computation.

Evaluation of CLOP Kernel Approximation
We evaluated the approximation error of our fitted parameter set against the original one proposed by CLOP [5].First, we quantified systematic errors of the kernel approximation KLOP by analyzing its standard deviation σLOP .We can see in Table 3 that both CLOP and our approximation underestimate the actual value σ LOP and that the bias increases in higher dimensions.Although these errors are significantly lower for our approximation throughout all dimensions, they may still be significant.Our consistent approximation always overestimates the actual standard deviation and has a slightly higher error than the unconstrained variant in low dimensions up to d = 5.However, it becomes unbiased in the limit d → ∞ and should be preferred in higher dimensions.We also considered minimizing the L 1 distance to the actual kernel K LOP by scaling the values σk with correction factors 1/b opt to obtain an improved set of parameters which is shown in Fig. 8. Here, the error of our approximation is significantly lower than for CLOP both before and after optimal correction.Furthermore, the optimal scaling factors are similar to the ratios of the standard deviation for d = 1.
In addition to the theoretical analysis of the kernel approximations, we also measured the reconstruction error when replacing the actual kernel K LOP with the respective approximation.For this purpose, we chose the Block model and uniformly sampled 50 000 target points P and 25 000 projection points Q respectively.We applied 100 iterations of WLOP as a smoothing operator with a large window size of h = 25 percent of the bounding box diagonal of P and a repulsion weight µ = 0.4.Then, we measured the distance  of each point to the (triangulated) surface of the reference point cloud as well as the mean point-surface distance where t(y j ) denotes the j-th triangle of Y. Fig. 9 shows the results of this point cloud smoothing operation.Whereas the CLOP approximation introduces higher errors at the edges of the sampled model due to the significantly underestimated standard deviation of the kernel, our approximation does not suffer from these artifacts.

Evaluation of Robust Loss Functions
We tested the LOP M-estimator against other popular choices for normal filtering.For this, we used the Gargoyle (86 311 vertices, 172 610 faces) and Box (70 134 vertices, 140 259 faces) models and corrupted the vertices in random directions by 0.25 l e uniform noise where l e denotes the average face edge length.Then, we applied 50 iterations of normal filtering with σ = 0.3 for each face normal n within its geometric neighborhood of size r = 1.5 l e , that is all normals whose face centers are traversable along the surface within a ball of size r.To avoid the singularity of the L 1 and LOP losses at x = 0, we only considered the neighboring face normals in the initial iteration and used all normals subsequently.For the second stage of the mesh denoising framework, we used the vertex update by Zhang et al. [80] with their default parameters of 20 iterations and w = 0.001 which avoids the triangle flipping problem.We evaluated the reconstruction error by the mean angular distance to the face normals of the ground truth mesh.Fig. 10 and 11 show comparisons between the L 2 , L 1 , Gaussian, and LOP loss.Whereas the L 2 loss leads to a very smooth surface, its L 1 counterpart is less sensitive to large normal variations within the local neighborhood and better preserves features.However, sharp edges cannot be reconstructed since all collected normals are considered in a global fashion.The Gaussian and LOP loss functions can be viewed as localized versions of the former losses and do not suffer from this limitation.Finer details being at a similar scale as the applied noise are hard to reconstruct and mostly smoothed out by all variants, but can be partially recovered by the LOP loss.

CONCLUSIONS
We presented incomplete gamma kernels, a novel family of kernels generalizing LOP operators.By revisiting the classical L 1 estimator used in LOP, we revealed its relation to the Mean Shift framework via a novel kernel K LOP and generalized this result to arbitrary localized L p estimators.We derived several theoretical properties of the kernel family K Γ concerning distributional, Mean Shift induced, and other aspects such as strict positive definiteness to obtain a deeper understanding of the operator's projection behavior.Furthermore, we illustrated several applications including an improved WLOP density weighting scheme, a more accurate kernel approximation for CLOP, as well as introducing incomplete gamma losses ρ Γ as a novel set of robust loss functions and confirmed their effectiveness in a variety of quantitative and qualitative experiments.We expect that building upon the insights provided by our work will be beneficial for future developments on surface reconstruction from noisy point clouds.

APPENDIX INTEGRAL OF INCOMPLETE GAMMA AND BESSEL FUNCTIONS
Let a > 0, b > 0, s ≥ 0 be real-valued constants.Then for any real-valued p > −1, the relation We can now substitute u = √ t and obtain a closed-form solution of the remaining integral [73] which holds under the condition 2p + 2s + 2 > 0:  A similar, more restricted result for the special case s = 1/2 has also been shown in previous work [81].

Fig. 6 .∆σWLOPFig. 7 .
Fig.6.Kernel density estimate f for h = 3 of a planar target surface patch P (74 000 points) that is sampled inversely proportional to the intensity of the Bird image.The regularity σ Q of any projected point set Q onto this target directly depends on the uniformity of f .Whereas WLOP[2] and our simple weighting scheme cannot fully remove high-frequency variations, our full weighting scheme leads to a significantly better normalization and more uniform density.Unit of h: [% BB diagonal].

Fig. 8 .
Fig.8.Analysis of bias in the width of the kernel approximations KLOP to the original kernel K LOP .A correction by scaling the parameters σk with a global factor 1/bopt lowers the error for CLOP[5].Nevertheless, our approximation still better follows K LOP with a significantly lower error and is almost unbiased in the 1-dimensional case.

Fig. 9 .
Fig.9.Bias of the kernel approximations KLOP when applying WLOP[2] to smooth the Block model (25 000 points) with the window size h = 25.Whereas the CLOP[5] approximation introduces systematic errors at the edges due to the bias in the width, our variant closely resembles the behavior of the original kernel K LOP .Unit: [% BB diagonal].

Fig. 10 .
Fig. 10.Mesh denoising of the Gargoyle model (86 311 vertices, 172 610 faces) corrupted with 0.25 le uniform noise.Although the mean angular distance d angle is slightly higher for the LOP loss ρ LOP , features and finer details are better preserved.Unit: [ • ].Noisy ρ L 2 ρ L 1 ρ Gaussian ρ LOP Original

Fig. 11 .
Fig. 11.Mesh denoising of the Box model (70 134 vertices, 140 259 faces) corrupted with 0.25 le uniform noise.Filtering with the LOP loss ρ LOP results in the lowest mean angular distance d angle and reconstructs fine details best.Unit: [ • ].

∞ 0 Γ 1 F 1 0 ∞ a 2 x 2 tt 0 t s−1 e −t 1 a √ t 0 JJ
(s, a 2 x 2 ) J p (b x) x p+1 dx (p + s + 1, p + 2, − b 2 4a 2 ) (37)holds where Γ(a, x) denotes the upper incomplete gamma function, J p the Bessel function of order p, and 1 F 1 the confluent hypergeometric function of the first kind.Proof: First, we plug in the definition of the incomplete gamma function and change the order of integration:∞ 0 Γ(s, a 2 x 2 ) J p (b x) x p+1 dx = ∞ s−1 e −t J p (b x) x p+1 dt dx = s−1 e −t J p (b x) x p+1 dx dt = ∞ p (b x) x p+1 dx dt(38)The inner integral can be solved by substituting y = b x and then applying the well-known relation x p+1 J p (x) dx = x p+1 J p+1 (x): p (y) y p+1 dy dt = ∞ 0 t s−1 e −t 1 b p+2 J p+1 (

TABLE 2 Parameter
Sets of Kernel Approximation KLOP

TABLE 3
Ratio of Standard Deviations σLOP /σ LOP Between the Kernel Approximations KLOP and the Original Kernel K LOP in R d