A Vectorial Approach to Unbalanced Optimal Mass Transport

Unbalanced optimal mass transport (OMT) seeks to remove the conservation of mass constraint by adding a source term to the standard continuity equation in the Benamou-Brenier formulation of OMT. In this note, we show how the addition of the source fits into the vector-valued OMT framework.


Introduction
Optimal mass transport (OMT) is a very important subject in mathematics, originating with the French civil engineer and mathematician Gaspard Monge in 1781 [15,16]. Recently, the theory has undergone a massive growth, with numerous applications to various areas including signal processing, machine learning, computer vision, meteorology, statistical physics, quantum mechanics, and network theory [1,3,11,12,14]. Several works deal with extensions of the theory to the unbalanced and vector-valued cases; see [6,9] and the references therein. In standard treatments, vector-valued and unbalanced extensions are treated separately. In the present work, we show that unbalanced OMT can fit into the model of vector OMT by taking a special set of weight parameters, which gives us fresh way to treat the problem.
In the L 2 setting, both extensions arise from the computational fluid dynamics (CFD) approach to OMT by Benamou and Brenier [2], which was a major development in optimal mass transport theory. For "square of distance" based cost functionals, the Benamou and Brenier framework is equivalent to the original one while giving an explicit interpolation of two mass distributions regarded as a geodesic on the space of probability distributions [13]. The CFD method views the optimal mass transport as the minimizer of a kinetic energy functional subject to a continuity constraint. By modifying the energy functional and the constraint, unbalanced OMT and vector OMT versions were developed [6,8,9]. More specifically, by adding a source term, unbalanced OMT can handle transport problems when the mass is not conserved. That is, mass can be created or destroyed during the process. By introducing the divergence of the flow between channels, vector OMT extends the density from scalar to vector-valued and even matrix-valued [4,5].
In this note, we indicate how to reformulate unbalanced OMT into a vector-valued OMT framework, and thus may be implemented through any existing code for vector OMT. All that needs to be done is adding a new weight matrix without changing the main structure of the code.
In this note, we focus on the connection between these two extensions. We will first give the background on the two extensions of OMT. After that we will show how to reformulate unbalanced OMT to a vector OMT problem. We will also look into reformulating unbalanced vector OMT to a common vector OMT problem. We conclude with some illustrative numerical results.

Background
In this section, we briefly introduce the Benamou and Brenier approach of OMT [2] as well as the two extensions mentioned above.

Benamou and Brenier
The original formulation of OMT due to Monge [15,16] may be expressed as follows: where c(x, y) is the cost of moving unit mass from x to y, T is the transport map, and ρ 0 , ρ 1 are two probability distributions defined on E a subdomain of R n . T # denotes the push forward of T .
As pioneered by Leonid Kantorovich [15,16], the Monge formulation of OMT may be relaxed replacing transport maps T by couplings π: where Π(ρ 0 , ρ 1 ) denotes the set of all the couplings between ρ 0 and ρ 1 (joint distributions whose two marginal distributions are ρ 0 and ρ 1 ). One may show that for c(x, y) = x − y 2 (square of distance function), the Kantorovich and Monge formulations are equivalent; see [15,16] and the references therein.
Moreover for c(x, y) = ||x − y|| 2 , the specific infimum is called Wasserstein 2 distance (W 2 ). Benamou and Brenier pointed out that this can be written in an equivalent computational fluid dynamic formulation: Thus, the latter reformulates OMT as one of minimizing a kinetic energy functional subject to a continuity constraint. The continuity constraint states that the change of density at each point over time is due to the flux of its neighborhood.
The Benamou-Brenier optimal solution may be related to the the original one of Monge. Indeed, if we have the optimal transport plan T in (1), the interpolation can be expressed as ρ(t, ·) = ((t · T + (1 − t) · id) # ρ 0 . For computational purposes, the Benamou-Brenier model is usually written as a convex optimization problem in momentum form (p = ρv):

Unbalanced OMT
In the standard formulation, ρ 0 , ρ 1 need to be balanced, i.e., have the same total mass. We enforced this by simply taking them to be probability distributions In the unbalanced case, total mass is not required to be preserved. This is very important for a number of applications in which mass may be created or destroyed.
There are many ways to extend the original setting to the unbalanced case [10]. In the present work, we will just treat the case in which a source term is added under the Fisher-Rao smooth formulation [9]. Namely, There is an extra source term s in the continuity equation. The change of density at each point over time now is not only due to the flux of its neighborhood, but also a source. Mass can be created or destroyed at any point and time. The parameter γ is a weight to control how much source one wants to use.

Vector-valued OMT
Vector-valued OMT considers vector-valued distributions instead of only the classical scalar ones [6]. With more than one channel, the change of mass is not only due to the flux of its neighborhood but also the redistribution within the given channels. Namely, we have As mentioned above, there are two kinds of flows for the vector-valued density ρ. The first is the spatial flux p and its corresponding divergence is denoted as ∇ x ·, and the second is the flux u between channels, whose discrete divergence is denoted by ∇ * F . p and u are both vectors. We note that F 1 is the all-1 part of the incident matrix F (sources of all the edges) and F 2 = F 1 − F (sinks of all the edges). As u is defined on every edge and the density is only defined on nodes, diag(F T 2 ρ) −1 + diag(F T 1 ρ) −1 assigns a density to each edge using the density of two end points so that we can compute the kinetic energy for the flows on edges.
Vector-valued OMT can be considered as a general OMT problem on E × F , where E is the space on which the vector-valued density is defined and F is a connected graph. With this understanding, we can rewrite the energy functional in the following equivalent form: whereρ is density of edge.

Unbalanced vector-valued OMT
In this section, we write down the unbalanced vector-valued version of OMT. Namely, Note that the source term is also a vector. So we need to write the energy functional for this vector term. Regarding the continuity equation, there are three possibilities for the change of mass over time: the flux of its neighborhood, flows between channels and source.

Reformulation unbalanced scalar OMT to vector-valued OMT
When rewriting vector OMT in (6), we should note the similarity with unbalanced OMT. The flow between channels may be utilized as the source. The source term describes the creation or vanishing of mass. This can be modeled via an extra layer: the mass created originates from that layer and the vanishing mass just goes to that layer.

Source layer: scalar case
Here the input source and target are regarded as two mass distributions on the subdomain E. Total mass may or may not be preserved. We add an extra source layer which is parallel to original space. For each point in E, there is an edge connecting with source layer. As now we have an extra layer, we can put the difference of mass into the new layer so that the total mass of the 2-vector new structure is preserved. The difference of mass can just be distributed uniformly to source layer or put to some area of interest for further specific applications.
Now we view the flow between channels as source term. Consider the continuity equations of (5b) and (6b): We can use ∇ * F u as the source term s. Because of the special graph structure (there is only one edge), we have: Now consider the second term in (7). Since there is only one edge, we can just omit the summation: If we further take the density of each edgeρ(t, x) = ρ(t, x), this term is exactly the same as the second term in the energy functional in unbalanced OMT setting.
Now we consider the first term of the energy functional: There are only two channels, i.e., c 1 denotes the original channel and c 2 denotes the source layer. Notice that the integral with respect to c 1 term is exactly the first term of (5). We want to get rid of the c 2 integral. This can be done by introducing a small weight parameter for that layer.

A naive generalization of vector-valued OMT: addition of weight
In the original setting, one treats the kinetic energy of different layers in an identical manner. But by simply introducing a weight, we can treat each layer differently: where w(c) is the weighting parameters. We can choose different values for different channels. For another form of vector OMT: Under this setting, if we take w(c 1 ) = 1 and w(c 2 ) very small, then the integral in the source layer is very small in that energy functional. Clearly, unbalanced OMT is almost equivalent to the extra source layer vector OMT with the latter set of weight parameters.

Implementation: scalar case
By introducing the source layer, there is an edge between two channels for which we use the flow on that edge as source, and thus there is an extra integral on that new layer. We add very small weight parameters so that two energy functionals are almost identical. See Figure 1. It is quite straightforward to implement unbalanced OMT from vector-valued OMT code. We need to only alter a few lines of code to make it work for unbalanced case: 1. Initialization: From the input, first construct two extended structures for vector OMT. The first layer is the original input and the second layer contains the mass difference. If the starting density distribution has more total mass, then the mass difference is added to the second layer of target density distribution. If the starting density distribution has less total mass, then the mass difference is added to the second layer of starting density distribution itself.
2. Set weighting parameters: Employing the code for the energy functional, just add weighting parameters to the corresponding layers.

Reformulation of unbalanced vector OMT to vector-valued OMT
The reformulation is very similar to the scalar case. The same simple idea is to add a new source layer which connects to each of the original layers. The flows on those added edges are our sources.

Source layer: vector-valued case
In the vector-valued case, the input source and target are two n-vector valued distributions. Total mass may or may not be preserved. We add an extra source layer which is connected to each of the existing layers. As before, we can put the difference of mass into the new layer so that the total mass of the (n + 1)-vector new structures is preserved.
We call the incident matrix for those newly added edges G, which is a submatrix of the new large incident matrixF. We just splitF for illustration purpose. They can be easily pieced together as in original vector setting: where u is the flow within the existing edges and v is the flow within the newly added edges.
Now we consider the continuity equations of unbalanced vector OMT and vector OMT with a new layer: We can use ∇ * G v as the source s. Now similar to the scalar case: The energy functional now becomes: With the above relationship (15), it is easy to see that the last term exactly fits the original source energy term (8a).
Again there is a new part of kinetic energy due to the newly added source layer. We can introduce weight parameters to make the extra term almost disappear. Same as adding weight technique for different layers, we can put γ and η together as a weight matrix for different edges so that the final form is more concise: where diag(w 1 ) denotes the weighting matrix for different layers and diag(w 2 ) denotes the weighting matrix for the edges we add (weight = η) and existing edges (weight= γ).ũ = u v is all the flows that between channels within the new large graphF andρ is the density assigned to each edge.

Implementation: vector-valued case
By introducing a source layer, there is an edge between the source layer and each existing channel. We use the flows on those edges as sources. As before, we add very small weight parameters for the source layer so that two energy functionals are almost identical. See Figure 2. As before, it is also very easy to implement unbalanced vector OMT from vector OMT code.
1. Initialization: From the input, first construct the extended structures for vector OMT.
Add an extra layer. Connect that new layer with each of other layers. Put the difference of total mass into that newly added layer.
2. Set weight parameters: Employing the code for the energy functional, add weighting parameters to corresponding layer and corresponding edges (existing edges and newly added edges).
With the simple change of the original vector-valued OMT, we can use the same code for unbalanced vector OMT case without even touching the main structure of the original code.

Numerical results
We tested our new formulation on several images using the numerical algorithm from [7]. Gray scale images are general mass distributions on a rectangular area while color images are vector valued distributions.

Unbalanced OMT
We tested an example of moving two Gaussian distributions. Though this example has preserved total mass, the optimal solution of OMT still uses the source term. We can look at the effect of tuning the weight parameter γ. The source image and target images are:   We also tested two images with very large total mass difference:

Unbalanced vector OMT
We tested our approach on color image data. While the interpolations of density are color, the source layer is still gray scale.

Conclusion
Vector-valued OMT is a very powerful model. In this note, we reformulated unbalanced OMT and unbalanced vector-valued OMT by adding the source as a new layer. We gave a new way to include a source term and we proposed a very simple way to implement unbalanced OMT from general vector-valued OMT code. In the present work, we only considered one kind of unbalanced formulation. Other flow-based unbalanced OMT settings fit into our model as well. We believe our generalization of vector-valued OMT has not reached its full potential. Indeed, we only give different weight parameters to different layers. But what if we give different weight parameters to different areas, different nodes or different edges? We may use different parameters according to specific applications. This will be the subject of future research.