Texture Mixing by Interpolating Deep Statistics via Gaussian Models

Recently, enthusiastic studies have devoted to texture synthesis using deep neural networks, because these networks excel at handling complex patterns in images. In these models, second-order statistics, such as Gram matrix, are used to describe textures. Despite the fact that these model have achieved promising results, the structure of their parametric space is still unclear, consequently, it is difficult to use them to mix textures. This paper addresses the texture mixing problem by using a Gaussian scheme to interpolate deep statistics computed from deep neural networks. More precisely, we first reveal that the statistics used in existing deep models can be unified using a stationary Gaussian scheme. We then present a novel algorithm to mix these statistics by interpolating between Gaussian models using optimal transport. We further apply our scheme to Neural Style Transfer, where we can create mixed styles. The experiments demonstrate that our method can achieve state-of-the-art results. Because all the computations are implemented in closed forms, our mixing algorithm adds only negligible time to the original texture synthesis procedure.


Introduction
Texture mixing is the process of generating new texture images that possess averaged visual characteristics of a given set of exemplars [1][2][3][4][5].It can provide visually pleasing interpolations of difference textures, therefore, has numerous applications in computer vision and graphics [4,6].Besides, the ability to create smooth morphing textures between textures is regard as an criteria for "good" texture synthesis algorithms [7] [8].
In the sense that a texture can be modeled by a set of statistics depicting the visual properties of its samples [9][10][11], texture mixing involves "averaging" the corresponding set of statistical measures.For copy-based texture synthesis methods, e.g.[12,13], textures can be mixed by combining pixels from multiple inputs by using well-designed procedures such as in [4] or the patch match scheme [14].These methods handle complex and geometric textures satisfactorily, but they tend to produce verbatim patterns and it is not easy to understand the mixing process.In contrast, statistical parametric texture methods, e.g.[11,15,16] are more principled, and their parameters are better understood, although they are often not as good at handling structured textures.Moreover, with parametric texture models, the mixing of textures can be computed feasibly and more easily by "averaging" the corresponding set of parameters, see e.g.[1,2,5,17].
A recent breakthrough in texture modelling involves the use of deep convolutional neural networks (CNNs) [18][19][20][21][22] for texture representation and extends the parametric framework of Portilla and Simoncelli [11].This approach enables us, using parametric models, to synthesize comparable or better textures containing complex patterns than copy-based methods.Under this deep-CNNs framework, researchers also cast the the problem of style transfer into texture transfer [20,23].However, due to the complex structure of the parametric space of deep CNNs [18,21,23], how to mix textures or styles with these models is still unclear.
In this paper, we address the problem of mixing textures using deep CNNs.More precisely, after studying existing texture synthesis methods with deep CNNs [18][19][20][21][22], we discover that the second-order statistics (e.g.Gram matrix and correlation matrix and their variants) used in these methods can be represented as continuous functions of a stationary Gaussian model, so the mixing of these statistics is reduced to the interpolation of Gaussian models.Therefore, we present a scheme illustrated in Fig 1, for mixing the statistics of deep CNNs using an optimal transport interpolation of Gaussian models, which enables us to mix textures simply and efficiently.We further apply our scheme to neural-style morphing, where we can interpolate between different styles.We also demonstrate that our mixing algorithm is fully compatible with feed-forward CNNs [24] and instance normalization [20]: generating mixed textures or stylish photos with mixed styles in a fast forward pass.Experiments demonstrate that our method achieve state-of-the-art results.It is also worth noticing that our mixing algorithm adds only negligible time to the original texture synthesis procedure, thanks to the fact that all the mixing computations are in closed forms in Fourier domain.
The rest of this paper is organized as follows: Section 2 briefly reviews the related work, Section 3 formulates the texture mixing problem, Section 4 presents the proposed scheme for mixing deep statistics with Gaussian models, and Section 5 provides all the implementation details.Section 6 compares the proposed methods with state-of-the-arts and analyzes the experimental results.Section 7 finally draws some conclusion remarks.

Related Work
Exemplar-based texture synthesis is the basis of our work, the goal of which is to generate new texture samples from a given texture exemplar [25].Work on texture synthesis can be roughly categorized into non-parametric models such as copy-based (also known as patch-based) methods, see e.g.[12] and statistic parametric models, see e.g.[11].Patch-based models copy pixels or patches directly from texture exemplars to synthesize new texture samples [12,13].This approach is efficient but sometimes generates verbatim patterns, i.e. precisely the same parts of the exemplar are highly reused in the results.In contrast, statistical parametric methods aim to find a parametric representation of textures, an approach that often allows more control over the synthesis process.Portilla and Simoncelli [11] used wavelet and pyramid decomposition to build a parametric texture model, which can synthesize many nature textures, even those containing geometric patterns.Stationary Gaussian model [5,16,[26][27][28][29][30] is a simple texture model, synthesizing Gaussian textures is easy and requires few computational resources, however, it is not good at handling textures with complex geometry patterns.Gatys et al .[23] used a CNN for texture synthesis, their method achieved good performance over a large scope of nature textures, but it failed when synthesizing textures with non-local structures and sometimes suffered from degraded quality [22].Liu et al .[19] added extra spectrum constrain to textures, which improved the model's ability to model non-local structures.Sendik et al .[21] further proposed using deep correlation matrix to model non-local structures and achieve better result than Liu et al .'s [19].Li et al .[22] proposed to use centred Gram matrix instead of Gram matrix to improve the quality of outputs.
The algorithm of neural style transfer is developed based on neural textures synthesis.The goal of style transfer is to transfer the "style" of one image to another, while keeping the "content" fixed.Although Gatys' vanilla style transfer algorithm [23] can produce high quality stylish photos, it relies on expensive optimized based algorithm.To accelerate this time comsuming procedure, Johnson et al .[24] proposed a perceptual loss functions and a transformation network, this network can generate textures and stylish photos in a forward manner, hundreds of times faster than Gatys' vanilla algorithm.Ulyanov et al .[20] proposed to use instance normalization to improve the quality of outputs.To learn styles in one network, Li et al .[22] and Dumoulin et al .[31] proposed new network structures.
In the past decades, tremendous studies have devoted to texture mixing.It aims at generating a "averaged" texture from several textures.Ruiters et al .[14] presented a technique to interpolate between two textures based on patch-based method.Soheil et al .[4] reported the state-of-the-art results of texture mixing via a image melding algorithm.In terms of statistic parametric models, texture mixing naturally means averaging statistics from different exemplars.Peyre [1] proposed to use "grouplet" for synthesizing and mixing locally parallel textures.Joseph et al .[17] proposed to use wavelet and a tree structure to model and mix textures.Rabin et al .[2] used optimal transport and pyramid decomposition to mix textures, the high dimensional Wasserstein metric is approximated by sliced one-dimensional Wasserstein metric.Optimal transport of Gaussian distribution has been studied to mix Gaussian textures [5].These optimal-transport-based algorithm can generate homogeneous mixed textures, but they can not handle structured textures.
It is worth noticing that not all texture models are able to mix textures.e.g. in the prominent work of Portilla and Simoncelli's [11], they mentioned that the parametric space of their model is not convex, because linear interpolation of statistics will result in poor quality (patchwise) results.Even the state-of-the-art texture model proposed by Gatys et al .[18] suffered from the similar problem, as linear interpolation of Gram matrices only results in similar pathwise mixture [22].
Exemplar-based texture synthesis with CNNs.Given a texture exemplar I exp , the aim of exemplarbased texture synthesis is to produce new texture samples I syn that are as similar as possible to I exp regarding certain visual/perceptual measurements [11].For instance, Zhu et al .[10] argued that I syn and I ex are equivalent on statistical feature sets, syn , . . ., F ( k ) syn } ∼ {F ( 1)  exp , . . ., where × } = F • I × are the sets of texture features extracted from I × by a texture model F , these models can be filter banks [10], wavelets [11] or Markovian models [12].The image I syn can thus be generated by feature projection [10,32].A survey of exemplar-based texture synthesis was recently provided in [25].
In this paper, we are interested in exemplar-based texture model using deep CNN features [18,21,23], because of their capability to synthesize textures with complex structures.This type of methods utilized a pre-learned deep CNN, F CNN , for texture description and suggested that I syn can be generated by matching deep features under Gram matrix or correlation matrix based similarity.More precisely, one can initialize I syn with a random noise and pursue an optimal output by minimizing the following objective: where G(a) is the Gram measure of matrix a and • F denotes the Frobenius norm.The minimization problem in Eqn. ( 1) can be solved using back-propagation [18].
Exemplar-based texture mixing with CNNs.Given two input texture exemplars I exp0 and I exp1 , exemplar-based texture mixing aims to generate new textures whose visual and perceptual properties are drawn from both the inputs.Denoting the deep features of the two inputs as where which actually finds an I syn with linear interpolation of the Gram measurements.As we shall discuss in Section 6, this mixing often produces results with conspicuous artifacts.
In what follows, we will develop a better means to interpolate the deep CNN features for mixing textures.

Deep Texture Mixing with Gaussian Models
Pioneered by Gatys et al .[18], several studies have addressed texture synthesis with deep CNNs, e.g.[19,21,22,33].In this section, we first reveal that the works using Gram matrix [18], centered Gram matrix [22], correlation [21] and spectrum [19] can be unified and derived from a stationary Gaussian model.We then show that this unified scheme enables us to mix textures by interpolating deep features generated from CNNs through a simple procedure.
-Gram matrix G.The Gram matrix G of F ∈ R U ×k is defined as: G is a k × k positive semi-defined matrix.
-Centred Gram matrix Ḡ.Instead of using Gram matrix G of F, Li et al .[22] suggested using centred Gram matrix Ḡ to generate textures and reported that this approach resulted in better, or at least comparable, perceptual quality. where -Correlation S .Recently, Sendik et al .[21] proposed using correlation S of F to synthesize textures, and reported the state-of-the-art results on non-local textures.Their deep correlations S ∈ R U ×k of feature maps F are defined as in which p = (i, j) is the offset vector, and -Modified correlation S. As we can see, Correlation S relations ignores the relations between pixels whose horizontal distance are further than Q/2, or vertical distance further than M/2, we modify the correlation matrix by adding periodic boundary condition: where 1 ≤ n ≤ k, p ∈ U .S(p) is a vector of length k.
Notice that these two definition is mathematically equivalent when F already satisfy periodic boundary condition.In later part of our paper, only the modified correlation matrix S will be used.As we will see in the experiment section, these two correlation matrix produce similar results.
-Spectrum F. The Fourier spectrum F of F has been integrated into the process of texture synthesis [19].
where â denotes the Fourier transformation of a.
Stationary Gaussian model For feature maps F ∈ R U ×k , the associated stationary Gaussian model µ(m, C) can be obtained by estimating mean m ∈ R k using Eqn.( 5) and covariance C ∈ R U ×k×k using the following equation [5]: where p ∈ U, 1 ≤ i, j ≤ k, and C(p) is a matrix of size k × k.
The following proposition provides the connections between the above four measurements and the stationary Gaussian model µ(m, C).Proposition 4.1.Given feature maps F ∈ R U ×k , its Gram matrix G, centred Gram matrix Ḡ, correlation S and spectrum F can be derived from a stationary Gaussian model: where T is the transpose operator, denotes the componentwise product, and • is the Fourier transformation.
The derivations of Eqn.(10) (11) (12) are straightforward.Eqn.(13) occurs because the correlation S is the auto-correlation of the feature map F, which offers with * denoting the conjugate transpose.
It is worth noticing that the Proposition 4.1 can explain the experimental observations reported by [18,19,21,22]: using Gram matrix and central Gram matrix generate comparable synthesized results, while using spectrum [19] and correlation [21] are both effective in synthesizing non-local structures.

Interpolating Deep Statistics via Gaussian Model
In Gatys' texture model [18], mixing textures corresponds to interpolating the Gram matrices.Formally, the goal is to pursue a continuous function G(ρ), ρ ∈ [0, 1], such that G(ρ) = G ρ when ρ = 0, 1. Obviously, the solution is not unique.For instance, a simple linear interpolation satisfies this requirement, however, linear interpolations of Gram matrices does not necessarily result in Gram matrices, and it performs poorly for texture mixing (see experiment in Fig. 4 and 5), primarily because it ignores the manifold of Gram matrix G.
The significance of roposition 4.1 is that it reduce the problem of interpolation of Gram matrices G to the problem of interpolation of Gaussian models µ.Formally, Proposition 4.1 demonstrates that, given feature maps F, the Gram matrix G and Gaussian model µ, there exists a continuous function Therefore, given feature maps F 0 and F 1 , their Gram matrices G 0 and G 1 , Gaussian models µ 0 and µ 1 respectively, if we can find a continuous function µ(ρ) = (m ρ , C ρ ), ρ ∈ [0, 1] such that µ(ρ) = µ ρ when ρ = 0, 1.We can consequently obtain a continuous function as a composition of µ(ρ) and f G : where 15) can be used to interpolate between Gram matrices G 0 and G 1 .
Possible solutions to the problem of interpolation of Gaussian models include linear interpolation, Fisher-Rao interpolation [34] and optimal transport interpolation [5], but linear interpolated µ(ρ) is no longer Gaussian, and no explicit formula is know for high dimensional Fisher-Rao interpolation.Alternatively, optimal transport interpolation provide a closed-form solution to the problem and it can be shown that interpolated µ(ρ) remains Gaussian [5].
According to [5], given feature maps F 0 and F 1 whose Gaussian models are µ 0 and µ 1 respectively, the feature maps F t of interpolated µ t can be calculated by using the following formula in Fourier domain, ∀w, Ĝ(w) = F1 (w) F1 (w) * F0 (w) where â denotes the Fourier transformation and a * is the conjugate transpose of a.
Proposition 4.2 provides the closed-form computation of the mixed Gram matrix and correlation matrix using optimal transport.All the computations are implemented on feature maps F. A conceptual illustration is provided in Fig. 1.

Implementation Details
This section presents the implementation details of our Gaussian scheme for texture mixing.For simplicity, we follow the texture synthesis pipeline proposed by Gatys et al .[18].However, note that our method can also be used to interpolate centred Gram matrix, correlation matrix and spectrum.We also apply our scheme to styles morphing.

Texture Mixing
As presented in Section 4.1, interpolating Gram matrices G exp0 and G exp1 can be computed by mixing its feature maps F exp0 and F exp1 .Therefore, our Gaussian scheme for exemplar-based texture mixing with CNNs can be summarized by algorithm 1.For simplicity, we use only two styles/textures in our algorithm, but it can be extended to morph more styles/textures without difficulty.

The interpolated Gram matrix G ( )
ρ is then computed from the mixed feature maps F ( ) ρ by Equation (20).To generate new textures, as in [18], back-propagation is used to generate an image I syn whose Gram matrices {G

Styles Morphing
Our scheme can also be applied to morphing the styles of two images.Given an original image I ori and two style images I sty0 and I sty1 , the goal of styles morphing is to interpolate between the style I sty0 and style I sty1 , while keeping the content of I ori .
Similar to the texture model, a Gram matrix is used to parametrized the style.Denote F ori , F syn as the feature maps of I ori and synthesized image.Let G sty0 , G sty1 and G syn be the Gram matrix of the style images and the synthesized image.Such an image I syn can be generated by minimizing where G ρ is an interpolated Gram matrix of G sty0 and G sty1 with a weight ρ ∈ [0, 1].α is a parameter to control the degree of style bending.
Note that, because our mixing process is in closed form, it costs almost the same time as Gatys' texture synthesizing algorithm [23].

Experimental Results and Analysis
In this section, we firstly evaluated the performance of our modified correlation matrix S on non-local textures, and compared them with Sendik's result [21], where we showed that our modified correlation matrix can produce comparable results with theirs.Then we showed our texture mixing scheme is effective in both non-periodic textures and periodic textures, and compared the results with other sate-of-the-art texture mixing algorithms.Finally, we applied our algorithm to style transfer, where we are able to produce high visual quality stylish photo with "intermediate" styles.

Comparisons of Correlations Matrices
In order to validate the efficiency of our proposition on the correlation matrix, given in Proposition 4.1, we compare the results using our modified correlation matrix S and Sendik's correlation matrix S [21] on texture synthesis.More precisely, we follow the same experimental settings as in [21]: using layers pool1, pool2, pool3, pool4 for Gram loss, using layer pool2 for correlation loss, and using conv1 for smooth loss.Moreover, each layer of the network is equally weighted.For experiments, each input image is re-scaled to 256 × 256 pixels, and each output image is initialized as white Gaussian noise.For optimization, the L-BFGS algorithm [37] is used.
In Fig 2, one can see that our modified correlation matrix S produced comparable results to that of Sendik's, while Gatys' Gram matrix failed to capture non-local structures.2nd row: results achieved using Gram matrix G as in Gatys [18]; 3rd row: results achieved using the Sendik's correlation matrix S in [21]; 4th row: results achieved using our modified correlation matrix S described in Eqn.(12).Observe that using both S and S can synthesize well non-local textures, but using Gram matrix can not.

Comparisons with State-of-the-art Texture Mixing Methods
This section evaluates our texture mixing method by comparing it with several texture mixing algorithms.We also compared our algorithm with TextureNet [20], which enable us to generate mixed textures instantly.The compared mixing algorithms are listed as follows: -GaussTexton [5]: A simple but fast texture mixing algorithm based on stationary Gaussian models.
-Image Melding [4]: An efficient texture mixing method based on patch match algorithm.
-Our scheme + TextureNet: The combination of Improved TextureNet [20] and our method, i.e. the Gram matrices used in TextureNet is mixed using our algorithm.
Following the settings of Gatys [18], conv1 1, pool1, pool2, pool3 and pool4 layers are selected to exert constrains.Pairs of texture exemplars used in the experiment are shown in Fig. 3, which are taken from DTD dataset [38] or collected from Internet.
Figure 4: Experiments on mixing micro textures, by using Image Melding (1-st row) [4], using Gauss-Texton (2-nd row), LIA (3-rd row), LIA + TextureNet (4-th row).our mixing scheme + Gram matrix (5-th row), our mixing scheme + correlation (6-th row), our mixing scheme + TextureNet (7-th row).Observe that our scheme, both with or without TextureNet, can smoothly interpolate between the two inputs without producing new structures.See text for more details.Fig. 4 displays the results of mixing two micro textures which can mostly be described by Gaussian model.In this experiment, we compare our algorithm with the GaussTexton [5], which is specifically designed for micro-texture mixing, and Image Melding [4].Note that the shape of the grass is completely missed by GaussTexton, while our method can represent it well.Image melding can indeed generate comparable textures, but it produces new structures, i.e. vertical strips, that are in neither of the input exemplars.Linear mixing algorithm leads to low-quality results, with different textures jointed together patchwisely.The combination of TextureNet and our method gives comparable results, and the output textures are generated instantly.Linear interpolation algorithm (LIA) always leads to poor quality results, with or without TextureNet, where different textures are joint together in patch-wise.
Fig. 5 shows the results on mixing two textures containing more geometric structures, such as sharpe edges which go far beyond Gaussian models.In this experiment, we compared our method with Image Melding, which used patch match for morphing structural textures, and diversified feed-forward networks (DFN) [22].For DFN, we directly used their trained 60-texture model in this experiment.Because their  method is based on Gram matrices, for fair comparisons, we compared our Gram-based methods with theirs.Observe that the mixing procedure of DFN indeed generated new textures, however, some of these Figure 7: Comparisons of mixing periodic textures, by using Image Melding (1-st and 3-rd row) and using our scheme with modified correlation matrix (2-nd and 4-th row).Note that our algorithm can preserve periodicity in every mixed textures.
textures are not visually similar to neither of the input exemplars.In contrast, our method can produce considerably better results, creating smooth transitions both in color and texture patterns from one to the other input.Image melding failed to generate intermediate textures in this experiments.
Fig. 6 presents more comparisons between our Gram-based algorithm and Image Melding [4].For the pebbles textures, our Gram-based mixing algorithm can mix the edges and the shapes of pebbles simultaneously, and create smooth transitions from one texture to the other.Image Melding can also create such transition, but it generate obviously repeated patterns, i.e. some pebbles in mixed textures are completely the same.For the crack textures, our algorithm creates mixed textures where two cracks gradually merged into each other, but in the results of Image Melding, the structures in the first textures are mostly ignored.Our algorithm also performs better on the third pair of textures, where delicate structures are homogeneously mixed together, while Image Melding fails to represent the strip patterns.Fig. 7 presents more comparison between our correlation based algorithm and Image Melding on periodic textures.For these textures, Image Melding creates intermediate textures which are no longer periodic.On the contrary, our correlation-based algorithm is able to preserve periodicity in every mixed textures.
As a conclusion, our method by interpolating deep statistics via Gaussian models provides a flexible scheme to mix textures with different texture synthesis models using deep neural networks, such as Gatys's method [18] and its variants [19,21], and TextureNet [20].It can produce smooth transitions both in color and texture patterns.

Style Morphing
In this experiment, we extend our texture mixing to style morphing, our goal is to create "intermediate" styles between different styles, in another word, to create smooth transitions between stylish photos.We use Jonson's feed-forward structure [24] together with instance normalization [20].We set style layers as relu1 1, relu2 1, relu3 1 and relu4 1, content layer as relu4 2. Style weight is set to 5. All other parameters are left as default.The photo and style images are shown in Fig. 8.We compare our result with Dumoulin's algorithm [31].
In our experiments, we propose to use a technique called lag constraint to morph styles.Specifically, we mix feature maps at pool1, pool2 and pool3 layer and propagate these mixed feature maps to relu2 1, relu3 1 and relu4 1 layers respectively, and use these feature maps at "relu" layers as constrain.Results generated with or without lag constraint are showed in Fig. 9, the results with lag constraint have higher   9: Comparisons of style morphing results, by using linear algorithm (1-st and 4-th row), using our scheme without lag constraint (2-nd and 5th row), and using our scheme with lag constraint (3-rd and 6-rd row).Note that our results with lag constraint technique (see text for details) gives better result to mix styles homogeneously.
Figure 10: Comparisons on style morphing between using Dumoulin's algorithm [31] (1-st, 3rd and 5th row) and our method (2-nd, 4-th and 6-th row).Dumoulin's algorithm can indeed create smooth transitions between different styles, but it failed to represent detailed structures.On the contrary, our algorithm can preserve much more detailed structures and create smooth transitions simultaneously.

Incremental training
Observe that in the scenarios that one needs to mix textures/styles with a large number of different relative weights, it can be time-consuming to initialize each optimization process with random noises.We thus proposed to use incremental training to reduce time consumption and create more smooth transitions.More specifically, our goal is to synthesis N images whose relative weights ρ are equally space in interval [0, 1]: 0 In random training, we simply initialize each image as random noise.In incremental training, we generate these images in sequence from the small weight to the large weight: the first image is initialized with random noise, while the rest of images are initialized with the image synthesized before.The convergence/stop criterion is set to 0.001 for texture mixing, and the maximum number of iterations is fixed to be 10000 for each stylish photo.

Conclusion
This paper proposed a novel algorithm to mix textures generated by CNN.We revealed that the statistics used in current CNN based texture models can be written as continuous functions of a Gaussian model, thus, they can be interpolated via Gaussian model.Experimental results have shown that our algorithm excels in mixing high quality textures, and creating mixed styles different from exemplar styles.
There are still some issues need to be further investigated, for example, we notice that the optimization  based CNN methods [18] [23] produce some low level noise.Although in most case one can polish the results with total variation de-noise techniques as in [24], this problem might be completely overcome by carefully padding the feature maps [31], or by using upsampling and convolution instead of deconvolution as suggest in [39], or even retraining CNN weights on the target image.Another important aspect is the choice of the training set in training feed forward networks.Current researches use the whole ImageNet dataset as the training set, and it is really time consuming to iterate through the whole data set.It is still unclear about the difference in using different training set, e.g.MSCoCo [40] and ImageNet, or whether there exist a smaller training set that is effective in style transfer task.Finally, note that we only described mixing of two given textures/styles, but our algorithm can be extended to mixing more textures/styles without any difficulty.

Figure 1 :
Figure 1: The proposed texture mixing scheme.(Top) Two texture exemplars are passed through the CNN.(Middle) The outputs of selected layers are mixed using a Gaussian model.(Bottom) The Gram matrices of the mixed outputs are used as constrains to generate mixed textures.

Figure 2 :
Figure 2: Comparison of using different correlation matrices.From top to bottom, 1st row: input images;2nd row: results achieved using Gram matrix G as in Gatys[18]; 3rd row: results achieved using the Sendik's correlation matrix S in[21]; 4th row: results achieved using our modified correlation matrix S described in Eqn.(12).Observe that using both S and S can synthesize well non-local textures, but using Gram matrix can not.

Figure 3 :
Figure 3: Pairs of textures (in column) used for texture mixing experiments.

Figure 8 :
Figure 8: Photos and pairs of style images (in column) used for style morphing.

Figure
Figure9: Comparisons of style morphing results, by using linear algorithm (1-st and 4-th row), using our scheme without lag constraint (2-nd and 5th row), and using our scheme with lag constraint (3-rd and 6-rd row).Note that our results with lag constraint technique (see text for details) gives better result to mix styles homogeneously.
Fig 11 illustrates  the differences between these two training procedures with experiments.For texture mixing, incremental training can speed up the optimization process by offering a better initial point, and also lead to a lower final loss.As one can see, the difference in time consumption between these two procedures increased with N , when N = 100, incremental training is about one magnitude faster than random training.It is also worth noticing that the adjacent textures created by incremental training is more similar.As a result, the whole transition is more smooth and visually pleasing than random training.Similar results can be observed in style morphing, see e.g.Fig 12.

Figure 11 :
Figure 11: Comparison on using incremental and random training for mixing wood and wool textures.Incremental training converges twice faster than random training (Top left), further more, incremental training can also achieve slightly lower loss (Top middle).The difference of converge speed is more obvious when synthesis more images (Top right).Note that, compared with that of using random training (Bottom, 2-nd row), the transitions created by incremental training (Bottom, 1-st row) are more smooth and visually pleasant.

Figure 12 :
Figure 12: Comparison on style morphing using incremental training (Bottom, 1-st row) and random training (Bottom, 2-nd row).Notice that the transitions created by incremental training are slightly more smooth.