Constrained Dynamic Mode Decomposition

Frequency-based decomposition of time series data is used in many visualization applications. Most of these decomposition methods (such as Fourier transform or singular spectrum analysis) only provide interaction via pre- and post-processing, but no means to influence the core algorithm. A method that also belongs to this class is Dynamic Mode Decomposition (DMD), a spectral decomposition method that extracts spatio-temporal patterns from data. In this paper, we incorporate frequency-based constraints into DMD for an adaptive decomposition that leads to user-controllable visualizations, allowing analysts to include their knowledge into the process. To accomplish this, we derive an equivalent reformulation of DMD that implicitly provides access to the eigenvalues (and therefore to the frequencies) identified by DMD. By utilizing a constrained minimization problem customized to DMD, we can guarantee the existence of desired frequencies by minimal changes to DMD. We complement this core approach by additional techniques for constrained DMD to facilitate explorative visualization and investigation of time series data. With several examples, we demonstrate the usefulness of constrained DMD and compare it to conventional frequency-based decomposition methods.


INTRODUCTION
The explorative visualization and analysis of time series is a wideranging field of research. It includes, for instance, the detection of anomalies, the identification of seasonal and cyclic patterns, and forecasting. Nowadays, explorative visualizations support these tasks via a combination of various tools based on statistics, signal processing, or machine learning. In general, these tools allow only little (or even no) interaction with their core analysis mechanisms. This is also the case for common spatio-temporal decomposition techniques that result in a characteristic representation with modes. One of these techniques is Dynamic Mode Decomposition (DMD), based on spectral theory and mainly used in the field of dynamical systems and fluid dynamics.
In this paper, our approach is to incorporate frequency-based constraints into DMD to obtain an adaptive decomposition. This leads to constrained Dynamic Mode Decomposition (constrained DMD), which directly addresses the problem of facilitating user interaction with the core mechanisms. The effect of frequency-based constraints is shown in Fig. 1, where the time series describes stereotypical daily data with weekly and monthly patterns. The length of the time series makes an identification of the monthly pattern difficult. This is a common problem for time series with multiple time scales. Constrained DMD solves this issue due to the incorporation of user-defined frequencies.
Hence, our main contributions can be summarized as follows: • An equivalent reformulation of DMD is presented that enables incorporation of multiple frequency-based constraints.
• We provide an efficient algorithm that uses the frequency-based constraints in a human-in-the-loop manner.
• We present additional visualization techniques that complement the visual exploration and analysis.
In addition to our technical contributions, we demonstrate the usefulness of our approach with regard to two aspects: First, a qualitative and quantitative comparison to other common spatio-temporal decomposition techniques is provided. Second, our approach is evaluated with focus on the interaction with the frequency-based constraints.

RELATED WORK
There are numerous approaches in the literature that address the analysis of data via explorative visualization. An overview of general interactive visualization techniques is presented by Tominski and Schumann [53]. For time-oriented data, an overview is given by Aigner et al. [1]. Here, an important challenge is the identification of trend, seasonal, and cyclic patterns. This is also one of the main goals of this paper. For this task, we introduce a formulation of DMD that enables interaction with the core mechanism in a human-in-the-loop manner. Thus, the related work focuses on two aspects: spatio-temporal decompositions (such as DMD) and time-oriented visualization techniques with similar goals krake eT aL.  Fig. 2. A user perspective step-by-step application of our approach. The schematic visualization shows multiple application of our constrained DMD in a human-in-the-loop manner. The user performs feedback-loops on the basis of external knowledge or present observations. that integrate the human-in-the-loop. In the following, we start with a comparison to DMD-based approaches followed by more general ones. Many theoretical and algorithmic concepts of DMD [45,46] were developed in the literature and applied to time series data, particularly in the field of dynamical systems and fluid dynamics. While these approaches address either the efficiency [9,20] or the accuracy of identified frequencies [14,41,54], they do not provide any mechanism to directly influence the core algorithm with respect to the computation of frequencies; we want to fill this gap with this paper.
More closely related is DMD with control (DMDc) [43], a DMD variant that integrates external control to the dynamics. Therefore, it is possible to analyze complex systems that require additional control. However, DMDc addresses completely different problems, whereas our technique incorporates constraints that directly influence DMD.
Another strategy that is closely related to our approach is constraining the DMD-specific minimization problem. Inspired by the Perron-Frobenius operator (the dual to the Koopman operator), Goswami et al. [23] introduce a minimization problem with constraints that guarantee a positive operator with row sum equal to 1. Similarly, Cohen et al. [16] restrict the minimization problem to operators that are symmetric. Multiple other publications deal with constraints that enforce a low-rank formulation of DMD. A closed-form solution with optimal ℓ 2norm is provided by Héas and Herzet [27], who also give an overview of sub-optimal approaches [26]. Although our method constrains the DMD-specific minimization problem in principle, we directly address the spectral theoretic properties (i.e., eigenvalues and thus frequencies), whereas the approaches mentioned above address algebraic properties.
To the best of our knowledge, the publication of Krake et al. [33] is the only paper that directly integrates frequency-based constraints into DMD. They provide a DMD-based method for video foreground/background segmentation that computes a static background model. For this step, a single constraint (DMD must contain the eigenvalue λ = 1) is integrated into DMD. While our approach uses a partly similar derivation, we end up with a formulation that allows integrating multiple different eigenvalues (frequencies) into DMD. In addition to that, our formulation also enables a human-in-the-loop process.
Besides DMD-based approaches, the most common techniques for the visualization and analysis of time series are seasonal trend decomposition using LOESS (STL) [15] and singular spectrum analysis (SSA). STL decomposes a univariate time series into three components: trend, season, and residual. It is used for visual analysis, identification of anomalies [13,28], and forecasting [39]. In contrast, SSA is a spectral decomposition that produces multiple oscillatory modes characterizing specific frequencies (although SSA is not connected to a frequency domain representation). Due to the generic formulation, SSA has various applications [2,4,22,25,31,40]. A more recent development is the use of an HJ-biplot for an explorative visualization [17]. Besides the fact that DMD produces conceptually a different decomposition, we incorporate multiple frequency-based constraints into DMD.
The approaches mentioned before only partially take advantage of user interaction. More interaction-oriented techniques combine representations such as a calendar-based view [55], spiral view [38,57], time wheel [51], or parallel coordinates [24] with tools like autocorrelation, bias reduction, aggregation, and discrete Fourier transform to analyze [12,30,56,58], segment [3,42], filter [7], or predict [5] time series. There are also direct extensions of previously mentioned techniques that integrate interaction. Carlis et al. [11] and Tominski and Schumann [52] analyze time series for the identification of seasonal and cyclic patterns via an interactive spiral view. Another related topic is the interaction with machine learning via visualizations [29,44]. All these publications address goals that are similar to ours. However, our explorative visualization is based on a spectral decomposition, which conceptually differs a lot. In addition, our user interaction facilitates the incorporation of frequencies, which is not covered in the above works.

PROBLEM STATEMENT
In this section, our goal is to discuss which problems we want to address and how they are typically solved using the proposed human-in-theloop process. Furthermore, we specify the requirements for DMD.

Introductory Example
In the following, we want to describe how a user approaches certain problems with our method. This will also emphasize which types of problems can be addressed. We consider the same time series as in Fig. 1, i.e., the task is to analyze daily data. This is the first information the user is confronted with, as illustrated in Fig. 2. However, the user has no influence on the application of conventional DMD to data. The computation results in DMD components that highlight a trend and two seasonal patterns with the periods 18.24 and 6.90, respectively.
In general, the user has to accept the patterns and can use postprocessing techniques to enhance the results. However, they do not influence the core mechanism of DMD. Our approach enables incorporating multiple frequency-based constraints into DMD. The corresponding algorithm is designed such that this process can be performed by a feedback-loop. This means, for instance, that the user may observe an interesting pattern, such as the period 6.90, and wants to adjust it. Furthermore, the user can connect observed patterns with external knowledge (here, the connection to weekly patterns). This is the first user-specified update in Fig. 2: the adjustment of the period 6.90.
The updated DMD components still produce a pattern (period 17.05) that does not match with daily data. Therefore, the user may try to integrate another frequency that has a relation to the data. Here, the user tries to integrate a two-weekly pattern, i.e., the period 14 into the system by trial and error. It can be observed that the updated DMD components replace the period 17.05 by the period 14.00. However, this increases the error significantly. Therefore, the user decides to decline this change and, instead, integrates the next reasonable choice of frequency: a monthly pattern. The final result seems to be reasonable because the decomposition exhibits relevant time scale features in the data.

Requirements for DMD
In the previous subsection, we described a typical situation for our approach. It shows that our technique has to be computationally efficient to facilitate human-in-the-loop interaction for the user. This aspect must be taken into account for the mathematical modeling. Moreover, due to limited space and a typically large number of DMD components, a filtering is necessary to highlight the relevant components. In addition to that, the user should keep track of changes in case of multiple updates (in particular, if updates are performed by trial and error).

MATHEMATICAL BACKGROUND
In this section, we will shortly explain the theory of delay embedding and how we use it in the context of DMD. In addition, the theoretical and algorithmic concept of DMD is presented.

Time Series Data and Delay Embedding
Although our approach is applicable to any kind of data, we describe it for real-valued time series. Given a time series z 0 ,...,z M ∈ R N , the number of attributes is N and the number of time steps is M + 1. In general, for high-dimensional data (as from many examples of simulationbased or experimental data), the relation N ≫ M holds (leading to linearly independent vectors), which is the typical condition for DMD. In this case, DMD can directly be applied to the time series z 0 ,...,z M without any further processing. In contrast, for univariate time series or multivariate time series with only a few attributes, the resulting relation N ≪ M usually introduces large errors in the reconstruction by DMD and an inaccurate identification of frequencies. A commonly used technique to avoid this DMD-specific dimensionality problem is time delay embedding, which is also called hankelization [18,19,21,32,48,50]. Developed in the field of dynamical systems, time delay embedding addresses the problem of reconstructing (chaotic) dynamics in state space with a sequence of observations [49]. In our discrete setting, the procedure of time delay embedding corresponds to lifting the time series z 0 ,...,z M ∈ R N into a higher-dimensional space R n by stacking d time-shifted copies. We describe this via H d : where we have the relations n = N(d + 1) and m = M − d with the delay parameter d ≥ 0. The delayed time series is a hankel-type matrix (in general, not a square matrix). Each row corresponds to a windowed representation of the original time series. The delayed time series solves the DMD-specific problem of dimensionality because for real world data the delay parameter d can be chosen such that the columns of the hankel-type matrix are linearly independent. In this context, the relation n ≥ m + 1 must be satisfied. Inserting the variables n and m, this leads to the condition d ≥ (M − N + 1)/(N + 1) for the delay parameter d. For univariate time series (N = 1), the condition boils down to d ≥ M/2. However, we recommend selecting a delay parameter that is slightly larger than M/2. In our evaluations, d was chosen as approximately d = 0.6 · M. For these values, our method performs in a reasonable and stable way. Choosing values near M/2 or M can lead to insufficiencies. This aspect is shown in a parameter study in the supplemental material [34].
Given objects in the delayed space x 0 ,...,x m ∈ R n×m+1 with the relations n = N(d + 1) and m = M − d, the procedure of reverting the higher-dimensional embedded objects to the original space is crucial (especially for our approach). The common technique is to use diagonal averaging (actually, the anti-diagonals are averaged). We describe this operation with H + d : (2) It can be observed that diagonal averaging is a left inverse operation for time delay embedding, i.e., the following identity holds: Hence, the interaction with DMD is as follows: DMD is applied to the delayed time series (to overcome the dimensionality problem mentioned before) and diagonal averaging is used to relate our DMD results to the original space.

Dynamic Mode Decomposition
DMD is a spectral decomposition technique that extracts spatiotemporal patterns from data. As mentioned before, we apply DMD to time-delayed data to overcome the DMD specific issue of dimensionality. Hence, we can assume the relation n > m + 1 and the linear independence of the vectors. Before we explain the algorithm of DMD shown in the first part of Algorithm 1 (lines 1-7), we describe the principles of DMD [36,37], which help understand our approach in the next section.
In general, DMD efficiently computes eigenvalues and eigenvectors of a high-dimensional matrix A ∈ R n×n , which determine the frequency-based decomposition of the data, via a low-dimensional representation S. The matrix A is characterized by the minimization Since this minimization problem usually does not have a unique solution (due to the fact n > m), DMD uses the minimum-norm solution that is given by where X + is the Moore-Penrose pseudoinverse of X. To avoid computationally costly operations for the spectral decomposition of A, the low-dimensional representation S ∈ R r×r of A is computed via the thin singular value decomposition (SVD) of X, where irrelevant rows and columns in the square matrix representation are discarded, i.e., X = UΣV * with U ∈ R n×r , Σ ∈ R r×r , V ∈ R m×r , and r = rank(X) = m: (5) krake eT aL.: ConsTrained dynamiC mode deComposiTion User-speciÞed freq.:  10. Application of DMD and constrained DMD to the hourly multivariate dataset that consists of energy consumption, temperature, and the installed capacity of solar panels. Constrained DMD incorporates daily, half-daily, weekly, and half-weekly frequencies. The resulting components are filtered with regard to our proposed filtering. The scale of the energy consumption and installed capacity refers to the same values. For all three attributes, the scale of the trend matches to the scale of the superposition (i.e., blue and purple start at the value 0 (bold dotted line) and increase in steps of 1000; orange start at the value 60 (bold dotted line) and increase in steps of 10). The seasonal patterns have the following scale: the bold dotted line indicate the value 0, and the other two dotted lines represent 100 and -100 (for blue and purple) and 1 and -1 (for orange), respectively.

Multivariate Time Series
In this subsection, a multivariate time series is analyzed from a user perspective. The time series has three attributes, where the target variable is San Diego's energy consumption (in MWh). Additionally, the highly correlated dry bulb temperature (in Fahrenheit) and installed capacity of solar panels at customer sites (in kW) are considered. The dataset is hourly and has 43824 hours (5 years). However, we applied DMD to only a section of the time series which corresponds to two weeks (i.e. 337 measurements). For this step, we used the delay parameter d = 203. Analogously to the previous subsection, the large number of DMD components (in this case, 133) makes it hard to identify relevant patterns if only the sorting via their influence is used. Applying our filtering (eT = 10−3, eS = 0.01), we obtain a representation of DMD components that is illustrated in Fig. 10 (left). The different columns show the three attributes (the scale is explained in the caption) We see that DMD successfully identifies a daily pattern with period 24.07 and half-daily pattern with the exact period 12.00. According to the superposition, we observe that these two patterns (along with the trend) characterize the energy consumption and the temperature quite well. This aspect is also reflected in the component with period 50.10. For the energy consumption and temperature, the influence is quite small (see dotted lines), whereas for the installed capacity, it is the most influential component. However, the superposition shows that the shape is not accurately captured. We further observe in this context that the shape exhibits a weekly pattern.
Thus, we can use this observation by performing a feedback-loop via constrained DMD to incorporate the period 168. Since DMD probably detect harmonics as well, we decide to integrate the period 84 as well. The result of this user-specified update is presented in Fig. 10 (right). First of all, we observe that all of our user-defined frequencies have a huge positive influence onto the system. This shows that our choice was appropriate. The resulting components of constrained DMD provide (after filtering) useful insights into the data. Namely, the update confirms our observation that the installed capacity highly exhibits a weekly pattern (and half-weekly pattern). This results in an improved approximation in the superposition and a smaller mean error. Additionally, we observe that weekly patterns also enhance the superposition and mean errors of the other attributes. This concludes the use of our approach for an explorative analysis of the multivariate time series.

7C ONCLUSION
We have presented a mathematical formulation that enables incorporation of multiple frequencies into DMD. Based on this, an efficient algorithm and additional visualization techniques were provided that support a human-in-the-loop process. This is a significant improvement towards the interactive application with DMD. Thus, an explorative visualization and analysis (of time series) is achieved, where the user is a part of the core mechanism of DMD. In this way, it is possible to bring in external knowledge or to specify identified patterns that may be wrong or only approximately correct.
In the future, we want to provide the user additional DMD-consistent tools that suggests possible frequency-based constraints. These could also partially be automatic. Another interesting research direction is the interaction of constrained DMD with the forecasting of time series. Due to our filtering and clustering, a subset of relevant DMD components is computed that may predict more accurately and transparently. In this context, it would be also interesting to apply our constrained DMD to other fields of application such as fluid dynamics or computer vision. ☉

APPENDIX: CONSTRAINED MINIMIZATION PROBLEM
Given the matrices A 2 Rm⇥m andÃ 2 Rq⇥m, where A is invertible, as well as the vectors b 2 Rm andb 2 Rq, the solution to the minimization problem minx2Rm kAx − bk2 2 subject toÃx =b is given by Proof. We can solve this minimization problem via the method of Lagrange multipliers. The Lagrangian function is given by L (x, µ)= . Differentiating with respect to x and µ leads to the two equations  Fig. 4. A flowchart-like overview of our approach. After the application of DMD, the DMD output is processed in the direct visualization, which may give initial insights into DMD. On this basis, the user can decide to consult the additional visualization techniques including clustering and change tracking of eigenvalues or to incorporate desired frequencies via constrained DMD.
This low-dimensional matrix is the key to efficiently compute the spectral components of A that characterize the data spatio-temporally. An algorithmic description is provided in the first part of Algorithm 1 (lines 1-7). Starting with the definition of X and Y (lines 1-2), the thin SVD of X (line 3) is used to compute the low-dimensional matrix S ∈ R m×m (line 4) via Equation 5. The next step is to compute the nonzero eigenvalues λ j ∈ C and eigenvectors v j ∈ C m of S (line 5). Then, the eigenvectors are transformed into DMD modes ϑ j = 1 [36]) and are arranged column-wise in the matrix Θ (line 6). Finally, the DMD amplitudes a j ∈ C are computed by solving min∥ΘΛa − x 1 ∥ via a least-squares solver (line 7). In general, this algorithm leads to an approximate decomposition of the data. Krake et al. [36] proved that this representation is exact if the matrices X and Y are full-rank and the eigenvalues λ 1 ,...,λ m are distinct (in this case, it holds r = m): where , and ϑ 0 = x m − XX + x m (these are only used for the reconstruction of x 0 and do not contribute to the dynamics). The equation provides an interpretation of the components λ j , ϑ j , and a j , which is important to understand our approach.

METHOD
This section presents our method: constrained DMD. We start with a motivation that illustrates the potential of DMD. Afterward, the formulation for incorporating multiple frequency-based constraints is derived and the corresponding human-in-the-loop algorithm is provided, which is publicly available 1 . We conclude the section with additional visualization techniques that complement the visual exploration.

Motivation
DMD decomposes data into three types: eigenvalues, modes, and amplitudes. According to Equation 6, the time-delayed temporal evolution of the j-th DMD components is given byx k ( j) = λ k j a j ϑ j ∈ C n for k = 0,...,m. To relate it to the original space (such that specific patterns in the data are characterized), diagonal averaging needs to be applied: In the case of univariate data, the application of DMD and constrained DMD is demonstrated in Fig. 1. The time series used in this figure (middle box) is a typical example of daily data. Besides a linear trend and a noise, two seasonal patterns (periods 7 and 28) are part of the superposition, which correspond to weekly and monthly patterns, respectively.
The application of both DMDs leads to many components, where the three most dominant ones are depicted in the figure (sorting top to 1 https://github.com/kraketm/ConstrainedDMD bottom). The remaining components characterize noise patterns. While conventional DMD (left box) nearly identifies the weekly pattern with the period 6.90, it does not correctly represent the linear trend as well as the other seasonal pattern. In fact, DMD assigns the period 18.24 to the monthly pattern. Due to the natural spatio-temporal representation of DMD (assignment of spatial components to a frequency and growth rate), an incorporation of frequency-based constraints is feasible, which is realized by our constrained DMD (right box). This interaction can be based on a human-in-the-loop process (e.g., inspired by previous results of DMD) or be motivated by external aspects. Since the time series consists of daily data, the time series will exhibit weekly and monthly patterns. Choosing these patterns as constraints, our constrained DMD produces analytically correct periods. The corresponding temporal evolutions of the components identify the patterns of the time series correctly. This shows the potential of our approach.

Mathematical Derivation
As motivated in the previous subsection, our goal is to incorporate frequency-based constraints into DMD to enable interaction with the core algorithm. To realize this, we first recap that the linearly independent time-delayed data with the respective data matrices X and Y as well as the reduced singular value decomposition of X = UΣV * result in the low-dimensional matrix S = U * YV Σ −1 ∈ R m×m (see Equation 5 or lines 1-4 in Algorithm 1). This is the start point of our derivation and highlighted in the top left of Fig. 3. The figure shows an overview of the entire derivation in the form of a commutative diagram. The dashed arrow indicates the goal, that is to transform the matrix S such that the modified matrixS has desired eigenvalues (which we demand), a reasonable connection to DMD, and numerical advantages for an efficient computation. In the following, we will derive a formulation for the matrixS following the commutative structure in Fig. 3 with steps , , and .
Derivation step Another way of formulating DMD is to use a companion matrix C. To facilitate the next derivation steps, we recap that a companion matrix (according to our definition) has the structure In this case, instead of solving the minimization problem (Equation 3), the related minimization problem min∥XC −Y ∥ F is considered, which boils down to min∥Xc − x m ∥ 2 . A solution is given by The relation between the two concepts is given by [36, Lemma 5.2]

CONCLUSION
We have presented a mathematical formulation that enables incorporation of multiple frequencies into DMD. Based on this, an efficient algorithm and additional visualization techniques were provided that support a human-in-the-loop process. This is a significant improvement towards the interactive application with DMD. Thus, an explorative visualization and analysis (of time series) is achieved, where the user is a part of the core mechanism of DMD. In this way, it is possible to bring in external knowledge or to specify identified patterns that may be wrong or only approximately correct.
In the future, we want to provide the user additional DMD-consistent tools that suggests possible frequency-based constraints. These could also partially be automatic. Another interesting research direction is the interaction of constrained DMD with the forecasting of time series. Due to our filtering and clustering, a subset of relevant DMD components is computed that may predict more accurately and transparently. In this context, it would be also interesting to apply our constrained DMD to other fields of application such as fluid dynamics or computer vision. ☉

APPENDIX: CONSTRAINED MINIMIZATION PROBLEM
Given the matrices A 2 R m⇥m andÃ 2 R q⇥m , where A is invertible, as well as the vectors b 2 R m andb 2 R q , the solution to the minimization problem min x2R m kAx − bk 2 2 subject toÃx =b is given by Proof. We can solve this minimization problem via the method of Lagrange multipliers. The Lagrangian function is given by . Differentiating with respect to x and µ leads to the two equations  which demonstrates that the two matrices are similar, i.e., they are equal up to a change of basis. Hence, they have the same eigenvalues.
Derivation step Whereas the matrix S has a more compact numerical representation (due to the SVD), the companion matrix C has direct relation to the eigenvalues. In fact, the characteristic polynomial of C is determined by This connection enables a modification of the eigenvalues. (Although Krake et al. [33] also use this connection to enforce the eigenvalue λ = 1, they do not exploit the similarity of S and C, leading to a different formulation). In fact, λ j is an eigenvalue of C if and only if p C (λ j ) = 0. Since the matrix C (or the similar matrix S) has the m eigenvalues λ 1 ,...,λ m , the vector c and therefore the companion matrix C is fully characterized by the following system of linear equations: where G, a Vandermonde matrix, and b are given by We use this representation to formulate an analogous (underdetermined) system encoding our desired eigenvaluesλ 1 ,...,λ q with q ≤ m: The key is now to determine how these two systems can be merged in a meaningful and numerically efficient way. The canonical method is to formulate a minimization problem that integrates the desired frequencies as a constraint: Due to Equation 11, we know that c is a solution to the minimization problem without constraints and, according to the appendix, a solution is therefore given by the formulã where we define G −1 = (G * G) −1 . Although this solution encodes the desired eigenvalues (implicitly in the vector c), it can radically alter the identification of other eigenvalues (which we do not address by our desired eigenvalues). This can also be observed in the weighting matrix G in Equation 15, where only G is encoded without any information about the importance of certain eigenvalues. For this reason, we suggest integrating a scaling matrix into the minimization problem. Let K ∈ R m×m be a diagonal matrix that describes the importance of eigenvalues (a reasonable choice will be discussed in the next subsection), then, the improved minimization problem is given by Again, according to the appendix, a solution to this problem is given bỹ where we define K −1 = (G * K * KG) −1 . In fact, if the scaling matrix K is the identity matrix, we get the same solution as for Equation 15. However, for a meaningful choice of K, we obtain a solution that both encodes the desired eigenvalues and does minimal changes to DMD. This effect can also be observed in the weighting matrix K.

Derivation step
The result of the previous step is a new companion matrixC that has the desired eigenvalues via the vectorc implicitly encoded. Finally, the companion matrixC needs to be retransformed to be aligned with the original DMD algorithm. According to Equation 9, the retransformation is given bỹ This concludes our derivation of the modified matrixS.

Algorithm and Human-in-the-loop
Now, we describe the implementation and discuss the human-in-theloop process. Figure 4 provides an overview of the process. The inner box shows the conventional DMD approach, which is described in lines 1-7 in Algorithm 1. The resulting DMD output is then used for further processing and visualizations. A natural step is to directly visualize the DMD eigenvalues, modes, and amplitudes first. There are different techniques described in the literature, however, in this paper we use more standard visualizations to clearly demonstrate our approach (our goal is not to improve the direct visualizations of DMD components): • DMD eigenvalues are visualized in the complex plane (later on, this representation is used for the tracking of eigenvalues). • DMD modes and amplitudes are represented by their temporal j a j ϑ j ∈ C n for k = 0,...,m. Besides the visualization, a sorting of DMD components is important as well due to the large number of components (see Fig. 5 step ). We recommend sorting the components by their influence onto the time series. The influence of components is not only important for the representation, it is also crucial for our constrained DMD algorithm. A possible definition of the influence will be derived in the following.
First of all, we need to emphasize that eigenvalues λ j , scaled modes a j ϑ j , and therefore the temporal evolutions λ j a j ϑ j occur in complex conjugate pairs in the time-delayed space [35]. Thus, complex conjugated pairs can be combined as follows: Assuming j 1 and j 2 to be complex conjugate pairs, then the temporal evolution is given by where we used the following relation λ k j 1 a j 1 ϑ j 1 + λ k j 2 a j 2 ϑ j 2 = λ k j 1 a j 1 ϑ j 1 + conj(λ k j 1 a j 1 ϑ j 1 ) = 2ℜ(λ k j 1 a j 1 ϑ j 1 ). This shows that the temporal evolution of a DMD component is actually given by its real part and, therefore, we recommend not showing temporal evolution twice. The influence of a DMD component can be computed by (multivariate data with various scales must be additionally aligned) This scalar is not only an appropriate value for the sorting of DMD components (compare Fig. 5), but also a suitable choice for the scaling matrix K = diag(K 1 ,...,K m ) for constrained DMD. Computing these scalars is also the first step of our constrained DMD if the feedbackloop is considered in Fig. 4. As we will see later, the user feedback is useful if the conventional DMD detects frequencies only approximately (see Fig. 1) or if the visualization suggests certain frequencies. It is important that frequencies are incorporated via complex conjugated pairs, i.e., the incorporation of a period p ∈ R is given by The human-in-the-loop process is algorithmically described in more detail in Algorithm 1 (lines [8][9][10][11][12][13][14][15][16][17][18]. It is formalized via a while loop that stops if the user considers DMD to be appropriate. The definition of user-specified frequenciesλ 1 ,...,λ q is the first step (line 9). Then, the scaling matrix K and the weighting matrix K are computed (lines [10][11]. Afterward, the term K −1G * is computed via solving q systems of linear equations (line 12). This result as well as the computed vector c (line 13) is used to obtain the modified vectorc via Equation 17 (line 14). With this, we can compute the modified matrixS (line 15). As indicated in the overview, Fig. 4, the modified matrixS can be seen as an update of the matrix S. This procedure is described in the remaining steps of Algorithm 1, which are simply a repetition of lines 5-7. Algorithm 1 shows that our human-in-the-loop process exploits previous DMD computations (e.g., the SVD of X is only performed once). Furthermore, we emphasize that only the objects with a tilde symbol need to be updated, i.e., lines 10, 11, and 13 need to be computed once. In addition, the computation of the modified matrixS does not produce much overhead, as the inverses can be computed efficiently: In fact, line 12 uses a solver in the low-dimensional space m with time complexity O(m 3 ) and line 14 computes the inverse of a q × q matrix with a time complexity of O(q 3 ). Since q is the number of frequency-based constraints, it satisfies q ≤ m, however, it is much smaller in general. Thus, the update of S computed in lines 9-15 does not produce much overhead. This applies to the eigenvalue computation in line 16 as well because the time complexity is O(m 3 ). In the last two lines 17-18, partly high-dimensional matrices are involved in the computation such that the time complexity is given by O(nm 2 ). In sum, constrained DMD has a time complexity of O(nm 2 ), however, most of the computations are performed in a much lower-dimensional space. In addition to that, all solvers (EIG, LS, or Thin SVD) can be parallelized [6,8,47].

Additional Visualization Techniques
In the previous section, an overview of the human-in-the-loop process was presented. The feedback-loop can be performed after the first impression of the direct visualization or as a consequence of our additional visualization techniques. These are presented in the following.
Filtering and clustering In the previous subsection, we discussed the direct visualization of DMD components (compare Fig. 4). The eigenvalues are represented in the complex plane and the modes are visualized by the corresponding temporal evolution (of course, more advanced techniques can be used in the case of multivariate data). As proposed previously, we recommend a sorting via their influence to suppress misleading DMD components (see Fig. 5, step ). While this sorting is useful for a first impression, it is not optimal for the task of clustering components. Thus, we propose a filtering that takes the location of the eigenvalues in the complex plane into account.
In general, the location of an eigenvalue λ j in the complex plane determines the dynamics of the temporal evolution λ k j a j ϑ j . Eigenvalues on the positive real axis, i.e., the imaginary part ℑ(λ j ) is small and ℜ(λ ) is positive, characterize (long-term) increase or decrease in the data and therefore correspond to the trend. In contrast, eigenvalues close to the unit circle, i.e., |λ | ≈ 1, (which are not classified as trend components) describe oscillatory structures and therefore seasonal patterns. The remaining components characterize noise. We can use this information to group the DMD components into the following subsets: where ε T and ε S are user-defined parameters. Thus, we can create an overview representation (note that temporal evolutions occur in complex conjugate pairs). We use the disjoint union of the index set   The next step is to use the filtering for clustering. We propose only clustering components that satisfy a threshold δ . The clustering of trend components can be done by the superposition of these components, i.e., In contrast, seasonal components should be aggregated into different clusters C δ S 1 , C δ S 2 , ... that consist of harmonics. This step must be done in a reasonable way to highlight certain patterns in the data. In this regard, constrained DMD has enormous advantages due to incorporating frequencies (later on, we purposely choose harmonic frequencies). An illustration of the clustering process is given in Fig. 5. The residual components C R complement the representation (and help in the case of wrong assignments).
Change tracking Due to our human-in-the-loop process shown in Fig. 4, constrained DMD incorporates new frequencies. This process affects all components of the DMD decomposition. Although our mathematical derivation guarantees small changes, this aspect is more of an algorithmic characteristic. Therefore, we propose a change tracking of eigenvalues as the eigenvalues mostly determine the decomposition.
Since the number of eigenvalues is not huge, our identification of eigenvalue pairs (old and new) is based on a similarity matrix (see Fig. 6). For the visualization, we use the representation in the complex plane and vector-like connections with dots at the end of the line that indicate the position of the eigenvalues. Due to ring-like shapes of the old eigenvalues, it is possible to identify the direction, even in the case of very small changes. To obtain a mapping between the old and new eigenvalues, pairs are identified successively on the basis of the highest value in the similarity matrix and the respective row and column are excluded after the identification. In this way, a unique mapping is guaranteed that highlights closest connections first.
For the computation of the similarity matrix Q, we use a DMDspecific formulation that uses the location of eigenvalues in the complex plane and the modes as they are objects in the high-dimensional space with a greater variance. We assume that λ 1 ,...,λ m and ϑ 1 ,...,ϑ m are the old eigenvalues and modes, respectively, andλ 1 ,...,λ m and ϑ 1 ,...,θ m are the new eigenvalues and modes, respectively. While the eigenvalues can be directly compared via the absolute value, i.e, δ λ i j = |λ i −λ j |, the modes stem from eigenvectors and need an additional scaling for comparison: δ ϑ i j = ∥ϑ i −θ * j ϑ i (θ * jθ j ) −1 ·θ j ∥ 2 . Normalizing the eigenvalue and eigenvector errors (δ λ i j , δ ϑ i j ) by the maximum errors in all pairs and taking a convex combination leads to the dissimilarity matrix D. Thus, the similarity matrix Q is given by Q i j = 1 − D i j . The convex combination parameter can be used to emphasize the eigenvalue or mode weighting.

RESULTS
This section presents the application of constrained DMD. First, we compare our method to other spatio-temporal decompositions. Then, we apply constrained DMD to both a univariate (where external knowledge is used) and a multivariate time series (where time-scale information is used) and explain the handling from a user perspective. Another evaluation is provided in the supplemental material [34], where we investigate a time series without any external knowledge.

Comparison of Spatio-Temporal Decompositions
In this subsection, we compare constrained DMD with DMD, SSA, and STL. For this, we use the lynx dataset, which is a time series with numbers of annual lynx trappings in Canada from 1821-1934. The dataset consists of 114 measurements, however, to complicate the investigation, we only use the last 70 measurements, i.e., the dataset is given by z 0 ,...,z 69 ∈ R. The time series consists of a mixture of cyclic and seasonal patterns as well as a slighly increasing trend. For the application of SSA and DMD, we used the delay parameter d = 41, which satisfies the condition d ≥ 69/2. This leads to delayed data x 0 ,...,x 28 ∈ R 42 as input. STL was applied with seasons 9 and 10, and krake eT aL.: ConsTrained dynamiC mode deComposiTion  Fig. 7.
The dashed lines correspond to the values 1000 and -1000 (trend and seasonal), whereas the dotted lines characterize an increase of 1500 starting with the bold dotted line at 0 (superposition and absolute error).
The first observation is that STL produces fewer components that may not identify seasonal and cyclic patterns appropriately. One reason for this is that STL cannot use non-integer numbers as input (we used 9 and 10). Another important aspect is that STL does not treat cyclic patterns separately. In fact, STL usually encodes these in the trend. In contrast, SSA, DMD, and constrained DMD produce various components. SSA sorts these by the corresponding singular values, whereas DMD and constrained DMD use our proposed filtering (starting with the trend components and followed by the seasonal components, see Fig. 5). All three methods result in a single trend (that slightly increases) and multiple oscillatory patterns. A major advantage of (constrained) DMD is the additional information of the period, which is given by the corresponding eigenvalue. This aspect enables interpretations and provides access to techniques like clustering. For constrained DMD, the structure of harmonics is clear and the clustering provides useful insights such as the periodicities in the lynx dataset (see Fig. 5). In contrast, conventional DMD mixes up relevant frequencies, not enabling a clear clustering. In the case of SSA, clustering is performed via a comparison of the corresponding singular values. However, it can be misleading, e.g., clustering the second and third component will result in a good approximation of the dominant 9.63 period, whereas clustering the fifth and sixth component results in a mixture of frequencies.
Another aspect of Fig. 7 is the reconstruction error. While a low error shows that a decomposition generally reproduces the data, it is not the primary factor for a spatio-temporal decomposition as the results may not provide insights into the data. In fact, STL has the lowest error, however, the components have less variety and mix up different periods. This demonstrates the limited applicability of STL. In contrast, SSA produces worse error yet with a better variety of patterns. Nonetheless, the interpretation of components is difficult as SSA does not naturally encode frequency into patterns (a similar incorporation approach with SSA instead of DMD would thus not be feasible). Since SSA is based on an SVD, the focus is on a best fit solution. DMD, in contrast, does a best fit with respect to a frequency-based representation. Indeed, DMD and constrained DMD produce error that are similar to SSA, but the identified patterns can be interpreted easily. While DMD fails to detect the correct frequencies, though, our constrained DMD solves this issue and produces a decomposition with a good identification and interpretation of frequencies by preserving a small error.

Univariate Time Series
In the following, we present a step-by-step application of our method by using the lynx dataset again. Now, we want to focus on the user perspective. Therefore, we assume to have DMD applied once, which  results in 28 DMD components. According to Fig. 4, we first consult the direct visualization. Since this first impression provides not much insights (compare the similar problem in step in Fig. 5), we use our filtering (ε T = 10 −3 , ε S = 0.05, and δ = 600). This leads to a clearer representation, which is illustrated in Fig. 7 (column with DMD). DMD captures the most dominant seasonal pattern via a period of 9.988. Furthermore, we see the typical behavior of harmonic patterns: besides the period 9.988, DMD identifies the period 5.003, which oscillates twice as fast. These two components could be clustered and analyzed, however, since other components seem not to feature certain patterns, we decide to give feedback to the system. Studying the literature of animal populations in North Canada leads to a period of 9.63 [10]. Hence, we use the feedback-loop via constrained DMD to incorporate the periods 9.63 and 4.815 (9.63/2). Thus, we correct the most dominant seasonal pattern by preserving the harmonic structure. The first user-specified update results in the same number of DMD components. Again, the direct visualization leads to a sorting of temporal evolutions, which highlights the relevant components. Consequently, we use our filtering and change tracking, which are shown in Fig. 8 and Fig. 9 (left), respectively. First of all, we successfully eliminated the period 9.988 in exchange for the period 9.63. Secondly, our approach to use the harmonic period 4.815 was useful as well since it is the third most dominant seasonal pattern now. The change tracking demonstrates that the corresponding eigenvalues are highly correlated to previous components. Indeed, the updated component with period 9.63 arises from the previous one with 9.988, and the component with period 4.815 arises from the previous one with 5.003. Since the change tracking takes the shape of temporal evolution into account (due to the similarity matrix), we observe similar oscillatory patterns.
Another observation is that our choice of user-defined periods eliminates certain mixed periods (7.358 and 4.157) except for the dominant seasonal pattern with period 26.787. The superposition (or the mean error plot) shows the effect of our update: while the error increases, our newly identified patterns lay a better foundation for the cyclic patterns. With this observation, we want to perform a second user-specified update that incorporates frequencies belonging to the cyclic behavior. As the cyclic occurs in regular intervals of four, we want to integrate the additional periods 38.52 (38.52/4 = 9.63) and 19.26 (19.26/2 = 9.63).
The user-specified update is shown in Fig. 5. As before, the direct visualization, i.e., sorting via influence (Fig. 5, step ) barely provides insights. Our filtering (Fig. 5, step ⋆) highlights more relevant components. We observe that all user-specified frequencies are included, which shows that previously dominant periods were replaced. In fact, the change tracking (see Fig. 9) reveals that the component with period 26.787 is a trade-off between the periods 19.26 and 38.52 (it is nearly the mean). According to our change tracking, the component with period 26.787 transforms to the user-defined period 38.52. A look at the superposition and error confirms that the four seasonal patterns along with the trend describe the time series adequately. Thus, we can cluster components (see Fig. 5). We observe that the trend slightly increases (except for small areas at the edge). Furthermore, superimposing certain harmonic frequencies, we can reproduce the seasonal and cyclic patterns. This observation concludes the explorative analysis.  Fig. 10. Application of DMD and constrained DMD to the hourly multivariate dataset that consists of energy consumption, temperature, and the installed capacity of solar panels. Constrained DMD incorporates daily, half-daily, weekly, and half-weekly frequencies. The resulting components are filtered with regard to our proposed filtering. The scale of the energy consumption and installed capacity refers to the same values. For all three attributes, the scale of the trend matches to the scale of the superposition (i.e., blue and purple start at the value 0 (bold dotted line) and increase in steps of 1000; orange start at the value 60 (bold dotted line) and increase in steps of 10). The seasonal patterns have the following scale: the bold dotted line indicates the value 0, and the other two dotted lines represent 100 and -100 (for blue and purple) and 1 and -1 (for orange), respectively.

Multivariate Time Series
In this subsection, a multivariate time series is analyzed from a user perspective. The time series has three attributes, where the target variable is San Diego's energy consumption (in MWh). Additionally, the highly correlated dry bulb temperature (in Fahrenheit) and installed capacity of solar panels at customer sites (in kW) are considered. The dataset is hourly and has 43,824 hours (5 years). However, we selected a section of the time series that corresponds to two weeks and, thus, our dataset is given by z 0 ,...,z 336 ∈ R 3 . For the analysis, we used the delay parameter d = 203 (it satisfies the condition d ≥ (M −N +1)/(N +1) = 83.5), resulting in the delayed data x 0 ,...,x 133 ∈ R 612 , which is the input for (constrained) DMD. Analogously to the previous subsection, the large number of DMD components makes it hard to identify relevant patterns if only the sorting via their influence is used. Applying our filtering (ε T = 10 −3 , ε S = 0.01), we obtain a representation of DMD components that is illustrated in Fig. 10 (left). The different columns show the three attributes (the scale is explained in the caption). We see that DMD successfully identifies daily patterns with the period 24.07 and half-daily patterns with the period 12.00. According to the superposition, these two patterns (along with the trend) characterize the energy consumption and temperature quite well. For the installed capacity, the pattern with the period 50.098 is the most influential (for the energy consumption and temperature, the impact is quite small). However, the superposition shows that the shape is not accurately captured. We further observe that the shape exhibits a weekly pattern.
Thus, we can use this observation by performing a feedback-loop via constrained DMD to incorporate the period 168. Since DMD probably detect harmonics as well, we decide to integrate the period 84 as well. The result of this user-specified update is presented in Fig. 10 (right). First of all, we observe that all of our user-defined frequencies result into dominant frequencies of the system. This shows that our choice was appropriate. The resulting components of constrained DMD provide (after filtering) useful insights into the data. Namely, the update confirms our observation that the installed capacity highly exhibits a weekly pattern (and half-weekly pattern). This results in an improved superposition with a smaller mean error. Additionally, we observe that weekly patterns also enhance the superposition and mean errors of the other attributes. This concludes the use of our approach for an explorative analysis of the multivariate time series.

CONCLUSION
We have presented a mathematical formulation that enables incorporation of multiple frequencies into DMD. Based on this, an efficient algorithm and additional visualization techniques have been provided that support a human-in-the-loop process. This is a major improvement toward the interactive application with DMD. Thus, an explorative visualization and analysis is achieved, where the user is a part of the core mechanism of DMD. It is possible to bring in external knowledge or to specify identified patterns that may be wrong or only nearly correct.
In the future, we want to provide the user additional DMD-consistent tools that suggest possible frequency-based constraints. These could also partially be automatic. Another interesting research direction is the interaction of constrained DMD with the forecasting of time series. Due to our filtering and clustering, a subset of relevant DMD components is computed that may predict more accurately and transparently. In this context, it would be also interesting to apply our constrained DMD to other fields of application such as fluid dynamics or computer vision.

APPENDIX: CONSTRAINED MINIMIZATION PROBLEM
Given the matrices A ∈ C m×m andÃ ∈ C q×m , where A is invertible, as well as the vectors b ∈ C m andb ∈ C q , the solution to the constrained minimization problem min x∈C m ∥Ax − b∥ 2 2 subject toÃx =b is given by the vector x ⋆ = A