Low-Rank Alternating Direction Implicit Iteration in pyMOR

The low-rank alternating direction implicit (LR-ADI) iteration is an effective method for solving largescale Lyapunov equations. In the software library pyMOR, solutions to Lyapunov equations play an important role when reducing a model using the balanced truncation method. In this article, we introduce the LR-ADI iteration as well as pyMOR, while focusing on its features which are relevant for integrating the iteration into the library. We present the results of numerical experiments, which indicate that the iteration’s pure pyMOR implementation outperforms external libraries when dealing with large problem dimensions.


Introduction
The basis for numerous practical applications and research areas, such as systems and control theory, consists of modeling large-scale technical and dynamical systems. Due to the growing complexity of modern applications, the occurrence of systems of differential equations that are too large for numerical computations or simulations is not uncommon. To overcome this issue, a variety of model order reduction methods have been developed. The main goal of some of these methods is to approximate a high-dimensional system with a much smaller one, which can be used effectively in computations but also estimates the input-output behavior of the original system well. In the process of modeling, linear time-invariant (LTI) systems of the forṁ with A ∈ R n×n , B ∈ R n×m , C ∈ R p×n , and D ∈ R p×m play an important role. Furthermore, u(t ) ∈ R m describes the input, y(t ) ∈ R p the output and x(t ) ∈ R n the state of the underlying system. An established method which is used to reduce this type of system is balanced truncation, which requires solving Lyapunov equations of the type AX + X A T + B B T = 0, as a central component. This type of equation appears in various applications [2,25,26], where the system matrix A is either small and dense or large and sparse.
When dealing with small Lyapunov equations, it is appropriate to use direct solvers like the Bartels-Stewart method [6] and Hammarling's algorithm [15], which are based on the Schur decomposition of the system matrix GAMMAS 2020 A. For large Lyapunov equations, it is usually too costly to compute the Schur decomposition. Additionally, it would not be possible to store the resulting dense solution matrix X ∈ R n×n . In order to avoid these issues, it is wise to use iterative solvers, such as the LR-ADI iteration, which compute low-rank factors Z ∈ R n× with n such that the solution is approximated via X ≈ Z Z T .
There exist a variety of software packages, such as SLICOT [20] and M.E.S.S. [8], which provide solvers for problems that occur in model order reduction. SLICOT focuses on Fortran 77 implementations which are relevant for applications in systems and control theory whereas M.E.S.S. has originally been developed to solve sparse matrix equations in MATLAB. A model order reduction framework written in the increasingly popular programming language Python is represented by the library pyMOR [5,18,23].
This paper revisits the LR-ADI iteration for solving Lyapunov equations and discusses aspects of implementing the corresponding algorithm in pyMOR respecting the library's design pattern, which enforces the abstraction of high-dimensional operations. The final implementation allows the library to work without depending on external solvers for large and sparse Lyapunov equations. Additionally, we discuss pyMOR's design paradigm, which allows developers to easily integrate external solvers for partial differential equations and extend operations, such that these can be used in parallel distributed systems. Finally, we compare the run time of external libraries and the implementation that is solely based on abstract operations available in the pyMOR environment with numerical experiments. For the comparison, we use C-M.E.S.S. and Py-M.E.S.S., which represent implementations of the M.E.S.S. library in C and a wrapper in Python, respectively.
In Section 2, we introduce balanced truncation, which serves as a motivation for solving Lyapunov equations. The LR-ADI iteration is discussed in Section 3, where we derive real low-rank formulations for the solution and the residual of Lyapunov equations. Additionally, we present appropriate choices for shift parameters and consider generalized Lyapunov equations. In Section 4, we introduce pyMOR and discuss practical aspects of implementing the LR-ADI iteration. Numerical experiments, which compare the run time of different implementations available in pyMOR, are presented in Section 5. Finally, we provide a summary and outlook in Section 6.
Throughout this paper we use the following notation: the symbol R − denotes the strictly negative real numbers, whereas C − represents the open left-half plane. The symbols Λ(A) and Λ(A, E ) denote the spectra of the matrix A and the pair of matrices (A, E ), respectively. Furthermore, diag(x 1 , . . . , x n ) is the n × n diagonal matrix which has the values x 1 , . . . , x n on its diagonal. For a complex matrix A ∈ C n×m we define A H := A T ∈ C m×n as the complex conjugated transposed matrix.

Balanced Truncation
The primary motivation for solving Lyapunov equations in the context of model order reduction is their relevance for the balanced truncation (BT) method. In this section, we shortly discuss BT based on [2]. Note that we focus on continuous-time LTI systems, which are asymptotically stable, hence satisfy Λ(A) ⊂ C − . This ensures that the resulting Lyapunov equations have a unique positive semidefinite solution. Additionally, we solely consider homogeneous initial conditions for systems of differential equations that occur within this article. Generally speaking, BT determines weakly observable and controllable states of a system and then eliminates those appropriately. In order to distinguish between weakly and strongly observable and controllable states, some measure for these properties needs to be provided. The reachability Gramian and observability Gramian serve exactly this purpose: Small eigenvalues of P and Q can be associated with weakly controllable and weakly observable states, respectively. In general, there is no guarantee that weakly controllable states coincide with those that are weakly observable. For this reason, the first step in BT consists of determining a matrix T ∈ R n×n such that where Σ 1 ∈ R r ×r and Σ 2 ∈ R (n−r )×(n−r ) . Using this transformation, the original system's realization, defined by the tuple can be brought into a balanced realization which describes the original dynamical system, but has the property that weakly controllable states are simultaneously weakly observable. In order to determine the transformation matrix T , it is crucial to compute the Gramians from equations (2) and (3). It can be shown [17] that the Gramians are solutions to the dual Lyapunov equations After computing the Gramians, Cholesky-like factorizations of the form P = Z P Z T P and Q = Z Q Z T Q allow the formulation of the singular value decomposition where Σ 1 ∈ R r ×r and the matrices L 1 and R 1 have r columns, respectively. Finally, we can define the trans- As a next step, we want to truncate the system such that states that can be associated with the singular values σ r +1 , . . . , σ n will be eliminated. Let therefore which define A := T L AT R , B := T L B , C := C T R and thus the reduced-order model

Low-Rank ADI Iteration for Lyapunov Equations
In practical applications, the coefficient matrix A of a Lyapunov equation is often sparse and the matrix B B T , which we refer to as the right-hand side, has a low rank (i.e., B ∈ R n×m with m n). In this case, it can be shown [3] that the solution of the underlying Lyapunov equation often has a low numerical rank as well. This serves as a motivation to approximate it with low-rank factors Z ∈ R n× such that n and X ≈ Z Z T . An iterative approach for solving Lyapunov equations, which takes advantage of these properties, is the LR-ADI iteration. In this section, we introduce the iteration for standard Lyapunov equations and later extend the resulting algorithm such that it can be applied to generalized Lyapunov equations where E ∈ R n×n is an invertible matrix.

ADI Iteration
The basis for deriving the LR-ADI iteration usually consists of formulating the two-step iteration [17,22] ( with complex shift parameters α i ∈ C − and an initial guess X 0 = 0 ∈ R n×n . We discuss an appropriate choice of shift parameters in Section 3.6. The above expression is equivalent to the single iteration step This formulation neither takes advantage of the righthand side's low rank nor does it compute the desired low-rank solution factors. In the following paragraphs, we derive a formulation of the ADI iteration, which tackles these issues, and summarize the results in a first algorithm.

Low-Rank Solution Factors
By rearranging (6) and considering X 0 = Z 0 Z H 0 with initial value Z 0 = [ ] ∈ C n×0 , we obtain the expression A single low-rank factor is then given by Repeatedly replacing Z j for j = i − 1, . . . , 0 on the righthand side of the expression above yields Considering the fact that the matrices (A ± αI ) ±1 and (A ± βI ) ±1 commute for arbitrary α, β ∈ C as long as the inverse exists [17], allows us to rearrange the factors occurring in the product T i · · · T 2 . Reversing the order of the shifts yields a more beneficial expression for (7) with where This leads to our first formulation of the LR-ADI iteration, which requires the solution of a shifted linear system with multiple right-hand sides in each step of the iteration and is presented in Algorithm 1. Note that m columns are added to the solution factor in each iteration step since V i ∈ C n×m . A fundamental issue with this algorithm is the lack of information about the quality of the approximation Z Z H , and in connection to that, it is unclear how many iteration steps s are necessary in order to obtain a sufficiently accurate approximation. A commonly used indicator for a high-quality approximation is a small Lyapunov residual Note that the Lyapunov equation is equivalent to a system of linear equations [2]. Due to the residual's relation to the error of the currently computed approximation [1], a stopping criterion for the LR-ADI iteration based on the residual is an obvious choice. Evaluating R i , as stated above, requires several large matrices to be multiplied. Since the computational effort for the multiplications is too large, we present a more favorable approach to evaluate the residual in the following subsection.

Low-Rank Residual Factors
Results presented in [10] have shown that not only the solution of a Lyapunov equation but also its residual has a low rank. In particular, it can be shown that the rank of the Lyapunov residual R i coincides with the rank of the This serves as a motivation to analyze low-rank residual In order to derive an expression for W i , we consider the following formula which is a result of subtracting the solution X from equation (6) Repeatedly plugging this into the expression Clearly, a single low-rank residual factor can then be written as Reconsidering the V i defined in (9a) allows us to derive the following expression by repeatedly replacing the V i −1 and afterwards rearranging the remaining factors appropriately Algorithm 2 LR-ADI Iteration Version 2 Finally, we obtain the formulation for the residual factors. The major benefit of expressing the residual as a product of two low-rank factors is that for several matrix norms it holds [10] Since W H i W i ∈ R m×m is a small matrix, eigenvalues can be cheaply computed, which allows fast evaluation of the matrix norm, for instance when using · 2 .
Thus, a stopping criterion for the LR-ADI iteration is provided by the relative residual where 0 < ε 1. Integrating this expression into the iteration leads to an improved version of Algorithm 1 summarized in Algorithm 2. Note that for a clearer presentation we consistently assume that the presented algorithms require exactly s iteration steps until the convergence criterion is fulfilled.

Real Low-Rank Factors
So far, we have considered the matrices W i and Z i to be complex, since the shift parameters {α 1 , . . . , α s } are complex as well. In [9], real formulations for these low-rank factors have been derived, based on the assumption that complex shifts only appear in pairs of complex conju- In the following, we assume that the LR-ADI iteration is in the k-th step and that the previously computed residual factor W k−1 and solution factor Z k−1 are realvalued. Letting complex shift parameters appear as pairs of complex conjugates exclusively, we now derive real formulations for W k and Z k or W k+1 and Z k+1 , respectively, by analyzing every relevant case that can occur within the iteration: 1. Case α k ∈ R − : Due to (10a) and (10b) it holds Since W k−1 is a real matrix and α k ∈ R − we obtain W k ∈ R n×m . Hence, equation (10b) also provides a real-valued representation for V k . Finally, from (8) we immediately obtain that the solution factor Z k is real-valued as well.
into its real and imaginary part yields According to our assumption we deduce which is equivalent to in the currently discussed case. Using α k+1 = α k and (9b), we obtain Finally, equation (11) yields where clearly W k+1 ∈ R n×m . Based on the derived real-valued expression for the residual factor, we can show that Z k+1 has a real representation as well. Let us therefore consider equation (8), which clearly yields By observing introducing the notation δ k := Im(α k ) , and considering (13) we obtain We can now extract a single real solution factor for a double iteration step, which is given by Using the fact that W 0 = B ∈ R n×m and considering Z 0 = [ ] ∈ R n×0 , we can use induction to show that we always obtain real-valued iterates W i and Z i when shift parameters appear in pairs of complex conjugates. In conclusion, the presented cases yield that real shift parameters always result in real residual and solution factors. Additionally, when a pair of complex shift parameters {α k , α k+1 = α k } occurs, a double step can be performed in the iteration, in order to obtain real-valued low-rank factors.
These results allow us to formulate the final version of the LR-ADI iteration for standard Lyapunov equations, which we summarize in Algorithm 3. Using real lowrank factors has two significant benefits: For one, real values require about half as much storage as complex Algorithm 3 LR-ADI Iteration Version 3 if Im(α i ) = 0 then 5: 10: 11: end if 14: end while 15: Z = Z s values and secondly, whenever a pair of complex shift parameters occurs, only one instead of two shifted linear systems has to be solved within the iteration in order to perform two steps.

Low-Rank ADI Iteration for Generalized Lyapunov Equations
As already pointed out, we want to extend Algorithm 3 such that it can be applied to generalized Lyapunov equations where E ∈ R n×n is an invertible matrix. This type of equation plays an important role when applying the BT method to generalized LTI systems of the type We formulate equation (15) as a standard Lyapunov equation We observe that applying Algorithm 3 to the transformed generalized Lyapunov equation yields the following linear system in Line 3 of the iteration

Algorithm 4 LR-ADI Iteration for generalized Lyapunov equations
if Im(α i ) = 0 then 5: 10: 11: where W 0 = EW 0 = E B = B . Using the derived expressions, we can formulate Algorithm 4, which is a slightly altered version of Algorithm 3 and can be applied to generalized Lyapunov equations.

Shift Parameters
A fundamental component of the iteration, which we did not discuss yet, is the choice of the shift parameters {α 1 , . . . , α s }. We shortly discuss approaches for computing such parameters, while focusing on generalized Lyapunov equations. It can be shown [17,24] that a superlinear convergence of the LR-ADI iteration can be achieved by using shifts with a negative real part. Furthermore, shift parameters that lead to a fast convergence can be obtained by solving the ADI shift parameter problem [27] {α 1 , . . . , α s } = arg min Since this min-max problem requires knowledge about the spectrum Λ(A, E ), the computational effort for computing exact solutions of equation (17) is too large in most cases. Nonetheless, several approaches that provide approximate solutions to the ADI shift parameter problem have been established. These are usually based on computing a certain amount of Ritz values of E −1 A and A −1 E , which can then be used to estimate a domain Ω ⊂ C − such that Λ(A, E ) ⊂ Ω. This allows for an approximation of equation (17) [17,24]. Alternatively, the Ritz values are used in heuristic approaches (e.g., in [22]) where (17) is evaluated by using several combinations of possible shift parameters.
A major drawback of these approaches is that several relevant parameters have to be specified. For instance, the amount of Ritz values that need to be computed in order to obtain high-quality shifts strongly depends on the underlying problem. A more user-friendly approach was introduced in [11], whereas further variants of the method have been discussed in [17]. The main idea of the approach is to use eigenvalues generated by lowdimensional subspaces as shift parameters, instead of computing Ritz values of large matrices such as E −1 A. Despite the lack of a deep theoretical understanding, these types of shift parameters have performed well in a variety of numerical experiments. Accordingly, we chose to implement variants of this approach for the pyMOR project and take a closer look at the approach in the following paragraphs. Note that we focus solely on variants that use the iterate V i , which allows for a clearer structure in the subsequent lines as well as in the implementation.
An initial set of shift parameters is provided by where the columns ofB form an orthonormal basis for span{B }. Since we consider B ∈ R n×m with m n, the orthonormal basis can be computed in an appropriate time frame. Additionally, it holds thatB T AB andB T EB are small square matrices, which allows for solving the underlying generalized eigenvalue problem in a negligibly small period of time as well. If none of the computed eigenvalues lie in C − , we use a randomly generated matrix Q and initialize the shifts with Note that the authors in [11] and [17] suggest to reflect unstable shift parameters across the imaginary axis instead of computing new shifts using the matrix Q. However, the implementation of the LR-ADI iteration in Py-M.E.S.S. is based on the variant which we stated first. In order to present a meaningful comparison of the pyMOR implementation and the Py-M.E.S.S. variant in Section 5, we decided to use the approach based on the projection with the randomly generated matrix Q in pyMOR. After k 1 steps of the iteration, new shifts can be computed by one of the following options using the iterates V i : with the columns of V k 1 as an orthonormal basis for span{Re(V k 1 ), Im(V k 1 )}.

Use {α
with the columns of V k 1 (u) as an orthonormal basis for span{V k 1 (u)} and If none of these variants yield new shifts, the previous set of shift parameters can be reused. In the case u ≥ k 1 , we use V k 1 (u) := [V 1 , . . . ,V k 1 ]. Note that in the second approach it is possible to use the last um columns added to the solution factor instead of using stored values of the iterates V i . In the case that α k 1 −u−1 = α k 1 −u , it is wise to consider the last u(m + 1) columns instead.

Software and Implementation
Using the rather theoretical results summarized in the previous sections, we discuss the implementation of the LR-ADI iteration in the following paragraphs. In the course of this, we introduce the pyMOR framework and briefly consider practical aspects of the implementation.

pyMOR
The software library pyMOR provides a framework for model order reduction applications in the programming language Python. Amongst others, reduced basis methods and system-theoretic model order reduction approaches such as balanced truncation have been implemented. One of the central design goals of pyMOR is the realization of model order reduction algorithms such that external libraries for solving partial differential equations (PDE) can be easily integrated into the library. In connection to that, the implementations should be formulated in a generic way such that the algorithms do not have to be reimplemented for each individual PDE solver.
In order to achieve these goals, all implementations in pyMOR follow a strict design paradigm, which enforces the usage of abstract interfaces for any high-dimensional objects (e.g., operators, vectors, discretizations, . . . ) that occur in the process of model order reduction. In the currently discussed context, interfaces are realized via abstract base classes (ABCs Accordingly, there exist implementations, so-called specializations of the underlying ABCs, that specify the behavior of Operator and VectorArray objects for each of these solvers. Most importantly, algorithms that are implemented using the previously mentioned interface classes can afterward be executed with any suitable specialization of the underlying ABC. Thus, model order reduction algorithms do not have to be implemented for every single external PDE solver, but only once using the interface classes for operators and vectors. In the following, we refer to these implementations as abstract implementations. Furthermore, support for new PDE solvers can be easily added by specifying a few specializations for operators and vectors. Another benefit of the presented approach evolves around the parallelism of pyMOR's implementations. Since the specializations for the highdimensional objects from the PDE libraries are based on interfaces provided by just these, the responsibility for a parallel implementation is transferred to the external library. Additionally, there exist operators in pyMOR, which easily enable parallel distributed computations with pyMOR.

Low-Rank ADI Iteration in pyMOR
In order to respect the pyMOR design paradigm introduced in the previous subsection, implementations for the library have to preserve the high level of abstraction. For the LR-ADI iteration, two abstract pyMOR components play an important role. One of them is the OperatorInterface with which the matrices A and E from Algorithm 4 can be implemented. In order to deal with the right-hand side B , the iterates V i and W i , as well as the solution factor Z i , the VectorArrayInterface can be used.
An algorithm that has already been integrated into the pyMOR library is the Gram-Schmidt Orthonormalization Method. The algorithm can be applied to specializations of the VectorArrayInterface and plays an important role when generating shift parameters, as suggested in Section 3.6. Another question which arises in the process of implementing the LR-ADI iteration is, how to deal with standard and generalized Lyapunov equations, respectively. Using the IdentityOperator provided by pyMOR, standard Lyapunov equations can be solved with Algorithm 4 by simply replacing the matrix E with the identity. When applying the IdentityOperator to an object, it will be returned without further computations. Besides all of the high-dimensional operations, there exist a few properties within the algorithm which can not be computed using abstract implementations available in pyMOR. Examples for these properties are the eigenvalues, which are crucial in the process of generating shift parameters. Additionally, the spectral norm needs to be computed in order to evaluate the stopping criterion of the iteration. Since these computations use small matrices (e.g.,B T AB ,B T EB ,W T i W i ∈ R m×m , m n), non-abstract implementations can be used without violating the design standards. For the mentioned problems, the popular libraries NumPy and SciPy with the suitable functions numpy.linalg.norm and scipy.linalg.eigvals can be used.

C-M.E.S.S. and Py-M.E.S.S.
M.E.S.S. [8] is the successor of the Lyapack toolbox for MATLAB, which has been developed to solve large, sparse matrix equations. A version of the M.E.S.S. library, written in the programming language C, is represented by C-M.E.S.S., which in particular provides an implementation of the LR-ADI iteration. A beneficial property of Python is that C code can be easily executed from Python scripts. Py-M.E.S.S. takes advantage of this property and represents a link between Python and the C-M.E.S.S. implementation. Accordingly, the LR-ADI iteration can either be realized using a pure pyMOR implementation or by accessing the C-M.E.S.S. library via Py-M.E.S.S.. The latter approach has already been integrated into the pyMOR library. In order to be consistent with the pyMOR design, C-M.E.S.S. is forced to use the functions for multiplying matrices and solving linear systems which are provided by the operators defined in the pyMOR library. This approach requires transferring matrices between Py-M.E.S.S. and pyMOR (i.e. VectorArrays) in each step of the iteration, which potentially becomes expensive for larger dimensions. Accordingly, we discuss a few numerical experiments which compare the run time of a pure pyMOR approach with the abstract Py-M.E.S.S. implementation.

Numerical Experiments
Since the performance of pyMOR and C-M.E.S.S. greatly depends on implementations of external software components like BLAS and LAPACK, we used the same versions of relevant software for all numerical tests. In  Table 1. Additionally, hardware components are just as relevant for the outcome of the experiments as the software versions. Accordingly, the used hardware is summarized in Table 2.
The experiments introduced in the following subsections were executed five times, where we consistently considered the shortest run time for the comparisons. The iteration terminated once the relative Lyapunov residual specified in equation (12) was lower than 10 −10 .
Note that both approaches discussed in the subsequent paragraphs require the same amount of iteration steps and use shift parameters generated by projections using the iterate V i as described in Section 3.6. Additionally, we focus on the equations introduced in (4) and (5)

Standard Lyapunov Equations
For the first experiment, we consider the heat equation on a one-dimensional segment [5] ∂ ∂t with the outputŷ(t ), input u(t ) and the following boundary conditions Applying the finite differences method using central differences yields the LTI systeṁ defined by matrices A ∈ R n×n , B ∈ R n×1 and C ∈ R 1×n , which we use to test the implementations for the LR-ADI iteration. In the following, the values which define the size of the system's matrices lie between n = 2 000 and n = 300 000.
In Figure 1(a), we see that the run times for both approaches behave very similarly. Overall, the interface to the Py-M.E.S.S. implementation is superior in this experiment. The largest time difference occurs at n = 50 000 where Py-M.E.S.S. is 1.24 seconds faster than the pure pyMOR approach. Only for n = 140 000, we can observe a better performance of the new implementation with an insignificant benefit of 0.08 seconds. Looking at the relative run time presented in Figure 1(b), we observe that Py-M.E.S.S. is at most 61% faster than the pyMOR implementation. After this peak at n = 30 000, the relative run time continuously decreases.
The implementation based on Py-M.E.S.S. requires copying the iterate V i and the residual factor W i in each step of the iteration. Additionally, the solution factor Z i is copied once after the algorithm terminates. Since the matrix B , and thus V i and W i , consist of a single column in this experiment, the amount of data that needs to be copied is relatively low. This serves as an explanation for the good performance of the Py-M.E.S.S.
approach. Furthermore, an observation worth noting is that the relative run time clearly approaches 1 for large problem dimensions n. We can explain this behavior by considering the cache size mentioned in Table 2: For large n, either implementation's performance is limited by the amount of available memory, and thus, similar run times are achieved by both approaches when considering large problem dimensions. Another fact that explains this observation is that the pure pyMOR approach does not require copies, in contrast to the Py-M.E.S.S.
variant. This drawback has a large impact on the performance, especially when considering high-dimensional problems. Nonetheless, the question remains as to why the implementation based on Py-M.E.S.S. performs better in the presented experiment. Regarding the answer to this question, the most relevant aspect evolves around the parallelism in the presented implementations: The Py-M.E.S.S. variant uses a single thread throughout the entire computation, whereas the pyMOR implementation operates with multiple threads. Especially for small problem dimensions, the initialization of multiple threads creates a relatively large overhead, which outweighs the performance gained through the parallelization in this experiment. For larger problems, the iteration does not significantly benefit from parallelization either, due to the previously mentioned fact that the performance is limited by the available memory. Finally, these connections serve as an explanation for the resulting run times.

Symmetric Generalized Lyapunov Equations
The numerical experiments in this subsection are based on the Steel Profile benchmark from [7,21]. In this benchmark, an LTI system of the type occurs in the process of modeling optimal cooling of steel profiles. For the system matrices it holds E = E T , A = A T ∈ R n×n and B ∈ R n×7 . Note that this benchmark is only available with a few different values for n. Accordingly, we present all of the resulting run times in Table 3.
Looking at the results, we observe that the approach Note that in each step of the iteration, a fixed number of columns is added to the solution factor Z i , which also needs to be copied once the algorithm terminates.

Unsymmetric Generalized Lyapunov Equations
In this subsection, an experiment based on the oscillator model with n ∈ N masses and three dampers introduced in [26] is discussed. In this model, a second order LTI system of the type with M , D, K ∈ R n× n , F ∈ R n× m and C 1 ,C 2 ∈ C p× n occurs which can be written as an equivalent generalized firstorder LTI system of the order n = 2 n as described in equation (16). In our case, the matrices defining the Lyapunov equation are given by Other than stated in [26], the matrices F and D are defined as and D = 0.02M + 0.5K ∈ R n× n , in our experiments, which allows for a more convenient implementation of the model. Note that every entry in 1 ∈ R n−1 3 has the value 1 and it holds n−1 3 ∈ N due to the special structure of the system. Furthermore, we used values between n = 1 502 and n = 300 002 for the tests.
Analogous to Section 5.1, the run times for both variants behave similarly, with the significant difference that  Overall, the resulting run times can be explained once again by considering the amount of copies that the approach based on Py-M.E.S.S. needs to perform. For one, the matrices V i and W i have more columns in the currently discussed case compared to the experiment performed in Section 5.1. Additionally, significantly more iteration steps were necessary in this experiment in order to achieve convergence of the LR-ADI iteration. Since copies are performed in each step of the algorithm and the solution factor Z i , which needs to be copied as well, becomes larger in every iteration step, the Py-M.E.S.S.-based implementation clearly has a disadvantage in this setting.

Conclusion
The LR-ADI iteration is an efficient approach to approximate the solution of Lyapunov equations, which benefits from low-rank formulations for the solution and residual. Additionally, the method benefits from real formulations for the low-rank factors via double iteration steps. Regarding the pyMOR library, an implementation based on Py-M.E.S.S. only performs well for Lyapunov equations where the matrix B has few columns, and the amount of iteration steps needed for convergence is rather low. Additionally, there are a few more options available in Py-M.E.S.S., which enable some fine-tuning of the LR-ADI iteration for certain problems. Another type of matrix equation which occurs in model order reduction and plays an important role in linearquadratic Gaussian balanced truncation [16] is the algebraic Riccati equation Since pyMOR relies on Py-M.E.S.S. for solving these equations, an algorithm that approximates low-rank solutions for these, using the interface-based algorithms available in pyMOR, could be integrated in future development.