Basics of Geometric Multigrid¶
Introduction¶
In the following, we are describing the geometric multigrid method, which for certain problems yields an iterative solver with optimal cost complexity, i.e. the solver returns a solution to a PDE in \(O(n_{\text{DoFs}})\) arithmetic operations. We will show that this can also be achieved for some reaction-diffusion equations on uniformly refined triangular meshes, when discretizing with linear finite elements. The Python code for this project can be found at https://github.com/mathmerizing/MultigridPython.
Problem setup¶
Let \(\Omega := (-1,1)^2 \setminus (0,1)^2\) be an L-shaped domain. We decompose the boundary of this domain \(\partial\Omega\) into the Dirichlet boundary \(\Gamma_D := (0,1) \times \{0\} \cup \{0\} \times (0,1)\) and the Neumann boundary \(\Gamma_N := \partial\Omega \setminus \Gamma_D\).
Next, we define the ansatz and test function space \(V := \left\{ u \in H^1(\Omega)\mid u = 0 \text{ on } \Gamma_D \right\}\), where
is the Sobolev space containing the weakly differentiable functions in \(\Omega\). Note that we haven’t explicitly prescribed any boundary conditions on \(\Gamma_N\) in the function space, since the homogeneous Neumann boundary conditions come up naturally when deriving the variational form of the problem. Further, we need a right hand side function
Figure 1: Domain \(\Omega\)
Using the parameters \(a = 1\) and \(c = 0\) in \(\Omega\), we can now formulate the strong form of our reaction-diffusion equation:
Strong form
Find \(u: \Omega \rightarrow \mathbb{R}\) such that
In the above formulation, we used the notation \(\partial n := \nabla \cdot n\). To be able to solve this problem, we need to convert it into its integral form. Therefore we multiply (1) from the right with a test function \(v \in V\) and integrate over \(\Omega\).
Now integration of parts can be applied to the first integral and we use the fact that \(\partial \Omega = \Gamma_D\ \dot\cup\ \Gamma_N\).
Note that the integrals over the boundaries \(\Gamma_D\) and \(\Gamma_N\) vanish, since \(\partial_n u = 0\) on \(\Gamma_N\) and, due to \(v \in V\), \(u = 0\) on \(\Gamma_D\). Thus we have derived the following integral problem of our problem, which is often referred to as the weak or variational form in the literature.
Weak form
Find \(u \in V\) such that
where \(a: V \times V \rightarrow \mathbb{R}\) is the bilinear form defined as
and the right hand side \(l: V \rightarrow \mathbb{R}\) is a linear form defined as
Furthermore using the fundamental lemma of calculus of variations, it can be shown that the strong and the weak form are equivalent. Hence it suffices to solve the weak form of the problem.
Finite Element Method¶
The problem that we are facing is that \(V\) is an infinite dimensional function space and we need the ability to solve the weak form with a classical computer. Hence instead we work with a finite dimensional subspace \(V_h \subset V\). This will enable us to rewrite the weak form as a linear equation system.
Let a subdivision of \(\Omega\) into finite elements \((K,P_1,\Sigma)\) be given, where
\(K\) is a two dimensional right triangle,
\(P_1(K) := \operatorname{span}\{1-x_1-x_2, x_1, x_2 \}\) is the space of linear functions defined on \(K\),
\(\Sigma := \{a_0, a_1, a_2 \}\) is a set of degrees of freedom (DoF), which here are the values of the polynomial at the vertices of \(K\).
Then a \(P_1(K)\) function is defined by
To recapitulate: First, we have divided \(\Omega\) into triangles \(K_1, ..., K_m\). Examples for this can be found in the section “Grid Setup”. Secondly, we have seen that we have the parameters (DoFs) which can describe any linear function on such a triangle \(K_k\). Now simply define our function space \(V_h\) as the space of functions which are continuous on the whole domain \(\Omega\), linear on each triangle \(K_k\) and satisfy the Dirichlet boundary conditions, i.e.
We use the index \(h\) to show that we are not longer using the infinite dimensional function space \(V\), but a finite dimensional subspace which is defined on triangles \(K_k\) where the short sides have length \(h\). By working with \(V_h\), we now try to find an element-wise linear approximation to the solution of the weak form.
Thus we are now trying to solve the discrete weak form:
Discrete weak form
Find \(u_h \in V_h\) such that
Furthermore, we know that \(V_h\) is finite dimensional and we can write down its basis, since we know the bases of \(P_1(K_k)\). Hence
where \(\phi_i\) is the basis function corresponding to the i.th degree of freedom, i.e. the i.th grid point. It follows that
for some \(\begin{pmatrix}u_1, \dots u_{n_{DoFs}}\end{pmatrix}^T, \begin{pmatrix}v_1, \dots v_{n_{DoFs}}\end{pmatrix}^T \in \mathbb{R}^{n_{DoFs}}\). Therefore the discrete weak form can be written as
Since \(a\) is linear in the second argument and \(l\) is also linear, it is thus sufficient to solve
The reaction-diffusion problem is linear itself, thus \(a\) is also linear in the first argument and we get
This can also be written as a linear equation system
To remain consistent with future chapters, we follow the naming convention
To be able to solve \(A_h x_h = b_h\), we need an efficient way to compute \(a\left(\phi_i,\phi_j \right)\) and \(l\left(\phi_j \right)\). For that we use that \(\Omega = \cup_{k = 1}^{n_{DoFs}}K_k\) and we thus get
similarly we get for the right hand functional
Note that many of these integrals are zero, since the basis functions \(\phi_i\) only have support on the triangles that contain the vertex corresponding to the i.th degree of freedom. Furthermore, we use the isoparametric concept that allows us to assemble the system matrix \(A_h\) and right side \(b_h\) by once computing integrals on a reference element \(\hat{K}\) and then transforming the results to the elements \(K_k\).
Figure 2: Transformation from reference element to finite element
We now apply the transformation theorem
to all integrals that need to be evaluated in the discrete weak form.
Then the algorithm for the assembly is given by
where \(\hat{\phi}_i = \phi_i \circ T_k\) are the basis functions on the reference element.
At the end of the assembly, we need to account for the Dirichlet boundary constraints, e.g. let a constraint \(u_{\tau} = \xi\) be given. Then we would need to make sure that \((A_h)_{\tau,j} = \delta_{\tau,j}\) for all \(1 \leq j \leq m\). Here \(\delta_{\tau,j}\) denotes the Kronecker delta, which is defined as
Futhermore, we would need to set \((b_h)_{\tau} = \xi\). Obviously these steps ensure that when solving the linear system \(A_h x_h = b_h\), we get \(u_{\tau} = \xi\). In our model problem, we only have homogeneous Dirichlet constraints. Thus, we only need to find all indices \(\tau\) at the Dirichlet boundary \(\Gamma_D\) and apply the procedure from above with \(\xi = 0\).
Overall, the Finite Element Method enabled us to transform a discrete form of the reaction-diffusion equation on a given grid into a linear equation system. In the following, we will investigate how such a linear equation system can be solved iteratively.
Iterative Methods¶
We want to construct an iteration, where each iterate \(x_k^{k}\) is a better approximation to the linear equation system \(A_h x_h = b_h\). To measure the quality of our solution, we monitor the defect
and try to minimize it. One can try to formulate a fixed point scheme \(x_h^{k+1} = g\left(x_h^{k}\right)\) to solve the system of equations. The goal of the fixed point scheme is to find some input \(x\) such that \(g(x) = x\). In our case, we want the exact solution \(x_h\) to be a fixed point of our iteration. We observe that the defect of the exact solution is zero. Thus one might try to increment the old iterate \(x_h\) by some multiple of the defect. This is called the Richardson method. However, the Richardson method is rarely used in practice. Instead we will work with the more general fixed point scheme
where \(C \in \mathbb{R}^{n_{DoFs} \times n_{DoFs}}\). Here the type of method depends on the matrix \(C\), e.g. \(C = I\) is the Richardson method. In our code, we implemented
\(C = D\) which is the \(\omega\)-Jacobi method,
\(C = \left( D + L \right)\) which is the Forward Gauss-Seidel method for \(\omega = 1\),
\(C = \left( D + U \right)\) which is the Backward Gauss-Seidel method for \(\omega = 1\).
In these definitions, we used the decomposition \(A_h = L + D + U\), where \(L\) has only nonzero entries below the diagonal (strictly lower triangular matrix), \(D\) has only nonzero entries on the diagonal (diagonal matrix) and \(U\) has only nonzero entries above the diagonal (strictly upper triangular matrix).
Note that we need to compute the inverse matrix \(C^{-1}\). This can be easily done for \(\omega\)-Jacobi, since we just invert the diagonal. It wouldn’t be efficient to invert \(\left( D + L \right)\) or \(\left( D + U \right)\) directly. Hence we use the formula
for Forward Gauss-Seidel and the formula
for Backward Gauss-Seidel. In these formulas, we used the notation \(a_{ij} := (A_h)_{ij}\), \(b_{i} := (b_h)_{i}\) and \(x_i^{k} := (x_h^{k})_i\).
Which of these iterative methods should be used in numerical computations? It depends! \(\omega\)-Jacobi has the benefit of being fast, since it can be parallelized. Nevertheless, it needs more iterations to converge than Gauss-Seidel and one needs to choose a good value for \(\omega\) before computation. Although the Gauss-Seidel methods converge in less iterations, they need longer for the computation, since the for loops need to be executed sequentially.
In the next few sections, we will show how \(\omega\)-Jacobi and Gauss-Seidel can be used in the multigrid method, resulting in a fast solver for the linear equation system derived from a discrete weak form of the reaction-diffusion equation.
Grid Setup¶
To transform the weak form of our problem into a linear equation system,
we first need to discretize our domain. For that purpose, we create an initial triangulation of the domain,
i.e. we divide \(\Omega\) into a set of triangles. We call this triangulation the coarse grid and denote it as \(\mathbb{T}_0\).
The grid consists of objects of type Node
, Edge
and Triangle
.
Figure 3: Coarse grid (\(\mathbb{T}_0\))
For the multigrid method, we need a sequence of such grids. In this work, we restrict our analysis to uniformly refined meshes. How can we create these refined meshes? We have to loop over all triangles of the grid and then refine them.
Figure 4: Refining a triangle
To refine a triangle one simply needs to bisect all of its edges and draw a new triangle out of these three new nodes.
As shown in figure 4, through the refinement process a triangle is being divided
into four smaller triangles. Each Node
object needs to know its parent nodes.
The parents are the two end nodes of the edge that has been bisected, e.g. node 1 and node 2 are the parents of node 4.
In the literature [1] these relationships are being stored in a father-son list.
This is not needed in our case, due to Object Oriented Programming (OOP).
Having refined all triangles of the coarse grid, we get a new triangulation \(\mathbb{T}_1\), which is called the grid on level 1. The level of a grid indicates how often we need to (globally) refine the coarse grid to construct that grid.
Figure 5: Grid on level 1 (\(\mathbb{T}_1\))
We continue the process of refining the grid, until we end up with a grid, which has enough nodes to ensure that a sufficiently good approximation to the exact solution can be computed.
Figure 6: Grid on level 2 (\(\mathbb{T}_2\))
The grid on the highest level, in this case \(\mathbb{T}_2\) or more generally \(\mathbb{T}_L\), is called the finest grid and will be used to assemble the system matrix.
Using the Finite Element Method, we can discretize the weak form of our PDE on each level grid with linear finite elements. For each level \(0 \leq l \leq L\), we get a linear equation system
where \(n_l\) is the number of degrees of freedom (DoFs), which in our case corresponds to the number of nodes in the grid. Note that the discrete function spaces \(\left( V_l \right)_{l=0}^L\) from the FEM are conforming finite element spaces, i.e. \(V_0 \subset V_1 \subset \cdots \subset V_L\). If this wasn’t the case, the grid transfer operations, which will be introduced shortly, would need to be modified.
Two-grid algorithm¶
To understand the multigrid algorithm we start by looking at the case where we only have two grids \(\mathbb{T}_{l}\) and \(\mathbb{T}_{l+1}\). The mulitgrid algorithm is then only a recursive application of the two grid version.
Two-grid algorithm
Let \(A_h x_h = b_h\) and \(A_{2h} x_{2h} = b_{2h}\) with \(A_h \in \mathbb{R}^{n \times n}\), \(A_{2h} \in \mathbb{R}^{m \times m}\) and \(m < n\) denote the linear equation systems from the grids \(\mathbb{T}_{l+1}\) and \(\mathbb{T}_{l}\). Let the k-th iterate \(x_h^k\) on the finer grid be given.
Hint
In most cases we want the two-grid method to be a symmetric iteration. Therefore we need \(\nu := \nu_1 = \nu_2\) and \(S := S_1 = S_2^\ast\) [4], e.g. choose \(S_1\) as forward Gauss-Seidel and \(S_2\) as backward Gauss-Seidel. Alternatively we have also implemented the \(\omega\)-Jacobi method which can be used for pre- and post-smoothing. Furthermore \(A_{2h}^{-1}d_{2h}\) is not feasible to compute with a direct solver if \(A_{2h}\) is too large, which is often the case. Thus \(A_{2h}^{-1}d_{2h}\) can be understood as solving the linear equation system and can be done for example by another two-grid method. This recursion then produces the multigrid algorithm.
Multigrid algorithm¶
Multigrid algorithm
Let \(A_L x_L = b_L\) denote the problem on the finest grid and \(A_l x_l = b_l\) the problems on the coarser grids for \(0 \leq l \leq L-1\). Let \(\nu\) denote the number of pre- and post-smoothing steps. Let the k-th iterate \(x_l^k\) on the l-th level be given.
The parameter \(\mu \in \mathbb{N}^+\) determines the cycle of the multigrid iteration. For \(\mu = 1\) we get the V-cycle
Figure 7: V-cycle
and for \(\mu = 2\) we get the W-cycle.
Figure 8: W-cycle
In the figures of these schemes, white circles stand for \(\nu\) steps of an iterative solver, black circles represent a direct solver, blue arrows illustrate a restriction and green arrows illustrate a prolongation.
Grid transfer¶
As we have seen in the previous sections, the multigrid algorithm requires the ability to prolongate vectors from \(\mathbb{R}^{n_{l}}\) to \(\mathbb{R}^{n_{l-1}}\). We will only show how the grid transfer operations work for conforming finite elements. For information on how to deal with non-conforming finite element spaces, please refer to [3]. Let \(\left\{\varphi_1^l, \dots, \varphi_{n_{l}}^l\right\}\) and \(\left\{\varphi_1^{l-1}, \dots, \varphi_{n_{l-1}}^{l-1}\right\}\) be some given bases of \(V^l\) and \(V^{l-1}\). Due to the conformity of the finite element spaces, \(V^{l-1} \subset V^l\) holds and there exists a matrix \(I^{l-1}_l \in \mathbb{R}^{n_{l-1} \times n_l}\) such that
The matrix \(I^{l-1}_l\) is called the restriction matrix and its transpose \(I^{l}_{l-1} = \left(I^{l-1}_l\right)^T\) is called the prolongation matrix. These matrices are dependent on the finite elements that are being used and on the way that the grids have been refined. They have only very few non-zero entries and thus are stored as sparse matrices. Furthermore, they also have a significant impact on the rate of convergence of the multigrid algorithm [3]]. Additionally, for linear partial differential equations the identity
is fulfilled. Before taking a look at the actual implementation in our code, it is helpful to a see how the the grid transfer works for one dimensional linear finite elements.
Grid transfer in 1D
Figure 9: Basis functions on the first two levels
It holds
Consequently the restriction matrix reads
We decided to follow these rules to create the interpolation matrix \(I_{l-1}^l\):
if the i.th node already exists on level \({l-1}\), then \(\left(I_{l-1}^l\right)_{i,i} = 1\)
else get the indices of the parents of the i.th node, then \(\left(I_{l-1}^l\right)_{i,\text{parent}_1} = \frac{1}{2}\) and \(\left(I_{l-1}^l\right)_{i,\text{parent}_2} = \frac{1}{2}\)
The rest of this matrix is filled with zeros. Doing this for \({l=1}\) yields the prolongation matrix
For our kind of prolongation matrix, the restriction matrix is given by
At this point, we should also mention how the boundary conditions can be applied in the multigrid algorithm. The start vector of this method should be the zero vector on which the Dirichlet boundary conditions should be applied. Note that Dirichlet boundary conditions need to be applied to the output vectors of the prolongation and restriction, but here all Dirichlet boundary conditions are set to be homogeneous.
References¶
Sven Beuchler. Lecture notes in ‘Multigrid and domain decomposition.’ April 2020.
Thomas Wick. Numerical Methods for Partial Differential Equations. 2020. URL: https://doi.org/10.15488/9248.
Dietrich Braess. Finite Elemente. Springer Berlin Heidelberg, 2013. DOI: 10.1007/978-3-642-34797-9. URL: https://doi.org/10.1007%2F978-3-642-34797-9.
Chao Chen. “Geometric multigrid for eddy current problems”. PhD thesis. 2012.
Julian Roth. “Geometric Multigrid Methods for Maxwell’s Equations”. Bachelor thesis. 2020.
Thomas Richter and Thomas Wick. Einführung in die numerische Mathematik - Begriffe, Konzepte und zahlreiche Anwendungsbeispiele. Springer, 2017.