MUQ  0.4.3
FlowEquation.cpp
Go to the documentation of this file.
1 /***
2 
3 ## Overview
4 The algorithms in MUQ leverage models that are defined through the `ModPiece` and `WorkGraph` classes. This example demonstrates how to implement a model with gradient and Hessian information as a child of the `ModPiece` class. A finite element implementation of the groundwater flow equation is employed with adjoint techniques for computing gradients and Hessian actions.
5 
6 One dimensional steady state groundwater flow in a confined aquifer can be modeled using the following PDE
7 $$
8 \begin{aligned}
9 -\frac{\partial}{\partial x}\left[ k(x) \frac{\partial h}{\partial x}\right] &= f(x)\\
10 h(0) &= 0\\
11 \left. \frac{\partial h}{\partial x}\right|_{x=1} &= 0,
12 \end{aligned}
13 $$
14 where $h(x)$ is the pressure (more generally the hydraulic head), $k(x)$ is a spatially distribution parameter called the hydraulic conductivity, and $f(x)$ is a source term. For simplicity, we assume $x\in\Omega$, where $\Omega=[0,1]$. While we will interpret this PDE through the lens of groundwater flow, the same equation arises in many disciplines, including heat conduction and electrostatics.
15 
16 <img src="DarcyFlow.png" style="float:right"></img>
17 
18 Given a particular conductivity field $k(x)$, the PDE above can be used to compute the pressure $h(x)$. We will leverage this relationship to construct a `ModPiece` that takes a vector defining $k(x)$ as input and returns a vector defining $h(x)$.
19 
20 At the end of this example, the finite element flow equation model is combined with a simple Gaussian density to imitate a Bayesian likelihood function. The Hessian of this likelihood function is studied by computing the Hessian's spectrum using MUQ's stochastic eigenvalue solver.
21 
22 
23 ## Finite Element Formulation
24 Here we derive the finite element discretization as well as the adjoint equations for gradient and hessian operations. *This section focuses exclusively on the details of the finite element discretization. Nothing MUQ-specific is discussed until the `Implementation` section below.* Readers familiar with finite element methods can skip ahead to that section with the knowledge that linear basis functions are used for representing $h(x)$, piecewise constant basis functions are used for $K(x)$, and all integrals are computed analytically.
25 
26 #### Forward Model Discretization
27 To discretize the flow model PDE with a finite element method, we first need to consider the PDE in weak form. Let $\mathcal{H}$ denote a Hilbert space of functions on $\Omega$ that satisfy the Dirichlet boundary condition $h(0)=0$. For any function $p\in\mathcal{H}$, we have from the PDE that
28 $$
29 \int_\Omega p(x) \left[\frac{\partial}{\partial x}\left[ k(x) \frac{\partial h}{\partial x}\right] + f(x) \right] dx = 0
30 $$
31 Using integration by parts and applying the boundary conditions, we obtain the weak form of the flow equation, which is given by
32 $$
33 \int_\Omega k \frac{\partial p}{\partial x} \frac{\partial h}{\partial x} dx - \int_\Omega f p dx= 0.
34 $$
35 Our goal is to find the function $h$ such that this expression is satisfied for all functions $p$. For this to be computationally feasible, we first need to represent the functions $h(x)$ and $p(x)$ through a finite number of basis functions. Using a standard Galerkin discretization, we represent both $h$ and $p$ with the set of basis functiosn $\phi^1_i(x)$:
36 $$
37 \begin{aligned}
38 h(x) & = \sum_{i=1}^{N+1} h_i \phi^1_i(x)\\
39 p(x) & = \sum_{i=1}^{N+1} p_i \phi^1_i(x),
40 \end{aligned}
41 $$
42 where $h_i,p_i\in \mathbb{R}$ are weights on the basis function $\phi^1_i(x)$. Here, we consider linear basis functions (also known as "hat" functions) given by
43 $$
44 \phi^1_i(x) = \left\{ \begin{array}{ll} \frac{x-x_{i-1}}{x_i-x_{i-1}} & x\in[x_{i-1},x_i]\\ \frac{x_{i+1}-x}{x_{i+1}-x_i} & x\in(x_i,x_{i+1}] \\ 0 & \text{otherwise} \end{array} \right.,
45 $$
46 where $[x_0,x_1,\ldots,x_{N+1}]$ is an increasing sequence of node locations splitting the domain $\Omega$ into $N$ cells. We will further assume uniform node spacing is constant so that $x_{i+1}-x_i = \delta x$ for all $i$.
47 
48 A similar approach is used to discretize the conductivity $k(x)$ and source term $f(x)$. In particular, we let
49 $$
50 \begin{aligned}
51 k(x) &= \sum_{i=1}^N k_i \phi^0_i(x)\\
52 f(x) &= \sum_{i=1}^N f_i \phi^0_i(x),
53 \end{aligned}
54 $$
55 but consider piecewise constant basis functions
56 $$
57 \phi^0_i(x) = \left\{\begin{array}{ll} 1 & x\in[x_{i},x_{i+1})\\ 0 & \text{otherwise} \end{array}\right.
58 $$
59 so that $k(x)$ and $f(x)$ are piecewise constant. With these discretizations in hand, the weak form becomes
60 $$
61 \begin{aligned}
62 \int_\Omega k \frac{\partial p}{\partial x} \frac{\partial h}{\partial x} dx - \int_\Omega f p dx & = \int_\Omega \left(\sum_{n=1}^N k_n \phi_n^0(x) \right) \left(\sum_{i=1}^{N+1}p_i \frac{\partial \phi^1_i}{\partial x}\right) \left(\sum_{j=1}^{N+1}h_j \frac{\partial \phi^1_j}{\partial x}\right) dx - \int_\Omega \left(\sum_{n=1}^N f_n \phi_n^0(x) \right) \left(\sum_{i=1}^{N+1}p_i \phi^1_i \right) dx.
63 \end{aligned}
64 $$
65 For this quantity to be zero for all functions $p\in\text{span}(\{\phi_1^1,\ldots,\phi_{N+1}^1\})$, we require that
66 $$
67 \begin{aligned}
68 & \int_\Omega \frac{\partial \phi^1_i}{\partial x} \left(\sum_{n=1}^N k_n \phi_n^0(x) \right) \left(\sum_{j=1}^{N+1}h_j \frac{\partial \phi^1_j}{\partial x}\right) dx - \int_\Omega \phi_i^1 \left(\sum_{k=1}^N f_k \phi_k^0(x) \right) dx &= 0\\
69 \Rightarrow & \sum_{j=1}^{N+1}\sum_{n=1}^N \left[ h_j k_n \int_\Omega \phi_n^0(x) \frac{\partial \phi^1_i}{\partial x} \frac{\partial \phi^1_j}{\partial x} dx - f_n \int_\Omega \phi_i^1(x)\phi_n^0(x) dx \right] & = 0
70 \end{aligned}
71 $$
72 for all $i$. Notice that this defines a linear system of equations $\mathbf{A} \mathbf{h} = \mathbf{f}$, where $\mathbf{h}=[h_1,\ldots,h_{N+1}]$ is a vector of coefficients, $\mathbf{A}$ is a matrix with components given by
73 $$
74 \mathbf{A}_{ij} = \sum_{n=1}^N k_n \int_{x_n}^{x_{n+1}} \frac{\partial \phi^1_i}{\partial x} \frac{\partial \phi^1_j}{\partial x} dx,
75 $$
76 and
77 $$
78 \mathbf{f}_i = \sum_{n=1}^N f_n \int_{x_n}^{x_{n+1}} \phi_i^1(x) dx.
79 $$
80 
81 Notice that the matrix $\mathbf{A}$ is symmetric. This has important consequences for the adjoint and incremental adjoint problems described below.
82 
83 To approximately solve the flow equation, we need to construct the stiffness matrix $\mathbf{A}$ and the right hand side vector $\mathbf{f}$ before solving $\mathbf{h} = \mathbf{A}^{-1}\mathbf{f}$ to obtain the coefficients $h_i$.
84 
85 
86 
87 #### Adjoint-based Gradient Computation
88 Many sampling, optimization, and dimension reduction algorithms require derivative information. Here we consider adjoint strategies for efficient computing gradients of functionals $J(h)$ with respect to the conductivity field $k(x)$. A natural way to do this is to consider the following constrained optimization problem
89 $$
90 \begin{aligned}
91 \min_{h,k}\quad & J(h)\\
92 \text{s.t.}\quad & \frac{\partial}{\partial x}\left[ k(x) \frac{\partial h}{\partial x}\right] + f(x) = 0.
93 \end{aligned}
94 $$
95 Using a Lagrange multiplier $p(x)$, we can form the Lagrangian $\mathcal{L}(h,k,p)$ associated with this optimization problem and solve for optimality conditions. The Lagrangian is given by
96 $$
97 \mathcal{L}(h,k,p) = J(h) - \int_\Omega p \frac{\partial}{\partial x}\left[ k \frac{\partial h}{\partial x}\right] dx - \int_\Omega p f dx.
98 $$
99 Note that the explicit dependence of $h$, $k$, and $p$ on $x$ has been dropped for clarity. Using integration by parts, we can "move" the derivative on $k \frac{\partial h}{\partial x}$ to a derivative on $p$, resulting in
100 $$
101 \mathcal{L}(h,k,p) = J(h) - \left.\left(p k \frac{\partial h}{\partial x}\right)\right|_{0}^{1} + \int_\Omega k \frac{\partial p}{\partial x} \frac{\partial h}{\partial x} dx - \int_\Omega p f dx.
102 $$
103 Recall that $p(0)=0$ because of the Dirichlet condition at $x=0$. Combined with the boundary condition $\partial h / \partial dx = 0$ at $x=1$, the Lagrangian can be simplified to
104 $$
105 \mathcal{L}(h,k,p) = J(h) + \int_\Omega k \frac{\partial p}{\partial x} \frac{\partial h}{\partial x} dx - \int_\Omega p f dx.
106 $$
107 
108 We now consider the first variations of the Lagrangian with respect to each input. Recall that the first variation of a functional $G(m)$ is a function $G_m(m)(\tilde{m})$ representing the directional derivative of the functional $G$ in the direction $\tilde{m}$ and is defined as
109 $$
110 G_m(m)(\tilde{m}) = \left. \frac{\partial}{\partial \epsilon} G(m+\epsilon\tilde{m}) \right|_{\epsilon=0},
111 $$
112 where $\epsilon\in\mathbb{R}$ is a scalar step size.
113 
114 Using this definition, the first variation of $\mathcal{L}$ with respect to $h$ is given by
115 $$
116 \begin{aligned}
117 \mathcal{L}_h(h,k,p)(\tilde{h}) &= \left. \frac{\partial}{\partial \epsilon} \mathcal{L}(h+\epsilon \tilde{h},k,p) \right|_{\epsilon=0}\\
118 &= \left. \frac{\partial}{\partial \epsilon}\left[J(h+\epsilon \tilde{h}) +\int_\Omega k \frac{\partial p}{\partial x} \frac{\partial (h+\epsilon\tilde{h})}{\partial x} dx - \int_\Omega p f dx\right] \right|_{\epsilon=0}\\
119 &= J_h(h)(\tilde{h}) + \int_\Omega k \frac{\partial p}{\partial x}\frac{\partial \tilde{h}}{\partial x} dx.
120 \end{aligned}
121 $$
122 
123 Similarly, the first variation with respect $p$ is
124 $$
125 \begin{aligned}
126 \mathcal{L}_p(h,k,p)(\tilde{p}) &= \left. \frac{\partial}{\partial \epsilon} \mathcal{L}(h,k,p+\epsilon \tilde{p}) \right|_{\epsilon=0}\\
127 &= \left. \frac{\partial}{\partial \epsilon}\left[J(h) + \int_\Omega k \frac{\partial (p+\epsilon \tilde{p})}{\partial x} \frac{\partial h}{\partial x} dx - \int_\Omega (p+\epsilon \tilde{p}) f dx\right] \right|_{\epsilon=0}\\
128 &= \int_\Omega k \frac{\partial \tilde{p}}{\partial x}\frac{\partial h}{\partial x} dx - \int_\Omega \tilde{p} f dx.
129 \end{aligned}
130 $$
131 
132 And finally, the first variation with respect to $k$ is given by
133 $$
134 \begin{aligned}
135 \mathcal{L}_k(h,k,p)(\tilde{k}) &= \left. \frac{\partial}{\partial \epsilon} \mathcal{L}(h,k+\epsilon \tilde{k},p) \right|_{\epsilon=0}\\
136 &= \left. \frac{\partial}{\partial \epsilon} \left[ J(h) + \int_\Omega (k+\epsilon\tilde{k}) \frac{\partial p}{\partial x} \frac{\partial h}{\partial x} dx - \int_\Omega p f dx\right] \right|_{\epsilon=0}\\
137 &=\int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial p}{\partial x} dx
138 \end{aligned}
139 $$
140 
141 The solution of the optimization problem is given by the functions $(h,k,p)$ that cause all of these first variations to be zero for all admissable directions $\tilde{h}$, $\tilde{p}$, and $\tilde{k}$. Solving $\mathcal{L}_p(h,k,p)(\tilde{p})=0$ for all $\tilde{p}$ yields the weak form of the original elliptic PDE -- the "forward problem." Solving $\mathcal{L}_h(h,k,p)(\tilde{h})$ for all $\tilde{h}$ results in the weak form of the adjoint equation. Finally, the constraint $\mathcal{L}_k(h,k,p)(\tilde{k})=0$ provides a definition of the gradient using the solutions of the forward and adjoint equations. The gradient is given by $\nabla_k J = \frac{\partial h}{\partial x} \frac{\partial p}{\partial x}$.
142 
143 To intuitively interpret $\nabla_k J$ as the gradient, it helps to compare the inner product of $\tilde{k}$ and $\nabla_k J$ in the definition of $\mathcal{L}_k(h,k,p)(\tilde{k})$ with the definition of a directional derivative in a finite dimensional space. Consider the relationship between the gradient and directional derivative of a function $g(y)$ mapping $\mathbb{R}^N\rightarrow\mathbb{R}$. The derivative of $g$ in a unit direction $\tilde{y}$ is given by the inner product of the gradient and the direction: $\tilde{y}\cdot \nabla_y g$. Comparing this to the first variation $\mathcal{L}_k(h,k,p)(\tilde{k})$, we see that the integral $\int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial p}{\partial x} dx$ is just an inner product of the direction $\tilde{k}$ with $\nabla_kJ$. $\nabla_kJ$ is therefore the functional gradient of $J$ with respect to the conductivity $k$.
144 
145 Notice that the adjoint equation has the same form as the forward model but with a different right hand side. This is because the flow equation is self-adjoint. Self-adjoint operators are numerically convenient because we can reuse the same finite element discretization of the forward problem to also solve the adjoint equation. In particular, we can solve the linear system $\mathbf{A} \mathcal{p} = \nabla_h J$, where $\mathbf{A}$ is the same matrix as before, and $\nabla_h J$ is a vector containing the gradient of $J$ with respect to the coefficients $\mathbf{h}_i$. The solution of this linear system can then be used to obtain the derivative of $J$ with respect to each coefficient $k_i$ defining the conductivity field:
146 $$
147 \frac{\partial J}{\partial K_i} = \frac{ \mathbf{h}_{i+1}-\mathbf{h}_{i}}{x_{i+1}-x_i} \frac{\mathbf{p}_{i+1} - \mathbf{p}_i}{x_{i+1} - x_i}.
148 $$
149 
150 Note that obtaining the gradient with this approach only requires one extra linear solve to obtain the adjoint variable $\mathbf{p}$ and that a factorization of the stiffness matrix $\mathbf{A}$ can be resued. This is in stark contrast to a finite difference approximation that would require $N$ additional solves -- one for each derivative $\frac{\partial J}{\partial k_i}$.
151 
152 
153 #### Jacobian Actions
154 
155 The gradient computation above performs one step of the chain rule, which is equivalent to applying the transpose of the model Jacobian (i.e., linearized $k\rightarrow h$ mapping) to the sensitivity vector $\nabla_h J$. It is also possible to efficiently apply the Jacobian itself to a vector $v$ using what is often called the tangent linear model. To see this, consider a small perturbation $\tilde{k}(x)$ of the conductivity field $k(x)$. This perturbation will then cause a perturbation $\tilde{h}$ of the original solution $h$. The tangent linear model relates $\tilde{k}$ and $\tilde{h}$ by ignoring higher order terms like $\tilde{k}^2$ or $\tilde{k}\tilde{h}$.
156 
157 To derive the tangent linear model for this problem, consider what happens when we substitute the perturbed fields $k+\tilde{k}$ and $h+\tilde{h}$ into the flow equation. The result is
158 $$
159 \begin{aligned}
160 -\frac{\partial}{\partial x}\left[ (k+\tilde{k}) \frac{\partial (h+\tilde{h})}{\partial x}\right] &= f(x)\\
161 \tilde{h}(0) &= 0\\
162 \left. \frac{\partial \tilde{h}}{\partial x}\right|_{x=1} &= 0,
163 \end{aligned}
164 $$
165 
166 Expanding terms, removing higher order terms, and simplifying then gives
167 
168 $$
169 \begin{aligned}
170 -\frac{\partial}{\partial x}\left[ k \frac{\partial h}{\partial x}\right] -\frac{\partial}{\partial x}\left[ \tilde{k} \frac{\partial h}{\partial x}\right] -\frac{\partial}{\partial x}\left[ k \frac{\partial \tilde{h}}{\partial x}\right] -\frac{\partial}{\partial x}\left[ \tilde{k} \frac{\partial \tilde{h}}{\partial x}\right] &= f(x)\\
171 \Rightarrow -\frac{\partial}{\partial x}\left[ k \frac{\partial \tilde{h}}{\partial x}\right] &= \frac{\partial}{\partial x}\left[ \tilde{k} \frac{\partial h}{\partial x}\right].
172 \end{aligned}
173 $$
174 The weak form of this linear tangent model is given by
175 $$
176 \begin{aligned}
177 \Rightarrow \int_\Omega k \frac{\partial p}{\partial x} \frac{\partial \tilde{h}}{\partial x} dx & = -\int_\Omega \tilde{k} \frac{\partial p}{\partial x} \frac{\partial h}{\partial x} dx \quad \forall p\in H^1
178 \end{aligned}
179 $$
180 
181 Yet again, this equation has the same form as the forward and adjoint equations! The only difference is the right hand side. This shared form is not the case for most operators, but makes the implementation of this model quite easy because we only need one function to construct the stiffness matrix.
182 
183 Using the vector $\mathbf{h}$ of nodal values, the weak form of the tangent linear model results in a linear system of the form
184 $$
185 \mathbf{A} \mathbf{\tilde{h}} = \mathbf{t},
186 $$
187 where entry $i$ of the right hand side is given by
188 $$
189 \begin{aligned}
190 \mathbf{t}_i & = \int_{x_{i-1}}^{x_i} \tilde{k} \frac{\partial \phi_i^1}{\partial x} \frac{\partial h}{\partial x} dx + \int_{x_{i}}^{x_{i+1}} \tilde{k} \frac{\partial \phi_i^1}{\partial x} \frac{\partial h}{\partial x} dx\\
191 &= \left[ \tilde{k}_{i-1} \frac{\mathbf{h}_i - \mathbf{h}_{i-1}}{x_i - x_{i-1}} \right] (x_i - x_{i-1}) + \left[ \tilde{k}_{i} \frac{\mathbf{h}_{i+1} - \mathbf{h}_{i}}{x_{i+1} - x_{i}} \right] (x_{i+1} - x_{i}) \\
192 &= \tilde{k}_{i-1} (\mathbf{h}_i - \mathbf{h}_{i-1}) + \tilde{k}_{i} (\mathbf{h}_{i+1} - \mathbf{h}_{i}).
193 \end{aligned}
194 $$
195 
196 To apply the Jacobian to $\tilde{k}$, we need to solve the forward problem to get $h$, then construct the right hand side vector $\mathbf{t}$ from the direction $\tilde{k}$, and finally solve $\mathbf{\tilde{h}} = \mathbf{A}^{-1}\mathbf{t}$ to get $\tilde{h}$.
197 
198 
199 #### Adjoint-based Hessian actions
200 Second derivative information can also be exploited by many different UQ, optimization, and dimension reduction algorithms. Newton methods in optimization are a classic example of this. Here we extend the adjoint gradient from above to efficiently evaluate the action of the Hessian operator $\nabla_{kk} J$ on a function $\hat{k}$. After discretizing, this is equivalent to multiplying the Hessian matrix times a vector.
201 
202 Recall from the adjoint gradient discussion above that $\mathcal{L}_k(h,k,p)(\tilde{k})$ is a scalar quantity that represents the directional derivative of the Lagrangian in direction $\tilde{k}$ when $\mathcal{L}_p(h,k,p)(\tilde{p})=0$ for all $\tilde{p}$ and $\mathcal{L}_h(h,k,p)(\tilde{h})=0$ for all $\tilde{h}$. To characterize the Hessian, we now want to consider the change in this scalar quantity when $k$ is perturbed in another direction. In an abstract sense, this is identical to our adjoint approach for the gradient, however, instead of considering the scalar functional $J(h)$ to construct a Lagrangian, we will the scalar functional $\mathcal{L}_k(h,k,p)(\tilde{k})$.
203 
204 To characterize a second directional derivative, we can form another (meta-)Lagrangian using $\mathcal{L}_k(h,k,p)(\tilde{k})$ and the constraints $\mathcal{L}_p(h,k,p)(\tilde{p})=0$ and $\mathcal{L}_h(h,k,p)(\tilde{h})=0$. The strong form of these constraints is given by
205 $$
206 \begin{aligned}
207 -\frac{\partial}{\partial x}\left[ k(x) \frac{\partial h}{\partial x}\right] &= f(x)\\
208 h(0) &= 0\\
209 \left. \frac{\partial h}{\partial x}\right|_{x=1} &= 0,
210 \end{aligned}
211 $$
212 and
213 $$
214 \begin{aligned}
215 -\frac{\partial}{\partial x}\left[ k(x) \frac{\partial p}{\partial x}\right] &= D_hJ(x)\\
216 p(0) &= 0\\
217 \left. \frac{\partial p}{\partial x} \right|_{x=1} &= 0.
218 \end{aligned}
219 $$
220 
221 In addition to introducing two additional Lagrange multipliers $\hat{p}$ and $\hat{h}$, we will also introduce a new function $s=\nabla_hJ(h)$ to represent the sensitivity of the objective functional $J(h)$ with respect to $h$. Notice that we will not directly consider second variations of $J$. Instead, we independently consider the action of Hessian on functions $\bar{k}$ and $\bar{s}$, which is what MUQ will need to propagate second derivative information through a composition of model components.
222 
223 Using $\hat{p}$, $\hat{h}$, and $s$, the meta-Lagrangian $\mathcal{L}^H$ takes the form
224 $$
225 \begin{aligned}
226 \mathcal{L}^H(h,k,p,\hat{h},\hat{p}, s; \tilde{k}) &= \int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial p}{\partial x} dx - \int_\Omega \hat{p} \left( \frac{\partial}{\partial x}\left[ k \frac{\partial h}{\partial x}\right] + f\right) dx - \int_\Omega \hat{h}\left(\frac{\partial}{\partial x}\left[ k \frac{\partial p}{\partial x}\right] + s \right) dx \\
227 &= \int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial p}{\partial x} dx + \int_\Omega k \frac{\partial \hat{p}}{\partial x}\frac{\partial h}{\partial x} dx - \int_\Omega \hat{p} f dx + \int_\Omega k \frac{\partial \hat{h}}{\partial x}\frac{\partial p}{\partial x} dx - \int_\Omega \hat{h} s dx
228 \end{aligned}
229 $$
230 Placing the $\tilde{k}$ after the semicolon in the arguments to $\mathcal{L}^H$ is meant to indicate that the direction $\tilde{k}$ is a parameter defining the first directional derivative and is not an argument that needs to be considered when taking first variations.
231 
232 
233 The first variation with respect to $p$ is given by
234 $$
235 \begin{aligned}
236 \mathcal{L}_{p}^H(h,k,p,\hat{h},\hat{p}, s; \tilde{k})(\bar{p}) & = \left. \frac{\partial}{\partial \epsilon} \mathcal{L}^H(h,k,p+\epsilon\bar{p},\hat{h},\hat{p}, s; \tilde{k}) \right|_{\epsilon=0}\\
237 & = \left. \frac{\partial}{\partial \epsilon} \left( \int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial (p+\epsilon\bar{p})}{\partial x} dx + \int_\Omega k \frac{\partial \hat{p}}{\partial x}\frac{\partial h}{\partial x} dx - \int_\Omega \hat{p} f dx + \int_\Omega k \frac{\partial \hat{h}}{\partial x}\frac{\partial (p+\epsilon\bar{p})}{\partial x} dx \right)\right|_{\epsilon=0}\\
238 & = \int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial \bar{p}}{\partial x} dx + \int_\Omega k \frac{\partial \hat{h}}{\partial x}\frac{\partial \bar{p}}{\partial x} dx
239 \end{aligned}
240 $$
241 This expression defines what is commonly called the *incremental forward problem.*
242 
243 The first variation of this meta-Lagrangian with respect to $h$ is given by.
244 $$
245 \begin{aligned}
246 \mathcal{L}_{h}^H(h,k,p,\hat{h},\hat{p}, s; \tilde{k})(\bar{h}) & = \left. \frac{\partial}{\partial \epsilon} \mathcal{L}^H(h+\epsilon\bar{h},k,p,\hat{h},\hat{p}, s; \tilde{k}) \right|_{\epsilon=0}\\
247 &= \left. \frac{\partial}{\partial \epsilon} \left( \int_\Omega \tilde{k}\frac{\partial (h+\epsilon\bar{h})}{\partial x} \frac{\partial p}{\partial x} dx + \int_\Omega k \frac{\partial \hat{p}}{\partial x}\frac{\partial (h+\epsilon\bar{h})}{\partial x} dx - \int_\Omega \hat{p} f dx + \int_\Omega k \frac{\partial \hat{h}}{\partial x}\frac{\partial p}{\partial x} dx - \int_\Omega \hat{h}\, s dx \right)\right|_{\epsilon=0} \\
248 &= \int_\Omega \tilde{k}\frac{\partial \bar{h}}{\partial x} \frac{\partial p}{\partial x} dx + \int_\Omega k \frac{\partial \hat{p}}{\partial x}\frac{\partial \bar{h}}{\partial x} dx.
249 \end{aligned}
250 $$
251 This expression defines the *incremental adjoint problem.* Notice that both incremental problems have the same bilinear form $\int_\Omega k \frac{\partial \hat{h}}{\partial x}\frac{\partial \bar{p}}{\partial x} dx$ that is present in the forward and adjoint problems. Just like we were able to reuse the matrix $\mathbf{A}$ from the forward problem in the adjoint problem, we can also use $\mathbf{A}$ to solve both incremental problems.
252 
253 Finally, the first variation with respect to $k$ is
254 $$
255 \begin{aligned}
256 \mathcal{L}_{k}^H(h,k,p,\hat{h}, \hat{p}, s; \tilde{k})(\bar{k}) &= \left. \frac{\partial}{\partial \epsilon} \mathcal{L}^H(h,k+\epsilon \bar{k},p,\hat{h},\hat{p},s; \tilde{k}) \right|_{\epsilon=0}\\
257 &= \left. \frac{\partial}{\partial \epsilon}\left( \int_\Omega \tilde{k}\frac{\partial h}{\partial x} \frac{\partial p}{\partial x} dx + \int_\Omega (k+\epsilon \bar{k}) \frac{\partial \hat{p}}{\partial x}\frac{\partial h}{\partial x} dx - \int_\Omega \hat{p} f dx + \int_\Omega (k+\epsilon\bar{k}) \frac{\partial \hat{h}}{\partial x}\frac{\partial p}{\partial x} dx - \int_\Omega \hat{h} D_hJ dx \right) \right|_{\epsilon=0} \\
258 &= \int_\Omega \bar{k} \frac{\partial \hat{p}}{\partial x}\frac{\partial h}{\partial x} dx +\int_\Omega\bar{k} \frac{\partial \hat{h}}{\partial x} \frac{\partial p}{\partial x} dx.
259 \end{aligned}
260 $$
261 Combined with the incremental solution fields $\hat{h}$ and $\hat{p}$, this expression allows us to apply the Hessian to an input $\tilde{k}$. The action of the Hessian, denoted here by $\nabla_{kk}J$, on the function $\tilde{k}$ is given by
262 $$
263 (\nabla_{kk}J ) \tilde{k} = \frac{\partial \hat{p}}{\partial x}\frac{\partial h}{\partial x} + \frac{\partial \hat{h}}{\partial x} \frac{\partial p}{\partial x}.
264 $$
265 Note that $\tilde{k}$ is implicitly contained in the right hand side of this expression through $\hat{p}$ and $\hat{h}$.
266 
267 
268 ## Implementation
269 MUQ implements components of models through the `ModPiece`. Readers that are not familiar with the `ModPiece` class should consult the [model compoents tutorial](https://mituq.bitbucket.io/source/_site/latest/group__modpieces.html) before proceeding.
270 
271 To define the adjoint-enabled flow equation solver derived above, we need to override 3 methods in the `ModPiece` base class: the `EvaluateImpl` function, which will solve $\mathbf{A}\mathbf{h}=\mathbf{f}$, the `GradientImpl` function, which will solve $\mathbf{A}p=\mathbf{s}$ and compute the gradient, and finally, the `ApplyHessianImpl` function, which will solve the incremtnal forward and adjoint problems for $\hat{\mathbf{p}}$ and $\hat{\mathbf{h}}$ before returning the Hessian action. These functions are all contained in the `FlowEquation` class below.
272 
273 Notice that all of these function require constructing the stiffness matrix $\mathbf{K}$. To avoid duplicitous code, the `FlowEquation` class therefore contains a `BuildStiffness` function that accepts a numpy vector of hydraulic conductivities and returns a scipy sparse matrix containing $\mathbf{K}$.
274 
275 To implement the boundary Dirichlet boundary condition $h(x=0)=0$, we ensure that $\mathbf{f}_0 = 0$ and that $\mathbf{A}_{0,0}=c$ for some large value $c$ (e.g., $10^{10}$). This is a common "trick" for enforcing Dirichlet boundary conditions while ensuring the stiffness matrix $\mathbf{A}$ remains symmetric.
276 */
277 
278 /***
279 ## Includes
280 This example will use many components of MUQ's modeling module as well as sparse linear algebra routines from Eigen. Below we include everything that's needed later on.
281 */
282 
283 #include "MUQ/Modeling/ModPiece.h"
285 #include "MUQ/Modeling/WorkGraph.h"
288 
290 
293 #include <boost/property_tree/ptree.hpp>
294 
295 #include <vector>
296 
297 #include <Eigen/Core>
298 #include <Eigen/Sparse>
299 #include <Eigen/SparseCholesky>
300 
301 
302 /***
303 ## Class Definition
304 
305 This class solves the 1D elliptic PDE of the form
306  $$
307  -\frac{\partial}{\partial x}\cdot(K(x) \frac{\partial h}{\partial x}) = f(x).
308  $$
309  over $x\in[0,1]$ with boundary conditions $h(0)=0$ and
310  $\partial h/\partial x =0$ at $x=1$. This equation is a basic model of steady
311  state fluid flow in a porous media, where $h(x)$ is the hydraulic head, $K(x)$
312  is the hydraulic conductivity, and $f(x)$ is the recharge.
313 
314  This ModPiece uses linear finite elements on a uniform grid. There is a single input,
315  the conductivity $k(x)$, which is represented as piecewise constant within each
316  of the $N$ cells. There is a single output of this ModPiece: the head $h(x)$ at the
317  $N+1$ nodes in the discretization.
318 
319  INPUTS:
320 
321 
322 */
323 class FlowEquation : public muq::Modeling::ModPiece
324 {
325 public:
326 
331  FlowEquation(Eigen::VectorXd const& sourceTerm) : muq::Modeling::ModPiece({int(sourceTerm.size())},
332  {int(sourceTerm.size()+1)})
333  {
334  numCells = sourceTerm.size();
335  numNodes = sourceTerm.size()+1;
336 
337  xs = Eigen::VectorXd::LinSpaced(numNodes,0,1);
338  dx = xs(1)-xs(0);
339 
340  rhs = BuildRhs(sourceTerm);
341  };
342 
343 protected:
344 
353  void EvaluateImpl(muq::Modeling::ref_vector<Eigen::VectorXd> const& inputs) override
354  {
355  // Extract the conductivity vector from the inptus
356  auto& cond = inputs.at(0).get();
357 
358  // Build the stiffness matrix
359  auto K = BuildStiffness(cond);
360 
361  // Solve the sparse linear system and store the solution in the outputs vector
362  Eigen::SimplicialLDLT<Eigen::SparseMatrix<double>> solver;
363  solver.compute(K);
364 
365  outputs.resize(1);
366  outputs.at(0) = solver.solve(rhs);
367  };
368 
397  virtual void GradientImpl(unsigned int outWrt,
398  unsigned int inWrt,
400  Eigen::VectorXd const& sens) override
401  {
402  // Extract the conductivity vector from the inptus
403  auto& cond = inputs.at(0).get();
404 
405  // Build the stiffness matrix
406  auto A = BuildStiffness(cond);
407 
408  // Factor the stiffness matrix for forward and adjoint solves
409  Eigen::SimplicialLDLT<Eigen::SparseMatrix<double>> solver;
410  solver.compute(A);
411 
412  // Solve the forward problem
413  Eigen::VectorXd sol = solver.solve(rhs);
414 
415  // Solve the adjoint problem
416  Eigen::VectorXd adjRhs = BuildAdjointRHS(sens);
417  Eigen::VectorXd adjSol = solver.solve(adjRhs);
418 
419  // Compute the gradient from the adjoint solution
420  gradient = (1.0/dx)*(sol.tail(numCells)-sol.head(numCells)).array() * (adjSol.tail(numCells) - adjSol.head(numCells)).array();
421  }
422 
423 
424  /***
425  This function computes the application of the model Jacobian's matrix $J$
426  on a vector $v$. In addition to the model parameter $k$, this function also
427  requires the vector $v$.
428 
429  The gradient with respect to the conductivity field is computed by solving
430  the forward model to get $h(x)$ and then using the tangent linear approach
431  described above to obtain the Jacobian action.
432 
433  @param[in] outWrt For a model with multiple outputs, this would be the index
434  of the output list that corresponds to the sensitivity vector.
435  Since this ModPiece only has one output, the outWrt argument
436  is not used in the GradientImpl function.
437 
438  @param[in] inWrt Specifies the index of the input for which we want to compute
439  the gradient. For inWrt==0, then the Jacobian with respect
440  to the conductivity is used. Since this ModPiece only has one
441  input, 0 is the only valid value for inWrt.
442 
443  @param[in] inputs Just like the EvalauteImpl function, this is a list of vector-valued inputs.
444 
445  @param[in] vec A vector with the same size of inputs[0]. The Jacobian will be applied to this vector.
446 
447  @return This function returns nothing. It stores the result in the private
448  ModPiece::jacobianAction variable that is then returned by the `Jacobian` function.
449 
450  */
451  virtual void ApplyJacobianImpl(unsigned int outWrt,
452  unsigned int inWrt,
454  Eigen::VectorXd const& vec) override
455  {
456  // Extract the conductivity vector from the inptus
457  auto& cond = inputs.at(0).get();
458 
459  // Build the stiffness matrix
460  auto A = BuildStiffness(cond);
461 
462  // Factor the stiffness matrix for forward and tangent linear solves
463  Eigen::SimplicialLDLT<Eigen::SparseMatrix<double>> solver;
464  solver.compute(A);
465 
466  // Solve the forward system
467  Eigen::VectorXd sol = solver.solve(rhs);
468 
469  // Build the tangent linear rhs
470  Eigen::VectorXd incrRhs = Eigen::VectorXd::Zero(numNodes);
471 
472  Eigen::VectorXd dh_dx = (sol.tail(numCells)-sol.head(numCells))/dx; // Spatial derivative of solution
473 
474  incrRhs.head(numCells) += ( vec.array()*dh_dx.array() ).matrix();
475  incrRhs.tail(numCells) -= ( vec.array()*dh_dx.array() ).matrix();
476 
477  // Solve the tangent linear model for the jacobian action
478  jacobianAction = solver.solve(incrRhs);
479  }
480 
519  virtual void ApplyHessianImpl(unsigned int outWrt,
520  unsigned int inWrt1,
521  unsigned int inWrt2,
523  Eigen::VectorXd const& sens,
524  Eigen::VectorXd const& vec) override
525  {
526  // Extract the conductivity vector from the inptus
527  auto& cond = inputs.at(0).get();
528 
529  // Build the stiffness matrix
530  auto K = BuildStiffness(cond);
531 
532  // Factor the stiffness matrix for forward and adjoint solves
533  Eigen::SimplicialLDLT<Eigen::SparseMatrix<double>> solver;
534  solver.compute(K);
535 
536  // Solve the forward problem
537  Eigen::VectorXd sol = solver.solve(rhs);
538 
539  // If we're using the Hessian $\nabla_{kk} J$
540  if((inWrt1==0)&&(inWrt2==0)){
541 
542  // Solve the adjoint problem
543  Eigen::VectorXd adjRhs = BuildAdjointRHS(sens);
544  Eigen::VectorXd adjSol = solver.solve(adjRhs);
545 
546  // Solve the incremental forward problem
547  Eigen::VectorXd incrForRhs = BuildIncrementalRhs(sol, vec);
548  Eigen::VectorXd incrForSol = solver.solve(incrForRhs);
549 
550  // Solve the incremental adjoint problem
551  Eigen::VectorXd incrAdjRhs = BuildIncrementalRhs(adjSol, vec);
552  Eigen::VectorXd incrAdjSol = solver.solve(incrAdjRhs);
553 
554  // Construct the Hessian action
555  auto solDeriv = (sol.tail(numCells)-sol.head(numCells))/dx;
556  auto adjDeriv = (adjSol.tail(numCells)-adjSol.head(numCells))/dx;
557  auto incrForDeriv = (incrForSol.tail(numCells) - incrForSol.head(numCells))/dx;
558  auto incrAdjDeriv = (incrAdjSol.tail(numCells) - incrAdjSol.head(numCells))/dx;
559 
560  hessAction = -(incrAdjDeriv.array() * solDeriv.array() + incrForDeriv.array() * adjDeriv.array());
561 
562  // If we're using the mixed Hessian $\nabla_{ks} J$
563  }else if((inWrt1==0)&&(inWrt2==1)){
564 
565  Eigen::VectorXd temp = solver.solve(vec);
566  auto solDeriv = (sol.tail(numCells) - sol.head(numCells))/dx;
567  auto tempDeriv = (temp.tail(numCells)-temp.head(numCells))/dx;
568 
569  hessAction = -dx * solDeriv.array() * tempDeriv.array();
570 
571  // We should never see any other options...
572  }else{
573  assert(false);
574  }
575  }
576 
578  Eigen::VectorXd BuildRhs(Eigen::VectorXd const& sourceTerm) const
579  {
580  Eigen::VectorXd rhs = Eigen::VectorXd::Zero(numNodes);
581 
582  rhs.segment(1,numNodes-2) = 0.5*dx*(sourceTerm.tail(numNodes-2) + sourceTerm.head(numNodes-2));
583  rhs(numNodes-1) = 0.5*dx*sourceTerm(numNodes-2);
584 
585  return rhs;
586  }
587 
589  Eigen::VectorXd BuildAdjointRHS(Eigen::VectorXd const& sensitivity) const
590  {
591  Eigen::VectorXd rhs = -1.0*sensitivity;
592  rhs(0) = 0.0; // <- To enforce Dirichlet BC
593  return rhs;
594  }
595 
597  Eigen::VectorXd BuildIncrementalRhs(Eigen::VectorXd const& sol, Eigen::VectorXd const& khat)
598  {
599  // Compute the derivative of the solution in each cell
600  Eigen::VectorXd solGrad = (sol.tail(numCells)-sol.head(numCells))/dx;
601  Eigen::VectorXd rhs = Eigen::VectorXd::Zero(numNodes);
602 
603  unsigned int leftNode, rightNode;
604  for(unsigned int cellInd=0; cellInd<numCells; ++cellInd)
605  {
606  leftNode = cellInd;
607  rightNode = cellInd + 1;
608 
609  rhs(leftNode) -= dx*khat(cellInd)*solGrad(cellInd);
610  rhs(rightNode) += dx*khat(cellInd)*solGrad(cellInd);
611  }
612 
613  rhs(0) = 0.0; // <- To enforce Dirichlet BC at x=0
614  return rhs;
615  }
616 
618  Eigen::SparseMatrix<double> BuildStiffness(Eigen::VectorXd const& condVals) const{
619 
620  typedef Eigen::Triplet<double> T;
621  std::vector<T> nzVals;
622 
623  // Add a large number to K[0,0] to enforce Dirichlet BC
624  nzVals.push_back( T(0,0,1e10) );
625 
626  unsigned int leftNode, rightNode;
627  for(unsigned int cellInd=0; cellInd<numCells; ++cellInd)
628  {
629  leftNode = cellInd;
630  rightNode = cellInd+1;
631 
632  nzVals.push_back( T(leftNode, rightNode, -condVals(cellInd)/dx) );
633  nzVals.push_back( T(rightNode, leftNode, -condVals(cellInd)/dx) );
634  nzVals.push_back( T(rightNode, rightNode, condVals(cellInd)/dx) );
635  nzVals.push_back( T(leftNode, leftNode, condVals(cellInd)/dx) );
636  }
637 
638  Eigen::SparseMatrix<double> stiffMat(numNodes,numNodes);
639  stiffMat.setFromTriplets(nzVals.begin(), nzVals.end());
640 
641  return stiffMat;
642  }
643 
644 
645 private:
646 
647  // Store "mesh" information. xs contains the node locations and dx is the uniform spacing between nodes
648  Eigen::VectorXd xs;
649  double dx;
650  unsigned int numCells;
651  unsigned int numNodes;
652 
653  // Will store the precomputed RHS for the forward problem
654  Eigen::VectorXd rhs;
655 
656 }; // end of class SimpleModel
657 
658 
659 /***
660 ## Construct Model
661 Here we instantiate the `FlowEquation` class. The source term is set to be $f(x)=1$ by passing in a vector of
662 all ones to the `FlowEquation` constructor.
663 */
664 int main(){
665 
666  using namespace muq::Modeling;
667  using namespace muq::Utilities;
668 
669  unsigned int numCells = 200;
670 
671  // Vector containing the location of each node in the "mesh"
672  Eigen::VectorXd nodeLocs = Eigen::VectorXd::LinSpaced(numCells+1,0,1);
673 
674  // Vector containing the midpoint of each cell
675  Eigen::VectorXd cellLocs = 0.5*(nodeLocs.head(numCells) + nodeLocs.tail(numCells));
676 
677  // The source term f(x) in each grid cell
678  Eigen::VectorXd recharge = Eigen::VectorXd::Ones(numCells);
679 
680  // Create an instance of the FlowModel class
681  auto mod = std::make_shared<FlowEquation>(recharge);
682 
683 /***
684 ## Evaluate Model
685 Here we construct a vector of conductivity values and call the `FlowEquation.Evaluate` function to evaluate the model.
686 Recall that you should implement the `EvaluateImpl` function, but call the `Evaluate` function.
687 In addition to calling `EvaluateImpl`, the `Evaluate` function checks the size of the input vectors, tracks run
688 times, counts function evaluations, and manages a one-step cache of outputs.
689 */
690 
691  // Define a conductivity field and evaluate the model k(x) = exp(cos(20*x))
692  Eigen::VectorXd cond = (20.0*cellLocs.array()).cos().exp();
693 
694  Eigen::VectorXd h = mod->Evaluate(cond).at(0);
695 
696  std::cout << "Solution: \n" << h.transpose() << std::endl << std::endl;
697 
698 /***
699 ## Check Model Gradient
700 To check our adjoint gradient implementation, we will employ a finite difference approximation of the gradient vector.
701 Before doing that however, we need to define a scalar "objective function" $J(h)$ that can be composed with the flow
702 equation model. In practice, this objective function is often the likelihood function or posterior density in a
703 Bayesian inverse problem. For simplicity, we will just consider the log of a standard normal density:
704 $$
705 J(h) \propto -\frac{1}{2} \|h\|^2.
706 $$
707 
708 The following cell uses the density view of MUQ's `Gaussian` class to define $J(h)$. The `objective`
709 defined in this cell is just another `ModPiece` and be used in the same way as any other `ModPiece`.
710 */
711  auto objective = std::make_shared<Gaussian>(numCells+1)->AsDensity();
712 
713 /***
714 To obtain the gradient of the objective function $J(h)$ with respect to the vector of cell-wise conductivities
715 $\mathbf{k}$, we need to apply the chain rule:
716 $$
717 \nabla_{\mathbf{k}} J = \left( \nabla_h J \right) \nabla_{\mathbf{k}} h
718 $$
719 The following cell uses the `objective` object to obtain the initial sensitivity $\nabla_h J$. This is then
720 passed to the `Gradient` function of the flow model, which will use our adjoint implementation above to
721 compute $\nabla_{\mathbf{k}} J$.
722 */
723  Eigen::VectorXd objSens = Eigen::VectorXd::Ones(1);
724  Eigen::VectorXd sens = objective->Gradient(0,0,h,objSens);
725  Eigen::VectorXd grad = mod->Gradient(0,0,cond,sens);
726 
727  std::cout << "Gradient: \n" << grad.transpose() << std::endl << std::endl;
728 
729 /***
730 To verify our `FlowEquation.GradientImpl` function, we can call the built in `ModPiece.GradientByFD` function
731 to construct a finite difference approximation of the gradient. If all is well, the finite difference and adjoint
732 gradients will be close.
733 */
734  Eigen::VectorXd gradFD = mod->GradientByFD(0,0,std::vector<Eigen::VectorXd>{cond},sens);
735  std::cout << "Finite Difference Gradient:\n" << gradFD.transpose() << std::endl << std::endl;
736 
737 /***
738 ## Test Jacobian of Model
739 Here we randomly choose a vector $v$ (`jacDir`) and compute the action of the Jacobian $Jv$ using both our adjoint method and MUQ's built-in finite difference implementation.
740 */
741  Eigen::VectorXd jacDir = RandomGenerator::GetUniform(numCells);
742 
743  Eigen::VectorXd jacAct = mod->ApplyJacobian(0,0, cond, jacDir);
744  std::cout << "Jacobian Action: \n" << jacAct.transpose() << std::endl << std::endl;
745 
746  Eigen::VectorXd jacActFD = mod->ApplyJacobianByFD(0,0, std::vector<Eigen::VectorXd>{cond}, jacDir);
747  std::cout << "Finite Difference Jacobian Action \n" << jacActFD.transpose() << std::endl << std::endl;
748 
749 /***
750 ## Test Hessian of Model
751 We now take a similar approach to verifying our Hessian action implementation. Here we randomly choose a vector $v$ (`hessDir`)
752 and compute $Hv$ using both our adjoint method and MUQ's built-in finite difference implementation.
753 */
754  Eigen::VectorXd hessDir = RandomGenerator::GetUniform(numCells);
755 
756  Eigen::VectorXd hessAct = mod->ApplyHessian(0,0,0,cond,sens,hessDir);
757  std::cout << "Hessian Action: \n" << hessAct.transpose() << std::endl << std::endl;
758 
759  Eigen::VectorXd hessActFD = mod->ApplyHessianByFD(0,0,0,std::vector<Eigen::VectorXd>{cond},sens,hessDir);
760  std::cout << "Finite Difference Hessian Action \n" << hessActFD.transpose() << std::endl << std::endl;
761 
762 /***
763 ## Test Hessian of Objective
764 In the tests above, we manually evaluate the `objective` and `mod` components separately. They can also be combined
765 in a MUQ `WorkGraph`, which is more convenient when a large number of components are used or Hessian information needs
766 to be propagated through multiple different components.
767 
768 The following code creates a `WorkGraph` that maps the output of the flow model to the input of the objective function.
769 It then creates a new `ModPiece` called `fullMod` that evaluates the composition $J(h(k))$.
770 */
771  WorkGraph graph;
772  graph.AddNode(mod, "Model");
773  graph.AddNode(objective, "Objective");
774  graph.AddEdge("Model",0,"Objective",0);
775 
776  auto fullMod = graph.CreateModPiece("Objective");
777 
778 /***
779 As before, we can apply the Hessian of the full model to the randomly generated `hessDir` vector and compare the results
780 with finite differences. Notice that the results shown here are slightly different than the Hessian actions computed above.
781 Above, we manually fixed the sensitivity $s$ independently of $h$ and did not account for the relationship between the
782 conductivity $k$ on the sensitivity $s$. The `WorkGraph` however, captures all of those dependencies.
783 */
784 
785  hessAct = fullMod->ApplyHessian(0,0,0,cond,objSens,hessDir);
786  hessActFD = fullMod->ApplyHessianByFD(0,0,0,std::vector<Eigen::VectorXd>{cond},objSens,hessDir);
787 
788  std::cout << "Hessian Action: \n" << hessAct.transpose() << std::endl << std::endl;
789  std::cout << "Finite Difference Hessian Action \n" << hessActFD.transpose() << std::endl << std::endl;
790 
791 
792 /***
793 ## Compute the Hessian Spectrum
794 In many applications, the eigen decomposition of a Hessian matrix can contain valuable information. For example,
795 let $\pi(y|k)$ denote the likelihood function in a Bayesian inverse problem. The spectrum of $-\nabla_{kk}\log\pi(y|k)$
796 (the Hessian of the negative log likelihood) describes which directions in the parameter space are informed by the data.
797 
798 Below we show how the spectrum of the Hessian can be computed with MUQ's stochastic eigenvalue solver. MUQ's
799 implementation is based on the generalized eigenvalue solver described in
800 [Saibaba et al., 2015](https://doi.org/10.1002/nla.2026) and [Villa et al., 2021](https://dl.acm.org/doi/abs/10.1145/3428447?casa_token=2mk_QQqHe0kAAAAA%3AT5lr3QwgwbKNB4WgxUf7oPgCmqzir2b5O61fZHPEzv3AcU5eKHAxT1f7Ot_wZOL_SGqxe8g5ANAEtw).
801 */
802 
803 /***
804 #### Define the linear operator
805 The first step is to define a `LinearOperator` that evaluates the Hessian actions at a point. Here, we create an
806 instance of MUQ's `HessianOperator` class, which will internally call the `ApplyHessian` function of the `fullMod` `ModPiece`. The inputs to the `HessianOperator` constructor are essentially the same as the `ApplyHessian` function, but with an additional scaling term. Here, `scaling=-1` is used to account for the fact that we want to use the Hessian of the *negative* log density, which will have a positive semi-definite Hessian.
807 */
808  unsigned int outWrt = 0; // There's only one output of "fullMod", so this is the only possibility
809  unsigned int inWrt1 = 0; // There's also only one input of "fullMod"
810  unsigned int inWrt2 = 0;
811 
812  double scaling = -1.0;
813  std::vector<Eigen::VectorXd> inputs(1);
814  inputs.at(0) = cond;
815  auto hessOp = std::make_shared<HessianOperator>(fullMod, inputs, outWrt, inWrt1, inWrt2, objSens, scaling);
816 
817 /***
818 #### Set up the eigenvalue solver
819 We can now set up the eigen solver, compute the decomposition, and extract the eigenvalues and eigenvectors. For more
820 information, see documentation for the [StochasticEigenSolver](https://mituq.bitbucket.io/source/_site/latest/classmuq_1_1Modeling_1_1StochasticEigenSolver.html) class.
821 */
822  boost::property_tree::ptree opts;
823  opts.put("NumEigs", numCells); // Maximum number of eigenvalues to compute
824  opts.put("Verbosity", 3); // Controls how much information is printed to std::cout by the solver
825 
826  StochasticEigenSolver solver(opts);
827  solver.compute(hessOp);
828 
829  Eigen::VectorXd vals = solver.eigenvalues();
830  Eigen::MatrixXd vecs = solver.eigenvectors();
831 
832 /***
833 #### Investigate the Spectrum
834 Below we plot the eigenvalues, which are sorted largest to smallest. There are `numCells` parameters in this model
835 and `numCells+1` outputs. The objective function in this example is similar to what we would see if all outputs
836 were observed. The sharp decrease in the eigenvalues shown here is an indication that observing the `numCells+1`
837 outputs of this model is not sufficient to completely constrain all of the `numCells` inputs. Without additional
838 regularization, an inverse problem using this model would therefore be ill-posed. This is a common feature of
839 diffusive models.
840 */
841  std::cout << "Eigenvalues:\n" << vals.transpose() << std::endl << std::endl;
842  return 0;
843 }
int main()
Eigen::VectorXd const & eigenvalues() const
Eigen::MatrixXd const & eigenvectors() const
Provides an abstract interface for defining vector-valued model components.
Definition: ModPiece.h:148
virtual void GradientImpl(unsigned int const outputDimWrt, unsigned int const inputDimWrt, ref_vector< Eigen::VectorXd > const &input, Eigen::VectorXd const &sensitivity)
Definition: ModPiece.cpp:237
virtual void ApplyJacobianImpl(unsigned int const outputDimWrt, unsigned int const inputDimWrt, ref_vector< Eigen::VectorXd > const &input, Eigen::VectorXd const &vec)
Definition: ModPiece.cpp:262
std::vector< Eigen::VectorXd > outputs
Definition: ModPiece.h:503
ModPiece(std::vector< int > const &inputSizes, std::vector< int > const &outputSizes)
Definition: ModPiece.cpp:9
virtual void EvaluateImpl(ref_vector< boost::any > const &inputs) override
User-implemented function that determines the behavior of this muq::Modeling::WorkPiece.
Definition: ModPiece.cpp:179
Eigen::VectorXd gradient
Definition: ModPiece.h:504
Eigen::VectorXd hessAction
Definition: ModPiece.h:507
Eigen::VectorXd jacobianAction
Definition: ModPiece.h:505
virtual void ApplyHessianImpl(unsigned int const outWrt, unsigned int const inWrt1, unsigned int const inWrt2, ref_vector< Eigen::VectorXd > const &input, Eigen::VectorXd const &sens, Eigen::VectorXd const &vec)
Definition: ModPiece.cpp:436
Two-pass stochastic algorithm for computing generalized eigenvalues from matrix products.
virtual StochasticEigenSolver & compute(std::shared_ptr< LinearOperator > const &A, std::shared_ptr< LinearOperator > B=nullptr, std::shared_ptr< LinearOperator > Binv=nullptr)
A graph of connected muq::Modeling::WorkPiece's.
Definition: WorkGraph.h:20
void AddNode(std::shared_ptr< WorkPiece > input, std::string const &name)
Add a new node to the graph.
Definition: WorkGraph.cpp:195
void AddEdge(std::string const &nameFrom, unsigned int const outputDim, std::string const &nameTo, unsigned int const inputDim)
Add a new edge to the graph.
Definition: WorkGraph.cpp:206
std::shared_ptr< ModGraphPiece > CreateModPiece(std::string const &node, std::vector< std::string > const &inNames=std::vector< std::string >()) const
Create a muq::Modeling::ModPiece whose output matches a given node.
Definition: WorkGraph.cpp:591
std::vector< std::reference_wrapper< const T > > ref_vector
A vector of references to something ...
Definition: WorkPiece.h:37
@ array
array (ordered collection of values)