Each model component in MUQ is defined as a child of the ModPiece abstract base class. The ModPiece class is the software analog of a function
\[ f : \mathbb{R}^{N_1}\times\cdots\times\mathbb{R}^{N_{D_{in}}}\rightarrow\mathbb{R}^{M_1}\times\cdots\times \mathbb{R}^{M_{D_{out}}}, \]
with \(D_{in}\) vector-valued inputs and \(D_{out}\) vector-valued outputs. The ModPiece base class provides functions for evaluating the function \(f\) and computing derivatives with respect to the input of \(f\). The code snippet below uses the ExpOperator child provided by MUQ to evaluate \(f(x) = \exp(x)\).
#include <Eigen/Core>
#include <vector>
#include "MUQ/Modeling/CwiseOperators/CwiseUnaryOperator.h"
using namespace muq::Modeling;
int main(){
// Set the dimension N_1 of the input vector
int N1 = 2;
// Create a ModPiece for evaluating f(x)=exp(x)
auto expMod = std::make_shared<ExpOperator>(N1);
// Define an input of all ones
Eigen::VectorXd x = Eigen::VectorXd::Ones(N1);
// Evaluate the model
std::vector<Eigen::VectorXd> fx = expMod->Evaluate(x);
}
There are a few important things to note in this code snippet:
Evaluate
is a list (Python) or std::vector (c++) of vectors. This is because the ModPiece class defines general functions with multiple inputs and outputs. The square brackets [...]
surrounding the input x
in the Python code exists for the same reason. The x
variable is a vector, but the Evaluate
function accepts one or more vectors in a list. This isn't an issue in the c++ code because MUQ has shortcuts for transforming the Eigen::VectorXd
into the expected list format. It is also possible to skip these tricks by explicitly creating a std::vector input and passing that to the Evaluate
function:
std::vector<Eigen::VectorXd> x(1);
x.at(0) = Eigen::VectorXd::Ones(N1);
std::vector<Eigen::VectorXd> fx = expMod->Evaluate(x);
Children of the ModPiece
class (colloquially called "ModPieces") with multiple inputs are evaluated in the same way as the ExpOperator
class above. Here is an example using the SumPiece class.
#include <vector>
#include <Eigen/Core>
#include "MUQ/Modeling/SumPiece.h"
int main(){
// Set the dimension both input vectors
int N = 2;
// Create a ModPiece for evaluating f(x)=x_1 + x_2
auto sumMod = std::make_shared<SumPiece>(N,2);
// Define the inputs
Eigen::VectorXd x1 = Eigen::VectorXd::Ones(N);
Eigen::VectorXd x2 = Eigen::VectorXd::Ones(N);
// Evaluate the model
std::vector<Eigen::VectorXd> fx = sumMod->Evaluate(x1,x2);
return 0;
}
In addition to evaluating the model \(y_1,y_2,\ldots,y_{D_{out}} = f(x_1,x_2,\ldots,x_{D_{in}})\), the ModPieces can also compute derivative information. For example, to compute the Jacobian matrix of \(y_i\) with respect to \(x_j\), we can use the following code:
unsigned int outWrt = 0; // i in dy_i /dx_j
unsigned int inWrt = 0; // j in dy_i /dx_j
Eigen::MatrixXd jac = expMod->Jacobian(outWrt,inWrt,x);
Similarly, we can compute the application of the Jacobian to a vector, which corresponds to the directional derivative of \(y_i\). For some models, such as PDES with simple tangent operators, it can be much more efficient to compute Jacobian-vector products than constructing the entire Jacobian matrix and performing a matrix-vector multiplication.
Eigen::VectorXd dir = Eigen::VectorXd::Random(inDim);
Eigen::Matrix deriv = expMod->ApplyJacobian(outWrt,inWrt,x,dir);
Gradients can also be computed in MUQ. Consider a situation where we have a scalar function \(g(y_i)\) and are interested in the gradient of \(g\) with respect to the model input \(x_j\). From the chain rule, we have \(\nabla_{x_j} g = \nabla_{y_i} g D_{ij}f\), where \(\nabla_{y_i} g\) is a row-vector containing the gradient of \(g\) and \(D_{ij}\) denotes the Jacobian matrix of the model \(f\) using output \(y_i\) and input \(x_j\). In terms of column vectors, we have
\[ ( \nabla_{x_i} g)^T = D_{ij}^T s, \]
where \(s = \nabla_{y_i} g\). In our context, the vector \(s\) is called the sensitivity vector as it represents the first order sensitivity of \(g\) with respect to the model output \(y_i\). Notice that the directional derivatives above, the gradient is simple a matrix-vector product involving the Jacobian matrix. Here however, the transpose of the Jacobian matrix is used.
The Gradient
function in the ModPiece
class computes \(D_{ij}^T s\). This operation can be much faster than constructing the full Jacobian matrix when the dimension of the model input \(x_j\) is large, such as when \(x_j\) represents a spatially distributed field and \(f(x)\) is a PDE model. In this case, adjoint techniques can be used to efficiently compute \(D_{ij}^T s\).
Eigen::VectorXd sens = Eigen::VectorXd::Random(outDim); // i.e., s= \nabla g^T
Eigen::VectorXd grad = expMod->Gradient(outWrt,inWrt,x,sens);
Hessian-vector products, which are second order directional derivatives, can be with the ApplyHessian
function in the ModPiece
class. Notice that the Hessian matrix is the Jacobian matrix of the function mapping \(x_j\) to \(\nabla_{x_j} g\). gradient function. The ModPiece::ApplyHessian
thus applies the Jacobian of the Gradient function to a vector. To define the gradient, we need the same sensitivity vector \(s\) used in the gradient calculation, as well as the vector we want to apply the Hessian to.
unsigned int inWrt1 = 0; // i in the second derivative d^2f / dx_i dx_j
unsigned int inWrt2 = 0; // j in the second derivative d^2f / dx_i dx_j
Eigen::VectorXd dir = Eigen::VectorXd::Random(inDim);
Eigen::VectorXd sens = Eigen::VectorXd::Random(outDim);
Eigen::VectorXd deriv2 = expMod->ApplyHessian(outWrt,inWrt1,inWrt2,x,sens,dir);