Example Random walk Metropolis MCMC using a custom proposal covariance
Implement a simple Metropolis-Hastings MCMC method, choosing a custom covariance for a Gaussian proposal.
Overview
Implement a simple Metropolis-Hastings MCMC method, choosing a custom covariance for a Gaussian proposal.
Overview
This example demonstrates how to manually specify the proposal covariance in a simple random walk proposal.
MUQ MCMC Interfaces
There are two ways to specify MCMC samplers in MUQ. The first, which is demonstrated in the Example1_Gaussian example, specifies the proposal variance and other algorithmic parameters through a boost property tree. A lower level interface also exists, which allows users to manually specify the proposal distribution and transition kernel. In this latter setting, the boost property tree is then used only for chain-level settings like the number of steps, burn in, and print levels.
#include "MUQ/Modeling/Distributions/Gaussian.h" #include "MUQ/Modeling/Distributions/Density.h" #include "MUQ/SamplingAlgorithms/SamplingProblem.h" #include "MUQ/SamplingAlgorithms/SingleChainMCMC.h" #include "MUQ/SamplingAlgorithms/MHProposal.h" #include "MUQ/SamplingAlgorithms/MHKernel.h" #include <boost/property_tree/ptree.hpp> namespace pt = boost::property_tree; using namespace muq::Modeling; using namespace muq::SamplingAlgorithms; using namespace muq::Utilities; int main(){
1. Define the target density and set up the sampling problem
The Gaussian
class in MUQ provides a definition of a multivariate Gaussian
distribution. The distribution is completely defined by a mean vector and a
covariance (or precision) matrix. In MUQ, modeling components like a Gaussian
probability density, are represented as children of the ModPiece
base class.
However, interpreting the Gaussian as as a ModPiece
is ambiguous. The
Gaussian class could be used to define represent a random variable "ModPiece"
that returns a random realization of the multivariate Gaussian when evaluated.
Or the Gaussian class could define a "ModPiece" that evaluates the log
probability density function of the Gaussian distribution. To select one of
these two use cases, the MUQ Gaussian
class has member functions AsDensity()
and AsVariable()
that return a shared_ptr to a class that evaluates the log PDF,
or a class that draws a random sample, respectively. These functions are
implemented in the Distribution
class, which is a parent of the Gaussian
class.
The AbstractSamplingProblem base class and its children, like the SamplingProblem class, define the interface between sampling algorithms like MCMC and the models and densities they work with.
Define the Target Density:
Eigen::VectorXd mu(2); mu << 1.0, 2.0; Eigen::MatrixXd cov(2,2); cov << 1.0, 0.8, 0.8, 1.5; auto targetDensity = std::make_shared<Gaussian>(mu, cov)->AsDensity(); // standard normal Gaussian
Create the Sampling Problem:
auto problem = std::make_shared<SamplingProblem>(targetDensity);
3. Construct the RWM proposal
Let $x_k$ denote the $k^{th}$ state in a Markov chain. At step $k$, the Random Walk Metropolis algorithm draws a random realization of the proposal random variable $x^\prime = x_k + z$, where $z\sim q(z)$ is a random variable distributed according to the proposal distribution $q(z)$. The proposed point is then accepted or rejected using the Metropolis-Hastings rule.
The MUQ RWM implementation allows users to specify the proposal distributions
$q(z)$ either through options in a boost property_tree (see the Example1_Gaussian
example) or manually (demonstrated below). Here, we employ the latter approach
and manually specify an instance of MUQ's MHProposal
class, which is then
combined with the MHKernel
Markov transition kernel to define the RWM algorithm.
Define the proposal distribution.
Eigen::VectorXd propMu = Eigen::VectorXd::Zero(2); // Set the proposal covariance to be the optimal scaling of the target covariance Eigen::MatrixXd propCov = (2.4*2.4/std::sqrt(2))*cov; auto propDist = std::make_shared<Gaussian>(propMu, propCov);
Use the Gaussian proposal distribution to define an MCMC proposal class.
pt::ptree propOpts; auto proposal = std::make_shared<MHProposal>(propOpts, problem, propDist);
Construct the Metropolis-Hastings (MH) Markov transition kernel using the proposal.
MUQ can perform blockwise updates of target densities that have multiple
vector-valued inputs (e.g., parameters and hyperparameters). The
SingleChainMCMC
class employed below therefore expects a transition kernel
for each parameter block. These kernels are passed to the SingleChainMCMC
constructor as a std::vector
of transition kernels. Here, we only have a
single kernel but we store the kernel in a vector to match the interface
expected by SingleChainMCMC
.
pt::ptree kernOpts; std::vector<std::shared_ptr<TransitionKernel>> kernels(1); kernels.at(0) = std::make_shared<MHKernel>(kernOpts, problem, proposal);
Use the kernel to define a single chain MCMC algorithm.
pt::ptree chainOpts; chainOpts.put("NumSamples", 1e4); // number of MCMC steps chainOpts.put("BurnIn", 1e3); chainOpts.put("PrintLevel",3); auto mcmc = std::make_shared<SingleChainMCMC>(chainOpts,kernels);
3. Run the MCMC algorithm
We are now ready to run the MCMC algorithm. Here we start the chain at the target densities mean. The resulting samples are returned in an instance of the SampleCollection class, which internally holds the steps in the MCMC chain as a vector of weighted SamplingState's.
Eigen::VectorXd startPt = mu; std::shared_ptr<SampleCollection> samps = mcmc->Run(startPt);
4. Analyze the results
When looking at the entries in a SampleCollection, it is important to note that the states stored by a SampleCollection are weighted even in the MCMC setting. When a proposal $x^\prime$ is rejected, instead of making a copy of $x^{(k-1)}$ for $x^{(k)}$, the weight on $x^{(k-1)}$ is simply incremented. This is useful for large chains in high dimensional parameter spaces, where storing all duplicates could quickly consume available memory.
The SampleCollection class provides several functions for computing sample moments. For example, here we compute the mean, variance, and third central moment. While the third moment is actually a tensor, here we only return the marginal values, i.e., $\mathbb{E}_x[(x_i-\mu_i)^3]$ for each $i$.
Eigen::VectorXd sampMean = samps->Mean(); std::cout << "\nSample Mean:\n" << sampMean.transpose() << std::endl; Eigen::VectorXd sampVar = samps->Variance(); std::cout << "\nSample Variance:\n" << sampVar.transpose() << std::endl; Eigen::MatrixXd sampCov = samps->Covariance(); std::cout << "\nSample Covariance:\n" << sampCov << std::endl; Eigen::VectorXd sampMom3 = samps->CentralMoment(3); std::cout << "\nSample Third Moment:\n" << sampMom3.transpose() << std::endl << std::endl;
5. Inspect the sample meta data.
In addition to storing the state of the MCMC chain, MUQ may also store extra
information stored by the transition kernel or proposal. This "metadata"
is store in the SampleCollection
and can be accessed using the GetMeta
function of the SampleCollection
class. It is also possible to list what
metadata is available using the ListMeta
function.` Below, we first list
all of the available metadata and then extract the log of the target density.
Note that the GetMeta
function returns a matrix of doubles regardless of the
size or type of the metadata. Each column of the matrix corresponds to a sample
and each row of the matrix to a component of the (possibly) vector-valued
metadata variable. For scalar values, like the log density, there will only
be a single row.
// List the name of any metadata stored by the samples std::cout << "Available MetaData: " << std::endl; for(auto& metaKey : samps->ListMeta()) std::cout << " \"" << metaKey << "\"" << std::endl; std::cout << std::endl; // Extract the log-density of the target distribution at each sample Eigen::MatrixXd logTargetDens = samps->GetMeta("LogTarget"); // Compute the maximum log target density and store the sample index where it occured double maxLogDens; unsigned int maxRow, maxCol; maxLogDens = logTargetDens.maxCoeff(&maxRow, &maxCol); std::cout << "From MetaData:" << std::endl; std::cout << " p* = max log(p(x)) = " << maxLogDens << std::endl; std::cout << " x* = argmax p(x) = " << samps->at(maxCol)->state.at(0).transpose() << std::endl;
6. Extract samples as a matrix for further processing.
In some cases you will want to extract a matrix of the MCMC states from the
SampleCollection
object. As shown below, that can be easily accomplished
using the AsMatrix
function in the SampleCollection
class.
Eigen::MatrixXd sampMat = samps->AsMatrix(); std::cout << "\nMean using eigen = " << sampMat.rowwise().mean().transpose() << std::endl; return 0; }
Complete Code
#include "MUQ/Modeling/Distributions/Gaussian.h" #include "MUQ/Modeling/Distributions/Density.h" #include "MUQ/SamplingAlgorithms/SamplingProblem.h" #include "MUQ/SamplingAlgorithms/SingleChainMCMC.h" #include "MUQ/SamplingAlgorithms/MHProposal.h" #include "MUQ/SamplingAlgorithms/MHKernel.h" #include <boost/property_tree/ptree.hpp> namespace pt = boost::property_tree; using namespace muq::Modeling; using namespace muq::SamplingAlgorithms; using namespace muq::Utilities; int main(){ Eigen::VectorXd mu(2); mu << 1.0, 2.0; Eigen::MatrixXd cov(2,2); cov << 1.0, 0.8, 0.8, 1.5; auto targetDensity = std::make_shared<Gaussian>(mu, cov)->AsDensity(); // standard normal Gaussian auto problem = std::make_shared<SamplingProblem>(targetDensity); Eigen::VectorXd propMu = Eigen::VectorXd::Zero(2); // Set the proposal covariance to be the optimal scaling of the target covariance Eigen::MatrixXd propCov = (2.4*2.4/std::sqrt(2))*cov; auto propDist = std::make_shared<Gaussian>(propMu, propCov); pt::ptree propOpts; auto proposal = std::make_shared<MHProposal>(propOpts, problem, propDist); pt::ptree kernOpts; std::vector<std::shared_ptr<TransitionKernel>> kernels(1); kernels.at(0) = std::make_shared<MHKernel>(kernOpts, problem, proposal); pt::ptree chainOpts; chainOpts.put("NumSamples", 1e4); // number of MCMC steps chainOpts.put("BurnIn", 1e3); chainOpts.put("PrintLevel",3); auto mcmc = std::make_shared<SingleChainMCMC>(chainOpts,kernels); Eigen::VectorXd startPt = mu; std::shared_ptr<SampleCollection> samps = mcmc->Run(startPt); Eigen::VectorXd sampMean = samps->Mean(); std::cout << "\nSample Mean:\n" << sampMean.transpose() << std::endl; Eigen::VectorXd sampVar = samps->Variance(); std::cout << "\nSample Variance:\n" << sampVar.transpose() << std::endl; Eigen::MatrixXd sampCov = samps->Covariance(); std::cout << "\nSample Covariance:\n" << sampCov << std::endl; Eigen::VectorXd sampMom3 = samps->CentralMoment(3); std::cout << "\nSample Third Moment:\n" << sampMom3.transpose() << std::endl << std::endl; // List the name of any metadata stored by the samples std::cout << "Available MetaData: " << std::endl; for(auto& metaKey : samps->ListMeta()) std::cout << " \"" << metaKey << "\"" << std::endl; std::cout << std::endl; // Extract the log-density of the target distribution at each sample Eigen::MatrixXd logTargetDens = samps->GetMeta("LogTarget"); // Compute the maximum log target density and store the sample index where it occured double maxLogDens; unsigned int maxRow, maxCol; maxLogDens = logTargetDens.maxCoeff(&maxRow, &maxCol); std::cout << "From MetaData:" << std::endl; std::cout << " p* = max log(p(x)) = " << maxLogDens << std::endl; std::cout << " x* = argmax p(x) = " << samps->at(maxCol)->state.at(0).transpose() << std::endl; Eigen::MatrixXd sampMat = samps->AsMatrix(); std::cout << "\nMean using eigen = " << sampMat.rowwise().mean().transpose() << std::endl; return 0; }
- Join our slack workspace to connect with other users, get help, and discuss new features.
This material is based upon work supported by the National Science Foundation under Grant No. 1550487.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
This material is based upon work supported by the US Department of Energy, Office of Advanced Scientific Computing Research, SciDAC (Scientific Discovery through Advanced Computing) program under awards DE-SC0007099 and DE-SC0021226, for the QUEST and FASTMath SciDAC Institutes.