Home

A Julia interface to cmdstan

CmdStan.jl

Stan is a system for statistical modeling, data analysis, and prediction. It is extensively used in social, biological, and physical sciences, engineering, and business. The Stan program language and interfaces are documented here.

cmdstan is the shell/command line interface to run Stan language programs.

CmdStan.jl wraps cmdstan and captures the samples for further processing.

StanJulia

CmdStan.jl is part of the StanJulia Github organization set of packages. It captures draws from a Stan language program and returns an array of values for each accepted draw for each monitored varable in all chains.

Other packages in StanJulia are either extensions, postprocessing of the draws or plotting of the results. As much as possible an attempt has been made to leverage the MCMCChains.jl package to make comparisons with other mcmc packages easier.

On a very high level, a typical workflow for using StanJulia and handle postprocessing by TuringLang's MCMCChains.jl, will look like:

using CmdStan, StatsBase

# Define a Stan language program.
bernoulli = "..."

# Prepare for calling cmdstan.
stanmodel = StanModel(...)

# Compile and run Stan program, collect draws.
rc, chns, cnames = stan(...)

# Summary of result
describe(chns) 

# Example of postprocessing, e.g. Highest Posterior Density Interval.
MCMCChains.hpd(chns)

# Plot the draws.
plot(chns)

This workflow creates an MCMCChains.Chains object for summarizing, diagnostics, plotting and further processing.

A similar workflow is available for Mamba StanMamba.jl. Another option is to convert the array of draws to a DataFrame using StanDataFrames.jl.

The default value for the output_format argument in Stanmodel() is :mcmcchains which causes stan() to call a conversion method convert_a3d() that returns the MCMCChains.Chains object.

Other values for output_format are available, i.e. :array, :namedarray, :dataframe and :mambachain. CmdStan.jl provides the output_format options :mcmcchains, :array and :namedarray. The associated methods for the latter two options are provided by StanDataFrames and StanMamba.

Other MCMC options in Julia

Mamba.jl, Klara.jl, DynamicHMC.jl and Turing.jl are other Julia packages to run MCMC models (all in pure Julia!). Several other packages that address aspects of MCMC sampling are available.

Of particular interest might be the ongoing work in DiffEqBayes.jl on using MCMC for ODE parameter estimation.

Jags.jl is another option, but like StanJulia/CmdStan.jl, Jags runs as an external program.

References

There is no shortage of good books on Bayesian statistics. A few of my favorites are:

  1. Bolstad: Introduction to Bayesian statistics

  2. Bolstad: Understanding Computational Bayesian Statistics

  3. Gelman, Hill: Data Analysis using regression and multileve,/hierachical models

  4. McElreath: Statistical Rethinking

  5. Gelman, Carlin, and others: Bayesian Data Analysis

and a great read (and implementation in DynamicHMC.jl):

  1. Betancourt: A Conceptual Introduction to Hamiltonian Monte Carlo