About TEBio-Fit
Reasoning about sets of nonlinear dynamical systems is hard. Good visualization systems can make it easier.
TEBio-Fit provides interactive visualization to facilitate parameter-estimation and model comparison
for general non-linear models.
Currently, it is focused on displaying the results produced by
ABC-SysBio, a tool for likelihood-free
parameter inference and model selection in dynamical systems that uses the Approximate Bayesian Computation
Sequential Monte Carlo (ABC SMC) approach described below. However, it could be extended to view the results
produced form other tools (for example, for the results of applying MCMC to models for which a likelihood
function is available).
Design Principles
TEBio-Fit was designed with several principles in mind:
- easy setup: the viewer run in any modern web-browser, on any platform with nothing to install. Pre-processing requires python scripts with minimal dependencies.
- modularity: code to visually compare two SBML models is a separate python module that can be re-used in other packages. Code to perform visualisation is entirely independent of code that performs inference
- freely licensed: BSD license imposes few restrictions on re-use
- interactivity allows the user to see all the data, yet also focus only on what is important to them at a particular moment
- figures are directly integrated with explanations of what they show. The purpose of this is two-fold: it help the novice user by providing explantions at the point of need, avoiding them having to refer back to documentation, and it produces a self-contained report that can be presented to a collaborator or supervisor who may be unfamiliar with the tools used.
- easy export of images: figures can be exported, and optionally edited in third-party applications
Theory
Suppose we have some initial idea of what values the parameters in a model might have, which we can express as a prior probability distribution. We then perform an experiment, obtaining measurements $x_0$. We now want to use these measurements to update our knowledge about the likely values of the parameters.
The way to do this is to obtain the posterior distribution by applying Bayes' rule: $$ \underbrace{p(\theta | x_0)}_{posterior} = \frac{ \overbrace{p(x_0 | \theta)}^{likelihood} \overbrace{p(\theta)}^{prior} }{ p(x_0) } $$
For sufficiently simple models, we can find an analytical expression for the posterior probability distribution function. More commonly, this is impossible, but we can draw samples from it using a Monte Carlo method. For more complicated models, we may not even be able to obtain an expression for the likelihood function, but if we can perform simulations we can apply an Approximate Bayesian Computation (ABC) method.
Three Monte-Carlo methods, and their ABC equivalents, are summarised in the table below. For clarity, the
prior is denoted by $\pi(\theta)$
| Approach | Monte-Carlo Method | ABC Method |
| |
- Given $x$ and $\theta$, we can evaluate the likelihood $\color{blue}{ p(x_0 | \theta) }$.
- Samples exact posterior $p(\theta | x_0 )$
|
- Given $\theta$, we cannot evaluate $\color{blue}{ p(x_0 | \theta)}$, but can draw a sample
from $p(x | \theta^*)$ by performing a simulation, and accept $\theta^*$ only if this is
sufficiently close to $x_0$
- Samples from the approximate posterior $p(\theta | \color{red}{d(x^*, x_0)} \leq \epsilon )$, where
$\epsilon$ is a parameter that determines the closeness of the approximation, and $d$ is a
distance function
|
| Rejection Method |
- Sample $\theta^*$ from $\pi(\theta)$
- Sample $u$ from $U(0, 1)$
- If $u < \color{blue}{p( x_0 | \theta^*)}$, accept $\theta^*$
|
- Sample $\theta^*$ from $\pi(\theta)$
- Sample $x^*$ from $p(x | \theta^*)$, by simulation
- If $\color{red}{d(x_0, x^*)} \leq \epsilon$, accept $\theta^*$
Approximate Bayesian Computation and MCMC
|
| MCMC |
Initialise $\theta^{(1)}$
For $i=1 \ldots N$:
- Sample $\theta^*$ from the proposal distribution $Q(\theta^* | \theta^{(i)})$
- Sample $u$ from $U(0, 1)$
- If $u < \min \left( 1, \frac{ \color{blue}{P(x_0 |\theta^*)} \pi(\theta^* ) Q(\theta^* | \theta^{(i)} )}{ \color{blue}{P(x_0 | \theta^{(i)})} \pi(\theta^{(i)}) Q(\theta^{(i)} | \theta^* ) } \right)$,
then $\theta^{(i+1)} = \theta^*$; otherwise $\theta^{(i+1)} = \theta^{(i)}$
|
Initialise $\theta^{(1)}$
For $i=1 \ldots N$:
START
- Sample $\theta^*$ from the proposal distribution $Q(\theta^* | \theta^{(i)})$
- Sample $x^*$ from $p(x | \theta^*)$, by simulation
- If $\color{red}{d(x_0, x^*)} > \epsilon$, then GOTO START
- If $u < \min \left( 1, \frac{\pi(\theta^* ) Q(\theta^* | \theta^{(i)} )}{ \pi(\theta^{(i)}) Q(\theta^{(i)} | \theta^* ) } \right)$,
then $\theta^{(i+1)} = \theta^*$; otherwise $\theta^{(i+1)} = \theta^{(i)}$
Markov chain Monte Carlo without likelihoods
|
| SIS |
For $t = 0 \ldots T$:
For $i = 1 \ldots N$:
START
- If $t=0$, sample $\theta^{**}$ independently from $\pi(\theta)$
- If $t>0$, sample $\theta^{**}$ from previous population ${\theta^i_{t-1}}$ with weights $w_{t-1}$, then obtain $\theta^{**}$ by perturbing with kernel $K_t$
- If $\pi(\theta^{**}) = 0$, then GOTO START
- Set $\theta_t^{(i)} = \theta^{**}$
- If $t=0$, $w_t^{(i)} = 1$; otherwise $$w_t^{(i)} = \frac{ \color{blue}{P(x_0 | \theta_t^{(i)})} \pi(\theta_t^{(i)}) }{ \sum_{j=1}^{N} w_{t-1}^{(j)} K_t( \theta_t^{(i)} | \theta_{t-1}^{(j)} ) }$$
|
For $t = 0 \ldots T$:
For $i = 1 \ldots N$:
START
- If $t=0$, sample $\theta^{**}$ independently from $\pi(\theta)$
- If $t>0$, sample $\theta^{**}$ from previous population ${\theta^i_{t-1}}$ with weights $w_{t-1}$, then obtain $\theta^{**}$ by perturbing with kernel $K_t$
- If $\pi(\theta^{**}) = 0$, then GOTO START
- Sample $x^*$ from $p(x|\theta^{**})$ by simulation
- If $\color{red}{d(x^*, x_0)} \geq \epsilon_t$, then GOTO START
- Set $\theta_t^{(i)} = \theta^{**}$
- If $t=0$, $w_t^{(i)} = 1$; otherwise $$w_t^{(i)} = \frac{ \pi(\theta_t^{(i)}) }{ \sum_{j=1}^{N} w_{t-1}^{(j)} K_t( \theta_t^{(i)} | \theta_{t-1}^{(j)} ) }$$
Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems
|
Note that there is a progression here: the rejection method draws each sample independently of its
predecessors; the MCMC method uses the previous sample when drawing the next; and the SIS method keeps track of a population of samples at each time-point.
The SIS method also makes use of a series of thresholds $\epsilon_t$: these follow a decreasing schedule, so
that the set of samples from successive populations are drawn from closer approximations to the true posterior.