The Transition Path Ensembles

Next: Sampling the Transition Path Up: Rare Events: Path-Sampling Monte Previous: Fundamentals

The Transition Path Ensembles

In this section, we briefly recapitulate the presentation of the transition path sampling method given in Ref. [18]. Accordingly, we will switch notation a bit. We will call the indicator function , and call a state point at time :

$\begin{displaymath} h_{A,B}\left(x_t\right) = \left\{\begin{array}{ll} 1 & \mbo... ...n A,B$}\\ 0 & \mbox{if $x_t \notin A,B$} \end{array}\right. \end{displaymath}$

(280)

We consider the correlation function which measures the likelihood of finding the system in state

at time

provided that it was in state

at time 0. Now, for extremely long times,

approaches the probability to find the system in state B at equilibrium, regardless of the system starting point (i.e., ergodicity is realized). These must correspond to times much longer that the reaction time $\tau_R$ . For times approaching $\tau_R$ ,

approaches $\left<h_B\right>$ exponentially:

$\begin{displaymath} C\left(t\right) \approx \left<h_B\right>\left(1-\exp^{-t/\tau_R}\right) \end{displaymath}$

(281)

Recall that $\tau_R = \left(k_{AB}+k_{BA}\right)^{-1}$ . When this time is greater than the short-time-scale molecular relaxation time, $\tau_{\rm mol}$ ,

is a linear function of time:

$\begin{displaymath} C\left(t\right) \approx k_{AB}t \end{displaymath}$

(282)

The reactive flux,

displays a time-independent plateau in this regime which is equal to $k_{AB}$ .

One should realize that can be computed from a single molecular dynamics simulation, in principle. However, if the system dynamics is subject to rare event transitions, it may not be possible in practice to simulate long enough to achieve a statistically relevant value of . Transition path sampling is meant to overcome this limitation.

Let's consider writing as an explicit ensemble average over the equilibrium phase space probability distribution, $\rho\left(x_0\right)$ :

$\begin{displaymath} C\left(t\right) = \frac{\int dx_0 \rho\left(x_0\right)h_A\le... ...x_t\right)}{\int dx_0 \rho\left(x_0\right)h_A\left(x_0\right)} \end{displaymath}$

(283)

Now, an insight of Dellago and Chandler is that, because both the numerator and denominator of Eq. 284 are partition functions, the log of

can be interpreted as a free energy difference between two systems:

$\begin{displaymath} \Delta F = -\ln C\left(t\right) \end{displaymath}$

(284)

So, the log of

is the free energy price one must pay to constrain the endpoint of a dynamical path of length

which starts at time 0 in region

inside state

. This means that we can use (in principle) any free energy method to compute

. Dellago and Chandler chose umbrella sampling.

To see why it is advantageous to use umbrella sampling, we must first imagine an order parameter $\lambda\left(x\right)$ which indicates when we are in region in the following manner:

$\begin{displaymath} x \in B \mbox{if} \lambda_{\rm min} \le \lambda\left(x\right) \le \lambda_{\rm max} \end{displaymath}$

(285)

We now ask, how probable is it to find the system with a particular value of the order parameter, $\lambda$ , at time

? We can express this probability distribution as an ensemble average by visiting each phase space point at time 0,

, and asking does this point initiate a dynamical trajectory that lands at order parameter $\lambda$ at time

$\begin{displaymath} P\left(\lambda,t\right) = \frac{\int dx_0 \rho\left(x_0\righ... ...ght)\right]}{\int dx_0\rho\left(x_0\right)h_A\left(x_0\right)} \end{displaymath}$

(286)

Here, $\delta\left(x\right)$ is the Dirac delta function. Because region

corresponds to an interval of $\lambda$ , $C\left(t\right)$ is an integral of $P\left(\lambda,t\right)$ :

$\begin{displaymath} C\left(t\right) = \int_{\lambda_{\rm min}}^{\lambda_{\rm max}}d\lambda P\left(\lambda,t\right) \end{displaymath}$

(287)

Because transitions from

are rare, $P\left(\lambda,t\right)$ in region

is small for relevant values of time, and we can't compute it directly. So, we divide phase space into

neighboring overlapping regions

$\begin{displaymath} \mbox{(whole phase space)} = \bigcup_{i=0}^{N}B[i] \end{displaymath}$

(288)

Each region is defined by

$\begin{displaymath} x \in B[i] \mbox{if} \lambda_{\rm min}[i] \le \lambda\left(x\right) \le \lambda_{\rm max}[i] \end{displaymath}$

(289)

Neighboring regions must overlap ``a little''; i.e., $\lambda_{\rm min}[i] < \lambda_{\rm max}[i-1]$ . The size of the overlap will be considered in a bit. Now, the distribution of $\lambda$ in each window

$\begin{displaymath} P\left(\lambda,t;i\right) = \frac{\int dx_0 \rho\left(x_0\ri... ...ho\left(x_0\right)h_A\left(x_0\right)h_{B[i]}\left(x_t\right)} \end{displaymath}$

(290)

Notice that $h_{B[i]}\left(x_t\right)$ acts like a Boltzmann factor for an umbrella potential,

$\begin{displaymath} w_i = \left\{\begin{array}{ll} \infty & \mbox{if $\lambda <... ...\mbox{if $\lambda > \lambda_{\rm max}[i]$} \end{array}\right. \end{displaymath}$

(291)

And we are computing this probability distribution in window

using a phase space distribution whose Hamiltonian is modified by

; e.g., $\rho_i\left(x\right) \propto e^{-\beta\left(\mathscr{H}+w_i\right)}$ . This means when we conduct a particular MC run, we sample only within one window

The key aspect of umbrella sampling is that, inside window :

$\begin{displaymath} P\left(\lambda,t\right) \propto P\left(\lambda,t;i\right) \... ...x{for $\lambda_{\rm min}[i] < \lambda < \lambda_{\rm max}[i]$} \end{displaymath}$

(292)

This is because the denominator of $P\left(\lambda,t;i\right)$ counts only those paths that end at

, while the denominator of $P\left(\lambda,t\right)$ counts all paths that end in state

This proportionality is important, because it means that one can compute $P\left(\lambda,t;i\right)$ for each window separately using MC simulations, and then match the resulting distributions (tabulated as histograms over $\lambda$ ) in the overlapping regions, and then renormalize the entire distribution to produce $P\left(\lambda,t\right)$ . Each MC simulation performs the appropriate random walk focused in its window, and thus maximizes the statistical significance of the results obtained in each window.

Finally, when we have $P\left(\lambda,t\right)$ , one must only integrate over the appropriate values of $\lambda$ to obtain .

The notion of a ``path ensemble'' comes from interpreting Eq. 291 as an average of the quantity $\delta\left[\lambda-\lambda\left(x_t\right)\right]$ over a distribution $f_{AB[i]}\left (x_0,t\right ) = \rho \left (x_0\right )h_A\left (x_0\right )h_{B[i]}\left (x_t\right )$ . $f_{AB[i]}\left(x_0,t\right)$ is the distribution function of all initial states whose trajectories lead exactly to state in time . $P\left(\lambda,t;i\right)$ is a weighted average over these ``paths.'' $f_{AB[i]}\left(x_0,t\right)$ is therefore called a ``path ensemble.'' The average of any quantity $A\left(x_{t^\prime}\right)$ in this ensemble,

$\begin{displaymath} \left<A\left(x_{t^\prime}\right)\right> = \frac{\int dx_0 f_... ...eft(x_0\right)\right]}{\int dx_0 f_{AB[i]}\left(x_0,t\right)}, \end{displaymath}$

(293)

is called a ``path average.''

portrait

A schematic of the first transition path ensemble, $f_{AB[i]}\left (x_0,t\right ) = \rho \left (x_0\right )h_A\left (x_0\right )h_{B[i]}\left (x_t\right )$ .

Let's take stock: We know that . In order for to be a constant, we need to ensure that is linear in time, meaning we must evaluate for many values of . Each evaluation of is a ``free energy'' calculation, so getting at this way may be prohibitively expensive. Another insight of Dellago and Chandler [18] was to recognize that a simple factorization of leads to an algorithm in which one need only do a single free energy calculation. Consider:

$\begin{displaymath} C\left(t\right) = \frac{\left<h_Ah_B\left(t\right)\right>}{\... ..._B\left(t^\prime\right)\right>} \times C\left(t^\prime\right), \end{displaymath}$

(294)

where both

and $t^\prime$ are in an interval denoted

. For notational convenience: $h_A\equiv h_A\left(x_0\right)$ and $h_B(t)\equiv h_B\left(x_t\right)$ .

Next, we define a new indicator function as a property of the interval :

$\begin{displaymath} H_B\left(x_0,T\right) \equiv \max_{0\le t\le T} h_B\left(x_t\right) \end{displaymath}$

(295)

which tells us if the trajectory begun at

visits state

at least once during the interval

. Since

for all $t \in \left[0,T\right]$ , and

otherwise, we can insert it into our factorized expression for

$\begin{displaymath} C\left(t\right) = \frac{\left<h_Ah_B\left(t\right)H_B\left(T... ...right)H_B\left(T\right)\right>} \times C\left(t^\prime\right), \end{displaymath}$

(296)

(We've also multiplied and divided by $\left<h_AH_B\left(T\right)\right>$ .)

If we stare at Eq. 297 long enough, we see that the quantity

$\displaystyle \left<h_B\left(t\right)\right>_{AB}$	$\textstyle \equiv$	$\displaystyle \frac{\left<h_Ah_B\left(t\right)H_B\left(T\right)\right>}{\left<h_AH_B\left(T\right)\right>}$	(297)
	$\textstyle =$	$\displaystyle \frac{\int dx_0\rho\left(x_0\right)h_A\left(x_0\right)H_B\left(x_... ...\right)}{\int dx_0\rho\left(x_0\right)h_A\left(x_0\right)H_B\left(x_0,T\right)}$	(298)

is an average of

over the distribution function

$\begin{displaymath} F_{AB}\left(x_0,T\right) \equiv \rho\left(x_0\right)h_A\left(x_0\right)H_B\left(x_0,T\right) \end{displaymath}$

(299)

This is the ensemble of all paths that begin in

and visit

at least once in the interval

. (It therefore differs from $f_{AB}\left(x_0,T\right)$ .)

portrait

A schematic of the second transition path ensemble, $F_{AB}\left (x_0,T\right ) \equiv \rho \left (x_0\right )h_A\left (x_0\right )H_B\left (x_0,T\right )$ .

Using this notation to denote averaging over this ensemble, $\left<\cdots\right>_{AB}$ ,

$\begin{displaymath} C\left(t\right) = \frac{\left<h_B\left(t\right)\right>_{AB}}... ...ft<h_B\left(t^\prime\right)\right>_{AB}}C\left(t^\prime\right) \end{displaymath}$

(300)

We can efficiently calculate $\left<h_B\left(t\right)\right>_{AB}$ by sampling $F_{AB}\left(x_0,T\right)$ in a single simulation. $C(t^\prime)$ can be calculated from a single free-energy umbrella-sampling calculation. This provides a recipe for obtaining

Perform path sampling on $F_{AB}\left(x_0,T\right)$ to obtain the function $\left<h_B\left(t\right)\right>_{AB}$ on . If $\frac{d}{dt}\left<h_B\left(t\right)\right>_{AB}$ does not display a plateau, repeat with a larger value of .
Choose a time $t^\prime$ (which can be much less than ) and compute $P\left(\lambda,t^\prime\right)$ by umbrella sampling to get $C(t^\prime)$ by integration of Eq. 288. Because of step 1, $\left<h_B\left(t^\prime\right)\right>_{AB}$ is known.
Calculate .
Calculate as

$\begin{displaymath} k\left(t\right) = \frac{dC\left(t\right)}{dt} = \frac{\left<... ...left(t^\prime\right)\right>_{AB}}\times C\left(t^\prime\right) \end{displaymath}$ (301)

Next: Sampling the Transition Path Up: Rare Events: Path-Sampling Monte Previous: Fundamentals

cfa22@drexel.edu