Sufficient and necessary conditions on random walk to obtain standard diffusion equation

Question

In the simplest random walk model that is generally considered, the probability of the finding the particle at time $t$ in $x$, $P(x,t)$ is given by, $$ P(x,t) = \frac{1}{2}\big[ P(x-a, t-\tau) + P(x+a, t-\tau) \big] $$ where $a$ is the fixed spacing between two subsequent points and $\tau$ is the time-interval for the particle to perform jumps $(x-a) \rightarrow x$ or $(x+a) \rightarrow x$.

It is crucial to note here that the choice of $\tau$ and $a$ are as non-random real numbers and the only probabilistic aspect is the choice to jump either in right/left directions. Essentially in this model, the only randomness incorporated is direction of motion.

In the limit $a, \tau \rightarrow 0$ and $a^2/(2\tau) \rightarrow D$, the above simply reduces to the famous diffusion equation given by, $$ \frac{\partial P(x,t)}{\partial t} = D \frac{\partial^2 P(x,t)}{\partial x^2} $$

This leads to the true nature of underlying randomness in the diffusion equation. In the seminal paper by Eisntein 1, the diffusion equation is derived under the condition that random walk is propagated by random jumps (sampled from a distribution $\lambda(x)$) taking place at random times (Poissonian stochastic process). Unlike in the original Einstein's article 1, the above mentioned derivation of diffusion equation only assumes randomness in the direction of motion.

Therefore, does the diffusion equation not require the random jump lengths at random times but only a randomness in the direction of motion?

1 Einstein, Albert (1905). "Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen" ["On the Movement of Small Particles Suspended in Stationary Liquids Required by the Molecular-Kinetic Theory of Heat"]. Annalen der Physik (in German). 322 (8): 549–560. doi:10.1002/andp.19053220806

EDIT 1:

After going through multiple answers, I do realize that the randomness in directions is simply the Bernoulli process and in the large $N$ (number of jumps) limit, it produces a Gaussian distribution according to central limit theorem. This answers the structure of the equation pertaining to the spatial part, but I still cannot fathom where the Poisson process (in the temporal domain) is hidden in this model.

Or, may be, is it true that this equation is valid even for a temporally deterministic process? For instance, if the particle were performing these random jumps at fixed time-intervals $T$, will this equation be still valid?

score 6 · Answer 1 · answered Dec 06 '22 at 10:02

Therefore, does the diffusion equation not require the random jump lengths at random times but only a randomness in the direction of motion?

No, random jump times are not necessary - they may however affect the value of the diffusion coefficient. Note that diffusion equation is a particular case of Fokker-Planck equation, which describes many stochastic processes under rather general conditions.

Another way to think of it: we could consider jumps by a random distance or we could consider jumps on a lattice, where the displacement is often by one lattice site, but the direction of the jump is random. The two approaches would produce a diffusion equation. In the same way, the derivation with non-random times can be viewed as a lattice model along the time dimension.

Ilmari Karonen · Answer 2 · 2022-12-08T11:36:03.467

Therefore, does the diffusion equation not require the random jump lengths at random times but only a randomness in the direction of motion?

Yes, a discrete time random walk on a discrete lattice, with randomness only in the direction of motion, is sufficient to obtain a discrete approximation to the continuous diffusion process, and this approximation indeed converges in distribution to the continuous diffusion process in the limit where $a, \tau \to 0$ while $a^2 = 2D\tau$.

The simplest rigorous way (that I know of) to prove this is just to solve both processes (i.e. find the distribution of the particle's position $X(t)$ for all $t > 0$, starting at $X(0) = 0$, for each process) and observe that the solution for the discrete lattice random walk converges in distribution, under appropriate scaling, to the solution for the continuous diffusion process.

Specifically, let $X_1(t)$ denote the position of the particle under the one-dimensional discrete lattice random walk, with spatial step size $a$ and time step $\tau$, at time $n\tau ≤ t < (n+1)\tau$, i.e. after $n$ random steps of $\pm a$ distance. Then the distribution of $X_1(t)$ is a scaled and translated binomial distribution: $$\frac{X_1(n\tau) + na}{2a} = \frac{X_1(n\tau)}{2a} + \frac n2 \sim B\left(n, \frac12\right). \tag{1}$$

Meanwhile, let $X_2(t)$ denote the position of the particle under the continuous diffusion process on one-dimensional Euclidean space with diffusion constant $D$ at time $t > 0$. Then $X_2(t)$ is normally distributed with mean $X_2(0) = 0$ and variance $\sigma^2 = 2Dt$: $$X_2(t) \sim \mathcal N(0, 2Dt). \tag{2}$$

(I'll leave verifying those solutions as an exercise. The one for the discrete random walk is easy enough to derive from first principles, while the continuous diffusion case amounts to deriving the fundamental solution of the heat equation in one-dimensional Euclidean space, which is a common enough exercise in the study of parabolic PDEs.)

Now, by rescaling space by $1/\sigma = 1/\sqrt{2Dt}$, we can transform $(2)$ into $$\frac{X_2(t)}{\sqrt{2Dt}} \sim \mathcal N(0, 1).$$

Meanwhile, for $(1)$, we can use the De Moivre–Laplace theorem, which says that if $X \sim B(n, p)$ and $n \to \infty$, then the distribution of $X_{\text{norm}} = (X-np)/\sqrt{np(1-p)}$, i.e. $X$ scaled and translated to normalize its mean to $0$ and variance to $1$, approaches a standard normal distribution: $$X_{\text{norm}} = \frac{X-np}{\sqrt{np(1-p)}} \overset{\mathcal D}{\to} \mathcal N(0,1),$$ where $\overset{\mathcal D}{\to}$ denotes convergence in distribution. Applying this theorem to $(1)$ — specifically, plugging in $p = \frac12$ and $X = \frac{X_1(n\tau)}{2a} + \frac n2$ — we can see that, as $n \to \infty$, then $$\frac{X-\frac12n}{\frac12\sqrt{n}} = \frac{X_1(n\tau)}{a\sqrt{n}} \overset{\mathcal D}{\to} \mathcal N(0,1).$$

If we now substitute in $n = t/\tau$ and $a = \sqrt{2D\tau}$, we can see that, as $\tau \to 0$ (and thus $n \to \infty$ and $a \to 0$), $$\frac{X_1(t)}{a\sqrt{t/\tau}} = \frac{X_1(t)}{\sqrt{2Dt}} \overset{\mathcal D}{\to} \mathcal N(0,1).$$ Thus, in particular, we see that as $\tau \to 0$, $X_1(t) \overset{\mathcal D}{\to} X_2(t)$.

Ps. Let me briefly mention multidimensional diffusion and random walks, since other answers have touched upon them.

In general, the same discrete approximation works just fine in more than one dimension, and as long as the transition kernel (i.e. the distribution of the particle's position after one time step) is sufficiently symmetric, it will converge to an isotropic diffusion process with no drift. (A sufficient, but not necessary, symmetry condition is that the kernel is invariant under inversion along any lattice axis and under the exchange of any two axes.)

In particular, both of the "four nearest neighbors" and "eight nearest neighbors" kernels mentioned in James's answer are sufficiently symmetric, and will converge to isotropic diffusion in two-dimensional Euclidean space. However, the effective diffusion coefficients will be different for the same lattice step size $a$ and time step $\tau$ depending on the kernel used.

In particular, the "four nearest neighbors" kernel (or, more generally, the "$2k$ nearest neighbors" kernel on a $k$-dimensional lattice) gives an effective diffusion coefficient of $D = \frac{a^2}{2k\tau}$. An intuitive way to see this is to note that each step of the "$2k$ nearest neighbors" random walk is equivalent to the following two steps:

Choose one of the $k$ lattice axes at random.
Move the particle by $\pm a$ along the chosen axis.

Thus, in effect, the particle undergoes a one-dimensional random walk along each of the $k$ axes, but only moves along a given axis with probability $1/k$ on each time step. Thus, it's not surprising that the resulting diffusion coefficient ends up being only $1/k$ times what one would get from a random walk in one dimension.

(FWIW, these random walks along each axis, while not independent, are uncorrelated, which is sufficient to yield isotropic diffusion in the limit.)

Meanwhile, the "eight nearest neighbors" random walk can be seen as a variation of the "four nearest neighbors" random walk, except that on each time step we first flip a coin choose whether to move the particle by distance $a$ orthogonally or by $\sqrt2 a$ diagonally. (Alas, this does not generalize neatly to higher dimensions.) Thus, the mean squared step distance (which is what determines the diffusion coefficient) for this random walk is not $a^2$ but rather $\frac{a^2+(\sqrt2a)^2}{2} = \frac32a^2$, and the resulting effective diffusion coefficient is thus $\frac32$ times what it would be for four nearest neighbors (or $\frac34$ times what it would be for one-dimensional diffusion with the same step size), i.e. $D = \frac{3a^2}{8\tau}$.

(Alternatively, you can view the "eight nearest neighbors" random walk as flipping a coin on each time step to decide whether to move the particle by $\pm a$ along just one randomly chosen axis or along both of them independently. Since in this view the particle moves by $\pm a$ along each axis with probability $\frac34$ per time step, we obtain the same result as above.)

user35952 · Answer 3 · 2023-09-11T08:11:41.383

After a number of answers, I also want to add an additional perspective by generalizing this random walk problem using the theory of Continuous Time Random Walks (CTRW) first introduced by Montroll & Weiss (1).

In CTRW, the displacements of particle is propagated by two underlying random processes. The first is the random waiting time process, which is characterized by a waiting time probability density $w(t)$. This describes the times in-between jumps from one place to another place along the x-axis. Now the jumps themselves are described by a jump length probability density $\lambda(x)$.

To briefly describe the process, the particle waits for a random time $t$ (drawn from $w(t)$) at a position $x_0$ before making a jump of random length $x$ (drawn from $\lambda(x)$). The equation governing the probaility density, P(x,t) can be given in Fourier-Laplace space as,

$$ P(k,s) = \frac{1-\tilde w(s)}{s}\frac{1}{1-\tilde w(s) \hat \lambda(k)} $$

where $\tilde w(s)$ is the Laplace transform of $w(t)$ and $\hat \lambda(k)$ is the Fourier transform of $\lambda (x)$.
Case 1:
Waiting times - Poisson process, $w(t) \sim \exp(-t/\tau) \implies\tilde w(s) \sim (1+\tau s)^{-1}$
Jump distribution - Gaussian distribution, $\lambda(x) \sim \exp(-x^2/2\sigma^2) \implies \hat \lambda(k) \sim \exp(-\sigma^2 k^2/2)$

We can recover the diffusion equation in large length and time limit, given by $s, k \rightarrow 0$, when
$\tilde w(s) \sim 1 - \tau s$
$\hat \lambda (k) \sim 1 - \sigma^2 k^2/2$

Substituting these, the Fourier-Laplace probability density becomes, $$ P(k,s) = \frac{1}{s + \frac{\sigma^2 k^2}{2 \tau}} $$

And from here it is straightforward to show, upon Fourier-Laplace inversion, $$ \frac{\partial P(x,t)}{\partial t} = \frac{\sigma^2}{2\tau} \frac{\partial^2 P(x,t)}{\partial x^2} $$

where diffusion constant is $D = \sigma^2/(2\tau)$, with $\sigma$ being the variance of jump length distribution and $\tau$ being the mean waiting time.
Case 2:

Now if we consider the case of periodic jumps with the same Gaussian distribution of jump lengths, where waiting time is fixed at a value of $T$, then we have $w(t) = \delta(t-T) \implies \tilde w(s) = \exp(-sT)$,

In the long-time limit($s \rightarrow 0$), this also gives, $\tilde w(s) \sim 1 - sT$, very similar to the case that observed for Poisson process. Therefore, once again we can recover the same diffusion equation even if we choose a non-random periodic waiting time between jumps. In this case the diffusion constant, $D = \sigma^2/(2T)$.

So, essentially, the only necessary conditions for recovering the diffusion equation is the following asymptotic form of jump length and waiting time densities in their respective Fourier and Laplace spaces,

$$ \tilde w(s) \sim (1 - s \tau) \\ \hat \lambda(k) \sim (1 - \sigma^2 k^2/2) $$

In the case of jump length distribution, it is straightforward to note that this imposes only the condition of symmetry on the distribution, inducing equal probabilities for jumps to the left/right. However, in the case of waiting time distribution, it doesn't impose any conditions at all, except that their Laplace transform should be analytic and have Taylor expansion about $s = 0$.

(1) Montroll, Weiss, J. Math. Phys. 6, 167 (1965) (https://doi.org/10.1063/1.1704269)
(2) Wikipedia Reference for CTRW

K. A. Buhr · Answer 4 · 2022-12-12T00:36:06.257

Random direction alone is sufficient to obtain a diffusion limit, as evidenced by the classical simple random walk you describe.

However, random direction is not necessary, if another source of randomness exists. For example, consider the one-dimensional random walk that jumps deterministically in left-right-left-right order at fixed time intervals $\tau$, but jumps a absolute distance of either $a$ or $2a$ independently at random.

Note that each pair of left-right jumps has the net effect of the random walk kernel:

$$ P(x,t) = \frac{1}{4}\big[ P(x-a, t-2\tau) + 2P(x, t-2\tau) + P(x+a, t-2\tau) \big] $$

Compare this to the kernel generated by pairs of consecutive jumps of your simple "random direction" walk:

$$ P(x,t) = \frac{1}{4}\big[ P(x-2a, t-2\tau) + 2P(x, t-2\tau) + P(x+2a, t-2\tau) \big] $$

They are the same up to the scaling of space, so under the same time and space scaling limits, the former converges to a diffusion with one-quarter the $D$ of the latter.

score -1 · Answer 5 · answered Dec 06 '22 at 11:08

This random walk can actually be quite complicated in a discrete grid in 2 dimensions and above,

There are only 2 ways to perform the diffusion in 2D; A only allows for up-down-left-right, and B allows for west-northwest-north-northeast-eeast-southeast-south-southwest diffusion.

Neither way is continuously random. If some being can measure extremely precisely on the scale close to this grid, then A will reveal itself quickly, because there will be no diffusion in certain directions. If this being can also measure velocity very accurately, then B will be revealed because the diffusion velocity will be different diagonally than in the up-down direction.

To obfuscate this underlying grid, might we introduce a random error dA and dV to every random walk, so that any measurement made within cannot possibly be accurate enough to detect these anomalies... Would the mathematics lead to a type of uncertainty principle and higher phenomena from this introduction of randomness?

Sufficient and necessary conditions on random walk to obtain standard diffusion equation

5 Answers5

Linked