© 2018 Suzy Beeler and Vahe Galstyan. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license

This exercise was generated from a Jupyter notebook. You can download the notebook here.


Objective

In this tutorial, we will computationally simulate the process of diffusion with "coin flips," where at each time step, the particle can either move to the left or the right, each with probability $0.5$. From here, we can see how the distance a diffusing particle travels scale with time.

Modeling 1-D diffusion with coin flips

Diffusion can be understood as random motion in space caused by thermal fluctuations in the environment. In the cytoplasm of the cell different molecules undergo a 3-dimensional diffusive motion. On the other hand, diffusion on the cell membrane is chiefly 2-dimensional. Here we will consider a 1-dimensional diffusion motion to make the treatment simpler, but the ideas can be extended into higher dimensions.

In [1]:
# Import modules
import numpy as np
import matplotlib.pyplot as plt

# Show figures in the notebook
%matplotlib inline

# For pretty plots
import seaborn as sns
rc={'lines.linewidth': 2, 'axes.labelsize': 14, 'axes.titlesize': 14, \
    'xtick.labelsize' : 14, 'ytick.labelsize' : 14}
sns.set(rc=rc)

To simulate the flipping of a coin, we will make use of numpy's random.uniform() function that produces a random number between $0$ and $1$. Let's see it in action by printing a few random numbers:

In [2]:
for i in range(10):
    print(np.random.uniform())
0.8342160184177014
0.9430314233421397
0.007367989009862352
0.09402902047181094
0.4641922790100985
0.9895482225120592
0.1652578614128407
0.6038851928675221
0.675556259583081
0.2558372886656155

We can now use these randomly generated numbers to simulate the process of a diffusing particle moving in one dimension, where any value below $0.5$ corresponds to step to the left and any value above $0.5$ corresponds to a step to the right. Below, we keep track of the position of a particle for $1000$ steps, where each position is $+1$ or $-1$ from the previous position, as determined by the result of a coin flip.

In [3]:
# Number of steps
n_steps = 1000

# Array to store walker positions
positions = np.zeros(n_steps)

# simulate the particle moving and store the new position
for i in range(1, n_steps):
    
    # generate random number 
    rand = np.random.uniform()
    
    # step in the positive direction 
    if rand > 0.5:
        positions[i] = positions[i-1] + 1
        
    # step in the negative direction 
    else:
        positions[i] = positions[i-1] - 1
        
# Show the trajectory
plt.plot(positions)
plt.xlabel('steps')
plt.ylabel('position');

As we can see, the position of the particle moves about the origin in an undirected fashion as a result of the randomness of the steps taken. However, it's hard to conclude anything from this single trace. Only by simulating many of these trajectories can we begin to conclude some of the scaling properties of diffusing particles.

Average behavior of diffusing particles

Now let's generate multiple random trajectories and see their collective behavior. To do that, we will create a 2-dimensional numpy array where each row will be a different trajectory. 2D arrays can be sliced such that [i,:] refers to all the values in the ith row, and [:,j] refers to all the values in jth column.

In [4]:
# Number of trajectories
n_traj = 1000

# 2d array for storing the trajectories
positions_2D = np.zeros([n_traj, n_steps])

# first iterate through the trajectories
for i in range(n_traj):
    
    # then iterate through the steps
    for j in range(1, n_steps):
        
        # generate random number 
        rand = np.random.uniform()

        # step in the positive direction 
        if rand > 0.5:
            positions_2D[i, j] = positions_2D[i, j-1] + 1

        # step in the negative direction 
        else:
            positions_2D[i, j] = positions_2D[i, j-1] - 1

Now let's plot the results, once again by looping.

In [5]:
# iterate through each trajectory and plot
for i in range(n_traj):
    plt.plot(positions_2D[i,:])
    
# label
plt.xlabel('steps')
plt.ylabel('position');

The overall tendency is that the average displacement from the origin increases with the number of time steps. Because each trajectory is assigned a solid color and all trajectories are overlaid on top of each other, it's hard to see the distribution of the walker position at a given number of times steps. To get a better intuition about the distribution of the walker's position at different steps, we will assign the same color to each trajectory and add transparency to each of them so that the more densely populated regions have a darker color.

In [6]:
# iterate through each trajectory and plot
for i in range(n_traj):
    # lower alpha corresponds to lighter lines 
    plt.plot(positions_2D[i,:], alpha=0.01, color='k')

# label
plt.xlabel('steps')
plt.ylabel('position');

As we can see, over the course of diffusion the distribution of the walker's position becomes wider but remains centered around the origin, indicative of the unbiased nature of the random walk. To see how the walkers are distributed at this last time point, let's make a histogram of the walker's final positions.

In [7]:
# Make a histogram of final positions
_ = plt.hist(positions_2D[:,-1], bins=20)
plt.xlabel('final position')
plt.ylabel('frequency');

As expected, the distribution is centered around the origin and has a Gaussian-like shape. The more trajectories we sample, the "more Gaussian" the distribution will become. However, we may notice that the distribution appears to change depending on the number of bins we choose. This is known as bin bias and doesn't reflect anything about our data itself, just how we choose to represent it. An alternative (and arguably better) way to present the data is as a empirical cumulative distribution function (or ECDF), where we don't specify a number of bins, but instead plot each data point. For our cumulative frequency distribution, the $x$-axis corresponds to the final position of a particle and the $y$-axis corresponds to the proportion of particles that ended at this position or a more negative position.

In [8]:
# sort the final positions 
sorted_positons = np.sort(positions_2D[:,-1])
# make the corresponding y_values (i.e. percentiles)
y_values = np.linspace(start=0, stop=1, num=len(sorted_positons))

# plot the cumulative histogram
plt.plot(sorted_positons, y_values, '.')
plt.xlabel("final position")
plt.ylabel("cumulative frequency");

This way of visualizing the data makes it easier to tell that distribution of walkers is in fact symmetric around 0. That is, 50% of the walkers ended on a negative position, while 50% of the walkers ended on a positive position.