mRNA distributions in yeast for constitutive and bursty genes

© 2021 Rebecca Rousseau. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

Objective: In this notebook, we will analyze mRNA count distributions calculated from yeast gene data (adapted from this paper) and fit appropriate probability distributions to the resulting plots. The chosen genes highlight constitutive and "bursty" gene transcription.

MDN1 mRNA count distribution (constitutive)

Poisson distribution for probability of $k$ counts: $$ P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}$$

To find the exact value of $\lambda$ with minimum error,

PDR5 mRNA count distribution (bursty transcription)

Let's try to fit a Poisson to this skewed distribution.

The probability distribution as defined by a Poisson curve is too densely centered around the mean, whereas the PDR5 gene expression distribution varies considerably across different possible mRNA counts. This is a strong indicator that, in the steady state, this gene exhibits "bursty" transcription. Such a phenomenon is better modeled by a negative binomial distribution (the discrete analog to the continuous Gamma distribution), defined as

$$ P(k) = {k+n-1 \choose n-1} p^n(1-p)^k, $$

where $k$ is the number of mRNA transcripts made in the characteristic lifetime of an mRNA, $n$ is a measure of the frequency of bursts, and $p$ is the probability that a burst in transcription stops.

We now determine the combination of parameters $(n,p)$ for which the negative binomial distribution best fits PDR5 gene expression.