Code and Data

Code and Data


Executing the code

The code provided here was written to be ran in a particular file structure. The structure looks like

The entire repository can be cloned via GitHub. This repo does not contain the data, but all data is provided in the links bellow. Please follow the installation instructions on the GitHub repository.

Computational environment

All analysis and data processing was performed with the following software configurations.

# Python Version
CPython 3.7.4
IPython 7.11.1

# Package versions
scipy==1.3.1
scikit_image==0.15.0
matplotlib==3.1.1
maxentropy==0.3.0
seaborn==0.9.0
pandas==0.25.3
numpy==1.18.1
GitPython==3.1.0
mpmath==1.1.0
skimage==0.0
emcee==2.2.1
sympy==1.5.1
cloudpickle==1.2.2
joblib==0.14.1
statsmodels==0.10.1
dill==0.3.1.1
ccutils==0.1.5


# System information
compiler   : Clang 4.0.1 (tags/RELEASE_401/final)
system     : Darwin
release    : 18.7.0
machine    : x86_64
processor  : i386
CPU cores  : 8
interpreter: 64bit

The ccutils Module

This work required several home-made Python functions. To ensure reproducibility, we have written it as a Python module that can be installed from the master branch of the GitHub repository. Please see the installation instructions for details. This module is required to execute all of the following scripts.

Jupyter Notebooks

This section contains detailed code in the format of Jupyter notebooks. These notebooks extensively explain the logic behind the computations that went into each of the sections with highly annotated Markdown text. The notebooks can be viewed as html files or can be downloaded as ipynb to be executed. When necessary, there is a link to download the data used for the computations in the notebook.

Python scripts

This section lists python scripts used to compute repetitive tasks explained in the Jupyter notebooks. When necessary, there is a link to download the data used for the computations in the notebook.

  • mdcd_iptg_range.py

    • This script computes in parallel the average moments of the mRNA and protein distribution for a fine grid of IPTG values with the experimentally explored repressor copy numbers only.
  • mdcd_repressor_range.py

    • This script computes in parallel the average moments of the mRNA and protein distribution for a fine grid of repressor copy number values with the 12 experimental IPTG concentrations.
  • mdcd_repressor_extended_range.py

    • This script computes in parallel the average moments of the mRNA and protein distribution for a grid of repressor up to 10^6 copy number values with the 12 experimental IPTG concentrations.
  • mdcd_ogorman_param.py

    • This script computes in parallel the average moments of the mRNA and protein distribution for the experimentally measured combinations of operators and repressors, but this time using the global parameter inferences as reported in Chure et. al, 2019 that phenomenologically capture better the induction profile for the O3 operator and the general steepness of the other strains.
  • maxent_protein_dist.py | [data]
    • Script that takes the protein distribution moments as inferred from the numerical integration of the dynamical equations and computes the corresponding Lagrange multipliers for a maximum entropy approximation of the distribution.
  • maxent_mRNA_dist.py | [data]
    • Script that takes the mRNA distribution moments as inferred from the numerical integration of the dynamical equations and computes the corresponding Lagrange multipliers for a maximum entropy approximation of the distribution.
  • maxent_protein_dist_rep_range.py | [data]
    • Script that takes the protein distribution moments as inferred from the numerical integration of the dynamical equations and computes the corresponding Lagrange multipliers for a maximum entropy approximation of the distribution for a larger span of repressor copy numbers.
  • maxent_protein_dist_iptg_range.py | [data]
    • Script that takes the protein distribution moments as inferred from the numerical integration of the dynamical equations and computes the corresponding Lagrange multipliers for a maximum entropy approximation of the distribution for a finer grid of inducer concentrations.
  • maxent_protein_dist_correction.py | [data]
    • Script that updates the second and third moment of the protein distribution to match the factor of two in the deviation between the original theoretical prediction and the experimental data. It then uses these updated moments along with the first protein moment to infer the maximum entropy distribution.
  • channcap_protein_multi_prom.py | [data]
    • Script that computes the channel capacity for the protein distributions generated with the output of the maxent_protein_dist.py script.
  • channcap_mRNA_multi_prom.py | [data]
    • Script that computes the channel capacity for the mRNA distributions generated with the output of the maxent_mRNA_dist.py script.
  • channcap_protein_multi_prom_rep_range.py | [data]
    • Script that computes the channel capacity for the protein distributions generated with the output of the maxent_protein_dist_rep_range.py script.
  • channcap_protein_multi_prom_iptg_range.py | [data]
    • Script that computes the channel capacity for the protein distributions generated with the output of the maxent_protein_dist_iptg_range.py script.
  • channcap_protein_single_prom.py | [data]
    • Script that computes the channel capacity for the protein distributions generated assuming a single promoter that reaches steady state. These calculations do not include the gene dosage variability during the cell cycle, and assume a Poissonian degradation of the proteins.

Data Sets

This section lists all datasets used for this work. From the raw microscopy images, to the processed single-cell fluorescence values. Also here we list all values generated from theoretical calculations that are computationally expensive to reproduce every single time.