Code & Data

Code & Data


There are several different categories of code used in this work. in the sections below, we describe each category and provide links to their contents where necessary.

Data Accessibility

In the sections below, we link every script with its required data files. This excludes data needed to run the tpm module as well as the preprocessed .mat files used by compile_data.py. The raw image files are remarkably large (~ 10 TB) and are thus preserved on cold storage and are available upon request. The results from the image files (~ 10 GB) that are scraped by the compile_data.py script are stored on the CaltechDATA research data repository and are accessible under the DOI: 10.22002/D1.1288.

The tpm Matlab Module

This module, available from the associated GitHub repository, is used during the acquisition and direct measurement from raw image data of tethered beads. It is implemented as previously described in Lovely et al. PNAS, 112(14) 2015 and Johnson et al Nucleic Acids Research, 40(16) 2012..

The vdj Python Module

This module, written explicitly for this work, is composed of a variety of Python functions useful for the processing, analysis, and presentation of data. This module is also available from the associated GitHub repository.

Python Processing Script

We used a single Python script compile_data.py which extracts measurements from a series of .mat files that are produced via the tpm module.

Python Analysis Scripts

These are Python files which can be run independently from the command line and perform data analysis processes, such as parameter inference and bootstrapping.

Python Figure Scripts

These are Python files which can be run independently from the command line and produce all data-based figures in the work, including the interactive figures.