Gene expression part 03

© 2017 Griffin Chure & Manuel Razo. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license

In part III of our matlab project, we will learn how to process a series of images, extract area values, and write our very first function!

At the end of the last exercise, we saw that we had a distribution of areas, but it was pretty hard for us to identify good thresholds. Today, we'll get a better idea of what a typical cell looks like by segmenting even more cells.

We will begin by listing all of the delta strain images. But first a quick useful aside:

Concatenating strings in MATLAB

If I want let's say to combine two strings 'hello,' and ' world!' in MATLAB all we have to do is the following

string1 = 'hello,';

string2 = ' world!';

% Concatenating two strings

[string1, string2]

ans = hello, world!

Now coming back to our problem. Let's define a string dataDir that points at the directory where the data lives

% Define variable that points to the directory where the data is

dataDir = 'data/lacI_titration/delta/';

Now we will use what is called a 'wildcard' character * that reprsents any list of characters in between a pattern.

% List all the images that have the pattern 'nd2' on the file name

deltaIms = dir([dataDir, '*nd2']);

The command dir lists all of the files that satisfy certain condition. For example here we are telling matlab to go into the data/lacI_titration/delta/ directory and find all the files that end with .nd2.

Let's look for the first 3 file name for example.

deltaIms(1:3).name

ans = 01.nd2
ans = 02.nd2
ans = 03.nd2

Having this deltaIms structure becomes very handy. We can now use a simple for loop to obtain all of the areas from each image.

You might be thinking: But wait a second! We have taken 2 sessions so far to get to the point where we obtain an array with the areas of labeled objects in an image. I don't want to copy and paste n times that code for each image.

Well, you only don't have to but as good as a programmer that you are becoming you shouldn't even think about that. For this we will define our first beautiful function!

By now you must be familiar with functions. We have used them from the very beginning. Functions such as mat2gray, imhist, regionprops receive an input, perform some computations and return some output afterwards. But what if there is not a function that does all the steps you want to follow at once? The you should define your own functions!

One of matlab's biggest weakness in my opinion is that it doesn't allow users to define functions in the same script where they are performing other operations. Functions need to live in single independent m-files where the name of the file needed to be the exact same as the name of the function. This version matlab 2016b tried modernizing the language to allow for local functions to be defined in a script.

Defining a function to extract the areas of labeled objects.

The way that we define a function in matlab is as follows:

function output(s) = name_of_the_function(parameter(s))

% Comments explaining how the function works Actual code performind the function

end

So before getting into our real function let's first try a simple one. Let's define a function that takes as input a matrix and returns the minimum and the maximum value on it.

Now in the same directory I am working I have an m-file named minMaxFn.m as you can see if I list all the files in the directory.

ls

bfmatlab inClass_frap_the_butter.m
data inClass_growth_movie_analysis.m
diffusion_spread_the_butter.html inClass_spread_the_butter.m
diffusion_spread_the_butter.mlx lacI_titration_part01.mlx
euler_method_ode.mlx lacI_titration_part02.mlx
gene_expression_part01.mlx lacI_titration_part03.mlx
gene_expression_part02.mlx masterEquation.m
gene_expression_part03.mlx minMaxFn.m
inClass_entropic_spring.m segmenter.m

With this in hand let's see if I can actually generate an array with random numbers and obtain the maximum and minimum values.

randArray = rand([1, 10]);

[minVal, maxVal] = minMaxFn(randArray)

minVal = 0.1419
maxVal = 0.9706

It worked! So now we know how to define a function. Let's now define a relevant function for what we want to do.

Let's look at we want our function to do. At the end of the day the function will actually need to live in a separate m-file.

  1. First of all we want the function to take a file name and read the image.
  2. Then we would like to choose some kind of threshold to pick up bacteria.
  3. Finally we would like to extract the area of all the objects found in the labeled image.

We know how to do everything automatically, except for step 3 where we gave some manual input as to which threshold value to use. There are actually automatic thresholding methods such as Otsu's method that are implemented in matlab. So we'll make use of this method for that step.

Again in the same directory I have this same function defined on a file named segmenter.m. So let's test the function with one of the files we listed in phaseIms.

[imLabel, areas] = segmenter([dataDir, deltaIms(1).name], 70);

Reading series #1
    ...

Let's look at the resulting labeled image!

imshow(label2rgb(imLabel))

That looks pretty good! Seems that our function can automatically segment out bacteria-like objects given a phase contrast image.

Applying an area filter.

If we zoom on the image we can see that there are a lot of small speckles that get a free ride when we apply the threshold-based segmentation. We should be able to get rid of them just by knowing how big is an E. coli. In order to apply a proper area filter to our segmentation mask we need to know what is the inter-pixel distance. The metadata of the images indicates that when using the 100x objective the interpixel distance is 0.0733 µM/pixel. Using this information let's apply a filter.

We know that E. coli is approximately 2 µm x 1 µm. So if we set the filter of the areas to be between 2 and 3 pixels should be good. To apply the filter we will use boolean operations where we will ask MATLAB to find the numbers larger than our lower bound and smaller than our upper bound. Recall that our segementer function returns also an array of areas; this means that we need to find a way to connect the value of the area with the index of the segmentation mask. That is easily done by using the find function. Let's look at an example

% generate array with 10 random numbers between 0 and 1.

a = rand([1, 10]);

% find which numbers are < than 0.5

[row, col] = find(a < 0.5)

Now let's convert our area array from pixel to actual µm and apply the filter

% Define interpixel distance

interPix = 0.0733; %µm / pixel

% Define area thresholds

areaMin = 2; % µm^2

areaMax = 3; % µm^2

% covnvert area array to µm

areasMicron = areas * interPix^2;

% Apply filter to area array

[row, index] = find(areasMicron > areaMin & areasMicron < areaMax);

Once we find the indices of the labeled objects that fall within our area bounds. This operation will return a binary image again so we will have to apply the bwlabel command again

% Find which labeled objects fell within the area bounds

imAreaFilt = ismember(imLabel, index);

% re-label the binary image

imLabelFilt = bwlabel(imAreaFilt);

% Display filtered image

imshow(label2rgb(imLabelFilt))

Most of the speckles disappeared! There are still some weird objects. For these we could use another type of filter having to do with the aspect ratio of the objects for example, or we could perform our segmentation using the volume marker channel where these objects would be invisible. But given the massive amount of data that we are getting per image (thank you Nikon!) these objects shouldn't affect that much our analysis.