Growth curves

> All scripts mentioned can be found on https://github.com/matteoferla/growth_curves
> Another post discusses what methodological errors can ruins growth rates

Bacterial growth curves never look like they are in the textbooks as sick bacteria are the worst patients.
Fitness is possibly one of the top three buzzwords in evolutionary biology. It encompasses the more subtle effects that lie between death or growth. However, measuring fitness via growth rates is easier say than done. Especially since auxotrophs and bradytrophs have quirkier behaviours than prototrophs.

Approaches

One of the most popular multiplate reader brands, Biotek, has a software called Gen5, which gives the steepest slope as the growth rate, which is a terrible idea.
R has a library (grofit) than I haven't found satisfactory as it's an inflexible, cumbersome and a one trick pony —one of those R libraries someone wrote for their specific purposes and as a result the inputted data needs to be in a specific way.
As the maths is simple, but the problems are unique, I prefer writing my own so I can tinker with it.
For curve analysis I use Matlab, because at its core is way better written than R even if there is less biocommunity-written code and it has a rather unique syntax. This post deals with the snippets of code I have written for it in the hope it will help other.

Column-wise

The first problem encountered is specific to the reader I use (BioTek). By default —but changeable— the report gives a table where the rows are timepoints, while the columns are wells listed going across each plate row. In Matlab when one linearises a matrix (a(:)), the values are listed down each column, therefore the plate needs to be rotated. This is not that simple as the matrix needs to be converted into a 3rd order tensor and then the 3rd and 2nd orders switched and then converted back:

newplate=reshape(permute(reshape(plate,size(plate,1),12,8),[1 3 2]),size(plate,1),96,1);

Outlier elimination

Bubbles are anecdotally blamed for strange outliers, which makes sense. These can be easily removed by smoothing.

smoothplate=zeros(size(newplate,1),96);
for i=1:96
smoothplate(:,i)=smooth(newplate(:,i),'lowess');
end

Baseline

Baseline normalisation is a straightforward process. The first point, which should be the lowest value is the baseline and subtracted from the curve. In many cases, e.g. fancy synthetic biology stuff with GFP, the GFP levels are not noise, but may dip due to the change going from a stationary overnight culture to a fresh media. So the zero is the minimum value seen.

plate = (smoothplate - repmat(min(smoothplate),size(smoothplate,1),1));

Fitting

So there are papers upon papers that discuss what equation best fits a sigmoidal curve —the key paper to read is this one. The classic one is a logistic curve. A fancier one is the Gompertz equation, which better fits the extreme cases, even if it disobeys the principle of dimensional homogeneity. In my experience with actual data, the differences are so minor and the phenomena seen are so major that such overcomplication is unwarranted. The main thing to do is give good boundary conditions for the fitting to avoid flatlining in border cases.

%As a separate m-file (as an anonymous function is less clear, but would be within the same file)
function y = growthcurve(x, odmax, odlamb, odmu)
y = (1+exp(4*odmu/odmax*(odlamb-x)+2)).\odmax;
end

function logidata = growthfit(data, interval)
%given a 96 vertically linearised 96-well plate returns a (96,4) sized
%matrix, where the first column is max od, second is slope, third is lag
%and fourth is R^2
%this function assumed the data is zeroed!
logidata=zeros(96,4);
spindex = reshape(1:96, 12, 8).';
t=(1:size(data,1))';

figure
hold on
for i=1:96
    subplot(8,12,spindex(i))
    try
    slp=data(:,i);
    options = fitoptions('growthcurve(x, odmax, tlambda, odmu)');
    options.Robust='Bisquare';
    options.MaxIter=800;
    % max set at within 20% error of endpoint. change slp(end) to max(slp) for max 
    options.StartPoint=[slp(end), 0, slp(end)/t(end)];
    options.Lower =[slp(end)*0.90, slp(end)/t(end), 0];
    options.Upper =[slp(end)*1.10, Inf, t(end)]; %say it cannot drop more than to 1/2.
    [fo,gof]=fit( t, slp, fittype('growthcurve(x, odmax,tlambda,odmu)'), options);
    plot(fo,t,slp);
    legend(gca,'off');
    axis off;
    logidata(i,1:3)=coeffvalues(fo);
    logidata(i,2)=logidata(2,i)/interval;
    logidata(i,3)=logidata(3,i)*interval;
    logidata(i,4)=gof.rsquare;
    xlabel('')
    ylabel('')
    catch
        display('FAILED!')
    end
end
end

Issues

There are several problems to this approach (which I will write in detail soon…).

Missing lag phase
Missing stationary phase
Death phase
Clumping causing a greater OD than expected with occasional weird peaks
Residual supplements causing a step like lag phase
Cheats and contaminants causing an increase in the growth rate

Technically, these issues do tell a lot about the strains, but there is so little litterature on the matter that one would have to do a lot of work giving definite proofs that they are best disregarded for papers.

Other functions

In addition to the functions mentioned here, I also wrote two handy functions.
As I want to see my data as a plate I wrote the function platelayout which is just a pretty imagesc. the arguments are a 96 vector and a string for a title.

platelayout(max(rawod),'highest recorded OD in each well')

If a well is dodgy and I want to plot it, I wrote platename. Which if given a string like 'A2', will give a number, e.g. 9. If given a number gives a str. It does not accept vectors yet. so to get a cell array with wells one would need to do something like

wells=arrayfun(@platename,1:96,'UniformOutput',0);

The art of blowing up protein

Pages

Wednesday, 23 March 2016