Peak height variation

Sunday 26 February 2017

Peak height variation

Sequencing a plasmid pool containing a sequence with a randomised codon can reveal the frequencies of the bases are (Acevedo-Rocha et al., 2015 ).
The problem is that sequence traces are not consistent. Some peaks are bigger than others and beyond a certain point the traces get messy. So how does that affect the prediction of the base frequencies?
To investigate this I did a staggered QuikChange and sequenced the pool from different sites. Namely, Y66X on sfGFP gene as a cliché test subject using 65˚C annealing with the following primers:

NameSequence
Y66X_ForwardcgacgctgactNNKggtgttcagtgctttgctcgttatccgg
Y66X_ReversecactgaacaccMNNagtcagcgtcgttaccagagtcg

Sequenced four times with Eurofins: twice from a lacOT5 primer, so through the preceding gene —the construct was purpose built for this test, sorry—, once with a normal amount (50 ng/µl) and another with way too much template (180 ng/µl); then with a reverse primer from the terminator and lastly with a primer just before the start of the gene. The lacOT5 primer is 626 bases away, the internal 192 bases and the reverse 632 bases.
The 600 bp is too far as the peaks are already merging. In my dataset, the peak heights decrease, while the interpeak height increases 250 bases in (25% of 1000 bp in the graph below). In a different dataset (the example data from an upcoming paper spearheaded by Carlos Acevedo-Rocha) the plots were different —the largest peaks were 200 bases in, so different suppliers/machines might be slightly different.

In the traces above, the first two are nearly the same, there are few differences by eye and taller peaks somewhat coincide, so increased template concentration does nothing. The peaks that are taller somewhat also reappear in the other, but not in the reverse.



1 comment: