Interpreting Chromatograms..
Our lab routinely produces high quality data with read lengths to 700 bases; however, this depends on many factors,
including how clean the template DNA is, or the annealing efficiency of the primer. To prepare DNA which is clean enough for automated
sequencing, see Template Preparation. Other problems in the sequencing reaction
can affect success as well. Some examples of these, with their solutions, follow.
A good sequence
Low Signal-to-Noise Ratio
Salt Contamination
Alcohol Contamination
GC Rich or Palindromic Regions
Double Priming
Miss call due to weak peaks after stronger peaks
Heterozygote Double Peaks
Poly A region
Other observations
|
- First an example of a good sequence
Fig. 1 shows an excellent sequence chromatogram of our standard reaction with clean, distinct peaks and
very low to no background noise. The sequence was completely accurate to 586 bases with 3 miscalls to 700 bases.

Figure 1.
You can expect fully accurate, reliable sequence to be found from 30 to 500-600 bases from the priming site, with
98.5% accuracy extending to 650-700 bases in some reactions. Fig. 2 shows the general appearance of the chromatogram peaks around 650
bases, which are much broader and less defined than around the 400 base region.
Figure 2.
- Low Signal-to-Noise Ratio
Fig. 3 shows a chromatogram with noisy signal peaks; however, the gel image had a completely blank lane. The
chromatogram is actually depicting background signal.
Figure 3.
Fig. 4 shows a chromatogram with actual sample signal that is very low. The total amount of signal for this sample
is about 10% of the amount of signal obtained from the standard reaction run on the same gel. As a result, the background noise level
is comparable to the signal level, and can introduce false sequence peaks and deletions. The bottom line is that the sequence is not
reliable. Low signals are most commonly caused by "dirty DNA", containing small molecule or RNA contaminants.
Solution: Prepare your DNA template again, taking care not to introduce small
molecule contaminants such as salts, EtOH and EDTA (See DNA Template Preparation).
Figure 4.
- Salt Contamination
Fig. 5 shows a chromatogram with 75mM NaCl added to our standard template reaction. The sequence starts off nicely,
but then there is a decrease in signal beginning around 300 bases, gradually descending to background level by the time the first N is
called at position 434. A comparison of this sequence to the pGEM3Zf+ sequence on file at NCBI, shows the first miscall to be at base 434,
with only 3 more miscalls to 500, 9 miscalls to 550, with a drastic deterioration of 44 miscalls to 600. Salt contamination alone is not
a big problem, but in combination with other trace contaminants, can erode accuracy, and shortens read lengths.
Solution: Wash final DNA pellet with 70% isopropanol (30% water) and
dry in a spin vac before resuspending in pure autoclaved water.
 |
 |
| Figure 5A. | Figure 5B. |
 |
 |
| Figure 5C. | Figure 5D. |
- Alcohol Contamination
Fig. 6 shows a chromatogram with 1% EtOH added to our standard template reaction. The peaks are sharp and distinct to
about 270 bases, but gradually drop in size to background level by the time the first N is called at position 419. A comparison of this
sequence to the pGEM3Zf+ sequence on file at NCBI, shows the first miscall at 359 bases, with 34 miscalls to 500, and 97 miscalls to
600. In the chromatogram, you can see the sequence rapidly deteriorates after 400 with erratic peaks. In combination with other
contaminants, it can contribute to poor sequence data.
Solution: Make sure all ethanol is evaporated off of the DNA pellet
after precipitation; dry in a spin vac, if possible, before resuspending in pure autoclaved water.
 |
 |
| Figure 6A. | Figure 6B. |
 |
 |
| Figure 6C. | Figure 6D. |
- GC Rich or Palindromic Regions
Regions with a GC content greater than 62% are difficult to sequence under our standard reaction conditions. The
reason for this is thought to be due to the stronger bonds in GC base pairs, which require a higher melting temperature to denature them.
Good denaturation is necessary to allow efficient annealing of the primer and subsequent extension. Fig. 7 shows an example of an
abrupt signal drop-off after a good run of sequence, at a GC rich region.
A similar result is sometimes seen if you are attempting to sequence through a region that has long palindromic
sequences that form secondary structure even during the denaturation cycle. These hairpin structures form physical barriers and the DNA
polymerase has difficulty reading through these regions.
Solution:When time permits, the facility will attempt to use an alternate PCR
cycle with higher than standard temperatures, or add 5% DMSO or glycerol to the reaction. If this does not resolve the problem, then
cutting the insert in the problem regions, subcloning, and then resequencing may be necessary.
Figure 7.
- Double Priming
Fig. 8 shows a sequence with clean, distinct peaks, but software frequently called N's in several positions.
Appearance is that of two separate sequences overlapping, so that multiple peaks occupy the same position, such that clean sequence
cannot be determined. This is due to the presence of two priming sites yielding DNA products from two different sequences in the same
sample, a common occurrence in cloning.
Solution: In lieu of recloning your insert into a new vector, you may be able to
sequence the same sample using a different primer. Commercial plasmids with multiple cloning sites usually contain alternative
common priming sites (e.g. T7, SP6, etc.). If not, you may have to resort to designing a specific primer for your template.
Figure 8.
- Miscalls due to weak peaks after stronger peaks
These are usually manually edited by our lab before delivery. You can identify these as lower case letters in the
sequence, which are also underlined. However, there are places where the weak peak effect may be occurring, yet not be obvious enough for
us to call. Weak peaks result from suppression of signal following a strong signal, occurring most commonly for G's after A's, and often
for G's after C's, as seen in Fig. 9 We provide computer files of the chromatograms as well as the straight sequence, which can be
viewed and edited by Editview software. This is free software available from ABI Perkin-Elmer. To download a copy for either PC or
Macintosh, see the Links & Software on our Homepage.

Figure 9: The G's directly following A or C peaks (circled) are suppressed in size compared to other G's in the
sequence
- Heterozygote Double Peaks
If you are sequencing DNA directly from a diploid organism (e.g. PCR products from chromosomal DNA) you may see
double peaks at one nucleotide, flanked by clean single-peak sequence, as shown below in Fig. 10 This can indicate that the two alleles
of the PCR'ed gene are different (the organism is heterozygous), and one base is present on one allele, while the second is present on the
other allele.
Figure 10.
- Poly A Region
Fig. 11A shows drop-off in sequence due to a poly A region. Sequence ahead of this region is clean and accurate; however,
sequence following the region has degraded, as shown in Fig. 11B This is caused by polymerase slipping as it extends the poly A chain,
essentially causing a frame shift that can produce inconsistent poly A lengths and subsequent chain terminations, i.e. different
fluorescent terminations in fragments of the same length, degrading sequence accuracy.
Solution: to read past the poly A region, use a poly T primer with a degenerate
base in the 3' position.
 |
 |
| Figure 11A. | Figure 11B. |
- Other Observations
In other experiments in our lab, we found that trace amounts of other potential contaminants, i.e. 0.2%
phenol-chloroform and trace amounts of silica beads, did not affect sequence data. In investigating the possible effect of TE
contamination, 5mM tris did not affect the reaction, but as little as 2 mM EDTA totally squelched signal resulting in a blank lane.
DMSO is a common additive to sequencing reactions, which can sometime resolve problems thought to be due to secondary
structure. The addition of 0.5% DMSO reduces signal by half, and 1% DMSO reduces it further by half again. This same effect has been
observed repeatedly in our lab, so that we routinely increase template and primer to maximize signal whenever adding DMSO.
|
|