Characterizing low copy DNA signal using simulated and experimental data
MetadataShow full item record
Sir Alec Jeffreys was the first to describe human identification with deoxyribonucleic acid (DNA) in his seminal work in 1985 (1); the result was the birth of forensic DNA analysis. Since then, DNA has become the primary substance used to conduct human identification testing. Forensic DNA analysis has evolved since the work of Jeffreys and now incorporates the analysis of 15 to 24 STR (short tandem repeat) locations, or loci (2-4). The simultaneous amplification and subsequent electrophoresis of tens of STR polymorphisms results in analysis that are highly discriminating. DNA target masses of 0.5 to 2 nanograms (ng) are sufficient to obtain a full STR profile (4); however, pertinent information can still be obtained if low copy numbers of DNA are collected from the crime scene or evidentiary material (4-9). Despite the sensitivity of polymerase chain reaction (PCR) - capillary electrophoresis (CE) based technology, low copy DNA signal can be difficult to interpret due to the preponderance of low signal-to-noise ratios. Due to the complicated nature of low template signal, optimization of the DNA laboratory process such that high-fidelity signal is regularly produced is necessary; studies designed to effectively hone in on optimized laboratory conditions are presented herein. The STR regions of a set of samples containing 0.0078 ng of DNA were amplified for 29 cycles; the amplified fragments were separated using two types of CE platforms: an ABI 3130 Genetic Analyzer and an ABI 3500 Genetic Analyzer. The result is a genetic trace, or electropherogram (EPG), comprised of three signal components that include noise, artifact, and allele. The EPGs were analyzed using two peak detection software programs. In addition, a tool, termed Simulating Evidentiary Electropherograms (SEEIt) (10, 11), was utilized to simulate EPG signal obtained when one copy of DNA is processed through the forensic pipeline. SEEIt was parameterized to simulate data corresponding to two laboratory scenarios: the amplification of a single copy of DNA injected on an ABI 3130 Genetic Analyzer and on an ABI 3500 Genetic Analyzer. In total, 20,000 allele peaks and 20,000 noise peaks were generated for each CE platform. Comparison of simulated and experimental data was used to elucidate features that are difficult to ascertain by experimental work alone. The data demonstrate that experimental signal obtained with the ABI 3500 platform results in signal that is, on average, a factor of four larger than signal obtained from the ABI 3130 platform. When a histogram of the signal is plotted, a multi modal distribution is observed. The first mode is hypothesized to be the result of noise, while the second, third, etc. modes are the signal obtained when one, two, etc. target DNA molecules are amplified. By evaluating the data in this way, full signal resolution between noise and allelic signal is visualized. Therefore, this methodology may be used to: 1) optimize post-PCR laboratory conditions to obtain excellent resolution between noise and allelic signal; and 2) determine an analytical threshold (AT) that results in few false detections and few cases of allelic dropout. A χ2 test for independence of the experimental signal in noise positions and the experimental signal within allele positions < 12 relative fluorescence units (RFU), i.e. signal in the noise regime, indicate the populations are not independent when sufficient signal-to-noise resolution is obtained. Once sufficient resolution is achieved, optimized ATs may be acquired by evaluating and minimizing the false negative and false positive detection rates. Here, a false negative is defined as the non-detection of an allele and a false positive is defined as the detection of noise. An AT of 15 RFU was found to be the optimal AT for samples injected on the ABI 3130 for at least 10 seconds (sec) as 99.42% of noise peaks did not exceed this critical value while allelic dropout was kept to a minimum, 36.97%, at this AT. Similarily, in examining signal obtained from the ABI 3500, 99.41% and 99.0% of noise fell under an AT of 50 RFU for data analyzed with GeneMapper ID-X (GM) and OSIRIS (OS), respectively. Allelic dropout was 36.34% and 36.55% for GM and OS, respectively, at this AT.