By Beatrice Marg-Haufe
Written in collaboration with Zymo Research, Irvine, CA, USA.
Next-generation sequencing (NGS*) has revolutionized genomic research, allowing entire genomes to be sequenced in a single day. This has led to massive advances in the diagnosis, prognosis and treatment of disease, answering genetic questions from a wide spectrum of applications and biological systems. Today, NGS is an essential tool for any biologist. Ultra-high throughput NGS solutions have a wide range of applications and are fully scalable—from rapid SNP genotyping of a single individual, to whole genome sequencing (WGS) of entire populations. The explosive demand for NGS often creates pressures upstream to process many more samples and prepare high quality DNA to feed into library prep and analysis. In this article we explore the ideal DNA requirements for NGS and look at some of the most critical parameters for developing an automated nucleic acid extraction workflow.
High quality DNA is needed for successful downstream NGS and genotyping: automation may be the answer.
How to perform quality control of DNA for NGS
Sample purification is critical for reliable NGS data, and the primary requirement for successful NGS is a nucleic acid template that is of high quality and purity. Sequencing low-quality nucleic acid can result in impaired performance and even failed runs. Because of this, it is crucial to optimize the DNA extraction process to ensure it delivers reliable and reproducible DNA quality every time. The process should also be fully scalable to the number of samples required per batch.
In order to create a consistent and robust sequencing pipeline, and to confirm that the automation process is adequate, the extracted DNA must go through the following quality control steps:
1) Profile characterization
A nucleic acid profile needs to be generated to evaluate what is present within the sample. This can be performed, for example, with an agarose gel or a bioanalyzer. The sample should exhibit the following properties:
High molecular weight genomic DNA
The genomic DNA used in downstream next generation or third generation sequencing should be intact and un-sheared, >50 kB in length, without any large smears across the lanes that would indicate the presence of smaller fragments in the sample. This is particularly important with higher eukaryotes or plants, which have many repetitive sequences, which if wrongly assembled during downstream NGS library preparation could bias the analysis and lead to inaccurate results. The initial fragmentation process of many NGS workflows aims to create a uniform fragment size, but if the starting material is already fragmented this process may lead to fragments that are too short to deliver useable sequence data.
Absence of RNA contamination
Genomic DNA used for sequencing must have as low a concentration of RNA as possible, and preferably none at all. While RNA may not impede the workflow or the actual NGS sequencing process, it will absorb UV light at the same wavelength as DNA, causing artificially high estimates of DNA quantity when assessed by spectrophotometry. In such cases, the researcher risks underestimating the amount of DNA required for downstream NGS preparation steps.
The purity of the sample needs to be evaluated via spectrophotometry, for example an Infinite® PRO 200 plate reader. By analyzing the absorbance at various wavelengths, sample contaminants can be detected based on absorbance values. Ideally, a DNA sample for NGS should show the following characteristics:
Absorbance 260/280 ratio value: ~ 1.8
Since proteins generally absorb light at around 280nm, while nucleic acids absorb at around 260nm, taking the ratio of absorbance at these two wavelengths provides an indication of DNA purity. An A260/280 ratio around 1.8 is generally considered to be an indication of high DNA purity1. The A260/280 ratio for pure RNA is ~2.0. These ratios are commonly used to assess the amount of protein contamination that is left from the nucleic acid isolation process since proteins absorb at 280 nm.
Lower than ideal ratios can indicate the presence of residual phenol or other reagent associated with the extraction protocol, or an unsuitably low concentration (< 10 ng/ul) of nucleic acid.
Absorbance 260/230 ratio value: > 2.0
Salts, EDTA, phenol, carbohydrates, and other contaminants all absorb around 230 nm, and a value < 2 means that the sample should not be used for NGS. A high 260/230 value (above 2.0) indicates that there are very few of these contaminants present within the DNA sample. A 260/230 value of < 1.5, indicates that there is a high concentration of contaminants in the sample, which can negatively affect many kinds of enzymatic and chemical reactions in the NGS workflow.
These low readings can come from carbohydrate carryover, residual phenol from nucleic acid extraction, residual chaotropic reagents such as guanidine, or glycogen used for precipitation, to name but a few.
Fluorometric methods of nucleic acid quantification are the gold standard for NGS pipelines. Fluorometers such as Qubit Flex (Thermo Fisher Scientific®) can be used in conjunction with reagents such as PicoGreen (e.g. Quant-iT™ PicoGreen™ dsDNA Assay Kit from Thermo Fisher Scientific), since they allow the selective quantification of DNA.
Dye-based methods will not detect degraded or short DNA fragments, so DNAs ≥50 bp are recommended when using PicoGreen to assess yield, although fragments as small as 20 bp can be detected with this approach. Alternatively, yield can be ascertained through UV spectrophotometry, provided that no RNA contamination is present within the sample.
Regardless of the method, extraction workflows must deliver sufficiently high DNA yields and concentrations in order to ensure successful results in subsequent stages of the NGS workflow or other downstream applications such as third generation sequencing and genotyping. The size of the target DNA fragments that come out of the DNA extraction workflow is key especially to NGS library construction and third generation sequencing.
Zymo Research and Tecan have collaborated to develop automated extraction workflows**, incorporating microbiome and genomic DNA extraction reagents from Zymo Research, to provide a complete, walk-away solution for nucleic acid extraction.
With the introduction of simplified robot-friendly DNA extraction workflows such as this, automation is no longer the exception in NGS labs. In fact, full integration of automated DNA extraction and quality control to routine sample prep workflows is now a realistic and affordable solution for any lab that is performing NGS or other platform-based applications and assays. Discover how automation workflows can take the effort out of your DNA prep in the next article of our blog series.
This blog was written in collaboration with Zymo Research, Irvine, CA, USA.
*What is NGS?
Next-generation sequencing (massively parallel, deep sequencing or second-generation sequencing) is used to describe several modern sequencing technologies that allow for millions of DNA fragments to be sequenced in parallel. Once all these short reads have been generated, bioinformatic analysis pieces these fragments together. The process is capable of sequencing the entire genome multiple times, delivering high depth and accurate data.
** For research use only – not for use in diagnostic procedures.
- Glasel J. (1995) Validity of nucleic acid purities monitored by 260/280 absorbance ratios. BioTechniques. 18 (1): 62–63. PMID 7702855.
About the author
Dr Beatrice Marg-Haufe
Dr. Beatrice Marg-Haufe is a product manager at Tecan Switzerland with over 10 years of experience in assay development and product management. She studied biochemistry at the University of Bielefeld, Germany, and at Harvard Medical School, USA. She focused on cancer research during her PhD in Biochemistry at the MPI, Munich, Germany. She joined Tecan in 2009 focusing on applications for the agriculture and genomics market.