Most high throughput sequencing is performed on DNA derived from a large populations of cells. A consensus sequence is obtained by putting together or aligning many short reads. The most frequent nucleotide is what determines the identify of the base at that particular position. While this is fine for most applications, measuring the genomic heterogeneity between single cells isn’t possible. Important differences, including variations in chromosomes such as single nucleotide variations (SNVs), copy number variations (CNVs) (responsible for genetic variation that can lead to gene malfunction or disease conditions) and transcriptome variation based on alternative splicing, can all be lost with population averaging.
While sequencing the genomic content of a single cell is important, achieving this has its challenges. First, you need to isolate a single cell. Typical methods to achieve this include cell sorting, laser capture microdissection and old fashioned dilution. Once you’ve isolated a single cell you need to deal with its contents. You can’t just use a typical phenol based extraction procedure. The membranes of the cell need to be carefully removed with mild lysis buffer to release compartmentalized DNA. Incomplete recovery (something that’s very common with single cells) from chromosomal breaks or DNA damage can result in the loss of genomic regions and uneven amplification across the genome yielding some regions with little or no representation. Finally, you have to deal with the minute amounts present in a cell. Each diploid cell contains less than ~20 picograms of DNA. To make libraries from single cells, you can rely on traditional amplification or several newer primer based techniques:
– Whole genome amplification (WGA)
– SMARTer (SMART-Seq)
– Multiple Annealing and Looping-Based Amplification Cycles
There are many forms of WGA, including multiple displacement amplification (MDA), primer extension preamplification (PEP), and degenerate oligonucleotide primed PCR (DOP). PCR based WGA, DOP and PEP, utilizes degenerate or random oligonucleotide primed Taq based PCR respectively. MDA uses isothermal genome amplification by binding of random hexamers to denatured DNA followed by strand displacement at constant temperatures using Phi 29 polymerase. Priming events on each denatured strand lead to a network of DNA structures. While WGA methods have been available since the early 90s and have been thoroughly tested by many researchers, they can be prone to amplification bias and result in low genome coverage. PCR based WGA can introduce sequence dependent bias and error prone amplification because of the use of a low fidelity Taq and overrepresentation of certain regions due to preferential binding of primers to specific genomic regions. MDA which uses a strand displacing Phi 29 polymerase providers certain improvements but still exhibits considerable bias due to non-linear amplification, random priming which amplifies both target and contaminating DNA and genomic rearrangements or chimeras that complicates genomic assembly by linking non-contiguous chromosomal regions.
SMARTer, is an approach for full length cDNA construction from picograms of total RNA, using the template switching activity of moloney murine leukemia virus (MMLV) reverse transcriptase (Chenchik et al.) Briefly, upon reaching the end of an RNA template, the terminal transferase activity of the transcriptase adds 3-5 nucleotides to the end of the 3’ end of the first strand cDNA. A primer binds to this overhang which serves as the template for transcription. Template switching from the RNA molecule to the primer generate a complete cDNA copy. The SMARTer technique offered by Clontech is an option available through Genohub.
Another new low input method developed by Professor Xiaoliang Sunney Xie’s group at Harvard University and reported in the December 21, 2012, issue of Science utilizes multiple annealing and loop based amplification cycles (MALBAC). Amplification begins with a pool of random primers each containing a common 27 mer oligo and 8 random nucleotides that evenly anneal to their template. Increasing the temperature to 65ºC generates variable length amplicons which are then amplified to full length amplicons with complementary ends. The temperature of the reaction is lowered to 58ºC to allow looping of the full length amplicons and prevents further amplification or cross hybridization. PCR is performed using the 27 mer oligo as a template, generating micrograms of DNA from as little as picograms.
Techniques to elucidate the genomic contents of a single cell are just beginning to be developed. Their use spans the need to track single circulating tumor cells to mapping chromosomal segregation to understanding the human microbiota. While the importance for these techniques exist, new protocols will have to be developed and tested to ensure accurate and complete representation of the single cell genome.
We’re interested to hear about your single cell protocols. Send them to us at email@example.com.
– Chenchik, A., Zhu, Y., Diatchenko, L., Li., R., Hill, J. & Siebert, P. (1998) Generation and use of high-quality cDNA from small amounts of total RNA by SMART PCR. In RT-PCR Methods for Gene Cloning and Analysis. Eds. Siebert, P. & Larrick, J. (BioTechniques Books, MA), pp. 305–319.
– Zong C, Lu S, Chapman AR, Xie XS. Genome-Wide Detection of Single-Nucleotide and Copy-Number Variations of a Single Human Cells. Science, 338(6114):1622-6. 2013.