Hybrid Read Sequencing: Applications and Tools

Next-generation sequencing (Illumina) and long read sequencing (PacBio/Oxford Nanopore) platforms each have their own strengths and weaknesses. Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, the combination of these techniques led to a new improved approach known as hybrid sequencing.

The hybrid sequencing methods utilize the high-throughput and high-accuracy short read data to correct errors in the long reads. This approach reduces the required amount of costlier long-read sequence data as well as results in more complete assemblies including the repetitive regions. Moreover, PacBio long reads can provide reliable alignments, scaffolds, and rough detections of genomic variants, while short reads refine the alignments, assemblies, and detections to single-nucleotide resolution. The high coverage of short read sequencing data output can also be utilized in downstream quantitative analysis1.

Applications

De novo sequencing

As alternatives to using PacBio sequencing alone for eukaryotic de novo assemblies, error correction strategies using hybrid sequencing have also been developed.

  • Koren et al. developed the PacBio corrected Reads (PBcR) approach for using short reads to correct the errors in long reads2. PBcR has been applied to reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced parrot (Melopsittacus undulates) The long-read correction approach, has achieved >99.9% base-call accuracy, leading to substantially better assemblies than non-hybrid sequencing strategies.
  • Also, Bashir et al. used hybrid sequencing data to assemble the two-chromosome genome of a Haitian cholera outbreak strain at >99.9% accuracy in two nearly finished contigs, completely resolving complex regions with clinically relevant structures3.
  • More recently, Goodwin et al. developed an open-source error correction algorithm Nanocorr, specifically for hybrid error correction of Oxford Nanopore reads. They used this error correction method with complementary MiSeq data to produce a highly contiguous and accurate de novo assembly of the Saccharomyces cerevisiae The contig N50 length was more than ten times greater than an Illumina-only assembly with >99.88% consensus identity when compared to the reference. Additionally, this assembly offered a complete representation of the features of the genome with correctly assembled gene cassettes, rRNAs, transposable elements, and other genomic features that were almost entirely absent in the Illumina-only assembly4.

Transcript structure and Gene isoform identification

Besides genome assembly, hybrid sequencing can also be applied to the error correction of PacBio long reads of transcripts. Moreover, it could improve gene isoform identification and abundance estimation.

  • Along with genome assembly, Koren et al. used the PBcR method to identify and confirm full-length transcripts and gene isoforms. As the length of the single-molecule PacBio reads from RNA-Seq experiments is within the size distribution of most transcripts, many PacBio reads represent near full-length transcripts. These long reads can therefore greatly reduce the need for transcript assembly, which requires complex algorithms for short reads and confidently detect alternatively spliced isoforms. However, the predominance of indel errors makes analysis of the raw reads challenging. Both sets of PacBio reads (before and after error-correction) were aligned to the reference genome to determine the ones that matched the exon structure over the entire length of the annotated transcripts. Before correction, only 41 (0.1%) of the PacBio reads exactly matched the annotated exon structure that rose to 12, 065 (24.1%) after correction.
  • Au et al. developed a computational tool called LSC for the correction of raw PacBio reads by short reads5. Applying this tool to 100,000 human brain cerebellum PacBio subreads and 64 million 75-bp Illumina short reads, they reduced the error rate of the long reads by more than 3-fold. In order to identify and quantify full-length gene isoforms, they also developed an Isoform Detection and Prediction tool (IDP), which makes use of TGS long reads and SGS short reads6. Applying LSC and IDP to PacBio long reads and Illumina short reads of the human embryonic stem cell transcriptome, they detected several thousand RefSeq-annotated gene isoforms at full-length. IDP-fusion has also been released for the identification of fusion genes, fusion sites, and fusion gene isoforms from cancer transcriptomes7.
  • Ning et al. developed an analysis method HySeMaFi to decipher gene splicing and estimate the gene isoforms abundance8. Firstly, the method establishes the mapping relationship between the error-corrected long reads and the longest assembled contig in every corresponding gene. According to the mapping data, the true splicing pattern of the genes is detected, followed by quantification of the isoforms.

Personal transcriptomes

Personal transcriptomes are expected to have applications in understanding individual biology and disease, but short read sequencing has been shown to be insufficiently accurate for the identification and quantification of an individual’s genetic variants and gene isoforms9.

  • Using a hybrid sequencing strategy combining PacBio long reads and Illumina short reads, Tilgner et al. sequenced the lymphoblastoid transcriptomes of three family members in order to produce and quantify an enhanced personalized genome annotation. Around 711,000 CCS reads were used to identify novel isoforms, and ∼100 million Illumina paired-end reads were used to quantify the personalized annotation, which cannot be accomplished by the relatively small number of long reads alone. This method produced reads representing all splice sites of a transcript for most sufficiently expressed genes shorter than 3 kb. It provides a de novo approach for determining single-nucleotide variations, which could be used to improve RNA haplotype inference10.

Epigenetics research

  • Beckmann et al. demonstrated the ability of PacBio sequencing to recover previously-discovered epigenetic motifs with m6A and m4C modifications in both low-coverage and high-contamination scenarios11. They were also able to recover many motifs from three mixed strains ( E. coliG. metallireducens, and C. salexigens), even when the motif sequences of the genomes of interest overlap substantially, suggesting that PacBio sequencing is applicable to metagenomics. Their studies infer that hybrid sequencing would be more cost-effective than using PacBio sequencing alone to detect and accurately define k-mers for low proportion genomes.

Hybrid assembly tools

Several algorithms have been developed that can help in the single molecule de novo assembly of genomes along with hybrid error correction using the short, high-fidelity sequences.

  • Jabba is a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. It uses a pseudo alignment approach with a seed-and-extend methodology, using maximal exact matches (MEMs) as seeds12. The tool is available here: https://github.com/biointec/jabba.
  • HALC is a high throughput algorithm for long read error correction. HALC aligns the long reads to short read contigs from the same species with a relatively low identity requirement and constructs a contig graph. This tool was applied on E. coliA. thaliana and Maylandia zebra data sets and has been showed to achieve up to 41 % higher throughput than other existing algorithms while maintaining comparable accuracy13. HALC can be downloaded here:  https://github.com/lanl001/halc.
  • The HYBRIDSPADES algorithm was developed for assembling short and long reads and benchmarked on several bacterial assembly projects. HYBRIDSPADES generated accurate assemblies (even in projects with relatively low coverage by long reads), thus reducing the overall cost of genome sequencing. This method was used to demonstrate the first complete circular chromosome assembly of a genome from single cells of Candidate Phylum TM6using SMRT reads14. The tool is publicly available on this page: http://bioinf.spbau.ru/en/spades.

Due to the constant development of new long read error correction tools, La et al. have recently published an open-source pipeline that evaluates the accuracy of these different algorithms15. LRCstats analyzed the accuracy of four hybrid correction methods for PacBio long reads over three data sets and can be downloaded here: https://github.com/cchauve/lrcstats.

Sović et al. evaluated the different non-hybrid and hybrid assembly methods for de novo assembly using nanopore reads16. They benchmarked five non-hybrid assembly pipelines and two hybrid assemblers that use nanopore sequencing data to scaffold Illumina assemblies. Their results showed that hybrid methods are highly dependent on the quality of NGS data, but much less on the quality and coverage of nanopore data and performed relatively well on lower nanopore coverages. The implementation of this DNA Assembly benchmark is available here: https://github.com/kkrizanovic/NanoMark.

References:

  1. Rhoads, A. & Au, K. F. PacBio Sequencing and Its Applications. Genomics, Proteomics Bioinforma. 13, 278–289 (2015).
  2. Koren, S. et al. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotech 30, 693–700 (2012).
  3. Bashir, A. et al. A hybrid approach for the automated finishing of bacterial genomes. Nat Biotechnol 30, (2012).
  4. Goodwin, S. et al. Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome. Genome Res 25, (2015).
  5. Au, K. F., Underwood, J. G., Lee, L. & Wong, W. H. Improving PacBio Long Read Accuracy by Short Read Alignment. PLoS One 7, e46679 (2012).
  6. Au, K. F. et al. Characterization of the human ESC transcriptome by hybrid sequencing. Proc. Natl. Acad. Sci. 110, E4821–E4830 (2013).
  7. Weirather, J. L. et al. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing. Nucleic Acids Res. 43, e116 (2015).
  8. Ning, G. et al. Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome. 7, 43793 (2017).
  9. Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq.(ANALYSIS OPEN)(Report). Nat. Methods 10, 1177 (2013).
  10. Tilgner, H., Grubert, F., Sharon, D. & Snyder, M. P. Defining a personal, allele-specific, and single-molecule long-read transcriptome. Proc. Natl. Acad. Sci. 111, 9869–9874 (2014).
  11. Beckmann, N. D., Karri, S., Fang, G. & Bashir, A. Detecting epigenetic motifs in low coverage and metagenomics settings. BMC Bioinformatics 15, S16 (2014).
  12. Miclotte, G. et al. Jabba: hybrid error correction for long sequencing reads. Algorithms Mol. Biol. 11, 10 (2016).
  13. Bao, E. & Lan, L. HALC: High throughput algorithm for long read error correction. BMC Bioinformatics 18, 204 (2017).
  14. Antipov, D., Korobeynikov, A., McLean, J. S. & Pevzner, P. A. hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32, 1009–1015 (2016).
  15. La, S., Haghshenas, E. & Chauve, C. LRCstats, a tool for evaluating long reads correction methods. Bioinformatics (2017). doi:10.1093/bioinformatics/btx489
  16. Sović, I., Križanović, K., Skala, K. & Šikić, M. Evaluation of hybrid and non-hybrid methods for de novo assembly of nanopore reads . Bioinformatics 32, 2582–2589 (2016).

 

PacBio vs. Oxford Nanopore sequencing

Long-read sequencing developed by Pacific Biosciences and Oxford Nanopore overcome many of the limitations researchers face with short reads. Long reads improve de novo assembly, transcriptome analysis (gene isoform identification) and play an important role in the field of metagenomics. Longer reads are also useful when assembling genomes that include large stretches of repetitive regions.

Currently, there are two long read sequencing platforms. To help a researcher choose between which platform has greater utility for their application, we compare overall instrument specifications offered by PacBio and Oxford Nanopore, and published applications in the next-generation sequencing space.

Capturea Oxford Nanopore charges an access fee that gives users one MinION/PromethIon instrument, a starter pack of consumables, certain data services, and community-based support

* Insufficient data

Although both PacBio and Oxford Nanopore generate longer reads compared to short read Illumina or Ion sequencing, the higher error rate of both the PacBio and Oxford Nanopore sequencers remain an issue needs addressing. Whereas PacBio reads a molecule multiple times to generate high-quality consensus data, Oxford Nanopore can only sequence a molecule twice. As a result, PacBio generates data with lower error rates compared to Oxford Nanopore. PacBio has a slightly better overall performance for applications such as the discovery of transcriptome complexity and sensitive identification of isoforms. On the other hand, MinION provides higher throughput as nanopores can sequence multiple molecules simultaneously. Hence, it is best suited for applications that require a larger amount of data9

As long reads can provide large scaffolds, de novo assembly is one of the main applications of PacBio sequencing5. Though the error rate of PacBio data is higher than that of short read Illumina or Ion sequencing, increased coverage or hybrid sequencing can greatly improve the accuracy of genome assembly. PacBio sequencing has been successfully used to finish the 100-contig draft genome of Clostridium autoethanogenum DSM 10061, a Class III, the most complex genome classification in terms of repeat content and repeat type. It has a 31.1% GC content and contains repeats, prophage, and nine copies of rRNA gene operons. Using a single PacBio library and sequencing it with two SMRT cells, an entire genome can be assembled de novo with a single contig. When short read Illumina or Ion sequencing was used alone with the same genome, >22 contigs were needed, and each of the assemblies contained at least four collapsed repeat regions, PacBio assemblies had none10.

PacBio sequencing has also been used to assemble the chloroplast genome of Potentilla micrantha11, Saccharomyces cerevisiae, Aradopsis thaliana and Drosophila melanogaster using fewer contigs and CPU time for assembly compared to assemblies using Illumina sequencers12.

PacBio sequencing of PCR products can be used to improve the quality of current draft genomes by closing gaps and sequencing through hairpin structures and areas of high GC content more efficiently than Sanger sequencing13.

Pacific Biosciences has developed a protocol, Iso-Seq, for transcript sequencing. This includes library construction, size selection, sequencing data collection, and data processing. Iso-Seq allows direct sequencing of transcripts up to 10 kb without the use of a reference genome. Iso-Seq has been used to characterize alternative splicing events involved in the formation of blood cellular components14. This is essential for interpreting the effects of mutations leading to inherited disorders and blood cancers, and can be applied to design strategies to advance transplantation and regenerative medicine.

Another major application of PacBio sequencing is in epigenetics research. Recent studies demonstrate that investigation of intercellular heterogeneity in previously undetectable genome DNA modifications (such as m6A and m4C) is facilitated by the direct detection of modifications in single molecules by PacBio sequencing15.

Compared to PacBio, the Oxford Nanopore MinION is small (size of a USB thumb drive), affordable, utilizes a simple library prep and is field portable16. This is useful in situations such as a virus outbreak where a mobile diagnostic laboratory can be set up using MinIONS. In remote regions such as parts of Brazil and Africa where there are logistical issues associated with shipping samples for sequencing, MinION can provide immediate and real-time data to scientific investigators. The most notable clinical use of MinION has been the analysis of Ebola samples on-site during the viral outbreak in West Africa17,18.

The low cost of sequencing and portability of the MinION sequencer also make it a useful tool for teaching. It has been used to provide hands-on experience to students, most recently at Columbia University and the University of California Santa Cruz, where every student performed their own MinION sequencing19.

Perhaps the most ambitious MinION application is its potential to detect and identify bacteria and viruses on manned space flights. In a proof-of-concept experiment, Castro-Wallace et al. demonstrated successful sequencing and de novo assembly of a lambda phage genome, an E. coli genome, and a mouse mitochondrial genome. They observed that there was no significant difference in the quality of sequence data generated on the International Space Station and in control experiments that were performed in parallel on Earth22.

Recently, Oxford Nanopore developed a bench-top instrument, PromethION, that provides high-throughput sequencing and is modular in design. It contains 48 flow cells that can be run individually or in parallel. The PromethION flow cells contain 3000 channels each, and produce up to 40 Gb of data.

 

References:

  1. Pacific Biosciences – AllSeq. Available at: http://allseq.com/knowledge-bank/sequencing-platforms/pacific-biosciences/.
  2. Jain, M., Olsen, H. E., Paten, B. & Akeson, M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 239 (2016).
  3. Lu, H., Giordano, F. & Ning, Z. Oxford Nanopore MinION Sequencing and Genome Assembly. Genomics. Proteomics Bioinformatics 14, 265–279 (2016).
  4. Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. bioRxiv (2017).
  5. Jain, M. et al. MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry [version 1; referees: awaiting peer review]. F1000Research 6, (2017).
  6. Rhoads, A. & Au, K. F. PacBio Sequencing and Its Applications. Genomics, Proteomics Bioinforma. 13, 278–289 (2015).
  7. MinION. Available at: https://nanoporetech.com/products/minion.
  8. PromethION Early Access Programme. Available at: https://nanoporetech.com/community/promethion-early-access-programme.
  9. Oxford Nanopore in 2016. Available at: http://blog.booleanbiotech.com/nanopore_2016.html.
  10. Weirather, J. L. et al. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000Research 6, 100 (2017).
  11. Brown, S. D. et al. Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum and analysis of CRISPR systems in industrial relevant Clostridia. Biotechnol. Biofuels 7, 40 (2014).
  12. Ferrarini, M. et al. An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome. BMC Genomics 14, 670 (2013).
  13. Berlin, K. et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotech 33, 623–630 (2015).
  14. Zhang, X. et al. Improving genome assemblies by sequencing PCR products with PacBio. Biotechniques 53, 61–62 (2012).
  15. Chen, L. et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science (80-. ). 345, (2014).
  16. Feng, Z., Li, J., Zhang, J.-R. & Zhang, X. qDNAmod: a statistical model-based tool to reveal intercellular heterogeneity of DNA modification from SMRT sequencing data. Nucleic Acids Res. 42, 13488–13499 (2014).
  17. Jain, M., Olsen, H. E., Paten, B. & Akeson, M. Erratum to: The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 256 (2016).
  18. Quick, J. et al. Real-time, portable genome sequencing for Ebola surveillance. Nature 530, 228–232 (2016).
  19. Hoenen, T. et al. Nanopore sequencing as a rapidly deployable Ebola outbreak tool. Emerg. Infect. Dis. 22, 331–334 (2016).
  20. Citizen Sequencers: Taking Oxford Nanopore’s MinION to the Classroom and Beyond – Bio-IT World. Available at: http://www.bio-itworld.com/2015/12/9/citizen-sequencers-taking-oxford-nanopores-minion-classroom-beyond.html.
  21. Castro-Wallace, S. L. et al. Nanopore DNA Sequencing and Genome Assembly on the International Space Station. bioRxiv (2016).

AGBT 2014 – Summary of Day 1

AGBT 2014 Summary

The first day of the Advances in Genome Biology & Technology (AGBT) meeting kicked off with an introduction by Eric Green, Director of the National Human Genome Research Institute. He announced that this 15th annual meeting was the largest ever with 850 expected to attend. The opening plenary session certainly did not look like 850 people in attendance. Winter Storm Pax wreaked havoc on flights coming in from Atlanta and other cities, resulting in several speaker and general attendee cancellations.

The plenary session began with scheduled talks by Aviv Regev, Jeanne Lawrence, Wendy Winckler and Valerie Schneider. Jeanne Lawrence couldn’t make it, which was a shame particularly since she gave a brilliant talk at ASHG on using a single gene XIST to shut down the extra copy of chromosome 21 in Down syndrome. This work was nicely summarized in a publication that came out this summer titled: Translating dosage compensation to trisomy 21.          

Aviv Regev and Wendy Winckler’s talks were subject to a blog/tweet embargo (unclear whether Regev’s talk was completely under embargo or only the last half, we’re playing it safe and not discussing it here), leaving Valerie Schneider’s presentation the only one that was tweeted or written about. This instantly created great angst among those attending the lectures, those stuck in airports enroute to AGBT and those at home waiting for in depth coverage.

Single-cell sequencing, considered the “method of the year” by Nature Methods was the basis of the opening lecture. Aviv Regev offered an excellent view of the dendritic cell network based on cyclical perturbations and variations between single cells. Regev’s first half of her presentation titled, “Harnessing Variation Between Single Cells to Decipher Intra and Intercellular Circuits in Immune Cells” was largely covered by her publication in April, “Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells”.

The second talk, by Wendy Winckler was not allowed to be discussed or tweeted according to Winckler, courtesy of Novartis’s communications department. The title of her presentation “Next Generation Diagnostics for Precision Cancer Medicine” wasn’t revealing either. To get an idea of what she’s up to and the direction of her lecture, you can read these recent publications.

The final talk by Valerie Schneider, titled “Taking advantage of GRCh38” began with an analogy to an unwanted pair of socks one receives for Christmas that ends up being used and finally really liked. “It was time for an update….whether or not it was on your wish list”. We were reminded that centromeres are important specialized chromatin structures important for cell division, but because of repetitive regions, they are not represented in reference assemblies. Previous versions of the human reference assembly had centromeres represented by a 3M gap. The latest assembly, GRCh38 incorporates centromere models generated using whole genome shotgun reads as part of the Venter sequencing project. Since there are two copies of each centromere for each autosome, these centromere models represent an average of two copies. She concluded her presentation urging users to switch now: http://www.ncbi.nlm.nih.gov/genome/tools/remap.

 After a short break from the talks, the closing reception sponsored by Roche began outside. Halfway through, there was a brief yet sudden Florida thundershower that sent the entire AGBT community scurrying indoors for shelter. That was okay though because the conversations just continued indoors. Looking forward to tomorrow morning’s lectures. Several of the ones we’ve highlighted will be up.

 

3 Top Factors Researchers Consider When Selecting an NGS Provider

At Genohub, not only do we seek feedback from researchers, our development methodology is almost entirely based on this feedback. We receive this feedback via website forms as well as routine one-on-one conversations with some of the top researchers using next generation sequencing for their projects. Through this data and interaction, certain trends have begun to emerge which may be useful to an NGS provider seeking additional projects. This list is not based on a controlled experiment, however countless conversations indicate that these factors are extremely important:

  1. Turnaround time – this one is a toss up when compared with price, but we typically find turnaround time to be among the leading factors in a researcher’s decision to select an NGS provider. We have heard quite a few stories of researchers seeing turnaround times over several months for library prep and sequencing.
  2. Price – while this is one of the biggest factors for researchers, it must be qualified with established trust which is the next major factor.
  3. Trust – this one is a biggie for many researchers and often a non-starter if not established. The main reasons for this are that researchers are hesitant to ship their precious samples (ie human brain tissue) to an NGS provider for quite often costly sequencing if they are not confident in their abilities. Researchers have told us some of the things they look for which lend to building their confidence:
    • Referrals & Reviews – researchers seek out colleagues who have done similar projects and look for recommendations. Word of mouth is one of the biggest methods researchers rely on to select an NGS provider.
    • Publications – providers who are listed in publications involving similar projects.
    • What kind of QC will be run on the sample.
    • Overall experience indicators such as time in business and volume of samples regularly handled.
    • Data and sample security.
    • Location – this factor is considerably important if previous trust is not established. Some researchers have absolutely no problem shipping samples across the globe, while others might physically drive their samples to a local provider to ensure sample integrity.

We would love to hear your feedback on this topic whether you are an NGS provider, or a researcher actively using next sequencing. What other decision driving criteria have you found as a provider, or what are some other factors important to you as a researcher?

In a Nutshell: Life Tech Exome Certified Service Provider Program

Life Technologies announced yesterday that they launched the Ion AmpliSeq Exome Certified Service Provider Program.

What the program is in a nutshell:

  • Goals: Offer a network of next gen sequencing providers able to help researchers get a high quality exome sequence at a reduced cost with fast turnaround times and low amounts of input material
  • Exome sequencing inputs: as little as 50ng of customer DNA
  • Library kit used: Ion AmpliSeq Exome kit
  • NGS Instrument used: Ion Proton
  • Exome sequencing outputs: high quality data, which of course can be used with Ion Reporter Software for mutation validation, annotation, and reporting

The Service Provider Program is intended to fill exome sequencing market demand which Life Tech argues has been under-serviced with exome sequencing currently going for $1,000+ , long turnaround times up to 8 weeks, and requiring up to 3mg of DNA. Dr. Candace Johnson, Deputy Director and the Wallace Chair of Translational Research at Roswell Park Cancer Institute states “Exome sequencing will be central to discoveries made in clinical research”. If the Exome CSP delivers as promised, it could have a major impact in accelerating discoveries made in clinical research.

For more information on the Life Tech Provider Program please see the entire press release.

Targeted Resequencing (TPS/WES) Tops Next Gen Sequencing Survey

Oxford Gene Technology (NGS provider currently listed on Genohub) recently presented the results of their next gen sequencing survey which demonstrated targeted resequencing as the top use for next generation sequencing. The results are based on a survey of 596 researchers who responded regarding their current and expected use of NGS services. When compared to the results for whole genome sequencing the popularity of targeted resequencing is possibly attributed mostly to the lower cost of targeted resequencing. This infographic depicts the results:

OGT NGS Survey Results

OGT NGS Survey Results

Other interesting results point to a general data problem with 38% of respondents saying they lack trust in bioinformatics data. Bioinformatics also leads the field when researchers were asked about the biggest barrier to NGS usage (see below).

Barriers to NGS Usage

Barriers to NGS Usage

Undoubtedly this presents an immense opportunity for the bioinformatics sector to increase confidence in data accuracy and interpretation which could have a positive impact on the use of next gen sequencing as a whole.

You can find many more interesting survey results on the excellent infographic titled Oxford Gene Technology – NGS Survey 2013.

First XPRIZE Cancelled Due To Unexpected Innovation in Next Gen Sequencing

For the first time ever, an XPRIZE has been cancelled. The reason — unexpected innovation in next generation sequencing. The Archon Genomics XPRIZE announced in 2006, had promised to award $10 mil to the first team that was able to accurately sequence 100 whole human genomes at a cost of $10,000 or less per genome in a short period of time. The competition was cancelled as XPRIZE CEO Peter Diamandis and team felt it was not serving its intended purpose to incentivize technological innovation in gene sequencing.

As stated by Peter Diamandis, “Every XPRIZE is carefully designed to address a market failure and hopefully create a new industry to achieve breakthroughs and solutions once thought to be impossible.” Although the Archon Genomics XPRIZE was conceived according to this criteria, the XPRIZE team felt that innovation in gene sequencing has been progressing independently of the XPRIZE incentive, therefore voiding the need for the competition.

The rapid innovation in next generation sequencing has caused sequencing times to decrease and prices to plummet to around $5,000 per genome. The XPRIZE team feels as if the targets laid out by the competition will be met in the very near future with or without their incentive, and have opted to cancel the XPRIZE and return the money to sponsors. The announcement by Peter Diamandis can be read in its entirety on the Huffington Post.

The logic behind the XPRIZE cancellation seems clear, however it remains to be seen what backlash, if any, arises from scientists who may have spent considerable time and effort devoted to meeting this challenge. Although next gen sequencing instruments are developed by large companies such as Illumina ($1.15B revenue), which may not be driven by a competition like the XPRIZE, innovation in this field must also be attributed to the wider research community, of which a team may have conceivably won the competition independent of any large commercial enterprise. In fact, in his cancellation announcement, Peter Diamandis thanks George Church and the Wyss Institute at Harvard for registering for the competition. Does the XPRIZE lose some of its ability to incentivize future competitions because of this cancellation? We welcome your comments on the matter.