Scientists Develop Method to Rapidly Analyze High Throughput Sequencing Data

Scientists in Singapore have developed a method to rapidly analyze high throughput sequencing data. This method incorporates a mathematical technique which has long been used in industries outside biotechnology such as radar, electrical engineering, and cell phones.

While high throughput sequencing has revolutionized molecular biology, there has been a speed bump which slows the analysis of sequencing results. The high throughput sequencing process generates a massive amount of data which must be analyzed to filter signal from noise. Traditionally, researchers have treated each set of sequencing data as unique and requiring its own set of analytical methods to decipher. Because of this, it can take a lengthy period of time to produce meaningful sequencing data.

As published in Nature Biotechnology, the team of scientists was able to produce meaningful results from a variety of high throughput sequencing data sets using only one analytics technique, the pre-whitening match filter. The filter was demonstrated to produce data across several sequencing based functional profiles that is more accurate than assay-specific analysis methods.

This latest technique holds promise to reduce time and cost for patients who could benefit from an improved level of care due to sequencing.

The article can be found at: Kumar et al. (2013) Uniform, Optimal Signal Processing Of Mapped Deep-Sequencing Data.


University of Rochester Genomics Research Center Adds High Throughput Sequencing Services to Genohub

We want to welcome the University of Rochester Genomics Research Center to the Genohub family. The high throughput sequencing services listed by the University of Rochester will add more available options to the Genohub next generation sequencing market.

The University of Rochester added the following instruments and library prep options:

  • Instruments: Illumina HiSeq
  • Library prep: a host of Illumina options
    • Directional RNA (rRNA-depleted)
    • Directional RNA (polyA-selected)
    • Small RNA (microRNA)
    • ChIP
    • Mate Pair (1-6 kb)
    • DNA
    • Bisulfite
    • RNA (polyA-selected)

We look forward to matching the University of Rochester with the most compatible high throughput sequencing orders for their lab.

High Throughput Sequencing Locations

If you get the inkling to know just where high throughput sequencing providers and sequencers themselves are located worldwide, there’s an excellent resource to help you do just that. If you haven’t come across this yet, is a crowd-sourced map which not only lists service providers globally, but also allows users to view the locations of selected individual sequencers. The site is aptly titled “Next Generation Genomics: World Map of High-Throughput Sequencers”. The home page is comprised of a map built on a Google Maps foundation that overlays the locations of sequencing providers and sequencers. Below is a screenshot of this view:

Omicsmaps - High Throughput Sequencing Locations

Omicsmaps – High Throughput Sequencing Locations

Users also have the option to filter by country. What is also quite interesting is the “Machine Statistics” link on the top right hand corner of the main screen which upon clicking, displays the total numbers of facilities per country, sequencers per provider, and many other interesting factoids. Here are a few highlights:

  • Largest genome center in the US: the Broad Institute of MIT and Harvard takes the cake on this one with a whopping 101 sequencing machines
  • The #1 high throughput sequencing platform in use today: this badge belongs to the Illumina HiSeq 2000 with 739 machines globally
  • There are more sequencing machines in the US than the next lower 6 countries on this list combined!

Each high throughput sequencing service provider also includes a link to their website which some might find useful. Another great way to get in touch with sequencing providers is through the Genohub shopping interface. Some of the best sequencing providers have already signed up with Genohub to also list their equipment along with the associated services, pricing, and additional value they offer. We encourage you to give it a try!

Longhorn Vaccines and Diagnostics Adds High Throughput Sequencing Services to Genohub

We want to welcome Longhorn Vaccines and Diagnostics to the Genohub family. The high throughput sequencing services listed by Longhorn will add more available options to the Genohub next generation sequencing market.

Longhorn Vaccines added the following instruments and library prep options:

  • Instruments: Ion PGM 314 Chip, and Ion PGM 318 Chip
  • Library prep: Ion Amplicon, and Ion total RNA

We look forward to matching Longhorn Vaccines with the most compatible high throughput sequencing orders for their lab.

Dr Luke Daum and team at Longhorn Diagnostics also provided us with some excellent feedback on the Genohub platform. At Genohub we strive to maintain a close ear to our clients and continue to improve the experience and service we provide based directly on feedback. We look forward to incorporating suggestions by the Longhorn team into future Genohub iterations.

Sequencing Design Part I: Replication, Randomization and Multiplexing


Replicates are essential in any biological experiment, the same goes for high throughput sequencing. Samples are subject to variation thus making biological replicates important for statistical significance and identifying sources of variation. Despite the desire to cut back on replicates to reduce cost, it’s important to remember that there are many factors which may cause a sequencing run or sample to fail. If you don’t have sufficient replicates, you may have to repeat your sequencing run. In general we recommend at least 4 biological replicates for every experiment.


Randomization is a process of assigning biological samples at random to groups or to different groups within an experiment. This reduces bias by equalizing independent variables that have not been accounted for in the experimental design. Randomization reduces instrument effect, systemic bias and the potential for the occurrence and effect of confounding factors (operational, procedural and person confound). The two main sources of variation that contribute to confounding factors are 1) library effects that occur due to reverse transcription and amplification and 2) subunit effects (sequencing lanes [Illumina and SOLiD], chips [Ion], plates [Roche 454]) such as poor base calling, bad sequencing cycles. We recommend randomizing your samples by making sure each sequencing subunit contains samples from both control and experimental groups. This can be done by barcoding or indexing your samples to allow for multiplexing.


DNA (or cDNA fragments made from RNA) can be labelled with sample specific sequences or barcodes that allow multiple samples to be included in the same sequencing reaction. Multiplexing allows for proper sample identification after the sequencing run is complete. Multiplexing can be used to create balanced, pooled experimental designs. If you have 8 samples that require the sequencing output obtained from 3 Illumina lanes, subunit effects can be eliminated by multiplexing all 8 samples and loading each 8 sample multiplexed pool into all 8 lanes. All subunit (lane effects) will be the same for each sample. Multiplexing also has the advantage of eliminating phasing issues related to low multiplex pools. Low multiplexed pools can result in no signal in one of the color channels of an index read. The image registration might fail and no base will be called from that cycle. If a base isn’t called then samples will not be able to be demultiplexed.

To conclude, the best way to ensure reproducibility is to include independent biological replicates that are randomly assigned to a sequencing subunit (flow cell lane, chip or plate). This can be done by multiplexing your samples using sample indices or barcodes. Multiplexing by adding a barcode during the ligation step of library prep will eliminate 1) library (amplification) and 2) subunit effects, confounding factors in sequencing.

If you’re new to high throughput sequencing and have questions about how you should design your sequencing run, email us to take advantage of our free consultation. AtGenohub we’re always happy to discuss your sequencing project, regardless of whether you use our service.

Genohub Now Open to NGS Service Providers

Demand for quick and accurate high throughput sequencing has never been greater. According to a recent Frost and Sullivan market analysis, Next Generation Sequencing (NGS) services will grow at an annual compound rate of 28% from 2011 to 2016. Based on this estimate, we should expect a double digit increase in the number of service providers entering the marketplace each year. Such growth will exacerbate current challenges in efficiently accessing and utilizing NGS resources, including the lack of an effective interface for customers and service providers to work with each other. After several years of working in this industry, we’re pleased to launch Genohub, an online sequencing marketplace, to address this and several other industry issues:

Discoverability and Accessibility

With over 5 commercial sequencing platforms, 3 times that number of sequencing instruments, numerous read type combinations and more than 20 different library preparation methods, conveying information to sequencing clients can be a formidable task, one that will only become harder as new sequencing technology and library prep methods are introduced. Additionally, non-standard service descriptions, prices and deliverables make it difficult for customers to decide whether a particular sequencing facility can handle their samples. While this may not be a problem for facilities that deal with just a handful of repeat clients, it prevents services from being easily discovered and accessed by a much larger number of prospective customers. Genohub solves this problem by allowing service providers to efficiently and precisely advertise NGS services in a format that’s easy to sort, filter and search. Our software automatically offers services to customers based on their experimental needs. In doing so, we’re providing the infrastructure to make sequencing services more accessible to the larger sequencing market.


We know that spending time writing quotes for customers who are comparing prices is tedious and frustrating. Genohub allows customers to compare and order services immediately, review provider-specific instructions, enter details of their experiment and place the order right there on the site. This not only saves time, but also allows providers instant access to a large NGS customer base. By the way, service providers always have the option not to accept a customer’s order. Genohub provides both the marketplace and the freedom to make final risk-free decisions on all orders.

Project Communication

The provider and customer typically need to communicate with one another at multiple points over the course of a project. This includes the provider requesting more information from the customer about the order, communicating unforeseen handling or quality issues with the customer’s samples, communicating the order status, etc. Genohub streamlines these interactions with a provider/customer interface that allows open and documented communication, along with the ability to update order status. We’ve developed “single-click” order updates for service providers to easily convey information to NGS customers who would otherwise inquire several times by phone or email.

Join us! During public beta, we are inviting NGS service providers to join our site, list their services and take orders. See our overview page for a quick description of how the site works or go straight to sign up. As always, please don’t hesitate to contact us with questions at