NextSeq, HiSeq or MiSeq for Low Diversity Sequencing ?

Low diversity libraries, such as those from amplicons and those generated by restriction digest can suffer from Illumina focusing issues, a problem not found with random fragment libraries (genomic DNA). Illumina’s real time analysis software uses images from the first 4 cycles to determine cluster positions (X,Y coordinates for each cluster on a tile). With low diversity samples, color intensity is not evenly distributed causing a phasing problem. This tends to result in a high phasing number that deteriorates quickly.

Traditionally this problem is solved in two ways:

1)      ‘Spiking in’ a higher diversity sample such as PhiX (small viral genome used to enable quick alignment and estimation of error rates) into your library.  This increases the diversity at the beginning of your read and takes care of intensity distribution across all four channels. Many groups spike in as much as 50% PhiX in order to achieve a more diverse sample. This disadvantage of this is that you lose 50% of your reads to sample you were never interested in sequencing.

2)      Other groups have designed amplicon primers with a series of random ‘N’ (25%A, 25%T, 25%G, 25%C) bases upstream of their gene target. This and a combination of PhiX spike also helps to increase color diversity. The disadvantage is that these extra bases cut into your desired read length and can be problematic when you are trying to conserve cycles to sequence a 16S variable domain:

Last year, Illumina released a new version of their control program that included updated MiSeq Real Time Analysis (RTA) software that significantly improves the data quality of low diverse samples. This included 1) improved template generation and higher sensitivity template detection of optically dense and dark images,  2) a new color matrix calculation that is performed at the beginning of read 1, 3) using 11 cycles to increase diversity, and 4) new optimizations to phasing and pre-phasing corrections to each cycle and tile to maximize intensity data. Now with a software update and as little as 5% PhiX spike-in, you can sequence low diversity libraries and expect significantly better MiSeq data quality.  

Other instruments, including the HiSeq and GAIIx still require at least 20-50% PhiX and are less suited for low diversity samples. If you must use a HiSeq for your amplicon libraries take the following steps with low diversity libraries:

1)      Reduce your cluster density by 50-80% to reduce overlapping clusters

2)      Use a high amount of PhiX spike in (up to 50%) of the total library

3)      Use custom primers with a random sequence to increase diversity. Alternatively, intentionally concatamerize your amplicons and fragment them to increase base diversity at the start of your reads.

The NextSeq 500, released in March of 2014, uses a two channel SBS sequencing process, likely making it even less suited for low diversity amplicons. As of 4/2014, Illumina has not performed significant validation or testing using low diversity samples on the NextSeq 500. It is not expected the NextSeq 500 instrument will perform better than the HiSeq for these sample types.

So, in conclusion, the MiSeq is currently still the best Illumina instrument for sequencing samples of low diversity:

100 Gb of Data per Day – Nextseq 500 Sequencing Services Now Available on Genohub

Nextseq 500, Genohub

Find Nextseq 500 service providers on

Access to the Nextseq 500, Illumina’s first high throughput desktop sequencing instrument, is now available on While not the highest throughput instrument on the market, it is one of the fastest with up to a 6x increase in bases read per hour (compared to HiSeq). The instrument is ideally suited for those who need a moderate amount of sequencing data (more than a MiSeq run, less than HiSeq) in a short amount of time. We expect the highest interest to be centered around targeted sequencing (exome or custom regions) and fast RNA profiling. For exome studies, you can run between 1-12 samples in a single run and get back 4 Gb at 2 x75 or 5 Gb at a 2×100 read length. If you’re interested in RNA profiling at 10M reads per sample, you can multiplex between 12-36 samples together in a single run. A 1×75 cycle run takes as few as 11 hours to complete and 2×150 runs take ~29 hours.

You can order Nextseq 500 sequencing services today and expect to receive data back in 3-4 days ! Prices for 1 lane start at $2,250. Start your search here and use our helpful filters to narrow down your choices:

After you’ve identified the service you need, communicate your questions directly to the service provider. We’ll make sure you get a fast response. Genohub also takes care of billing and invoicing, making domestic & international ordering a breeze. We also have an easy to use project management interface to keep communication and sample specification data in one place.

If you’re not familiar with Nextseq technology or how best this instrument can be applied to your samples, take advantage of our complementary consultation service: We can help with your sequencing project design and make recommendation as to what sequencing service would be best suited for your experiment.

Last month we announced the availability of HiSeq X Ten services on Genohub:

As an efficient online market for NGS services, Genohub increases your access to the latest instrumentation and technology.  You don’t have to shell out $250K or $10M for a NextSeq or HiSeq X Ten, when access to professional services is right at your fingertips !