With new instrumentation, cluster chemistries, software updates and continuously updated library preparation reagents; accurately monitoring sequencing run quality has become increasing difficult. In a recent paper by Manley et al., 2016, the authors develop an open source tool called the Percent Perfect Reads (PPR) plot to monitor base quality.
PPR uses PhiX alignment and calculates percent of reads with 0–4 mismatches. A PPR plot contains a cycle-by-cycle representation of the percentage of reads with mismatches. PPR was originally introduced with the original Genome Analyzer and retired in 2014.
PPR is developed as an alternative to the Phred-like Q score for determining run quality and has the following advantages:
- PPR is independently calculated, unlike Illumina’s Q Score which is calculated with instrument dependent variables (vary by instrument, chemistry, software)
- PPR is a direct measure of error unlike Q score’s which rely on a table of data, generated under ideal sequencing circumstances
- Q scores tend to overestimate quality
- Unlike with Q scores, PPR allows the user to identify the source of sequencing error
By examining a PPR profile, the following issues are distinguishable:
- Adapter read through (sequencing cycles are longer than the library insert and the run reads through the adapter sequence)
- Repetitive or low diversity sequences
- Imaging problems
- Over/under clustering
- Chemistry problems (cluster reagents are not working properly)
The PPR plot program is compatible with HiSeq 2000/2500, NextSeq 500, and MiSeq instruments. It’s written in Perl and R, and accepts FASTQ files as input. The PPR software package is available at http://openwetware.org/wiki/BioMicroCenter:PPR_Program (BioMicro Center, Massachusetts Institute of Technology, Cambridge, MA, USA).