Day 2 Summary from the Future of Genomic Medicine Conference 2018

Day 2 of the of the FOGM conference showcased some truly fascinating breakthrough talks, particularly related to how we diagnose and treat cancer.

Dr. Todd Golub of the Broad Institute gave a phenomenal talk on the need for better pre-clinical models in order to get drugs to market faster. He and his team utilized a high-throughput screening method called PRISM (Profiling Relative Inhibition Simultaneously in Mixtures), which relies on 24 nucleotide long biological barcodes that are stably integrated into genetically distinct tumor cell lines that allows screening drug candidates against a large number of cancer cell lines.

Hillary Theakston of the Clearity Foundation spoke about the importance of helping patients and families to ‘live in the future of genomic medicine’. Her foundation focuses on helping women suffering from ovarian cancer, which really lags behind in terms of survival compared to to other cancers. It’s mostly diagnosed in stage 3 or 4, and only 30% of women make it past 10 years. Clearity helps these women by spending hours of individual counseling on their unique case, including performing genetic testing of their tumor and trying to get them into drug trials that may be beneficial for their specific type of cancer–in fact, 27% of Clearity patients end up in a clinical trial. I found this talk particularly moving–while the science at the conference was incredible, it’s so important that we not forget about the patients in the process.

Dr. Mickey Kertesz, the CEO of Karius, spoke about the importance of effective clinical diagnoses. While most of the FOGM talks were cancer-related, Dr. Kertesz spoke about how we can use genomic testing to inform infectious disease diagnostics and treatments. Infectious diseases are the cause of ⅔ of the deaths of all children under 5 years old, so getting the proper treatment in a timely manner is absolutely crucial. Currently, even after 8 days in a clinical setting, only 38% of patients are diagnosed. Karius aims to improve that with their end-to-end sample processing that is able to detect circulating microbe DNA samples in the patient blood and check it for over 1,000 pathogens (possibly using 16S or metagenomic sequencing), leading to 50% of patients diagnosed in just ONE day–an enormous improvement.

Dr. C. Jimmy Lin of Natera spoke about his company’s new personalized liquid biopsy, Signatera. Signatera aims to increase the speed at which we detect cancer relapses by determining the unique clonal mutations in each patients tumor. They can then look for circulating cell-free DNA in the blood and sequence it deeply only on those specific genes (i.e., custom amplicon sequencing), looking for those same clonal mutations in the blood. Using this pipeline, they are able to detect relapses up to 100 days before they can be clinically diagnosed. The next step will be show that this can improve clinical outcomes. I wish them the best of luck with this–this could be a game-changer for diagnostics.

Dr. Kate Rubins is the first person to sequence DNA in space! It’s hard to describe how amazing this is. There are a lot of technical challenges to overcome when working in a vacuum, but Kate was able to successfully pull this off. Be able to sequence samples during space flight will certainly prove to be useful when we eventually take our first long-term mission and want to be able to sequence the samples we find ASAP!

That’s it for the Future of Genomic Medicine conference! Of course, all of the talks were fantastic and I didn’t have the space to summarize them all here. Check out the course overview here for more details.


Day 1 Summary from the Future of Genomic Medicine Conference 2018



Right next to the conference location!

It’s hard to imagine a better conference setting than the Scripps Research Institute, where you can listen to scientific talks while literally sitting right next to the beach! It’s even better when those talks are as interesting as they were. The FOGM conference covered a lot of ground:

Dr. Juan Carlos Izpisua Belmonte discussed his latest findings using the homology-independent targeted integration (or HITI) to target non-dividing cells–a key feature that means it could be used to treat adults and not just embryos. He has previously shown that they could insert treat rats with retinitis pigmentosa using this system, and they showed improvement in their ability to respond to light and healing in their eyes. In his talk at the conference, he was also able to use this same technique to alter the epigenetic landscape of mice suffering from progeria, a genetic condition that induces rapid aging, and show improved organ function and lifespan. He hopes to use this discovery to move us towards eventual treatments for the symptoms of aging–the one disease that we all suffer from. See here for the full picture.

Dr. Paul Knoepfler presented an elegant model of epigenetic effects in pediatric glioma. Pediatric gliomas are nearly 100% fatal even with the best treatments, and the treatments are incredibly severe. Children with this disease are given the same treatments as adults, but what if the tumors are different?

Well, it turns out that they are different. Dr. Knoepfler showed that pediatric gliomas frequently possess two unique point mutations in histone H3.3, and that these mutations aren’t seen in the adult gliomas. It seems astonishing that these two point mutation can convey such incredible lethality, but in fact, even small histone mutations can be incredibly lethal because of the effect on the epigenetic landscape (seen via ChIP-seq by Bjerke et al).

So, Dr. Knoepfler wanted to see if reversing these two mutations in these cancer cells could reverse the phenotype, and additionally, if doing the opposite (causing those two mutations in normal brain cells) would induce the cancer phenotype. In fact, in both cases, either reversing or causing those mutations caused an immense transcriptional shift in the opposing direction, indicating that these two point mutations are enormously important in this cancer type. Dr. Knoepfler wants to use this information to create mouse models and test new drug treatments to see which of them can be most effective against this particularly aggressive cancer.


ROC curves for Dr. Mesirov’s predictive models, which outperform current clinical predictions.

Dr. Jill Mesirov also gave a very informative talk regarding pediatric brain tumors. Her lab applies machine learning and statistical techniques to find molecular markers to aid in the identification and stratification of cancer subtypes. She examined the RNA profiles from several pediatric medulloblastoma tumors using RNA-seq and found 6 different subtypes in the RNA profiles with vastly different survival rates. Using this model, she was able to categorize 15 patients that were diagnosed as ‘low-risk’ using traditional diagnostic methods as actually being ‘high-risk’–and 6 of these patients went on relapse within 3 years. Dr. Mesirov wants to use this model to help identify novel therapeutics for the particularly deadly myc-driven cancer subtype, and hopefully improve clinical outcomes.

Bonnie Rochman’s talk focused more on the sociological effects of modern genomic medicine, particularly with respects to having children. She asked some tough questions of the audience regarding prenatal testing for Down’s syndrome, or early childhood screening for genes like the BRCA1/2 genes, which predispose a person to breast cancer. Additionally, our ability to gather genomic data far outpaces our ability to accurately interpret said data, which leads to a lot of anxiety around what all this genetic testing actually means. She concludes that ultimately, there are no right or wrong answers regarding this subject–everyone has their own thoughts and feelings, shaped by their experiences with their own genetics. You can read more in her book here

Those were my favorite talks from the first day….check out the Day 2 summary tomorrow!

Day 2 Highlights from the Future of Genomic Medicine 2018 #FOGM18

The first day of the FOGM conference was absolutely incredible! I learned so much and heard some fascinating science, as well as met some truly amazing scientists and entrepreneurs. Here are the talks I’m most looking forward to today!

Todd Golub, MD, Speaking on Cancer Genomics

Dr. Golub has been a key member of important institutions such as the Broad Institute and Harvard Medical School, and made important discoveries in the genetic basis for childhood leukemia.

Mickey Kertesz, PhD, Speaking on Circulating DNA and RNA

Dr. Kertesz is the CEO of Karius, a company dedicated to making pathogen detection easier to implement for patients. He has a PhD in computational biology and did postdoctoral work at Stanford in investigating the genetic diversity of viruses.

Ash Alizadeh, MD, Phd, Speaking on Circulating DNA and RNA

Dr. Alizadeh studies the genomic biomarkers of tumors, particularly looking at non-invasive methods of detecting cancer such as looking for circulating tumor DNA (ctDNA in the blood). Shouldn’t be missed!

Jimmy Lin, MD, PhD, MHS, CSO, Speaking on Circulating DNA and RNA

Dr. Lin led the first ever exome sequencing studies in cancer and is the CSO of Natera.

Alicia Zhou, PhD, Speaking on Predictive and Preventative Genetics

Dr. Zhou studied the effects of the c-Myc oncogene in triple negative breast cancer, and currently works as the Head of Research at Color Genomics to bring population genetics to full populations.

Leslie Biesecker, MD, Speaking on Predictive and Preventative Genetics

Dr. Biesecker developed the ClinSeq® program in 2006, before the wide availability of NGS. I’m looking forward to hearing his perspective on preventative genetics.

Robert Gould, PhD, Speaking on Epigenetics

Dr. Gould has had an incredibly distinguished career. He’s currently President and CEO of Fulcrum Therapeutics–prior to that, he served as the director of novel therapeutics at the Broad Institute and spent 23 years at Merck.

Kathleen Rubins, PhD, Speaking on Our Genomics Past and Future

Dr. Rubins is the first person to sequence DNA in space! Need I say anything more?

Day 1 Highlights of the Future of Genomic Medicine Conference #FOGM18

Many of the most difficult to treat diseases that exist today have genetic origins, and one of the most difficult things about devising new treatments is the lack of connection between the research and clinical sides of biology. Because of that, the Future of Genomic Medicine conference is one of the most interesting ones to attend, because a truly fantastic mix of PhDs, MDs, and others (which this year includes journalists, CEOs and an astronaut!) have an opportunity to present and create new connections in this community.

There are so many fascinating speakers that it’s difficult to narrow it down, but here are some to watch out for on Day 1:

Eric Topol, MD, Speaking on the Future of Individualized Medicine

Dr. Topol is the founder and direction of the Scripps Translation Science Institute and in 2016 was awarded a $207M grant to lead a part of the Precision Medicine Initiative. He is one of the organizers of the Future of Genomic Medicine and has been voted the #1 influential physician leader in the US by Modern Healthcare.

Andre Choulika, PhD, Speaking on Genome Editing

In his post-doctoral work, Dr. Choulika was one of the inventors of nuclease-based genome editing and currently serves as CEO of Cellectis. We’re very interested in what he has to say on the current state of genome editing!

Paul Knoepfler, PhD, Speaking on Genome Editing

Dr. Knoepfler is not just a cancer researcher, but also a cancer survivor. He is currently studying the epigenetics of cancer and stem cells, using many techniques including CRISPR. It will be interesting to see how he uses genome editing and CRISPR in his research! He is also an active blogger and author.

Mark DePristo, PhD, Speaking on Data Science in Genomics

Dr. DePristo was part of the team that developed the GATK, one of the most prominent softwares for processing next-generation sequencing data. He currently is the head of the Genomics team at Google.

Jill P. Mesirov, PhD, Speaking on Data Science in Genomics

Dr. Mesirov does fascinating work applying machine learning to cancer genomics to stratify cancer patients according to their risk of relapse and identifying potential compounds for treatments.

Viviane Slon, Graduate Student, Speaking on Genetics of Human Origins

It’s so great to see a graduate student speaking at a conference! Viviane studies the DNA of our closest extinct relatives. It should be interesting to see her new data!

Eske Willerslev, DSc, Speaking on Genetics of Human Origins

Dr. Willerslev is an evolutionary geneticist most known for sequencing the first ancient human genome–it should be interesting to hear his perspective on human origins!

Keep an eye out for my highlights for Day 2 coming tomorrow!


Nanopore Sequencing: The Future of NGS?

As I mentioned in my previous post, nanopore sequencing using the MinION instrument is one of the hottest new sequencing techniques currently available. It has several benefits over the current generation of short-read sequencing instruments, including measuring epigenetic DNA modifications and ultra-long reads, which allows for improved coverage of difficult-to-sequence regions.

It does have a few drawbacks however, including the fact that it has a fairly low output, which mostly relegates it to sequencing microbial genomes. However, a recent paper by Jain et al. from UCSC [1] used the minuscule MinION instrument to sequence the human genome and compare it to the current reference genome.

There were several items of note in this paper, not the least of which is that this is the most contiguous human genome to date, getting us closer and closer to a telomere-to-telomere sequence. Additionally, they were able to close 12 gaps, each of which was more than 50 kb in length, significantly improving completion of the genome.

Amazingly, since nanopore sequencing does not utilize PCR amplification, epigenetic modifications are maintained and are actually measurable by the MinION. The instrument is capable of detecting 5-methylcytosine modifications, and this data showed good concordance with whole genome bisulfite sequencing performed in the past.

Furthermore, they were able to map several of their ultra-long reads with telomeric repeats to specific chromosomal regions. They were then identify the start of the telomeric repeats and calculate the length of the repeat sequence. Overall, they found evidence for repeat regions that span 2 – 11 kb.

Long and ultra-long reads are absolutely critical when it comes to annotating these highly repetitive regions. There are other sequencers, including the PacBio SMRT Sequel sequencing system, that allows for very long reads compared to the Illumina instruments. But Jain et al. were able to obtain reads that were up to a staggering 882 kb in length.

Jain et al. were able to effectively show that the MinION system is capable of being used to sequence something as complex as a human genome. Interestingly, they theorized that the MinION system may have no intrinsic limit to read length–meaning that this protocol can be improved even further by finding methods of purifying high molecular weight DNA without fragmentation. Additionally, MinION reads are still considerably less accurate than Illumina sequencing, so this aspect could be improved as well. Nonetheless, this is a truly astonishing accomplishment that indicates what the future of DNA sequencing holds in store.

If you’re interested in finding a provider of nanopore sequencing, please send us an email at and we’d love to help you with your project!


Sanger Sequencing Turns 40: Retrospectives and Perspectives on DNA Sequencing Technologies

Retrospective: What have we accomplished with DNA sequencing so far?

Sanger wasn’t the first person to attempt sequencing, but before his classic method was invented, the process was painfully slow and cumbersome. Before Sanger, Gilbert and Maxam sequenced 24 bases of the lactose-repressor binding site by copying it into RNA and sequencing the RNA–which took a total of 2 years [1]!

Sanger’s process sequencing made the process much more efficient. Original Sanger sequencing took a ‘sequencing by synthesis’ approach, creating 4 extension reactions, each with a different radioactive chain-terminating nucleotide to identify what base lay at each position along a DNA fragment. When he ran each of those reactions out on a gel, it became relatively simple to identify the sequence of the DNA fragment (see Figure 1) [2].


Figure 1: Gel from the paper that originally described Sanger sequencing.

Of course, refinements have been made to the process since then. We now label each of the nucleotides with a different fluorescent dye, which allows for the same process to occur but using only one extension reaction instead of 4, greatly simplifying the protocol. Sanger received his second Nobel Prize for this discovering in 1980 (well-deserved, considering it is still used today).

An early version of the Human Genome Project (HGP) began not long after this discovery in 1987. The project was created by the United States Department of Energy, which was interested in obtaining a better understanding of the human genome and how to protect it from the effects of radiation. A more formalized version of this project was approved by Congress in 1988 and a five-year plan was submitted in 1990 [3]. The basic overview of the protocol for the HGP emerged as follows: the large DNA fragments were cloned in bacterial artificial chromosomes (BACs), which were then fragmented, size-selected, and sub-cloned. The purified DNA was then used for Sanger sequencing, and individual reads were then assembled based on overlaps between the reads.

Given how large the human genome is, and the limitations of Sanger sequencing, it quickly became apparent that more efficient and better technologies were necessary, and indeed, a significant part of the HGP was dedicated to creating these technologies. Several advancements in both wet-lab protocol and data analysis pipelines were made during this time, including the advent of paired-end sequencing and the automation of quality metrics for base calls.

Due to the relatively short length of the reads produced, the highly repetitive parts of the human genome (such as centromeres, telomeres and other areas of heterochromatin) remained intractable to this sequencing method. Despite this, a draft of the HGP was submitted in 2001, with a finished sequence submitted in 2004–all for the low, low cost $2.7 billion.

Since then, there have been many advancements to the process of DNA sequencing, but the most important of these is called multiplexing. Multiplexing involves tagging different samples with a specific DNA barcode, which allows us to sequence multiple samples in one reaction tube, vastly increasing the amount of data we can obtain per sequencing run. Interestingly, the most frequently used next-generation sequencing method today (the Illumina platforms–check them out here) still uses the basics of Sanger sequencing (i.e., detection of fluorescently labelled nucleotides), combined with multiplexing and a process called bridge amplification to sequence hundreds of millions of reads per run.

Figure 2

Figure 2: Cost of WGS has decreased faster than we could have imagined.

Rapid advancement in genome sequencing since 2001 have greatly decreased the cost of sequencing, as you can see in Figure 2 [4]. We are quickly approaching sequencing of the human genome for less than $1,000–which you can see here on our website.

What are we doing with sequencing today?

Since the creation of next-generation DNA sequencing, scientists have continued to utilize this technology in increasingly complex and exciting new ways. RNA-sequencing, which involves isolating the RNA from an organism, converting it into cDNA, and then sequencing resulting cDNA, was invented shortly after the advent of next-generation sequencing and has since become a staple of the molecular biology and genetics fields. ChIP-seq, Ribo-Seq, RIP-seq, and methyl-seq followed and have all become standard experimental protocols as well. In fact, as expertly put by Shendure et al. (2017), ‘DNA sequencers are increasingly to the molecular biologist what a microscope is to the cellular biologist–a basic and essential tool for making measurements. In the long run, this may prove to be the greatest impact of DNA sequencing.’ [5] In my own experience, utilizing these methods in ways that complement each other (like cross-referencing ChIP-seq or Ribo-Seq data with RNA-seq data) can produce some of the most exciting scientific discoveries.

Figure 3

Figure 3: Model of the MinION system.

Although Illumina sequencing still reigns supreme on the market, there are some up and coming competitor products as well. Of great interest is the MinION from Oxford Nanopore Technologies (ONT) (see more about them here). MinION offers a new type of sequencing that offers something the Illumina platforms lack–the ability to sequence long regions of DNA, which is of enormous value when sequencing through highly repetitive regions. MinION works via a process called nanopore sequencing, a system which applies voltage across hundreds of small protein pores. At the top of these pores sits an enzyme that processively unwinds DNA down through the pore, causing a disruption in the voltage flow which can measured at the nucleotide level (see Figure 3) [6]. These reads can span thousands of base pairs, orders of magnitude greater than the Illumina platforms, which greatly simplifies genome assembly. Other new options for long-read sequencing include the PacBio system from Pacific Biosciences (look for pricing options for this service here).

Like any new technology, there have been setbacks. The early accuracy of MinION cells was quite low compared with Illumina, and the output was quite low as well. And although these issues have mainly been addressed, MinION still trails in the market compared to Illumina platforms, which are seen as more reliable and well-characterized. However, MinION has several advantages that could eventually lead to it being more commonly used in the future: for one, it literally fits in the palm of your hand, making it much more feasible for people like infectious diseases researchers, who are in desperate need of sequencing capabilities in remote locales. It’s fast as well; in one example, a researcher in Australia was able to identify antibiotic resistance genes in cultured bacteria in 10 hours [7]–an absolutely incredible feat that couldn’t have been imagined until very recently. This kind of technology could easily be used in hospitals to assist in identifying appropriate patient treatments, hopefully within a few years.

Although we are not regularly able to utilize sequencing technology for medical treatments as of yet, there are a few areas where this is currently happening. Detecting Down’s syndrome in a fetus during pregnancy used to be a much more invasive process, but with improvements in sequencing technology, new screens have been invented that allow for the detection of chromosomal abnormalities circulating in the maternal blood [8]. Millions of women have already benefitted from this improved screen.

Perspective: What does the future of DNA sequencing hold?

As the Chinese poet Lao Tzu said, ‘Those who have knowledge, don’t predict’, and that’s as true as ever when it comes to DNA sequencing technology. We’re capable today of things we couldn’t even have dreamed of 40 years ago, so who knows where we’ll be in the next 40 years?

But as a scientist, I’ve always enjoyed making educated guesses, so here’s some limited predictions about what the future might hold.

Clinical applications: I’ve never been a fan of the term personalized medicine, since it implies that one day doctors will be able to design individual treatments for each patient’s specific illness. I find this scenario unlikely (at least in the near future), because even though the cost and time of DNA sequencing has decreased by astonishing amounts, it still is expensive and time-consuming enough that it doesn’t seem likely to be of great use for clinical applications (to say nothing of cost and time for developing new drug regiments). However, I have high hopes for the future of precision medicine, particularly in cancer treatments. Although we may never be able to design the perfect drug specifically designed to target one individual’s cancer, we can certainly create drugs that are designed to interact with the frequently observed mutations we see in cancers. This could allow for a more individualized drug regiment for patients. Given that cancer is a disease with such extremely wide variations, we will almost certainly need to start taking more targeted approach to its treatment, and genome sequencing will be of great benefit to us in this regard.

A fully complete human genome: As I mentioned previously, one drawback of Illumina sequencing is that it is not capable of sequencing across highly repetitive regions, and unfortunately, large swaths of the human genome are highly repetitive. As such, while we have what is very close to a complete human genome, we do not have the full telomere-to-telomere sequence down as of yet. However, with the new long-read technologies that are currently being implemented, the day when this will be completed is likely not far off.

A complete tapestry of human genetic variation: Millions of people have already had their genomes sequenced to some degree (I’m one of them! Any others?), and millions more are sure to come. Widespread genome re-sequencing could one day allow us to have a full catalog of virtually every heterozygous gene variant in the world, which could allow for an even greater understanding of the connection between our genetics and specific traits.

Faster and better data analysis: Data analysis is probably the biggest bottleneck we’re currently experience when it comes to DNA sequencing. There is what seems like an infinite amount of data out there and unfortunately, a finite number of people who are capable of and interested in analyzing it. As these technologies become more and more mature and established, new and better data analysis pipelines will eventually be created, speeding up analysis time and increasing our understanding the data. Hopefully, one day even scientists with only moderate technical savvy will be capable of performing their own data analysis.

I’m certain the future of DNA sequencing will also hold things that I can’t even imagine. It’s an amazing time to be a scientist right now, as researchers are continuously discovering new technologies, and finding ways to put our current technologies to even more interesting uses.

What do you think the next big thing in DNA sequencing will be? Tell us in the comments!

RIN Numbers: How they’re calculated, what they mean and why they’re important

High-quality sequencing data is an important part of ensuring that your data is reliable and replicable, and obtaining high-quality sequencing data means using high-quality starting material. For RNA-seq data, this means using RNA that has a high RIN (RNA Integrity Number), a 10-point scale from 1 – 10 that provides a standardized number to researchers indicating the quality of their RNA, removing individual bias and interpretation from the process.

The RIN is a significant improvement over the way that RNA integrity was previously calculated: the 28S and 18S ratio. Because 28S is approximately 5 kb and 18S is approximately 2 kb, the ideal 28S:18S ratio is 2.7:1–but the benchmark is considered about 2:1. However, this measurement relies on the assumption that the quality of rRNA (a very stable molecule) is linearly reflective of mRNA quality, which is actually much less stable and experience higher turnover [1].


Figure 1: RNA traces of RNA samples with different RIN values. Note the difference between high and low quality samples.

Fortunately, Agilent Technologies has developed a better method: the RIN value. Agilent has developed a sophisticated algorithm that calculates the RIN value, a measurement that is a considerable improvement over the 28S:18S ratio. RIN is an improvement in that it takes into account the entirety of the RNA sample, not just the rRNA measurements, as you can see in Figure 1 [2]

The importance of RNA integrity in determining the quality of gene expression was examined by Chen et al. [3] in 2014 by comparing RNA samples of 4 different RIN numbers (from 4.5 – 9.4) and 3 different library preparation methods (poly-A selected, rRNA-depleted, and total RNA) for a total of 12 samples. They then calculated the correlation coefficient of gene expression between the highest quality RNA and the more degraded samples between library preparation methods.


Figure 2: Only poly-A selected RNA library preparations experience a decrease in data quality with a decrease in RIN value.

Fascinatingly, the only library preparation method that showed a significant decrease in the correlation between high quality and low quality RNA was the poly-A selected library preparation method. The other two library preparation methods still had correlation coefficients of greater than 0.95 even at low RINs (see Figure 2 [3])!

Chen et al. theorize that the reason behind this is that degraded samples that are poly-A selected will result in an increasingly 3′ biased library preparation, and that therefore you will lose valuable reads from your data. Because the other methods involve either no treatment or rRNA removal (as opposed to selection), there will be considerably less bias in the overall sample.

Even though it seems as though only the poly-A selected library preparation method suffers from having a low RIN, providers still prefer to work with relatively high quality RNA samples for all library preparation methods. However, if you do have important samples that are of lower quality RIN, it may be worth still discussing your options with a provider directly–and we at Genohub are more than happy to help facilitate your discussions! Please contact us here if you have any further questions about sequencing of samples with poor RIN.