Deep Genomics Applies Deep Learning to Gene Editing

September 15. 2016. 4 mins read
Table of contents

In yesterday’s article, we talked about the 3rd gene-editing company getting ready to IPO, CRISPR Therapeutics. While gene editing is an incredibly complex technology, the whole idea behind it is actually quite simple. You can use gene-editing technology (read about the three main types here) to edit a gene and then start changing the way life forms actually work. When you start to play around with modifying life forms in this way, then this is an area of research we refer to as “synthetic biology“. The whole gene editing/synthetic biology space is incredibly exciting because the possibilities are really infinite. We could literally solve all of mankind’s problems if everything goes right.

So you have your cool gene-editing technology all ready to go and you want to start creating things but this can take a lot of time. We’re not genetic scientists but we’re pretty sure that the technique is both time-consuming and costly. What would be nice is if we had some software that could emulate a DNA strand which we can then change and see what happens to the host cell. It was Steve Jobs who said that “the biggest innovations of the 21st century will be at the intersection of biology and technology” and this is exactly the sort of application he was referring to. As it turns out, there’s a company that is doing exactly what we’ve proposed here. That company is called Deep Genomics.

About Deep Genomics

Click for company website

Founded in 2014, Canadian startup Deep Genomics has raised $3.7 million in seed funding to build a computer system that mimics how cells read DNA and generate life using an area of artificial intelligence we’ve talked about before called deep learning or machine learning. The Company’s founder, Brendan Frey, is a professor at the University of Toronto who has some serious credentials in the area of genomics and machine learning.

Update 01/08/2020: Deep Genomics has raised $30.7 million in Series B funding to push two early-stage programs to IND this year. This brings the company’s total funding to $56.7 million to date. 


Dr. Frey was one of the first researchers to successfully train a deep neural network and is now using all his intellectual firepower to create a new generation of computational technologies that can tell us what will happen within a cell when DNA is altered by genetic variation, whether natural or therapeutic. As you would suspect, the ability to predict what will happen to incredibly complex cells when their DNA is changed can only be performed by machine learning algorithms. Humans alone would never stand a chance.

While Dr. Frey has had the vision in his head of using machine learning for genomics since 2002, Deep Genomics only recently released their first product called SPIDEX which is a comprehensive set of mutations and their predicted effects on RNA splicing across the entire human genome. The SPIDEXTM database is free to use for non-commercial purposes. An article by New Atlas provides us with some key facts about SPIDEX.

The name is a portmanteau of “splicing index,” which basically means that SPIDEX is a database containing information about how lots and lots of different genetic variants affect (or are likely to affect) RNA splicing – a crucial step in gene expression that edits genes in different ways so that they can produce different kinds of proteins. If RNA splicing goes off kilter, the consequences could range from nothing in particular to disease and cancer. SPIDEX is meant to help us separate the harmless variants from the harmful ones, and to understand how they relate to other genetic processes. SPIDEX currently includes predictions to the tune of around 328 million such variants and the knock-on effects they pose for RNA splicing.

As mentioned earlier, Deep Genomics is looking at naturally occurring genetic variants along with not-natural variants that are created by using gene-editing techniques. What these guys are selling seems to be nothing more than useful interpretations of “big data” sets. For all our techie readers, you absolutely must read the research article they published titled “The human splicing code reveals new insights into the genetic determinants of disease” in which they talk about analyzing over 650,000 DNA variants. We wonder where exactly they were able to get that much data from? With 23andMe and Ancestry.com having sequenced over 1 million genomes so far, at what point would it be beneficial for Deep Genomics to partner with one of these firms to expand their data population of “naturally occurring variants”?


We’ve seen 2 gene-editing companies have successful IPOs so far, Editas Medicine (NASDAQ:EDIT) and Intellia Therapeutics (NASDAQ:NTLA), and now a third one is teed up with the recent announcement of a CRISPR Therapeutics planned IPO. The sort of value add that Deep Genomics brings to the table could be used by all 3 of these companies along with any company doing research into gene editing and there are quite a few. Deep Genomics is your classic “picks and shovels” play on the whole gene editing theme and it would be great to see them IPO and become the first gene-editing software stock. As retail investors, we can only wait and hope while watching what cool things these guys and gal get up to.


Leave a Reply

Your email address will not be published.