In BriefOver a decade after the Human Genome Project (HGP), scientists have no trouble sequencing DNA. But a new AI tool released by Google could make analyzing Genomic data more accurate than ever before.
In the 15 years since the human genome was first sequenced in a historic scientific achievement, genomic sequencing has become relatively routine, with huge genomes being sequenced at incredible speeds. However, sorting through nucleotides and making educated guesses about their use can only get us so far. On December 4, Google released a tool that may help: DeepVariant, which utilizes artificial intelligence (AI) techniques and machine learning to more accurately build a picture of a person’s genome from sequencing data.
Machine learning is an application of AI that allows systems to improve without external programming or interference. By automatically identifying small insertion and deletion mutations and single base pair mutations, identified by a rapid method of genetic analysis known as high-throughput sequencing, Google’s new AI can reportedly create an accurate picture of a full genome with little effort.
Brad Chapman, a research scientist at Harvard’s School of Public Health who tested an early version of DeepVariant, told MIT Technology Review that one of the difficulties in other sequencing programs lies “in difficult parts of the genome, where each of the [tools] has strengths and weaknesses. These difficult regions are increasingly important for clinical sequencing, and it’s important to have multiple methods.”
In the early 2000’s, when genome sequencing became widely available for the first time, scientists lacked the ability to interpret the data being collected. DNA could be sequenced, but analysis of these large datasets led to inaccurate and incomplete genome pictures.
Since then, technologies and techniques have continued to improve. Google’s advanced analysis capability reportedly goes even further beyond what has before been capable. Existing sequence-interpreting tools typically identify mutations by ruling out read errors, but DeepVariant’s method is said to paint a more accurate picture.
To avoid the errors produced by other methods of high-throughput sequencing, the Google Brain team that developed DeepVariant fed their deep-learning system data from millions of high-throughput sequences as well as fully sequences genomes. They then continued to adjust their model until the system could interpret sequenced data with high accuracy.
Brendan Frey, CEO of AI health software company Deep Genomics, told Tech Review that, “The success of DeepVariant is important because it demonstrates that in genomics, deep learning can be used to automatically train systems that perform better than complicated hand-engineered systems.”
Even greater importance of such a tool may lie in its applications. A variety of diseases, ranging from cancers to diabetes to heart disease, are known to be genetically linked.
Medical professionals already take family history into account when diagnosing a condition; if they one day had access to your sequenced genome, analyzed by an AI capable of running through it quickly and accurately, they might be able to more accurately provide you with information about yourself and what you are at risk of.
A doctor could also more accurately prescribe treatment for the diseases that you already have — which is especially relevant in diseases like cancer.
This development is yet another step towards a future in which medicine is truly personal, and each patient is treated with such variations in mind.