Blood counts, biopsies, experimental treatments and drug cocktails are all part of the experiences a cancer patient faces. By gathering this information from hundreds of thousands of people and analyzing it, scientists could learn more about the disease itself.

A team at Memorial Sloan Kettering Cancer Center in New York is training an artificial intelligence to find similarities between these cases that doctors might miss. Using the clinical notes for these patients, the software combs over 100 million sentences.

“We’re looking into the exhaust of all that data to try to find something interesting,” says Gunnar Rätsch, a data scientist at Memorial Sloan Kettering who is leading the team.

The concept developed by Rätsch and his team is simple, but has extraordinary implications. What they did is construct computational models that analyze a person’s condition, how it compares to others, and what the future course of their disease is likely to be.

“Once we have that, we can think about how to treat the patient best.”

The researchers have built an AI learning algorithm that sifts through data culled from the clinical notes on 200,000 cancer patients. The program then assembles 10,000 related data clusters out of millions of sentences, covering everything from patients’ symptoms and medical histories to doctors’ observations.

These data clusters represent common observations found across multiple medical records—for instance, recommended treatment regimens or especially noteworthy symptoms. And once the manifold connections between these clusters were mapped, the whole intricate web of relationships between different comments or courses of treatment was revealed.

And the potential benefits of such an algorithm in the fight against cancer are simply enormous.


In another study based on Rätsch’s research, the clusters were compared against the records of about 2000 people with different types of cancer, with the purpose of ferreting out any unknown associations that might exist between doctors’ notes and patients’ gene and blood sequencing.

As an example, patients with similar genetic profiles might have the same sort of medical note in their files; these types of connections, and others like them, which were formerly uncorrelated, may in future help doctors acquire a new insight into the nature of cancer.

And that is exactly what the developers of this wonderful new algorithm aim to achieve.

The hope is that these associations will provide fresh ideas for research. “You can take the genetic information and make this connection in order to find new hypotheses, which can then be tested,” says Rätsch. The use of machine learning in medical applications has long been proven to be useful and more efficient. Computers are already being trained to diagnose conditions using X-ray and MRI imagery, while another system at Chicago hospitals has learned to predict heart attacks.

A previous AI has also been built that could detect cancer cells in the body. All this research goes a long way to the end goal of eradicating cancer.  “The human mind is limited, hence you need to use statistics and computer science,” notes Rätsch.

Let’s hope our computer programs can someday defeat cancer—a goal that our own ingenuity, so far, has failed to accomplish.

Share This Article