Generative artificial intelligence, like the kind that powers OpenAI’s DALL-E, ChatGPT, and other popular programs, is going to be an important tool for breakthroughs in oncology, the study of cancer, according to Daphne Koller. Koller is an AI pioneer and co-founder and CEO of life sciences AI firm Insitro.
“What we’ve taken on as an effort is to really learn the language of histopathology [the study of tissues]… and then use that to […] give us potential [drug] targets,” said Koller, speaking at a daylong workshop hosted by Stanford University’s Human-Centered AI institute on Tuesday, titled, “New Horizons in Generative AI: Science, Creativity, and Society.”
Also: Generative AI is everything, everywhere, all at once
Koller is an adjunct professor of computer science at Stanford. Koller explained a two-step process that can lead to novel drug targets for cancer.
In the first step, Insitro machine learning AI technology is able to analyze images of cancerous tissue, a histology image generated from a biopsy. A human pathologist will “typically boil down these images of billions of pixels into, like, three numbers,” she explained, “And it’s clear that there is a ton more information that is available within them” that is not being used.
Also: Cerebras and Abu Dhabi’s M42 made an LLM dedicated to answering medical questions
By using machine learning, the computer will “really learn the language of histopathology,” she said, which in turn lets the machine predict genetic changes in patients with cancer with 90% to 95% accuracy.
“So, basically, by looking at a slide, you can say this patient has this genetic mutation versus this other patient, something that no clinician can really do,” she explained.
That’s the first step. To find drug targets, you need a lot more samples of tissue than are actually collected — thousands versus dozens. To solve that supply of images, the Insitro team used generative AI to create “deep fakes” of tissue images, said Koller. “Rather than generating images of movie stars, we generate images of pathology slides.”
Also: Microsoft unveils extensions to Fabric, Azure for healthcare AI
By multiplying tissue samples from hundreds to thousands, Koller explained, a much larger sample can be analyzed using a special tool developed at Stanford called an “ATAC-seq” assay. The team was able to go from 400 cancer tissue image samples to almost 100,000. That scale starts to make it possible to ask questions that would be impossible with fewer samples.
“And now you can basically start to ask questions like the open-ness or closed-ness of which gene — that is, the activity of which gene — is most strongly associated with survival, something that would not be possible to do if you had 30 patients.”
By analyzing thousands of deep fake images of triple-negative breast cancer, for example, with ATAC-seq, the technology reveals previously unknown genetic changes that can be drug target candidates. “Some of these targets are novel in triple-negative breast cancer [but] they’ve been implicated in other cancers,” said Koller. “That gives you confidence around the causal role that they play, and [they] are potentially really interesting new drug targets.”
Koller described the overall program of generative AI in biology as dealing with a level of complexity that is “not something that the human brain will really ever be able to understand.”
“In order to tackle this domain, we really just need to first collect a very large amount of data, at unprecedented fidelity and scale, at different levels of biological granularity, and then let machines do what they now do much better than people, which is understand the subtle patterns in these data, help us redefine the heterogeneity and complexity of human disease, and identify intervention hubs that might give rise to therapeutics that work in the clinic.”
Also: Generative AI will far surpass what ChatGPT can do. Here’s everything on how the tech advances
The workshop’s co-organizer, Percy Liang, an associate professor of computer science at Stanford, lauded Koller as a Stanford professor who “inspired a whole generation of researchers” in AI. Koller also co-founded the online learning company Coursera.
Liang noted that the workshop’s various speakers, including Koller, offered lots of examples of “multi-modality,” where the same kinds of generative AI programs operate on very different kinds of data, from biological data to sound data to even whale songs. As ZDNET has pointed out, multi-modality, which brings together different data types, is one of the most important future directions of the field.
In closing out her talk, Koller remarked how science goes through periods of tremendous progress. “Think about the late 1800s and chemistry, with the uncovering of the periodic table [of elements],” she offered, “Or the early 1900s, of course, physics, with the connection between energy and matter, space and time.”
Also: 3 ways AI is revolutionizing how health organizations serve patients. Can LLMs like ChatGPT help?
The 1990s saw a similar explosion of discovery in two disciplines, said Koller, “Data/machine learning/AI, which is really something that began back then, and quantitative biology, which is the ability to measure biology at unprecedented fidelity.”
Those two disciplines are now merging, she said, to create a new field called digital biology, which is “the ability to read the biology digitally at this incredible fidelity at an unprecedented scale, interpret what we see using tools such as machine learning and AI, and then write biology using techniques like CRISPR and combinatorial chemistry, and all sorts of other things to make biology do things that it wouldn’t otherwise do.”
Also: Generative AI and machine learning are engineering the future in these 9 disciplines
That new field, said Koller, will have “tremendous repercussions in human health, but also in the environment, in energy, in bio-materials, and sustainable agriculture, and many other disciplines that will help make our world a better place, which is why I think it’s a really exciting place to be.”
The full workshop agenda is posted online, and you can watch the replay of the entire event on YouTube.
Artificial Intelligence