Mind Machines & Medical Miracles –  Insights from an interview with Mike Gartrell, Lead AI Research Scientist at Sigma Nova

Posted by:

|

On:

|

In recent years, artificial intelligence (AI) has made an impressive leap with the emergence of Foundation Models (FMs). These are massive neural networks trained on enormous amounts of data, capable of performing multiple tasks, from creating images that never existed to answering complex questions as if they were experts. Examples? OpenAI’s GPT-4, which writes texts that could easily pass for human writing, and DeepMind’s AlphaFold, which deciphered the structure of almost all known proteins, accelerating medical research by decades.

But while these foundation models promise scientific and industrial revolutions, they also bring profound dilemmas. When GPT-4 “hallucinates”, it invents false information that sounds real. When an AI system diagnoses a disease, it may outperform a human doctor in accuracy, but it can also create a false sense of security. And when brain-computer interfaces use AI to decipher neural signals and recreate what someone is seeing or thinking, we open the door to a future where even our thoughts can be analyzed.

To better understand these challenges and explore how far AI can go, we invited Mike Gartrell, Lead AI Research Scientist at Sigma Nova, a cutting-edge company in artificial intelligence research. Mike leads a team that works on the development of fundamental AI technology for foundation models, the systems that underlie technologies like GPT-4 and AlphaFold. He will help us understand how these models work, what they can actually do, and, most importantly, where we should draw ethical boundaries for their use.

In this interview, we will talk about how AI is transforming science, medicine, and even our understanding of the human mind. But we will also challenge Mike with tough questions: Are we creating a generation of specialists who rely too much on AI? And if AI models can read thoughts, are we ready for that? Let’s find out.

Questions

  1. [Ana] Mike, foundation models like GPT-4 and DALL-E from OpenAI have demonstrated an impressive ability to create text, images, and even music. But in many situations, they also produce false information that seems real. At Sigma Nova, how does your team deal with the problem of AI “hallucination” — that is, when the model generates information that appears real but is not? Are there areas of science where these errors are particularly dangerous?

[Mike] The problem of hallucination in AI is still not fully understood, and is a complex topic. In the context of large language models (LLMs), such as GPT-4, hallucination can be seen as the model generating a wrong or misleading response to a text prompt with very high confidence, particularly when the prompt is significantly different from the data that the model was trained on (called out-of-distribution or OOD in the AI research community).  Therefore, the model provides unreliable uncertainty estimates in its predictions. Addressing the challenge of producing reliable uncertainty estimates is an active area of research. 

Hallucination and unreliable uncertainty estimates can be especially problematic in medical and clinical settings, where doctors need a reliable estimate of the model’s uncertainty in its predictions in order to use the model. For example, if a model is used to select possible participants for a study on a neurological or mental disorder based on their EEG or fMRI brain recordings, the doctor must be able to trust the model’s predictive uncertainty estimates and take action based on these predictions.

One way to provide more reliable uncertainty estimates is through Bayesian approaches to machine learning. LLMs are typically trained using algorithms based on optimization, which produces a single point estimate (a single numerical value) of each learned parameter in the model. While relatively fast and efficient, it is difficult to reliably estimate uncertainty using this approach. In contrast, Bayesian approaches estimate a probability distribution of possible values of the model parameters, which allows a reliable estimate of the model’s predictive uncertainty. Of course, this benefit does not come for free, and it can be difficult to scale Bayesian approaches to very large scientific foundation models. Sigma Nova and others in the AI research community are actively investigating Bayesian approaches in this setting, as well as other methods for reliable uncertainty estimation and reduction of hallucinations.          

  1. [Ana] You lead a team that works with scientific foundation models. In some fields, such as AI-assisted medical diagnosis, these models can help identify diseases more quickly and accurately. But there is also the risk that medical professionals may become overly dependent on these technologies. Do you believe that foundation models are making experts more efficient or more complacent? How can we ensure that they complement rather than replace human judgment?

[Mike] We’re still in the early days of real-world use of scientific foundation models. I view the use of foundation models and AI in general as a tool that enhances human capability and judgement, rather than replacing it. In particular, for high-consequence medical decisions, it is likely that AI will never be able to fully replace human decision making. Scientific AI tools can help with tasks such as interpreting X-rays and other medical images, predicting protein structure, and identifying biomarkers that may indicate disease potential. In scientific and medical settings, the ideal setting seems to be where human experts work collaboratively with AI tools. For example, imagine a radiologist interacting with a vision-language foundation model for medical images, querying the model for feedback about particular regions of the image using a graphical and text-based chat interface. Such interactive settings, where the human expert and the AI tool work together collaboratively, helps to ensure the experts work more efficiently, and that the humans and AI complement each other effectively.

  1. [Ana] Recently, DeepMind’s AlphaFold revolutionized biology by predicting the structure of almost all known proteins. However, it also raised concerns about the impact of these technologies on traditional scientific research. At Sigma Nova, are you also developing models aimed at transforming scientific fields? How do you balance the transformative power of these models with the risk of replacing established research methods?

[Mike] The history of science involves the continual search for new ideas and models that deepen our understanding about nature and the universe. Sometimes these new ideas incrementally refine our previous understanding, and sometimes these new ideas completely replace what came before. As an example of the latter, consider the case of aether theory in the history of physics from the 16th to the late 19th century, which proposed that light traveled through a transmission medium called aether. With the development of special relativity, this theory has fallen out of use in modern physics. We now know that light can travel through a vacuum, without any particular transmission medium. 

Scientific foundation models could lead to incremental scientific advances, as well as eventually fundamentally new scientific concepts. Thus, we don’t see a tension here with the arc of how science evolves over time. While we at Sigma Nova are initially focused on the development of scientific foundation models that have a more applied impact, we fully understand that our work may fundamentally deepen scientific understanding long term.

  1. [Ana] Brain-inspired models, such as Generative Neuroimaging Models, are being used to try to “read thoughts” or recreate images that someone has seen by analyzing brain signals. In your opinion, what is the ethical limit for using these models? Do you think society is ready for technologies that could, in theory, access someone’s thoughts?

[Mike] This is an important question. Rather than trying to build AI models that can reconstruct brain stimuli (such as images or language) as an end in itself, which certainly has ethical and privacy concerns, I’m more interested in using such models to learn more about underlying brain dynamics for other downstream applications. For example, a model that is able to reconstruct brain stimuli with high accuracy may also be able to better generalize to tasks such as the detection of indicators for neurological or mental health disorders for medical settings. We at Sigma Nova are interested in investigating the promise of such approaches.  

  1. [Ana] Although your focus is not directly on neuroscience, you work with AI that could potentially be applied to this field. Considering the rise of foundation models in mental health, where AI therapy apps are becoming common, do you think we are moving towards a society where people prefer to open up to machines rather than to other people? How can AI researchers ensure that these technologies promote well-being instead of increasing social isolation?

[Mike] This is a very challenging question, with big implications. One could argue that the proliferation of AI therapy apps helps to provide mental health services to more people, which helps to address the shortage of mental healthcare professionals, and thereby could improve mental health and well-being across the population overall. Therefore, it may be the case that AI therapy services contribute to improvements in mental health, which could actually help to reduce social isolation. Also, as I mentioned previously, brain foundation models trained on brain recording data could facilitate study and potentially diagnosis of mental disorders, which also leads to further potential for improving mental health. Finally, brain foundation models may enable AI-powered brain imaging devices that are relatively light and cheap, thus allowing for more access to devices that provide early diagnosis of mental and neurological disorders. So, in this regard, AI provides potential solutions for improving mental health and making mental health treatment more accessible. 

There is also an interesting longer-term aspect to your question, beyond the specific cases of AI therapy apps and brain foundation models. As next-generation LLMs and video language models become more powerful, and embodied AI continues to emerge, one could imagine an accelerating trend toward people preferring interactions with machines in the coming years. This has the potential for profound long-term societal impacts, and long-term longitudinal studies are needed to understand the psychological and societal effects.