Technology developers, electronic health records vendors and hospitals are grappling with how to govern artificial intelligence as futuristic algorithms become increasingly pervasive in the operations and delivery of patient care.
Growing AI adoption, coupled with a lack of comprehensive supervision from the federal government, is leading some to worry the industry may have too much latitude in building and implementing the technologies, experts said last week at the HIMSS conference in Orlando, Florida.
Hospitals and developers say they’re being careful, creating a forest of internal controls and standards groups to ensure AI tools — including those based on generative AI — are accurate and unbiased. But those may not be enough, given algorithms are becoming more complex, hampering oversight.
“With modern AI systems, the most transparent you can be is here are the mechanics how we trained it, here’s what we measure about its behavior. What happens in the middle is a black box. Nobody knows. It’s a mystery,” said Heather Lane, senior architect in health software company Athenahealth’s data science team, in an interview.
Washington is currently working to create a strategy to oversee AI, with a new HHS task force kicking off work in January.
With the threat of regulations looming, technology companies building AI and the hospitals benefiting from it are warning Washington not to enact policies that could smother the future of AI, given its potential to lower hospital costs, ameliorate doctor burnout and improve patient care.
“There’s an alignment between health systems and the government — they’re looking at us to put safety in place,” said Robert Garrett, CEO of New Jersey hospital operator Hackensack Meridian Health, during a panel on AI. “What we don’t want to do is stifle the innovation.”
‘New horizons’
Over the past year, technology giants like Google and Microsoft launched a flurry of AI tools for healthcare organizations and partnered with major hospital chains to explore new uses. They’re also linking with large electronic health records vendors like Epic and Meditech to integrate their algorithms directly into clinicians’ workflows.
As a result, hospitals interested in exploring AI have never had more options. That’s true both of tried-and-tested predictive AI, which analyzes data to make predictions — what’s increasingly being thought of as “traditional” AI — and the more futuristic generative AI, which can produce original content.
Generative AI roared into the public sphere late 2022, when OpenAI launched ChatGPT, a human-like chatbot built on a large language model called GPT.
Much of the subsequent debate around generative AI in healthcare has centered around if and when the technology could replace physicians. If that happens at all, it’s far off, experts say.
Healthcare organizations using generative AI today are mostly focusing on low-risk, high-reward use cases in administration, like summarizing records, improving nurse handoffs between shifts or transcribing doctor-patient conversations.
California system Stanford Health Care is experimenting with new software from EHR vendors that automatically drafts clinician replies to patient notes, said Gary Fritz, Stanford’s applications chief, during a panel.
Vanderbilt University Medical Center in Nashville, Tennessee, even created its own internal GPT-backed chatbot after discovering some of its clinicians were accessing ChatGPT onsite, according to Yaa Kumah-Crystal, VUMC’s clinical director of health IT.
Mayo Clinic recently put out an internal call for proposals for projects using generative AI and received over 300 responses, Cris Ross, CIO of the Mayo Clinic, said in an interview. The academic medical system has since narrowed it down to about 10.
Generative AI “is so exciting to us because so much of our data is language-based and/or multimodal,” Ross said. “There are new horizons that have opened because of the power of the language tools.”
Internal control efforts ramp up
As AI adoption accelerates, so do worries about responsible AI. The technology is far from perfect: Generative AI is known to hallucinate, or provide answers that are factually incorrect or irrelevant. Model drift, when an AI’s performance changes or degrades over time, is also an issue.
The technology also raises larger questions of privacy, ethics and accountability. Bias is an especially acute worry in healthcare, an industry struggling to quash historical inequities in care delivery and outcomes. If an algorithm is trained on biased data or applied in bias ways, it will perpetuate those biases, according to experts.
EHR vendors and hospitals at the forefront of AI adoption say they’ve created robust internal controls to keep their models in check, including validation and frequent auditing.
Meditech, a major vendor for hospitals that’s partnered with Google on AI, extensively tests its models on its own datasets to ensure information is protected and outputs are reliable, said COO Helen Waters in an interview.
“We do a lot on the validation process before we ever release anything to a customer,” Waters said.
Epic, the largest EHR vendor in the U.S., has taken similar steps. The performance of all its AI models are constantly tracked by monitoring software once they’re in a customer’s EHR, said Seth Hain, Epic’s senior vice president of R&D, in an interview.
Health systems are also wary. Hospital executives said they’re standing up governance committees to oversee AI pilots, training employees on the technology and continuously monitoring any AI tools.
Highmark Health, an integrated health system based in Pittsburgh, created a committee to oversee generative AI use cases as they emerge, said Brian Lucotch, president of Highmark’s tech subsidiary enGen, in an interview. The system has also formed groups working on reengineering Highmark’s business processes and training its workforce around AI, Lucotch said.
Along with oversight and intake processes for traditional AI, nonprofit giant Providence has stood up specific governance structures around generative AI, Sara Vaezy, Providence’s chief strategy and digital officer, told Healthcare Dive via email.
The seven-state system is also building infrastructure to test, validate and monitor AI at every phase of its deployment, from development to maintenance, Vaezy said.
Similarly, academic medical system Duke Health has stood up an “algorithmic-based clinical decision support oversight process,” said Michael Gao, a data scientist at the Duke Institute for Health Innovation, during a panel on AI adoption strategies.
The process begins after an AI project is initiated, and monitors models for fairness and reliability, including any signs of drift or bias, Gao said.
VUMC also has a process for the intake and vetting of models, along with ensuring their outputs remain applicable to its patient population, said Peter Embí, VUMC’s SVP for research and innovation, during a panel on operationalizing AI.
But “one of the things we know, when we deploy these tools in practice more so than just about any other tool we’ve dealt with in the past, things are going to change,” Embí said. “And when those things change they can have very significant impacts that can be very detrimental.”
Ever more challenging to police
Some AI ethicists and computer scientists are warning existing governance may not be up to the task of overseeing increasingly complex models.
It was clear how traditional predictive AI worked and obvious when it diverged from its purpose, said Athenahealth’s Lane.
But “for generative AI, it’s much harder to get a hold of what we call ground truth — a clear right answer,” Lane said.
For example, it’s clear if an AI meant to standardize insurance information gets an output wrong because you can refer back to the original dataset, Lane said. But for generative AI tasked with summarizing a patient’s entire medical history based on unstandardized clinical notes, “there’s a lot more variability,” Lane said. “No two clinicians would write the same note in exactly the same way. And generative AI isn’t going to write it the same way twice in a row either.”
As a result, evaluating these tools is incredibly difficult, according to Michael Pencina, chief data scientist at Duke Health.
The notion of explainability, or knowing how an AI system reached a particular decision, “goes out the window,” Pencina said during a panel on operationalizing AI. “It works, but we don’t really know why.”
And when it comes to quantitatively analyzing the performance of large language models, “there are really no standards,” Pencina said. “How do you measure — a written note — how good it is?”
These concerns are exacerbated by research finding governance systems for existing AL models may not be rigorous enough. Even well-resourced facilities with explicit procedures for the use and assessment of predictive tools are struggling to identify and mitigate problems, according to a study published January in NEJM.
When the functioning of an AI affects patient care, gaps in governance can become a huge issue — even with more knowable traditional models.
California nonprofit giant Sutter Health at one point used a predictive AI tool that flagged potential cases of pneumothorax, when air collects outside the lung but inside the chest, for radiologists, said Jason Wiesner, Sutter’s chief radiologist, during a panel on AI in healthcare delivery.
The tool was implemented into radiologists’ imaging system before false positives and negatives were sorted out, causing clinicians to see pneumothorax warnings during patient visits despite no other evidence of the serious condition.
The algorithm was pulled after three-and-a-half weeks. But “there was some time there when we could have done some patient harm,” Wiesner said.
Washington moving slowly
AI’s rapid advancement and implementation is occurring in a regulatory environment multiple experts described as “patchwork.”
A handful of federal agencies, including the CMS, the Office of the National Coordinator and the Food and Drug Administration, have issued targeted regulations around AI use and quality. Some states have also passed health AI legislation. However, Washington has yet to create a concrete strategy for overseeing health AI.
That’s expected to change this year, as a new HHS task force works on building a regulatory structure to assess AI before it goes to market and monitor its performance and quality after it does.
The task force has “made significant progress” on an overall strategic outline, along with deliverables due by the end of April, including a plan for AI quality and assurance, Syed Mohiuddin, counselor to the deputy secretary of HHS, told Healthcare Dive via email. Mohiuddin is co-leading the task force along with Micky Tripathi, the head of the ONC.
“AI technology is advancing so rapidly — can the regulations even keep up? I don’t think so.”
Robert Garrett
CEO, Hackensack Meridian Health
In the meantime, AI developers and healthcare providers say the solution is for the private sector to come together to share best practices and create guidelines for themselves.
Resulting standards groups have published blueprints for trustworthy AI adoption and plan to operationalize those standards in practice, like inventorying what algorithms are being used by members and tracking outcomes.
In an interview, David Rhew, Microsoft’s chief medical officer and VP of healthcare, stressed that such networks are public-private partnerships and that the federal government has ongoing input into the practices of private companies today.
But implementation challenges are “not going to be figured out by any one company or even the government. It’s going to be figured out by organizations that are doing it,” Rhew said.
Some executives suggest there may not be a bigger role for the government to play in overseeing AI, given the lightning speed of technological advancement.
“AI technology is advancing so rapidly — can the regulations even keep up? I don’t think so. Governance should emanate from the healthcare sector itself, and we can work hand in glove with regulators,” said Hackensack’s Garrett.
Some experts have concerns about developers being left wholly to their own devices, given the massive profit motivation in racing AI applications to market. Demand for the services is causing revenue and profits for technology giants to snowball and their valuations to soar.
“It would be really best if everyone acted ethically and responsibly without any regulation or coalition, but that’s unlikely,” said Mayo’s Ross.
Private sector efforts will “play an essential role” in setting standards for AI in healthcare, Jeff Smith, ONC’s deputy director of certification and testing, told Healthcare Dive over email. But “participation in these governance groups will not replace the need for regulation.”
Balancing regulation with innovation
AI developers said they’re open to regulation, but Washington needs to be careful it doesn’t enact so much oversight it stifles new advances.
Past regulations around EHR technology, for example, has rankled vendors as slowing down product launches. Tech developers don’t want more of the same for AI, especially given how quickly the technology evolves.
“I can tell you that I don’t want it to be too heavy. But we don’t want it to be too late. There has to be some form of a litmus test that says we can produce enough information to validate our assumptions, to validate our models, to tell you why you should trust them,” said Meditech’s Waters.
The government could be helpful when it comes to continuous monitoring of AI tools, said Nasim Afsar, Oracle chief health officer, in an interview.
For example, regulators could check the output of algorithms, after the private sector does its due diligence during development and implementation.
”How do we make sure that at multiple gates the algorithms are doing what they need to do? So first the developing gate, then at the implementation gate in the healthcare delivery system, and then there needs to be a continuous monitoring piece,” said Afsar. “From a policy standpoint, what could that look like?”
“I don’t want it to be too heavy. But we don’t want it to be too late.”
Helen Waters
COO, Meditech
Regulators should become more involved when AI begins to be used in higher-risk use cases, like as a diagnostic agent, according to Aashima Gupta, Google Cloud’s global director of healthcare strategy and solutions.
“This has moved so fast, just in a year, and we’re talking use cases. There’s a lot of discussion, and now we’re talking actual use cases of actual impact. That absolutely needs to be regulated,” Gupta said in an interview.
Other experts said requiring more transparency from AI developers would go far in making their products more trustworthy.
That’s especially as a small number of gargantuan companies are actually responsible for building large language models, and “the rest of us are a little bit at their mercy,” said Athenahealth’s Lane.
“What I would love to see from them is much more detailed quantitative information on the behavior of the system with respect to correctness, with respect to omissions, not just on control test sets the way they tend to publish in white papers but in relevant real-world conditions,” Lane said. “If anybody is in a position to coerce the large tech companies about this, it would be the government.”
The difficult road ahead
Though AI stakeholders are split on what they’d like to see from Washington, they agree on one thing: Regulators seeking to erect guardrails around AI are facing a Sisyphean endeavor, tasked with overseeing a technology that’s constantly changing and already in the hands of doctors.
Washington will have to struggle with a number of open questions: regulating different types of AI technologies, accounting for different workflows across organizations, making sure AI is available for all patients, ensuring accuracy and not exacerbating disparities, said Epic’s Hain.
“It’s a challenging path to weave,” Hain said.
Despite the growth of private sector-led standards groups, the HHS will have to weigh in on those questions soon. Regulators are aware of the tightrope they tread.
“It is critical that we ensure the fair, appropriate, valid, effective, and safe use” of AI in healthcare, ONC’s Smith said. “Balancing this critical need while enabling innovation to flourish is a primary objective of ONC’s policies.”
Still, companies with a stake in the AI game are waiting to see what emerges from Washington — whether any regulations will fill gaps not covered by standards organizations while leaving the private sector free to innovate, or require onerous testing and reporting requirements that could slow down the breakneck speed of AI advancement.
“I hope they will be more permissive than restrictive until proven that we need more regulation, because these technologies are going to save lives. And we really don’t want to slow them down,” said Mayo’s Ross. “We want to go as fast as possible but no faster than is safe.”