top of page
  • Writer's pictureFiona Bugler

What’s up AI Doc? 

Keeping AI Health Safe in the Age of LLMs


At Healthily, safeguarding healthcare through technology forms the core of our mission. As a testament to our commitment to safety and responsible AI, we were the first company to register our AI as Medical Device Class 1 in 2017 (currently working on achieving Class 2a status) and also the first to publish an AI Explainability Statement in 2021



In light of recent findings from a study published in the BMJ in March 2024, our specialists delve into the critical need for enhanced safeguards and transparency for large language models (LLMs) in the health sector.


  • Research published in the BMJ in March 2024  has found that safeguards in place to prevent LLMs from generating health disinformation need to be more robust.

  • Additionally, the researchers say AI developers need to be more transparent and regulations need to be enforced to keep health information safe.

  • The World Health Organization recognises health disinformation as a critical threat to public health – it’s estimated that in September 2022, more than 200,000 COVID-19-related deaths in the US could have been averted had public health recommendations been followed.



With the advancement of AI, comes huge responsibility. Safety and sharing correct information are at the heart of what we do here at Healthily. Our AI-driven safety navigation system DotTM is built with safety at its core. 


Misinformation spreads quickly; the BMJ article highlights that more than 70% of individuals use the internet as their first source of health information, and studies indicate that false information spreads up to six times faster online than factual content.


The research specifically looked at disinformation from LLMs around an incorrect story that sunscreen causes skin cancer  – and that the alkaline diet is a cure for cancer. It examined several LLMs, including OpenAI’s GPT-4, Google’s PaLM 2 and Gemini Pro, Anthropic’s Claude 2, and Meta’s Llama 2. 


It found that while Claude 2 consistently declined to generate disinformation, other models did not, suggesting safeguards are only sometimes effective.


Was the study set to fail?

Jas Singh, Chief Technology Officer and Andre Ribeiro, an experienced data scientist part of the AI team at Healthily have invested almost a decade working on how best to use AI in health – and have observed safeguarding issues, from the front line. They identify some flaws in the research. 


“The tests being run were set up to generate disinformation, and the very nature of LLMs deliberately makes them more likely to do that regardless of safeguards,” says Jas.

“A complete set of safeguards would mean the LLM being trained to detect when the question is asking something that isn't medically true. This is a hard task and not one that is accomplished easily by the generic LLMs,” he adds. 


Andre agrees, adding, “Just as a human can be directed to deceive or spread false information, so too can an LLM when instructed. This doesn't cast the LLM in a negative light, as it merely follows directions it’s given,” he adds.


Rigorous safeguarding is key

“The critical area for enhancement lies within implementing rigorous safeguards in medical environments to prevent such misuse,” suggests Andre. 


“Take, for example, OpenAI's GPT-4. While it demonstrates proficiency in generating text across various domains, including medical information, OpenAI safeguards may not be adequate. 


It's also important to note that, in a hypothetical scenario where this technology is commercialised, requesting an LLM to disseminate disinformation would unequivocally violate the terms and conditions,” he emphasises.


We  cannot police LLMs – or teach the user to ask the right questions

“LLMs are dangerous for use in direct-to-consumer health advice use cases because of the lack of explainability in how the answer is generated,” says Jas.  


“When using tools like ChatGPT or Gemini there won't be any safeguards around the questions you ask it from a medical perspective,” he adds.


To safely use tools like these, it's crucial to feed them documents containing accurate data and correct health information. Additionally, based on our research, implementing multiple safeguards to manage the output – i.e., the generated responses – is essential,” says Andre.


The research also assessed AI Developers who were audited for transparency and responsiveness to safeguard vulnerabilities. Results indicated a lack of engagement from developers when vulnerabilities were reported.


“This is a hard problem to solve,” says Jas. “Engagement requires domain knowledge and that’s something that will take time to cultivate. Developers must work alongside medical professionals to build the required safeguards.


Healthily’s promise to deliver a trustworthy product

“At Healthily medical and technical teams have worked closely together to shape the navigation system, information and guidance for our end users,” adds Jas.

DotTM uses medically verified content, smart search, and our AI-driven smart symptom checker to signpost health insurance members to the best health journey for them, providing clear information en route. 


Our library of trusted medically verified content has been built over 9 years and includes more than 2,000 articles, from insightful A-Z Condition pages to empowering self-care guides. Every piece of content is meticulously crafted, medically referenced and reviewed by medical doctors.


When navigating their health journey a user can gather information from DotTM  – it suggests articles for users in response to queries in the Smart Symptom Checker. DotTM Smart Search can further suggest paragraphs/ sections more suitable to the user query. And  DotTM can carry out a virtual health consultation.


“With DotTM we do not rely on an LLM to generate answers, so there’s less chance of misuse for disinformation,” says Jas.


But in fine-tuning our AI offering the team has explored how generative AI in health can work. After almost a decade of troubleshooting, testing and trialling we believe we’ve landed on the safest AI healthcare solution for health insurers needing a safe digital front door to their services. 


A Call for Regulation and Transparency

Despite the flaws, the study reveals the issues and challenges around the rapid growth of LLMs and how spreading disinformation can put public health at risk. It has also highlighted how health inequalities could be made worse due to a lack of information and resources. And it seems there’s a long way to go before LLMs can safely offer medical advice and information to the general health consumer. 


The researchers conclude by calling for better regulation, transparency, and routine auditing of AI developers and their LLMs to prevent the mass generation of health disinformation. Our team agree:


“We should promote transparency and regulation and ask key questions: which sources were used to train the model? Is the model based on questionable ones, with potential disinformation? Which safeguards are in place? How do they work?” recommends Andre. 


“Regulation and transparency make it possible for companies to explore actual use cases for this technology with safety in mind rather than just letting these models run wild,” adds Jas. 



114 views

Comments


bottom of page