Eating rocks: The case for early integration of medical ethics into AI product development

Updated 5 months ago on June 18, 2024

The recent integration of Gemini, a modern big language model, into Google's search engine marks a significant step in creating artificial intelligence-based answers to digital health queries. With users worldwide turning to Google with health queries 70,000 times per minute, the company has years of experience in communicating health information and a solid foundation for delivering high-quality, authoritative answers to users.

Except when it isn't.

As a physician (R.S.H.) and a public health communicator (A.B-G.) working together on technology solutions to improve digital health communications, we understand how important it is for digital products like Google to provide reliable health information. As our new article in the Journal of Health Communication states, we believe that generative AI tools have the potential to become trusted and reliable sources of health information. But they have not yet reached that pinnacle.

Gemini, for example, can produce inaccurate and potentially harmful health-related information. Users were quick to discuss some of the more bizarre findings on social media, such as "add Elmer's glue to pizza sauce to keep cheese from sliding off" and "geologists recommend eating at least one small rock every day."

Not all misinformation is obvious and humorous. Another widespread result from Gemini: "Doctors recommend smoking 2-3 cigarettes a day during pregnancy."

Innovations in artificial intelligence are advancing at a rapid pace. The underlying large language models and other generative artificial intelligence technologies have the potential to scale solutions to address healthcare access, medical misinformation, physician and other health and public health professional burnout, and others.

As artificial intelligence products are deployed by both large commercial and smaller developers, they also have the potential to exacerbate the problem of medical misinformation and cause real harm to people. While the examples we have cited seem isolated among the millions of qualitative responses, they highlight the importance of considering the health impacts of artificial intelligence products not specifically designed for medical information, as the consequences for a user who trusts such advice can be very serious.

When providing health information or medical advice, physicians, researchers, and public relations professionals are guided by four fundamental principles of medical ethics-intangibility, beneficence, autonomy, and justice-to make morally informed judgments and protect people from harm. AI products that are intended to or may provide health information or medical advice should not be exempt from these ethical principles, which should be applied in their design and implementation.

The principle of "do no harm", otherwise known as "Do no harm", is the basis for decision-making in clinical and research practice. In medicine, it is a complex assessment of the risks and benefits of an intervention, a reminder that one should never underestimate one's capacity to cause harm, even when trying to help. If taken too literally, this principle is fraught with stagnation and potential harm from inaction. A parallel can be drawn with the current spectrum of rapid AI development philosophies, where effective altruism and effective accelerism are at both ends.

While the benefits of AI products may outweigh the risks, the development and deployment of AI products should be based on the intentional avoidance of harm, especially when creating medical content. From a practical perspective, technology developers can follow this ethical principle by prioritizing safety through red teaming, ensuring that high-quality and authoritative sources are used and highly ranked in training data, and conducting research on user interaction with their product before release.

The principle of beneficence, which balances the principle of non-beneficence, encourages innovation, proactivity and preventive decision-making. The development of AI for products that will convey health information should embody this principle and put the user's interests at the center of every step of the development process. A preventive approach to AI product development can use responsive design to detect health-related queries and prioritize the use of search-expanded generation in such cases to reduce the likelihood of inaccuracies and hallucinations. Search-extended generation (or RAG) accesses a knowledge base outside the training data of a large language model before generating a response to optimize the accuracy of the output.

Autonomy in medical ethics means that patients have the right to make their own decisions about their health care. Artificial intelligence products that enable the dissemination of accurate medical information have great potential to increase human autonomy in health and medical decision-making. However, technology developers must realize that increased autonomy means they must train their AI products to provide balanced, accurate, and unbiased medical information.

Fairness in medical ethics means treating everyone equally and fairly. When it comes to the ethical development of medical AI, there is perhaps no more important area than ensuring fairness for all users. Historically marginalized populations are disproportionately affected by false medical information and are more likely to be prejudiced. At every stage of the development process, there is the potential to introduce biases that can exacerbate these inequalities and reinforce systemic inequities. AI developers can prevent and reduce bias through technological solutions such as collecting unbiased training data, operational design strategies such as train-of-thought processing, and post-training strategies such as ranking changes and loss function modifications. But there is also a need to incorporate the voices of diverse communities early in technology development through participatory research to better understand what fair representation looks like for these communities.

The early adoption of user-centered ethical principles in medicine, healthcare, and research adapted for AI product development has the potential to positively impact health outcomes. Society has already seen the consequences of previous technologies that did not always prioritize accuracy of information over engagement, and there is now an opportunity to prevent these mistakes from being repeated across the industry.

Research on user interactions with genAI tools and their impact on health-related attitudes, beliefs, and behaviors is essential to the development of ethical frameworks. At NORC at the University of Chicago, our team is embarking on research programs to study these interactions and seeks to provide valuable insights that can inform the advancement of principles of care, beneficence, autonomy, and justice in AI-generated health communications for all people.

Let's get in touch!

Please feel free to send us a message through the contact form.

Drop us a line at mailrequest@nosota.com / Give us a call over skypenosota.skype