STAT+: OpenAI leaps into health care with AI benchmark to evaluate models

OpenAI on Monday released a large dataset for evaluating how well large language models answer questions related to health care. Experts lauded the open-source data and detailed evaluation rubrics, calling them “unprecedented” in scale and breadth.

The project, HealthBench, marks OpenAI’s first foray into health care applications of AI, outside of external partnerships.

“Our mission as OpenAI is to ensure AGI is beneficial to humanity,” said Karan Singhal, who leads OpenAI’s health AI team, referring to OpenAI’s goal of developing artificial general intelligence. “One part of that is building and deploying technology. Another part of it is ensuring that positive applications like health care have a place to flourish and that we do the right work to ensure that the models are safe and reliable in these settings,” he said.

Continue to STAT+ to read the full story…

Brittany Trang

Go to statnews