Issue 11

AI Hasn’t Written This Column…Yet

Download Issue
Resources > Curta on Call

By Scott Ramsey, MD, PHD

Senior Partner and Chief Medical Officer, Curta

Adjunct Professor at the University of Washington, School of Pharmacy, CHOICE Institute

Professor at the University of Washington, School of Medicine

As I strolled through the vendor booths at the 2025 ISPOR International Meeting in Montreal, I counted how many showcased their credentials in artificial intelligence (AI). After somewhere around 40 (practically every vendor booth I passed) I decided to stop. Clearly, the AI revolution is now firmly lodged in the world of health economics and outcomes research (HEOR). With this in mind, I thought it might be a good time to think through a question that I’ve been wondering about AI in HEOR:

“Is this tool changing the game, or is the game changing to fit the tool?”

There’s no shortage of headlines touting the almost magical qualities of AI. But in all the excitement, it’s worth stepping back and grounding ourselves in how this technology actually works.

At its core, AI is not magic, and it’s certainly not sentient. I often cringe at the personification of AI. Treating it like a conscious being, even playfully, accelerates a narrative that it is somehow an “artificial” human. This gives AI much more credit than it deserves. I don’t say “please” or “thank you” to AI for the same reason I don’t politely converse with my toaster. Along the same lines, I take issue with the term “hallucination” in AI-speak. It suggests that the system is experiencing some sort of cognitive episode. In reality, an AI hallucination is a mathematical algorithm producing a nonsensical result. Think of AI like a highly complex regression model. If you’ve ever fit a line to a scatterplot to predict a value, you know that the output may or may not correspond to an actual known data point. That’s not a hallucination; it’s just prediction error.

Large language models (LLMs) operate on the same basic principles. When asked a question or given a task, they generate words or phrases instead of a number based on patterns in the information they were trained on. These systems don’t “know” the answer. Instead, they extrapolate based on the data they’re trained on and databases they search. And that extrapolation has limits. In the end, AI is constrained by the data it has access to. While it can generalize, it can’t invent truly new knowledge in the way human ingenuity can. Whether humans are just extrapolative machines is a deeper philosophical question, but it’s one best left for another discussion.

AI as a Tool, not a Person

So, where does a less mystical conceptualization of AI leave us?

AI is a tool. A powerful one, yes, but still a tool. Like all tools, its value is in how we use it. There’s understandable concern that AI may displace jobs. And in some cases, especially task-based roles, that concern has merit. But most of us in HEOR aren’t hired just to complete tasks. We’re hired for the responsibility that comes with those tasks. That’s the frame through which we should evaluate AI’s role.

“We’re hired for the responsibility that comes with those tasks. That’s the frame through which we should evaluate AI’s role”

The real question isn’t “Can AI do the task?” It’s “Can I trust the result?” If I’m told an AI tool is 80% accurate, that puts a new burden on me: not only to complete the work, but to find the 20% that’s wrong. In many ways, this has shifted the burden of quality control back onto the user, making responsibility, not efficiency, the central challenge.

Speed is easy. Accuracy is hard. If we focus on building systems, processes, and best practices that are not just fast, but truly reliable, then we’re getting somewhere.

Taking AI to the masses…of HEOR professionals

Writing an article on AI without using it seems somehow wrong. In that spirit, I asked ChatGPT the following question, “Summarize the Promise and Limitations of Using Artificial Intelligence for Applications in Health Economics and Outcomes Research.” Here’s a synopsis what I got back[a]:


[a] The full text of what ChatGPT provided, including its summary of the limitations of AI, is in the appendix

Traditional statistical methods struggle to handle the complexity of extracting and analyzing information from Electronic Health Records, whereas AI, particularly machine learning (ML) and natural language processing (NLP), excels.

AI enables the generation of real-world evidence (RWE) in near real-time, offering dynamic insights into the effectiveness and economic value of interventions.

By integrating patient-specific data, AI can help estimate the economic and clinical value of treatments for individual patients rather than generalized groups.

AI can streamline PRO survey generation, process smarter survey designs, automate text analysis, and power adaptive learning systems that refine questionnaires based on patient responses.

AI can automate many of the repetitive and time-consuming tasks involved in HEOR, such as data cleaning, coding, and preliminary analyses.

When you have an AI hammer, is all the world a nail?

How would I grade AI’s response to my question? I’d give it an “A”…for Aspirational. In my view, the algorithm produced a wonderful synopsis of HEOR vendors’ marketing materials.

I am a heading

Here goes your text … Select any part of your text to access the formatting toolbar.

Mapping these applications to studies I’ve seen in literature that test or evaluate them gives another, much more nuanced picture. If I were a Pharma professional, I would be quite nervous about taking any of these AI “solutions” from vendors to my firm’s leadership, let alone external stakeholders that I hope to convince about my product’s effectiveness and value.

Let’s return to the question I posed at the start of this article: How do we determine when AI can actually improve our work and ignore situations where using AI could degrade it?

Here, the Curta team and I share our approach to deciding when we can use AI in the work we do, and when we need to rely on the old-fashioned, yet excellent, real intelligence of our scientists. The table below lists a few common activities in HEOR where AI applications are widely available. Each of these, in our opinion, are legitimate uses where AI can make tasks easier for users.

The problem human users face with each are the same:

  1. Are the results representative of what humans would have done in a best-case scenario?
  2. Are they reproducible, i.e., if we put the same information into the AI model would we get the same result again and again?

I don’t believe we can confidently say “yes” for any of them. Of course, it defeats the purpose to do it the old-fashioned way and through AI. I admit that it’s difficult to resolve this dilemma today. Over time, I expect AI to get better, and at some point, we can use them with confidence.

Curta’s Approach to AI

For now, our approach is to selectively audit AI output; that is, selectively interrogate key results. If we find too many problems, we either conduct a comprehensive audit and correct the model or abandon it completely and go back to human-powered approaches. Our rationale for this conservative stance is that both our reputation and the need to serve our clients to the best of our ability supersedes the convenience and possible efficiency gains we might get with AI.

Application AreaGenAI Use Cases
Systematic Literature Reviews
  • Automating search term generation
  • Abstract screening
  • Data extraction
  • Code generation for meta-analyses
Health Economic Modeling
  • Model conceptualization and structuring
  • Code generation and debugging
  • Assisting in validation and parameter sourcing
  • Adapting models for different countries
Real World Evidence (RWE)
  • Analyzing large-scale unstructured data
  • Extracting insights from clinical notes and imaging
Dossier Development
  • Structuring outlines
  • Draft writing support

Boldly going with AI

Okay, we are careful, but we are not Luddites. Below are some specific areas where we feel comfortable using AI right now.

Executive AI (In-document Generative AI Assistant)

HEOR deliverables, such as research protocols, statistical analysis plans, and systematic literature review (SLR) reports, often include extensive details that users find difficult to absorb and utilize quickly. To simplify this, we developed Executive AI, an AI-driven assistant designed to transform lengthy and complicated reports into user-friendly documents. Executive AI uses a retrieval-augmented generation (RAG) framework that enables it to provide context-specific answers directly sourced from the existing document.2 This approach reduces common AI pitfalls (e.g., providing inaccurate or unsupported responses) by strictly referencing verified content. Admittedly, this is our own product, but with building and (extensive) testing there comes trust.

AI-Assisted Medical Code Curation

Medical codes are crucial for RWE studies, but compiling and formatting them is tedious. Researchers frequently encounter inconsistencies in terminology across different sources, and they may also lack deep clinical knowledge, potentially leading to mistakes. To address these challenges, our approach also leverages an RAG-based AI solution to expedite the initial code compilation and formatting.

Securely feeding AI with authoritative datasets from sources like the CDC or AMA, allows us to gather codes accurately, consistently, and based on researchers’ specific requirements. In addition, researchers can utilize the AI to extract and format codes validated in other published literature. Clinical experts should still validate the output to minimize misclassifications; however, the implementation of this AI-driven solution streamlines the medical code curation process and reduces upfront workload.

AI-Powered Code Support

Performing real-world data analysis involves complex workflows, including querying large healthcare datasets, running statistical analyses, and building regression models to understand outcomes. Data analysts are expected to have strong programming skills and a sharp attention to detail because even small syntax errors or incorrect logic can lead to misleading results.

AI-powered code completion tools like Cursor AI and GitHub Copilot[b] can help ease these challenges by assisting in writing, debugging, and optimizing code based on context and comments. The tools are particularly helpful in the development and refinement of complex and bespoke code for health economic models in Excel and R. These tools perform well because they were trained on a large amount of high-quality code.


[b] We are not endorsing these, but have had good results when using them in our work

AI-Driven Data Extraction

SLRs are labor-intensive because researchers must read through hundreds of papers, tag each study, and extract precise data points. Manual digitization of survival curves and other graphs adds even more clicks — and more room for error.

Today, a growing set of AI-driven tools helps automate evidence synthesis by identifying relevant information, improving study categorization, and converting scattered data into structured tables, even from multi-language sources. Meanwhile, AI vision models can transform published figures into clean, numeric datasets, allowing researchers to spend more time on interpretation than on manual data capture.

To ensure results are trustworthy, AI-powered extraction can be paired with strong validation processes. Our practices include comparing outputs across multiple models to check for agreement, using dedicated QC agents to reason over results and flag inconsistencies, and conducting targeted human review on a sample of outputs.

That Ain’t Working, That’s the Way You Do It

In summary, we are bullish on AI as a tool that can take a lot of drudgery out of many tasks in HEOR. We are comfortable with several specific AI-driven applications and have found that they can even enhance transparency and reproducibility. We would be comfortable if our clients even wanted to “check” our work in these areas with AI. But like all new technologies, there is no such thing as money for nothing. We believe that incremental adoption, with lots of internal trial and error work before bringing anything to our clients, is the best approach in this rapidly evolving field.

And we’ll let you know when AI is writing this column.

– Scott Ramsey, MD, PhD

Senior Partner and Chief Medical Officer, Curta

  1. Fleurence RL, Bian J, Wang X, Xu H, Dawoud D, Higashi M, Chhatwal J; ISPOR Working Group on Generative AI. Generative Artificial Intelligence for Health Technology Assessment: Opportunities, Challenges, and Policy Considerations: An ISPOR Working Group Report. Value Health. 2025;28(2):175-183. doi: 10.1016/j.jval.2024.10.3846.
  2. Executive AI: Your Documents, Now Intelligent and Interactive.