Issue 2

What is the Shelf Life of a Model?

Download Issue
Resources > Curta on Call

By Scott Ramsey, MD, PHD

Senior Partner and Chief Medical Officer, Curta

Adjunct Professor at the University of Washington, School of Pharmacy, CHOICE Institute

Professor at the University of Washington, School of Medicine

“I will remember that I didn’t make the world, and it doesn’t satisfy my equations”¹

I came across this quote from Emanuel Derman and Peter Wilmott while reading Erica Thompson’s excellent book “Escape from Model Land.² It is part of their “Modelers’ Hippocratic Oath,” a self-described manifesto for those who create simulation models (in their case, for the finance industry). As one who remembers reciting the Hippocratic Oath as part of my medical school graduation ceremony and has spent much of my academic career in both medicine and model land, finding this quote felt like a revelation: “first, do no harm” surely applies as much to modeling as it does to patient care.

Most of us are highly motivated to “get it right”: create models that accurately reflect disease or care processes, inform them with valid, representative data, and report the results clearly, taking care to characterize uncertainty and articulate limitations. That said, sloppy and inaccurate modeling, and false or misleading claims of precision, are problems. Review papers have shown that an alarming amount of poorly constructed, nontransparent, and biased models find their way into the literature. To our field’s credit, much effort has been made to mitigate this problem through publication of standards for modeling, and guidance for journals that publish models. As one who has participated in some of those efforts, I believe they are critical for our discipline.

The question I pose here is somewhat different: Is it possible that the models we create in medicine, epidemiology, and health technology assessment could “expire” over time like drugs, such that using them could harm the health care system, or worse, patients? It’s not a question that people in our field contemplate very often, if at all. I never thought about this type of question until I had an experience that made it seem very real, and rather personal. One of my earliest models estimated the cost-effectiveness of lung transplantation.³ It was well received (well, perhaps not by the surgical community), won an award, and was published in a prestigious pulmonary journal. Nearly 20 years after its publication, I received a call from a reporter asking me what I thought about the fact that my paper was being discussed by the Arizona state legislature relating to a bill that aimed to restrict Medicaid enrollees’ access to lung transplantation. I was horrified. Almost every aspect of lung transplantation had changed since the time that paper was published. Using a 20-year-old model (and inputs) to inform policy could cause real harm if it directed funds away from patients who might benefit from this procedure.

“Almost every aspect of lung transplantation had changed since the time that paper was published.”

Curta is currently working with a client that wants to create a model to simulate a very complex cancer: one where many lines of therapy are common, and the treatment landscape is changing rapidly. Wisely, they are supporting a landscape analysis of past modeling efforts in this space. We will soon know the general quality of past efforts, and what approaches might be useful for our new endeavor. All of this has got me thinking, though, about our yet-to-be-built model: can we build in diagnostics to help our clients know when the state of the world is such that a model or modeling approach has lost relevance? More generally, are there ways to determine when any model has exceeded its shelf life?

I will describe three approaches that are worth considering when evaluating models for obsolescence. First and easiest is to apply the PICOTS (Populations, Interventions, Comparators, Outcomes, Timing, Settings) framework to the model in the context of the new treatment.⁴ One would want either the intervention or comparator arm of an existing model to be a valid comparator for the new treatment. As a rule of thumb, if the model does not fit well for at least four criteria, it may not be suitable for the task at hand. Second, consider the newer treatment’s mechanism of action, known adverse effects, and outcomes and decide if these are reasonably captured in the model. When colleagues and I modeled the effects of a new (at the time) immunotherapy for melanoma, we observed that the survival curves were quite unlike previous treatments: hockey-stick shaped, reflecting a subset of persons who experienced prolonged survival. We eventually used mixture cure models to characterize this effect, with superior predicted results.⁵ Third, identify the major assumptions of the model for face validity in light of newer evidence. Often, key issues that were unknown at the time—relationships between surrogate endpoints and outcomes—are clearer as the follow-up period grows.

Modelers might argue that newer, more sophisticated forms of modeling—particularly dynamic microsimulation models—extend the shelf life of models, perhaps indefinitely, because they take the problem down to the level of the patient, or even the disease process. I am not so sure. It’s important to acknowledge that what we know about disease processes or patient behavior changes over time. For example, our knowledge of the role that the immune system plays in cancer has changed fundamentally in the past 10–15 years. Immunotherapy and CAR T-cell therapies are an outgrowth of that knowledge. Their mechanisms of action and toxicities are completely different from that of cytotoxic chemotherapy. Modeling them as chemotherapy could be highly problematic (as noted previously).⁴ Similarly, the COVID-19 epidemic has changed people’s attitudes and behaviors towards vaccines and prevention: not always in good ways, and certainly in ways that we must admit we don’t fully understand. This has tremendous implications for models of future epidemics and, as another example, cancer screening.

I’m not aware of any estimates of harms that have occurred from models in healthcare. In contrast, several studies have evaluated the role that financial models played in the 2007 collapse in subprime mortgages and subsequent global recession. As Thompson notes in her book,² those models performed well until they failed spectacularly because they didn’t account for new stressors to the financial system. Critically, the models themselves were influencing financial decisions, but when the world changed, nobody bothered to question whether the models were accurately estimating the risks in the system. The majority of models built in healthcare today are budget impact and cost-utility analyses used to support insurer reimbursement policies. Here, the harms are second-order effects, transmitted to patients in the form of reduced or enhanced access to technology that may or may not be beneficial. Surely, as we use and re-use those models, new discoveries and external factors that fundamentally influence care (hello, COVID!) should prompt the modelling community to look at the assumptions and data that are built into them.

While most of the modeling world has focused on improving the quality of new models, there are real but as-yet unexplored risks when older models are reapplied in an ever-changing world of science and care delivery. As Thompson eloquently notes, “Despite the allure of quantification, unquantified and unquantifiable uncertainties abound, and the objects of study themselves are directly influenced by the models we put to work on them.”² Models can have tremendous value to decision makers when the clinical situation is complex and there are multiple layers of uncertainty.

“We put “use by” dates on drugs, because over time they will expire. Should we do the same for models?”

Scott Ramsey, MD, PhD

Senior Partner and Chief Medical Officer, Curta

  1. Derman E and Wilmott P. The Financial Modelers’ Manifesto. 2009. https://wilmott.com/financial-modelers-manifesto/
  2. Thompson E. Escape from Model Land. Basic Books 2022. ISBN 978-1541600980
  3. Ramsey SD, Patrick DL, Albert RK, Larson EB, Wood DE, Raghu G. The Cost-Effectiveness of Lung Transplantation. A Pilot Study. University of Washington Medical Center Lung Transplant Study Group. Chest. 1995;108(6):1594-601.
  4. Othus M, Bansal A, Koepl L, Wagner S, Ramsey S. Accounting for Cured Patients in Cost-Effectiveness Analysis. Value Health. 2017;20(4):705-709.
  5. Patient-Centered Outcomes Research Institute (PCORI) Methodology Committee. The PCORI Methodology Report. PCORI Methodology Committee; 2019. https://www.pcori.org/research/about-our-research/research-methodology/pcori-methodology-standards.