Reinforcement Learning in Healthcare Systems

Explore top LinkedIn content from expert professionals.

Sanjay Basu, MD, PhD

Chief Medical Officer | Internal Medicine, Data Science

5,031 followers 1mo
Report this post
Our new peer-reviewed research on preventing ED visits and hospitalizations among patients receiving Medicaid: https://lnkd.in/g7wrhEhr When patients have multiple conditions, the optimal "next best step" is rarely clear. With 15 minutes left on a Friday night to spend with a patient with uncontrolled schizophrenia and hypertension, do I first spend my time on a prior authorization for a long-acting antipsychotic or on medication adherence counseling for a new blood pressure regimen? Standard guidelines leave these crucial, sequential decisions to varied individual judgement and experience. Our new, peer-reviewed research, published in JMIR AI, shows that Reinforcement Learning (RL) offers powerful guidance for such decisions. Instead of relying on LLMs (which can hallucinate, risking safety), we carefully studied years of intervention data from multidisciplinary population health teams, comparing the outcomes of similar patients who received different interventions or intervention sequences. We used these historical intervention sequences and their outcomes to build a State-Action-Reward-State-Action (SARSA) RL model to recommend the optimal interventions to population health teams. The results: - In this counterfactual causal inference study, the RL-guided approach reduced acute care events (ED visits and hospitalizations) by 12 percentage points compared to the status quo, a 20.7% relative reduction (P=0.02). - It yielded a Number Needed to Treat (NNT) of 8.3 patients getting service to prevent one acute care event. - Crucially, there was no evidence of harm (number needed to harm = infinity). - The RL-guided approach also improved fairness across demographic groups, with a 28.3% reduction in gender-based disparity and a 37.1% reduction in race/ethnicity-based disparity. This work demonstrates that population health teams enabled by RL technologies can outperform those relying on experience- or playbook-based practices alone, particularly when navigating the complex intersections of medical and social needs. This builds on ongoing work at Waymark in RL for Medicaid: Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management: https://lnkd.in/gm8hU7Pd Hybrid Adaptive Conformal Offline Reinforcement Learning for Fair Population Health Management: https://lnkd.in/gVuWmj5p Accepted to Stanford's #Agents4Science2025: Feasibility-Guided Fair Adaptive Offline Reinforcement Learning for Medicaid Care Management: https://lnkd.in/dC48TCdp Andrew Ng James Zou Pranav Rajpurkar Lucas Hopkins Andrea Ramirez Scott Anders Michael Pencina Josh Patten Keith Payet Jerold Mammano Joel Gray Doug McMillen Haroon Hyder Wael Haidar Yasir Tarabichi, Brian Martin, Mohammad Dar Christina Severin Sunita Kasliwal Baligh Yehia, Jeffrey Cohen Paul Testa Suja Mathew Tracy B. Aparna Abburi Erin Nahrgang,Rob Fields Daniel Barchi Shantanu Nundy Vineeta Agarwala, Hui Cheng

9 Comments
Like Comment
Amy Bucher, Ph.D.

Chief Behavioral Officer at Lirio & Author of Engaged: Designing for Behavior Change

12,611 followers 1y
Report this post
When I joined Lirio almost three years ago, I faced a steep learning curve to understand artificial intelligence and the specific tools we use in Precision Nudging. Fortunately Christopher Symons and his team are extremely patient and collaborative teachers; with their expert support, I've learned to use our reinforcement learning approach as the backbone of our behavioral design. In my layperson terms, reinforcement learning uses reward functions to train algorithms to maximize wanted outcomes. For Precision Nudging, that means identifying meaningful target behaviors (like having a cancer screening or a vaccination) that drive results, and making those the focus of our AI. The Behavior Science team works closely with our Behavioral Reinforcement Learning colleagues to select and monitor the right behaviors. Then, our algorithms figure out the right combination of behavioral messages for each person that are most likely to lead to the completion of the target behaviors. Over time and across behaviors and populations, this approach informs our Large Behavior Model (LBM). A few things I like about this approach as a behavioral scientist: It allows us to leverage tools like segmentation but quickly go far beyond them, since we learn from individual behavior and can recognize when people need something different than others "like them" might (especially key because segments don't necessarily apply across behaviors or domains). It also allows us to optimize for meaningful behaviors. We could easily reward algorithms for engagement with Precision Nudging, but we know it's not the engagement per se that drives results. So we ensure rewards are tied to key behaviors. And, I love the idea we explored in our Frontiers paper that if the reinforcement learning does its job, we can deliver equitable outcomes by having the right behavioral "ingredients" on the table. Finally, I know the bell hath tolled for the Dunning-Kreuger effect, but working alongside AI experts has me really living that DK life. As I continue to learn from my colleagues and this hands-on work, I hope to elevate a better understanding of AI for my social science community. This is a really cool toolkit if we understand how to use it. I've put a few resources into the comments if you want to read more!

19 Comments
Like Comment
Rajeev Ronanki

CEO at Lyric | Amazon Best Selling Author | You and AI

16,873 followers 4mo
Report this post
The time to design AI-native architectures isn’t after operational gaps appear. It’s now. Healthcare doesn’t need more AI pilots. It needs systems that can reason, coordinate, and decide — together, in real time. On that line of thought, sharing this recent peer-reviewed commentary by Dr. Andrew Borkowski that outlines how multiagent AI systems are reshaping the frontier of clinical intelligence. These systems go far beyond today’s static tools and LLM wrappers. They orchestrate collaboration — across agents, workflows, and decision points. The commentary shares an example of sepsis management, where seven AI agents work in parallel to: • Clean and integrate unstructured data • Interpret imaging and vitals via deep learning • Stratify risk with Sequential Organ Failure Assessment (SOFA) and qSOFA scores • Generate treatment plans using reinforcement learning • Optimize hospital logistics with queue theory and genetic algorithms • Detect anomalies in real time via streaming forecasts • Auto-document every step into structured EHR records Every decision is governed by explainable AI, a quality-control agent, and confidence-calibrated outputs. Federated learning enables continuous evolution, while blockchain and OAuth 2.0 protect system integrity. This isn’t a distant vision. It’s a working blueprint for health systems under pressure to scale intelligence, not just automation. 📌 Read the commentary here → https://lnkd.in/g5X5PADk #AIsystems
No more previous content

No more next content
4 Comments
Like Comment

Reinforcement Learning in Healthcare Systems

More in AI in Healthcare Innovation

Explore categories