Week 241
EHR Foundation Model, EHR, Social Determinants of Health, Ethics
We are moving our newsletter to Substack for a better experience!
In Week #241 of the Doctor Penguin newsletter, we focus on the recent use of large language models (LLMs) to support and enhance communication in healthcare.
1. EHR Foundation Model. A foundation model trained on patients' medical history for zero-shot health trajectory prediction.
Renc et al. developed ETHOS, a decoder-only transformer model designed to predict a patient's possible future clinical events based on their electronic health record data. The model demonstrates zero-shot learning capabilities across various tasks, including predicting inpatient and ICU mortality, estimating ICU length of stay, determining readmission probabilities, and estimating the first-day Sequential Organ Failure Assessment (SOFA) score. ETHOS operates similarly to language models, taking as input a chronologically arranged sequence of tokens representing clinical events (e.g., admissions, lab tests, diagnoses, procedures), and is trained to predict the next token in the sequence. Notably, time interval tokens are inserted between events, and numerical values are binned into quantile tokens. Furthermore, ETHOS was trained on the MIMIC-IV dataset in its original, noisy form without data cleaning or imputation, even retaining inconsistencies like discharge dates preceding admission dates. This approach was based on the assumption that, given a sufficiently large dataset and appropriate tokenization and training methods, ETHOS would develop the robustness to handle and automatically manage noisy or anomalous input data. The resilience of ETHOS to data inaccuracies and missing information has important implications for the efficiency of downstream model development, given that healthcare data inevitably contains errors and it’s impractical to clean large datasets.
Read paper | npj Digital Medicine
2. EHR. An open-source Python framework for EHR data analysis.
Heumos et al. developed ehrapy, a Python framework designed for comprehensive exploratory analysis of heterogeneous epidemiology and EHR data (including free text notes). This tool is compatible with any EHR dataset that can be vectorized and covers various analytical steps, from data extraction and quality control to the generation of low-dimensional representations. The pipeline begins with data quality inspection, by analyzing feature distributions that may skew results and detecting visits and features with high missing rates. Subsequently, ehrapy’s normalization and encoding functions are applied to achieve a uniform numerical representation, facilitating data integration and correcting for dataset shift effects. It then calculates lower-dimensional representations of the data, which can be visualized, clustered, and annotated to obtain a patient landscape. For deeper analysis, it provides statistical methods for group comparison, survival analysis, and causal inference. Additionally, ehrapy features advanced trajectory inference methods to analyze disease progression across patient visits. This structured approach allows users to extract meaningful insights from their EHR data, potentially improving patient stratification, identifying biomarkers, and understanding disease trajectories.
Read Paper | Nature Medicine
3. Social Determinants of Health. Social determinants of health (SDoH) are nonclinical factors—such as housing, transportation, employment, violence, food insecurity, and environmental conditions—that significantly impact health outcomes. These factors are often embedded in unstructured clinical notes, making them time-consuming and cumbersome to extract from electronic health records.
Gabriel et al. explored the use of large language models (LLMs) to identify SDoH from clinical notes. They used GPT-3.5 turbo to generate a synthetic dataset for training BERT and RoBERTa-based classifiers to detect homelessness, food insecurity, and domestic violence. They created this dataset by querying GPT to compose various sentence-phrases that were either positive or negative descriptions of an SDoH concept of interest. These sentences were then injected into clinical notes. For example, a positive sentence for homelessness was: "The patient presented with multiple health issues, including respiratory infections and malnutrition, which are commonly observed among individuals lacking stable housing." An example negation statement was "The absence of homelessness issues allows the patient to access consistent healthcare services." They compared three training approaches: using synthetic data generated by GPT, authentic notes from the MIMIC-III dataset, and a combination of both. Results showed that combining synthetic and authentic notes often yielded the best performance. When validated on their institutional dataset, a regularly maintained registry for SDoH from preoperative clinic interviews, the combined model achieved an AUROC of 0.78 for homelessness, 0.72 for food insecurity, and 0.83 for domestic violence. This study demonstrates the potential of LLMs in automating SDoH detection from unstructured medical data, which could help healthcare providers better understand patients' social barriers and tailor interventions.
Read Paper | Proceedings of the National Academy of Sciences
4. Ethics. Current gaps in the ethical considerations of generative AI (GenAI) in healthcare.
Ning et al. conducted a comprehensive scoping review of ethical discussions on GenAI in healthcare, identifying four current gaps: (1) few solutions have been proposed for ethical issues; (2) insufficient discussion on ethical concerns beyond large language models, with other GenAI methods like generative adversarial networks (GANs) often overlooked despite their use in medical research; (3) a lack of standardized frameworks for addressing ethical considerations; and (4) limited discussion on multimodal GenAI. To address these gaps, they developed the Transparent Reporting of Ethics for Generative AI (TREGAI) checklist, based on nine established ethical principles. This checklist can serve as a tool for journals, institutional review boards, funders, and regulators to ensure researchers transparently document ethical issues related to GenAI, discuss potential solutions, and clearly indicate where these discussions can be found in their manuscripts.
Read Paper | The Lancet Digital Health
-- Emma Chen, Pranav Rajpurkar & Eric Topol


