A new hip fracture risk predicting model based on large language model

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-1549

Type:

NAISS Small Compute

Principal Investigator:

Zhengshan Wang

Affiliation:

Karolinska Institutet

Start Date:

2025-11-07

End Date:

2026-11-01

Primary Classification:

20603: Medical Imaging

Webpage:

Allocation

Alvis at C3SE: 500 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

Hip fracture is a high-burden geriatric orthopedic condition with high morbidity, disability, and mortality rates. Global annual deaths from hip fractures exceed 300,000, and its incidence continues to rise with population aging. Current prediction methods, such as the FRAX score and traditional machine learning models, have limitations: FRAX ignores latent risk factors (e.g., lifestyle, medication history), while traditional models struggle with unstructured clinical text data, leading to suboptimal accuracy. This project aims to develop a new hip fracture risk predicting model based on large language models (LLMs) to address these gaps. First, we will collect and preprocess high-quality multi-modal data (structured data: age, bone mineral density; unstructured text: clinical notes, imaging reports) from collaborating medical institutions, with strict compliance with privacy regulations (e.g., anonymization, informed consent). Second, we will adapt medical-domain LLMs (e.g., ClinicalBERT, BioBERT) by integrating orthopedic knowledge—adding medical vocabulary embeddings and a knowledge-aware attention mechanism—to enhance capture of critical clinical information. We will then fuse structured and processed unstructured data for joint training, optimizing model performance via cross-validation and real-world clinical validation. Finally, we will develop a user-friendly Web/mobile application to deploy the model, supporting clinical decision-making. Expected outcomes include a high-performance model (AUC > 0.85, Accuracy > 0.80 on test sets), a curated dataset of ≥5,000 patient records, and a scalable application. This project innovates in multi-modal data fusion, knowledge-integrated LLM adaptation, and clinical translation, contributing to improved hip fracture prevention, reduced healthcare costs, and advancing medical AI for risk prediction.