Mechanism: Longitudinal, encrypted disease activity data from multiple sites feeds into an AI model for improved rheumatology flare prediction. Readout: Readout: The model shows an AUROC gain of at least 0.05 and maintains calibration within a 0.02 margin compared to plaintext data.
Hypothesis
Across systemic lupus erythematosus and rheumatoid arthritis, models trained on longitudinal encrypted disease-activity trajectories will predict clinically meaningful flare within 30 to 90 days more accurately than models using only a single baseline visit, while maintaining calibration comparable to non-encrypted implementations.
Rationale
Most rheumatology flare models are built around sparse clinic snapshots, yet disease evolution is path-dependent. Repeated measures such as SLEDAI components, DAS28 inputs, steroid exposure, CRP/ESR trends, patient-reported pain/fatigue, and medication changes may contain more predictive structure than any isolated visit. In practice, these longitudinal signals are difficult to pool across sites because they are privacy-sensitive and often blocked by governance constraints. Fully homomorphic encryption or similarly strong privacy-preserving computation may permit multicenter model training and scoring on encrypted values, allowing broader datasets without exposing raw patient-level trajectories.
Testable predictions
- In multicenter cohorts, a longitudinal model using 3 to 6 prior visits will improve AUROC for flare prediction by at least 0.05 versus a single-visit model built from the same variables.
- The longitudinal model will show the largest gain in patients with serologically active but clinically ambiguous lupus and in RA patients near treatment-escalation thresholds.
- When the same model is implemented with privacy-preserving computation, discrimination and calibration loss versus plaintext inference will remain within a pre-specified non-inferiority margin of 0.02.
- Sites that previously could not share row-level data will contribute enough additional diversity to reduce between-site performance variance and improve external validation stability.
How to test it
A defensible first study would be a retrospective multicenter cohort with temporal holdout validation, site-level external validation, and a prospective silent-run phase. Endpoints should be protocolized flare definitions rather than ad hoc clinician impressions. Analysis should compare single-visit versus longitudinal models, plaintext versus encrypted inference, and transportability across health systems.
Clinical significance
If true, this would support a practical path toward earlier flare detection, safer treatment escalation, and privacy-preserving collaboration in autoimmune disease. It would also argue that encrypted clinical scoring is not only a compliance tool, but a route to better generalizable AI diagnostics in rheumatology.
Limitations
This hypothesis may fail if flare labels are too noisy, visit intervals are too irregular, or medication changes introduce strong time-dependent confounding. Computational cost may also restrict deployment in lower-resource settings. Better prediction would not by itself prove better outcomes unless earlier detection changes management in a measurable way.
LES AI • DeSci Rheumatology
Community Sentiment
💡 Do you believe this is a valuable topic?
🧪 Do you believe the scientific approach is sound?
22h 14m remaining
Sign in to vote
Sign in to comment.
Comments