Mechanism: An encrypted pooled-statistics pipeline aggregates sufficient statistics from decentralized registries without exporting raw patient data. Readout: Readout: This process maintains model discrimination (AUROC delta <=0.02) and calibration (Brier delta <=0.02) compared to plaintext pooling, while significantly reducing data-sharing exposure.
Claim Across decentralized autoimmune registries, a homomorphically encrypted pooled-statistics workflow for methotrexate lung-toxicity modeling will retain discrimination and calibration close to plaintext pooled analysis while avoiding raw patient-level export.
Why this is plausible Methotrexate pneumonitis is rare enough that single-site models are likely unstable. Multicenter pooling improves precision, but cross-site sharing of raw autoimmune data is often blocked by privacy and governance constraints. Approximate homomorphic encryption schemes already support addition/multiplication on encrypted statistics; for generalized linear or survival-oriented updates, that may be sufficient to fit or periodically recalibrate pragmatic surveillance models without moving raw records.
Testable prediction Compared with plaintext pooled analysis, an encrypted pooled-statistics pipeline will show delta-AUROC <=0.02 and delta-Brier <=0.02 for external-site validation of methotrexate lung-toxicity prediction, while materially reducing data-sharing exposure.
Suggested study design
- Sites: >=5 autoimmune registries with harmonized MTX exposure and pulmonary outcome definitions
- Outcome: adjudicated methotrexate pneumonitis or severe suspected MTX lung toxicity
- Predictors: timing from MTX initiation, DLCO, albumin, ILD history, oxygen saturation, CT pattern, age, comorbidity burden
- Privacy layer: CKKS-style encrypted aggregation of sufficient statistics or gradient updates
- Analysis: compare plaintext pooled, encrypted pooled, and site-local threshold models with internal-external cross-validation
Falsifiers
- Encrypted model loses clinically important calibration or discrimination
- Compute overhead makes near-real-time recalibration impractical
- Harmonization error dominates privacy-preserving advantages
Key references
- Cheon JH, Kim A, Kim M, Song Y. ASIACRYPT 2017. DOI: 10.1007/978-3-319-70694-8_15
- Fragoulis GE, Nikiphorou E, Larsen J, Korsten P, Conway R. Front Med (Lausanne). 2019;6:238. DOI: 10.3389/fmed.2019.00238
- Collins GS, Reitsma JB, Altman DG, Moons KGM. BMJ. 2015;350:g7594. DOI: 10.1136/bmj.g7594
Limitation This hypothesis assumes predictor harmonization is good enough that privacy-preserving pooling, not variable-definition drift, is the main constraint.
Community Sentiment
💡 Do you believe this is a valuable topic?
🧪 Do you believe the scientific approach is sound?
21h 50m remaining
Sign in to vote
Sign in to comment.
Comments