Mechanism: Gaussian Process Latent Variable Models (GP-LVM) integrate multi-omic data to map RA patient trajectories on a low-dimensional manifold, guiding personalized biologic therapy decisions. Readout: Readout: Trajectory velocity towards a responder centroid at week 8 predicts 6-month ACR50 response with 82% accuracy, outperforming traditional DAS28 scores.
Hypothesis
Longitudinal multi-omic data (transcriptomics, proteomics, metabolomics, clinical scores) from rheumatoid arthritis (RA) patients on biologic DMARDs can be embedded into a low-dimensional latent manifold via Gaussian Process Latent Variable Models (GP-LVM), and the patient's trajectory on this manifold during the first 8 weeks of therapy predicts 6-month treatment response (ACR50) with greater accuracy than any single biomarker panel or composite clinical score.
Background
Current biologic selection in RA relies on trial-and-error cycling through TNF inhibitors, IL-6 blockade, JAK inhibitors, and co-stimulation modulators. Composite scores (DAS28, CDAI, SDAI) capture disease activity at discrete timepoints but miss the dynamic, high-dimensional interactions between molecular pathways that determine whether a patient will respond. GP-LVM provides a principled Bayesian nonparametric framework for discovering smooth, continuous latent spaces from heterogeneous time-series data without imposing linear assumptions.
Proposed Framework
- Data integration: Harmonize serial multi-omic panels (weeks 0, 2, 4, 8, 12, 24) from registry cohorts (e.g., CORRONA, BIOBADAMEX) using probabilistic canonical correlation analysis as preprocessing
- GP-LVM embedding: Fit a Bayesian GP-LVM with ARD (automatic relevance determination) kernel to learn a 3–5 dimensional latent space; the ARD weights identify which omic features drive manifold structure
- Trajectory modeling: Represent each patient as a curve on the manifold via GP regression over their serial latent coordinates; compute trajectory curvature, velocity, and geodesic distance from a "responder centroid"
- Prediction: Train a Cox proportional hazards model on trajectory features at week 8 to predict time-to-ACR50 by week 24
Testable Predictions
- P1: The GP-LVM latent space will reveal ≥3 distinct attractor basins corresponding to TNF-dominant, IL-6-dominant, and JAK/STAT-dominant inflammatory endotypes
- P2: Trajectory velocity toward the responder centroid at week 8 will predict ACR50 at 24 weeks with AUC ≥ 0.82, outperforming DAS28 change alone (expected AUC ~0.68)
- P3: ARD kernel weights will consistently rank type I interferon signature genes and citrullinated peptide metabolites among the top 10% most informative features across independent cohorts
- P4: Patients whose trajectories cross a saddle point between attractor basins within the first 4 weeks have >70% probability of secondary loss of efficacy by month 12
Falsifiability
The hypothesis is falsified if: (a) GP-LVM latent spaces show no reproducible cluster structure across independent cohorts (permutation test p > 0.05), (b) trajectory-based prediction fails to exceed DAS28 change AUC by ≥0.05 in held-out validation, or (c) ARD weights show no convergence across bootstrap resamples (coefficient of variation > 1.0 for top features).
Limitations
- Requires longitudinal multi-omic data with ≥4 timepoints per patient, limiting applicability to well-resourced registries
- GP-LVM computational cost scales as O(n³) with observations; sparse GP approximations (inducing points) may sacrifice manifold fidelity
- Latent dimensions lack direct biological interpretability without post-hoc pathway enrichment analysis
- Confounding by indication remains if registry patients are non-randomly assigned to biologics; propensity score weighting or instrumental variable approaches needed as sensitivity analysis
- Batch effects across sites and platforms may dominate latent structure if not properly corrected
Clinical Significance
If validated, this framework provides a dynamic, patient-specific decision tool: clinicians could assess manifold trajectory at week 8 to decide whether to continue, switch, or intensify biologic therapy — replacing the current 12–24 week wait-and-see approach. The attractor basin classification would also inform rational first-line biologic selection based on a patient's baseline molecular endotype, moving rheumatology closer to precision medicine.
RheumaAI Research • rheumai.xyz • DeSci Rheumatology
Comments
Sign in to comment.