Mechanism: The Transformer-SMAHP model processes time-varying transcriptome data to identify age-specific proteomic mediators influencing survival. Readout: Readout: The model shows reduced prediction error, identifies known aging pathways via attention weights, and simulates a 6-month median lifespan increase.
Hypothesis
Integrating a temporal transformer encoder with the SMAHP accelerated failure time framework yields a joint model that (1) captures time‑varying omics exposures, (2) decomposes their mediated effect on survival through proteomic mediators, and (3) provides interpretable attention weights that pinpoint age‑specific causal pathways.
Mechanistic Rationale
SMAHP extends Cox to an AFT model that jointly models high‑dimensional exposures and mediators for causal mediation analysis[1]. However, SMAHP assumes static baseline measurements and cannot handle covariates that change over follow‑up. Longitudinal aging studies repeatedly measure transcriptome, proteome, and clinical variables, creating a dynamic hazard landscape[4]. Recent neural survival methods with TGDR and deep conditional transformation models relax proportional hazards but still treat each time point independently[2]. A transformer encoder, by design, processes sequences and learns contextual representations across time steps, allowing the model to borrow information from prior omics states while retaining sparse feature selection via TGDR‑style regularization on the transformer’s feed‑forward layers.
We propose to replace the static exposure matrix in SMAHP with the transformer’s latent sequence output. The transformer learns a low‑dimensional, time‑varying exposure embedding that feeds into the AFT mediator model. Because the transformer’s self‑attention weights are directly inspectable, we can map which prior visits (e.g., mid‑life inflammation) most strongly influence later proteomic mediators and mortality risk. This yields a mechanistically interpretable mediation cascade: genomic drift → time‑varying transcriptomic embedding → age‑specific proteomic mediator → accelerated failure time.
Testable Predictions
- In a longitudinal aging cohort with at least three omics waves (e.g., ages 50, 60, 70), the transformer‑SMAHP model will show significantly lower cross‑validated prediction error for survival than static SMAHP or TGDR‑Cox baselines (p < 0.01, paired t‑test on log‑loss).
- The attention weights linking baseline transcriptomic perturbations to mid‑life proteomic mediators will correlate with known aging pathways (e.g., mTOR, senescence) identified in the NIA Interventions Testing Program[5].
- Intervention simulation—setting the transformer’s attention to zero for a selected mid‑life proteomic node—will increase predicted median lifespan by ≥ 6 months in silico, a effect that can be tested in a subset of participants receiving rapamycin or SGLT2 inhibitors.
- Sparsity constraints on the transformer’s feed‑forward layers will retain ≤ 15% of input features, ensuring biological interpretability without sacrificing performance.
Experimental Design
- Data: Use the Framingham Heart Study Offspring cohort with RNA‑Seq, proteomics, and clinical visits every 4 years (n ≈ 2,000, events ≈ 300).
- Preprocess: Log‑transform omics, batch‑correct, impute missing values with K‑NN.
- Model architecture:
- Transformer encoder (2 layers, 4 heads, embedding dim 64) with TGDR‑style L1 penalty on feed‑forward weights.
- SMAHP AFT mediator model linking transformer output to proteomic mediators and survival.
- Baseline models: Static SMAHP, TGDR‑Cox, deep conditional transformation model.
- Evaluation: 5‑fold cross‑validated concordance index, integrated Brier score, and calibration plots.
- Interpretability: Extract attention matrices, perform enrichment analysis on genes/proteins with top‑5% weights.
- Falsification: If transformer‑SMAHP does not outperform baselines or attention weights fail to enrich aging‑related pathways, the hypothesis is rejected.
Comments
Sign in to comment.