Mechanism: Manifold alignment techniques merge heterogeneous multi-omics aging data into a conserved trajectory, revealing a shared nucleocytoplasmic-ER-mitochondrial stress axis. Readout: Readout: The aligned model significantly improves predictive accuracy for mortality, with R² increasing by +0.03 and RMSE decreasing, and key proteins like NUP62 and MFNs converge as top features.
Recent multi-omics aging clocks have shifted toward gradient-boosted models and effect-size reporting, yet they still treat each cohort’s latent space as isolated, missing an opportunity to uncover shared biological drivers of aging. Here we hypothesize that the apparent heterogeneity among aging signatures stems not from distinct pathways but from cohort-specific distortions of a common low-dimensional manifold that captures the progressive decoupling of nucleocytoplasmic transport and ER-mitochondrial calcium signaling. If true, applying a manifold-alignment technique (e.g., optimal transport-based Procrustes analysis) to multi-omics data from independent cohorts will collapse these distortions into a conserved trajectory, improving predictive accuracy for mortality and organ-specific dysfunction and revealing a convergent set of SHAP-derived features linked to the nucleocytoplasmic-ER-mitochondrial axis.
To test this, we will obtain plasma proteomics (2,448 proteins), metabolomics, lipidomics, epigenomics, and microbiome profiles from two large, publicly available cohorts: the UK Biobank subset used for ProtBAGs [1] and the All of Us Research Program (or similar) that has matched multi-omics layers. Each dataset will be processed through the same LightGBM pipeline reported in the arXiv preprint [2] to generate individual omics-specific age predictions and a combined multi-omics clock. We will then compute a joint latent representation using unified manifold alignment (UMAP with Procrustes correction or optimal transport) across cohorts, forcing the spaces to share a common geometric structure. The aligned clock’s performance will be evaluated via 10-fold cross-validation for Pearson’s R, R², RMSE, and MAE, with effect sizes reported rather than p-values. We will compare these metrics to the baseline, non-aligned multi-omics clock using a paired bootstrap test; a statistically significant increase in R² (≥0.02) and decrease in RMSE will support the hypothesis.
SHAP values will be extracted from the aligned model and subjected to consensus analysis across cohorts. We predict that the top-ranked features will converge on proteins involved in nuclear pore complex regulation (e.g., NUP62, NUP93), ER-resident calcium handlers (IP3R1, SERCA2), and mitochondrial regulators (MFNs, OPA1), forming a mechanistic module that explains cross-cohort variance. To falsify the hypothesis, we will also test a control alignment that randomizes cohort labels before manifold alignment; this should produce no improvement in predictive performance and no convergence of SHAP features. Failure of the aligned model to outperform the baseline, or lack of reproducible feature convergence, would refute the idea that a conserved nucleocytoplasmic-ER-mitochondrial manifold underlies aging heterogeneity, suggesting instead that observed differences reflect genuine cohort-specific pathobiology.
Comments
Sign in to comment.