arXiv:2606.28395v1 Announce Type: new Abstract: Recent studies have shown that spatial properties of tumors are critical for understanding disease biology and predicting patient outcomes. These spatial properties are increasingly uncovered through complementary modalities: spatial transcriptomics (ST) captures spatially-resolved molecular states, while hematoxylin and eosin-stained whole slide images (HE) reveal tissue morphology. While approaches are emerging to fuse these modalities, effective methods that learn not only joint representations but also incorporate spatial context across modalities are lacking. Here, we present JASPR (Joint Spatial Representation learning), a self-supervised deep learning framework that integrates HE images and ST data through a cross-modal reconstruction objective that incorporates spatial context within HE images and ST profiles. It employs shared modules to capture universal spatial properties across modalities, while modality-specific experts encode features unique to morphological and genomic data. We train and validate JASPR on breast cancer datasets, demonstrating that its learned joint representation substantially improves HE-based prediction of 9,248 genes and provides prognostic value for breast cancer outcomes.
Few-class Fidelity: Evaluating Explanations of Real-conditions CNN classifiers with Optimized Perturbations