Background: Despite the rapidly increasing number of new targeted and immunotherapeutic options over the past two decades, the prognosis of patients with NSCLC, even with early-stage tumors, is still poor and novel biomarkers are needed to better stratify patients in terms of survival and treatment response. A novel approach is to gain a holistic understanding of the cellular composition and formation of the tumor microenvironment (TME). Therefore, we developed a miF-based, AI-driven approach for spatially resolved TME characterization at the cellular level and used this to successfully predict clinical outcome.
Methods: We assembled a large bicentric real-world sample group of 1168 patients with resected NSCLC from the Charite and the University Hospital Cologne. For tissue microarray construction, four 1.5 mm tissue cores were punched from each formalin-fixed and paraffin-embedded tumor block. Sections were stained with a 12-plex IF panel followed by H&E staining. All stains were scanned and co-registered with single cell accuracy. Next, we trained a H&E-based tissue segmentation model to detect the different tumor regions: carcinoma, stroma, and necrosis. In addition, we developed a nucleus-based cell detection model, and 12 cell classification models to categorize each detected cell by single-miF channels. Different cell phenotypes were derived from the marker-specific cell classifications. Finally, we trained a model on the Charité cohort using the spatially resolved cell readouts, spot-wise phenotype log-density, co-clustering of marker expression, and frequency of co-occurrence of marker expression through Delaunay triangulation, to predict patient survival on the Cologne cohort.
Results: The tissue segmentation model achieved a macro averaged F1 score of 92%. The cell detection model identified a total of 53 million cells that were classified marker-wise with an F1 score of at least 95% on hold-out data. Our final prediction model identified a stable spatially resolved cell signature, consisting of 10 different characteristic cell neighborhood niches, which could be used to predict overall patient survival. The model trained on the Charite cohort was validated with the Cologne cohort and achieved a high performance (C-score of 71). In comparison, the UICC8 stage and the immunoscore (CD20+CD3+/carcinoma cell ratio), which were used as a baseline, achieved C-scores of 63 and 54, respectively.
Conclusions: The combination of our large real-world clinical cohort, multiplex panel, and automated AI approach enabled a broad spatially resolved exploration of the TME in NSCLC at single cell resolution. Our model identified a specific cell neighborhood signature predictive of patient survival outperforming the commonly used prognostic scores, UICC8 stage and immunoscore. This allows for an improved patient stratification with potential implication for therapy selection.