Innate lymphoid cells (ILCs) are the innate counterparts of T lymphocytes and are classified into type 1, 2, and 3 subsets based on shared transcriptional programs and effector functions. Despite increasing insight into ILC biology, tissue-specific heterogeneity of human ILCs remains poorly explored. To address this, we profiled ILCs across matched tissues from human organ donors to define tissue-associated immune organization patterns.
In the first phase of the project, we generated a high-dimensional spectral flow cytometry dataset comprising approximately 90 FCS files across 36 parameters from matched mucosal, lymphoid, and non-lymphoid tissues (n=10 donors). Following initial manual gating, the dataset includes approximately 30 million selected events requiring comprehensive unsupervised computational analysis. Planned analyses include data quality control, cross-sample normalization, batch correction, high-dimensional dimensionality reduction (UMAP), graph-based clustering (e.g., FlowSOM), differential abundance testing, and cross-donor integration. Construction of nearest-neighbor graphs and iterative clustering across tens of millions of events is memory-intensive and computationally demanding, exceeding the capacity of standard laboratory workstations.
We therefore request NAISS resources to implement a scalable R-based analysis pipeline enabling parallelized computation, reproducible workflow management, and secure storage of raw and processed high-dimensional cytometry data. This infrastructure will support development of a comprehensive atlas of tissue-resident human ILCs and NK cells and establish computational workflows applicable to emerging large-scale spectral cytometry datasets.
Following spectral cytometry analysis, complementary high-dimensional datasets, including single-cell RNA sequencing, will be generated and integrated to link phenotypic and transcriptional tissue-resident immune states. Such integration requires large-scale matrix processing, cross-modal alignment, and scalable storage solutions. Continued access to NAISS computational resources will therefore be essential to support high-throughput data analysis, secure storage, and development of reproducible multi-dataset workflows.