There is growing consensus that alterations in the pancreatic tissue microenvironment, such as immune cell infiltration, β-cell death, and fibrosis, are critical drivers of various pancreatic diseases. In my current project, I aim to establish a computational infrastructure to analyze spatial transcriptomics data from pancreatic biopsies, obtained from both healthy donors and those with medical conditions such as Type 1 Diabetes (T1D), Type 2 Diabetes (T2D), and obesity. This infrastructure will enable us to gain deeper insights into the spatial dynamics of pancreatic tissue in both healthy and diseased states.
To achieve this, I will optimize computational pipelines to process and analyze spatial transcriptomics data, starting with the generation of data from FASTQ files and running these through scalable workflow managers like Snakemake or Nextflow, integrated with containerization tools such as Singularity. This will allow for efficient and reproducible processing of large datasets. In addition, I will employ models like Cell2Location to deconvolve complex tissue samples and quantify cell type-specific gene expressions.
Given the complex and data-intensive nature of spatial transcriptomics, I intend to expand my use of artificial intelligence (AI) and machine learning (ML) models to further enhance my analyses. Specifically, I plan to utilize transformer models such as scBERT for cell type annotation, allowing me to improve the accuracy of identifying distinct cell types within spatially resolved transcriptomics data. Furthermore, I will explore the application of various neural network architectures, including convolutional neural networks (CNNs) and U-Net-based models, to segment tissue regions and identify key patterns that are associated with disease progression. These AI/ML approaches offer significant advantages in the automated interpretation of complex spatial datasets, particularly when standard analytical methods fall short.
To advance this AI-driven approach, I am requesting 250 GPU hours per month. Based on my prior experience, this is the approximate amount of GPU time I typically utilize when training AI models on spatial transcriptomics data. With these resources, I plan to explore novel models such as graph neural networks (GNNs) to integrate multi-omics data with spatial context and apply transfer learning techniques to refine cell type predictions across datasets. These methods will enhance the resolution at which we can study pancreatic tissue organization and cellular behavior, particularly in the context of diseases like T1D and T2D.
By delving deeper into the application of AI in my research, I aim to provide new insights into how the spatial organization of cells within the pancreatic microenvironment contributes to disease progression. This work will not only benefit my ongoing research but will also set the stage for developing generalizable AI-driven tools for spatial transcriptomics in biomedical research.
Classification: Bioinformatics, Computational Biology, AI/ML, Transcriptomics, Medicine, Biomedical Science.