The central dogma describes the conversion process from DNA to proteins: "DNA is transcribed into RNA, and then RNA is translated into proteins.” However, it is an intricate process with numerous potential variations in practice, resulting in a diverse set of protein products from each gene. Here, “proteoform” describes each possible molecular form of proteins generated from a single gene, accounting for genetic variations, alternatively spliced RNA transcripts, and post-translational modifications.
We have identified two interesting resources. First, we have access to a deep proteomics set and, second, we have an extensive long-read mRNA-seq data describing the abundance and variation of mRNAs, for such cells.
We aim to examine if the long-read transcripts that are different from the canonical Uniprot definitions of transcripts can also be retrieved in the deep proteomics data.