An extended catalogue of tandem alternative splice sites in human tissue transcriptomes
Tandem alternative splice sites (TASS) is a special class of alternative splicing events that are characterized by a close tandem arrangement of splice sites. Most TASS lack functional characterization and are believed to arise from splicing noise. Based on the RNA-seq data from the Genotype Tissue Expression project, we present an extended catalogue of TASS in healthy human tissues and analyze their tissue-specific expression. The expression of TASS is usually dominated by one major splice site (maSS), while the expression of minor splice sites (miSS) is at least an order of magnitude lower. Among 46k miSS with sufficient read support, 9k (20%) are significantly expressed above the expected noise level, and among them 2.5k are expressed tissue-specifically. We found significant correlations between tissue-specific expression of RNA-binding proteins (RBP), tissue-specific expres- sion of miSS, and miSS response to RBP inactivation by shRNA. In combination with RBP profiling by eCLIP, this allowed prediction of novel cases of tissue-specific splicing regula- tion including a miSS in QKI mRNA that is likely regulated by PTBP1. The analysis of human primary cell transcriptomes suggested that both tissue-specific and cell-type-specific factors contribute to the regulation of miSS expression. More than 20% of tissue-specific miSS affect structured protein regions and may adjust protein-protein interactions or modify the stability of the protein core. The significantly expressed miSS evolve under the same selection pressure as maSS, while other miSS lack signatures of evolutionary selection and conservation. Using mixture models, we estimated that not more than 15% of maSS and not more than 54% of tissue-specific miSS are noisy, while the proportion of noisy splice sites among non-significantly expressed miSS is above 63%.