Hairpin sequence and structure is associated with features of isomiR biogenesis
MiRNA isoforms (isomiRs) are single stranded small RNAs originating from the same pri-miRNA hairpin as a result of cleavage by Drosha and Dicer enzymes. Variations at the 5ʹ-end of a miRNA alter the seed region of the molecule, thus affecting the targetome of the miRNA. In this manuscript, we analysed the distribution of miRNA cleavage positions across 31 different cancers using miRNA sequencing data of TCGA project. As a result, we found that the processing positions are not tissue specific and that all miRNAs could be correctly classified as ones exhibiting homogeneous or heterogeneous cleavage at one of the four cleavage sites. In 42% of cases (42 out of 100 miRNAs), we observed imprecise 5ʹ-end Dicer cleavage, while this fraction was only 14% for Drosha (14 out of 99). To the contrary, almost all cleavage sites of 3ʹ-ends (either Drosha or Dicer) were heterogeneous. With the use of only four nucleotides surrounding a 5ʹ-end Dicer cleavage position we built a model which allowed us to distinguish between homogeneous and heterogeneous cleavage with the reliable quality (ROC AUC = 0.68). Finally, we showed the possible applications of the study by the analysis of two 5ʹ-end isoforms originating from the same exogeneous shRNA hairpin. It turned out that the less expressed shRNA variant was functionally active, which led to the increased off-targeting. Thus, the obtained results could be applied to the design of shRNAs whose processing will result in a single 5ʹ-variant.