Bacterial genome annotations contain a number of coding sequences (CDSs) that, in spite of reading frame disruptions, encode a single continuous polypeptide. Such disruptions have different origins: sequencing errors, frameshift, or stop codon mutations, as well as instances of utilization of nontriplet decoding. We have extracted over 1,000 CDSs with annotated disruptions and found that about 75% of them can be clustered into 64 groups based on sequence similarity. Analysis of the clusters revealed deep phylogenetic conservation of open reading frame organization as well as the presence of conserved sequence patterns that indicate likely utilization of the nonstandard decoding mechanisms: programmed ribosomal frameshifting (PRF) and programmed transcriptional realignment (PTR). Further enrichment of these clusters with additional homologous nucleotide sequences revealed over 6,000 candidate genes utilizing PRF or PTR. Analysis of the patterns of conservation apparently associated with nontriplet decoding revealed the presence of both previously characterized frameshift-prone sequences and a few novel ones. Since the starting point of our analysis was a set of genes with already annotated disruptions, it is highly plausible that in this study, we have identified only a fraction of all bacterial genes that utilize PRF or PTR. In addition to the identification of a large number of recoded genes, a surprising observation is that nearly half of them are expressed via PTR - a mechanism that, in contrast to PRF, has not yet received substantial attention.
Adaptive introgression—the flow of adaptive genetic variation between species or populations—has attracted significant interest in recent years and it has been implicated in a number of cases of adaptation, from pesticide resistance and immunity, to local adaptation. Despite this, methods for identification of adaptive introgression from population genomic data are lacking. Here, we present Ancestry_HMM-S, a hidden Markov model-based method for identifying genes undergoing adaptive introgression and quantifying the strength of selection acting on them. Through extensive validation, we show that this method performs well on moderately sized data sets for realistic population and selection parameters. We apply Ancestry_HMM-S to a data set of an admixed Drosophila melanogaster population from South Africa and we identify 17 loci which show signatures of adaptive introgression, four of which have previously been shown to confer resistance to insecticides. Ancestry_HMM-S provides a powerful method for inferring adaptive introgression in data sets that are typically collected when studying admixed populations. This method will enable powerful insights into the genetic consequences of admixture across diverse populations. Ancestry_HMM-S can be downloaded from https://github.com/jesvedberg/Ancestry_HMM-S/.
Magnesium chelatase chlIDH and cobalt chelatase cobNST enzymes are required for biosynthesis of (bacterio)chlorophyll and cobalamin (vitamin B12), respectively. Each enzyme consists of large, medium, and small subunits. Structural and primary sequence similarities indicate common evolutionary origin of the corresponding subunits. It has been reported earlier that some of vitamin B12 synthesizing organisms utilized unusual cobalt chelatase enzyme consisting of a large cobalt chelatase subunit (cobN) along with a medium (chlD) and a small (chlI) subunits of magnesium chelatase. In attempt to understand the nature of this phenomenon, we analyzed >1,200 diverse genomes of cobalamin and/or chlorophyll producing prokaryotes. We found that, surprisingly, genomes of many cobalamin producers contained cobN and chlD genes only; a small subunit gene was absent. Further on, we have discovered a diverse group of chlD genes with functional programed ribosomal frameshifting signals. Given a high similarity between the small subunit and the N-terminal part of the medium subunit, we proposed that programed translational frameshifting may allow chlD mRNA to produce both subunits. Indeed, in genomes where genes for small subunits were absent, we observed statistically significant enrichment of programed frameshifting signals in chlD genes. Interestingly, the details of the frameshifting mechanisms producing small and medium subunits from a single chlD gene could be prokaryotic taxa specific. All over, this programed frameshifting phenomenon was observed to be highly conserved and present in both bacteria and archaea.