Gene fusion occurs when two or more individual genes with independent open reading frames becoming juxtaposed under the same open reading frame creating a new fused gene. A small number of gene fusions described in detail have been associated with novel functions, e.g. the hominid specific PIPSL gene, TNFSF12 and the TWE-PRIL gene family. We use Sequence Similarity Networks (SSNs) and species level comparisons of great ape genomes to identify 45 new genes that have emerged by transcriptional readthrough, i.e. transcription-derived gene fusion (TDGF). For 35 of these putative gene fusions we have been able to assess available RNAseq data to determine if there are reads that map to each breakpoint. A total of 29 of the putative gene fusions had annotated transcripts (9/29 of which are human-specific). We carried out RT-qPCR in a range of human tissues (placenta, lung, liver, brain and testes) and found 23 of the putative gene fusion events were expressed in at least one tissue. Examining the available ribosome foot-printing data we find evidence for translation of three of the fused genes in human. Finally, we find enrichment for TDGFs in regions of known segmental duplication (SD) in human. Together our results implicate chromosomal structural variation brought about by SD with the emergence of novel transcripts and translated protein products.
Bibliographical note© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
* For inclusion in REF2021