Specific interactions among biomolecules drive virtually all cellular functions and underlie phenotypic complexity and diversity. Biomolecules are not isolated particles, but are elements of integrated interaction networks, and play their roles through specific interactions. Simultaneous emergence or loss of multiple interacting partners is unlikely. If one of the interacting partners is lost, then what are the evolutionary consequences for the retained partner? Taking advantages of the availability of the large number of mammalian genome sequences and knowledge of phylogenetic relationships of the species, we examined the evolutionary fate of the motilin (MLN) hormone gene, after the pseudogenization of its specific receptor, MLN receptor (MLNR), on the rodent lineage. We speculate that the MLNR gene became a pseudogene before the divergence of the squirrel and other rodents about 75 mya. The evolutionary consequences for the MLN gene were diverse. While an intact open reading frame for the MLN gene, which appears functional, was preserved in the kangaroo rat, the MLN gene became inactivated independently on the lineages leading to the guinea pig and the common ancestor of the mouse and rat. Gain and loss of specific interactions among biomolecules through the birth and death of genes for biomolecules point to a general evolutionary dynamic: gene birth and death are widespread phenomena in genome evolution, at the genetic level; thus, once mutations arise, a stepwise process of elaboration and optimization ensues, which gradually integrates and orders mutations into a coherent pattern.
Specific interactions among biomolecules – such as between receptors and their ligands, enzymes and their substrates, transcription factors and their DNA-binding sites – drive virtually all cellular functions and underlie phenotypic complexity and diversity (Carroll et al. 2008). Studies have revealed that intimate interacting partners do not always emerge simultaneously; specific interactions among biomolecules often form in a stepwise Darwinian fashion, as parts of systems do not remain static and can be co-opted for novel functions (Thornton 2001, Irwin 2005, Bridgham et al. 2006, J He, D M Irwin, Y P Zhang, unpublished observations). Gene duplication, and gain and loss of interactions through mutations in existing proteins are the two recognized major evolutionary processes shaping the specific interactions among biomolecules (Wagner 2001, 2003, Berg et al. 2004). In contrast, there is very little knowledge concerning interaction turnover due to gene losses. Gene loss and pseudogenization are widespread phenomena in genome evolution (Wang et al. 2006), and, as a dramatic genetic change, lead to the immediate loss of specific interactions. Gene loss probably affects interaction turnover as greatly as gene duplication. Gene loss may yield a selective advantage (Varki 2001, Perry et al. 2005, Wang et al. 2006). The hormone–receptor pair is often seen as a biological lock and key. When such a lock-and-key pair comes into existence, it is natural to ask how it evolved. Similar to emergence, the simultaneous loss of multiple interacting partners is also unlikely, thus, if one of the partners is lost, what are the consequences for the retained partner? Motilin (MLN), a 22-amino-acid peptide synthesized by endocrine cells of the duodeno-jejunal mucosa and capable of stimulating the motility of digestive organs (Poitras & Peeters 2008), has never been isolated from the mouse or rat, the most frequently used laboratory animals, despite several attempts (Vogel & Brown 1990, Huang et al. 1999). Here, using bioinformatic methods, we provide genetic evidence that partners with intimate interactions might not always co-evolve. The MLN receptor (MLNR) gene was pseudogenized before the divergence of squirrel and other rodents, an event that occurred about 75 mya (Adkins et al. 2001). After the pseudogenization of its specific receptor, the MLN gene had divergent evolutionary fates. While an intact MLN gene was preserved in the kangaroo rat, MLN was inactivated independently on the lineages leading to the guinea pig and the common ancestor of the mouse and rat. Together, gain and loss of specific interactions among biomolecules through the birth and death of genes for biomolecules point to a general evolutionary dynamic: gene birth and death are widespread phenomena in genome evolution, and once mutations occur, a stepwise process of elaboration and optimization ensues, which integrates and orders mutations into a coherent pattern.
Material and methods
Genomic databases of Homo sapiens (human NCBI Build 36.1), Monodelphis domestica (opossum MonDom5), Bos taurus (cow Btau_4.0), Canis familiaris (dog CanFam2.0), Macaca mulatta (macaque Mmul_1), Tupaia belangeri (tree shrew tupBel1), Oryctolagus cuniculus (rabbit oryCun1), Ochotona princeps (pika ochPri1), Spermophilus tridecemlineatus (squirrel speTri1), Cavia porcellus (guinea pig cavPor1), Dipodomys ordii (kangaroo rat dipOrd1), Mus musculus (mouse mm37), Rattus norvegicus (rat rn4) maintained by Ensembl (http://www.ensembl.org, Ensembl release 52 – December 2008) were searched for sequences similar to the known preproMLN and MLNR sequences. Accession numbers of the identified sequences are listed in the Supplementary information (Supplementary tables 1 and 2, see section on supplementary data given at the end of this article). The genomic neighborhoods of MLN and MLNR were characterized by searching the proteome (i.e. predicted and known protein) database with DNA sequences 5′ and 3′ to the MLN and MLNR genes. MiltiPipMaker (Schwartz et al. 2000), which aligns multiple, long genomic DNA sequences with good sensitivity, was used to compute alignments of the genomic sequences of species listed above. Guinea pig genomic DNA was prepared by standard methods, and exons 1, 2, and 3 of the guinea pig MLN gene were amplified by PCR with Taq DNA polymerase and sequenced in both directions on an ABI 3730 automated sequencer according to the manufacturer's protocol, with primer sequences: 5′-GAGCCCCTGTCATTGTCC-3′, 5′-TTGCTTTTCCTACCTCTTGG-3′ (exon 1), 5′-GGGAAGGGGAGCACTTCT-3′, 5′-CGCCCTCCCATCCAACCC-3′ (exon 2), and 5′-AGTGCCCTGGTGAAAACC-3′, 5′-CCTCAGACTCGCTGGAATAG (exon 3). Amplification was initiated with a long denaturing step of 94 °C for 5 min, and was followed by 57–60 °C for 3 min and 72 °C for 5 min, and then 30 cycles of 94 °C for 1 min, 57–60 °C for 1 min, and 72 °C for 5 min, with a final extension step of 72 °C for 10 min. PCR fragment was extracted and sequenced directly on an ABI 3730 automatic sequencer. The potential open reading frame (ORF) for MLN was predicted using Wise2 (version 2.1.20 stable) in accordance with the genomic sequence alignments (Birney et al. 2004). Coding sequence alignments were generated by ClustalW (Thompson et al. 1994) guided by alignments of the amino acid sequences, following visual adjustments. Generally, accepted phylogenetic relationships of the species (Murphy et al. 2001a,b, Huchon et al. 2002) were employed to conduct comparative genomics and evolutionary inferences. Synonymous and nonsynonymous substitution rates (dS and dN) were calculated by the Pamilo–Bianchi–Li's method, in which the transitional/transversional substitution bias was taken into account (Li 1993, Pamilo & Bianchi 1993).
Results and discussion
Conservation of genomic context and disruption of the MLN and MLNR genes in mouse and rat
Despite several attempts, previous investigators have failed to amplify a MLN gene transcript by reverse transcriptase (RT)-PCR in both mouse and rat, and it has been shown that exogenous MLN has no physiological or pharmacological effects in either of these species (Aerssens et al. 2004, Peeters et al. 2004). Our genomic searches identified single copies of MLN and MLNR gene-like sequences in the mouse and rat genomes (Fig. 1). We used comparative genomic analysis to confirm that these MLN and MLNR-like sequences were orthologous (i.e. same gene) to the characterized human MLN and MLNR genes. When the genomic context, i.e. gene content and gene order, was examined, it was found that orthologs of the genes 5′ and 3′ to the human MLN and MLNR genes could be identified in the rodent genes flanking the MLN and MLN-like sequences, suggesting that these genomic regions are largely conserved (Fig. 1). Thus, this analysis allowed us to identify vestiges of the MLN and MLNR genes, which are in an appropriate genomic context. In both mouse and rat, sequences with high similarity to exons 1, 2, and 3 of the human MLN gene were identified, but the genomic sequence appears to encode a pseudogene, as several frameshift indels prevent the prediction of an intact mature preproMLN (Fig. 2A). Using a similar approach, we identified degenerated copies of the mouse and rat MLNR genes, which are unable to predict intact ORFs (Fig. 1B). Our genomic analysis not only provides evidence that the genomes of both the mouse and the rat only contain remnants of the MLN and MLNR genes, all of which are nonfunctional, thus the mouse and rat are natural MLN and MLNR gene knockouts (Peeters 2005), but also demonstrates that mouse and rat MLN and MLNR genes were lost through independent mutations in existing genes and not by a disruptive chromosomal rearrangement deleting the genes, which potentially could have removed both genes in a single event.
Evolutionary histories of the MLN and its specific receptor gene in rodents
To better understand how the mouse and rat lost their MLN and MLNR genes, we examined the genomes from five rodent species (mouse, rat, kangaroo rat, guinea pig, and squirrel), as well as six other mammals (human, dog, cow, pika, rabbit, and tree shrew). MLN and MLNR genomic regions were identified in four and five rodents respectively and in all of the other mammalian species (Fig. 3). A MLN genomic region was not identified in the squirrel genome, most likely due to the incomplete draft nature of this genome sequence. In the squirrel, the gene that should be 5′ to MLN is located at the 3′ end of scaffold_4745, while the 3′ flanking gene was at the 5′ end of scaffold_810, thus it is possible that MLN (or remnants of it) could exist in the gap between these two scaffolds. A genomic sequence alignment of the MLN and MLNR genes generated with MultiPipMaker that used pairwise alignments between the cow sequence and the orthologous sequences from all the other species is shown in Fig. 3. Similar alignments were generated when genomic sequences of other mammals were used as the common sequence for the pairwise comparisons. The graphical percent identity plot shows that the rodent MLN and MLNR genomic sequences have diverged considerably from those of other mammals (Fig. 3). The rodent MLNR gene regions have been disrupted and are largely degenerated, thus making it difficult to produce credible alignments of the genomic sequences. Large sequence gaps and mutations, which generate frameshifts and/or premature stop codons, disrupt the MLNR ORFs in all of the rodent species examined (squirrel, guinea pig, kangaroo rat, mouse, and rat; data not shown) and explain why previous attempts to isolate the MLNR gene from rodents have failed (Peeters 2005, Xu et al. 2005). The poor conservation of the MLNR sequences thus did not allow the identification of a single mutation that is shared by all rodents, the mutation that would be the candidate for the first inactivation mutation. Since all rodent MLNR gene sequences fail to predict functional products, and a large number of changes that have occurred to the genomic sequences on all of the rodent lineages suggest that this gene was inactivated prior to the initial divergence of rodents more than 75 million years ago (Adkins et al. 2001).
The MLN genomic sequences show better conservation among mammals (Fig. 3A). As shown in Fig. 2A, not all of the glires, MLN genes are disrupted. While the genomic sequence does not encode a complete rabbit MLN gene, a cDNA sequence for rabbit MLN had been previously isolated and encodes an intact ORF, and the physiological and pharmacological effects of MLN have been demonstrated (Banfield et al. 1992, Van Assche et al. 1997). The pika genomic sequence predicts an intact ORF. In the guinea pig, a functional MLN precursor cDNA was previously identified, and its expression pattern, including expression in the duodenal mucosa, characterized (Xu et al. 2001), yet the Ensembl guinea pig genome sequence predicted a nonfunctional gene sequence. Exons 1, 2, and 3 of the guinea pig MLN gene can be unambiguously predicted from genomic sequences, and it contains a single base deletion in exon 1 (G at position 70) and a single base insertion at the 3′ end of exon 3 (Fig. 2B). The G at position 70 is conserved in all of the studied species that have intact MLN genes, including the previously identified guinea pig MLN cDNA (Fig. 2A). To determine whether these two frameshifts reflect examples of the few assembly errors generated during genome sequencing, we resequenced the guinea pig MLN gene. Our genomic sequence was in general agreement with the Ensembl genome sequence and includes both of the frameshift mutations, although it did have few other base differences from both the cDNA and genomic sequences (Fig. 2B). Further analysis of the guinea pig MLN genomic and cDNA sequences shows that, in addition to the two frameshift mutations, there are also many other differences among the sequences, many of which cause nonsynonymous changes. Multiple studies have suggested that the guinea pig originated in the Andes and is the domesticated descendant of several closely related species of cavy such as Cavia aperea, Cavia fulgida, or Cavia tschudii (Weir 1974, Nowak 1999). Thus, it is possible that the cDNA and genomic sequences of MLN represent highly divergent alleles within the guinea pig population.
The kangaroo rat has an intact MLN gene (Fig. 2A), with a predicted protein-coding transcript (ensemble ID ENSDORT00000001465), which has all the hallmarks of being a functional gene. The kangaroo rat is phylogenetically more closely related to mouse and rat than it is to guinea pig (Huchon et al. 2002), indicating that a functional MLN gene should have existed in the common ancestor of guinea pigs, kangaroo rats, mice, and rats. Thus, irrespective of whether a functional MLN exists in some guinea pigs, the inactivation of the guinea pig MLN gene (or allele) must be independent of the mutation that inactivated the rat and mouse MLN gene. The genomic sequences have allowed us to conclude that, within rodents, the MLN gene was inactivated at least twice. When the mouse and rat MLN genomic sequences were compared, it was found that they share a 1 bp insertion at the 3′ end of exon 3, while none of the other mutations that disrupt the reading frame are shared by the two species (Fig. 2A). Since the single base insertion would generate a frameshift altering the coding potential, and is shared by mouse and rat, it is likely to be the ancestral mutation that first inactivated the MLN gene on the mouse and rat common ancestral lineage.
Based on the parsimony principle, our genomic analysis, including the extreme lack of conservation of MLNR coding sequences in all rodents, suggests that the MLNR gene became a pseudogene first, more than 75 million years ago (before the divergence of the squirrel and other rodents (Adkins et al. 2001)), and has become badly degenerated in all of the rodent species studied (Fig. 3B). In contrast, the evolutionary history of the MLN gene is not as simple, and it was not lost immediately after the loss of its specific receptor, but instead it presents divergent evolutionary fates. An intact ORF for the MLN gene and its specific receptor gene (MLNR) have been preserved in lagomorpha, both of which show substantial evolutionary constraints upon their sequences (in all pair of comparisons dN<dS, and mean dN is significantly <dS). While the MLNR is a pseudogene in the squirrel, the fate of the MLN gene is unknown, as we were unable to identify this gene sequence in the squirrel low coverage 1.90X genome assembly, thus this area needs further investigation. Genomic data, both from Ensembl and our own data, suggest that the MLN gene in the guinea pig is a pseudogene. The kangaroo rat MLN gene, though, potentially could be expressed and is under substantial evolutionary constraint, displaying an evolutionary rate similar to that of other mammalian species (Tables 1 and 2), which suggests that it is under similar constraints as seen for other mammalian MLN genes, yet this species does not have a functional MLNR gene. The MLN gene was inactivated on the common ancestral lineage leading to mouse and rat, independent of the pseudonization in the guinea pig. As we could not identify a common mutation that could explain the pseudonization of all of the MLNR genes, we cannot exclude the possibility that the MLNR gene was inactivated independently on several rodent lineages, thus it is possible that the kangaroo rat might have lost its functional MLNR gene very recently explaining why this species has retained an intact MLN ORF. However, this possibility seems unlikely, as the kangaroo rat MLNR coding sequences contain a large number of mutations that disrupt potential function (e.g. lack of similarity in exon 1 sequence, Fig. 3B), suggesting that it was not recently inactivated. Thus, the reason for the conservation of kangaroo rat MLN is unclear. Intriguingly, studies have suggested that, after the break down of the MLN signaling pathway, the ghrelin signaling pathway was recruited to compensate for this loss (Dass et al. 2003, Depoortere et al. 2003). It would be of great interest to dissect how this recruitment happened.
Mean of nonsynonymous (dN) and synonymous (dS) substitutions per site of motilin (MLN) gene in different evolutionary linage
|Guinea pig mRNA–kangaroo rat||0.303±0.045||0.474±0.103||0.64|
Mean of pairwise dN and dS distances of rabbit, pika, guinea pig, kangaroo rat, and pig motilin (MLN) with other species MLN (Supplementary Table 1, see section on supplementary data given at the end of this article)
|Guinea pig–other species||0.264±0.021||0.532±0.073|
|Kangaroo rat–other species||0.196±0.021||0.455±0.074|
Knowledge of how elements that constitute the MLN-signaling pathway were lost in rodents not only helps in choosing species to examine their function, but also helps to enhance our understanding of how the evolutionary processes assemble and disassemble complex systems that depend on specific interactions among its parts. At birth, gene duplication produced MLN and its specific receptor in a stepwise fashion giving rise to the MLN–MLNR-signaling pathway (J He, D M Irwin, Y P Zhang, unpublished observations). Similarly, the loss of the MLN–MLNR-signaling pathway also followed a stepwise process. Biomolecules are not isolated particles, but are elements of integrated interaction networks, and play their roles through specific interactions. The simultaneous emergence or loss of multiple interacting partners is unlikely. If one of the interacting partners is lost, then what are the evolutionary consequences of the retained partner? A possibility is that the retained partner may subsequently be lost, as demonstrated independently for MLN in the mouse and rat and in the guinea pig. Alternatively, the retained partner may serve as raw material in evolution and become recruited into a new interaction, as suggested for MLN in the kangaroo rat. The MLN coding region has been well conserved in the guinea pig (even though there is evidence that it has now become a pseudogene) and kangaroo rat. Clearly, in the kangaroo rat, the MLN sequence is intact, potentially could be expressed (i.e. has an intact promoter region, Fig. 3A, and data not shown), and is under sustained evolutionary constraints (Tables 1 and 2), suggesting that it is being preserved by functional constraint for a function other than traditional MLN signaling. The expression and physiological role of MLN in the kangaroo rat needs further investigation. Here, we demonstrate that some deleterious mutations (e.g. gene loss) can be acceptable, and become fixed in population during evolution. At the genetic level (dynamic interaction network), life evolves by a stepwise process of elaboration and optimization under natural selection. Natural selection does not act merely as a sieve eliminating detrimental mutations and favoring the reproduction of beneficial ones. In the long run, evolution integrates and orders mutations into adaptively coherent patterns adjusted over millions of years (Jacob 1977).
This is linked to the online version of the paper at http://dx.doi.org/10.1677/JME-09-0095.
Declaration of interest
The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.
This work was supported by grants from the National Basic Research Program of China (973 Program, 2007CB411600), the National Natural Science Foundation of China (30621092, 30623007), and Bureau of Science and Technology of Yunnan Province.
We thank anonymous reviewers for helpful comments.
AdkinsRMGelkeELRoweDHoneycuttRL2001Molecular phylogeny and divergence time estimates for major rodent groups: evidence from multiple genes. Molecular Biology and Evolution18777–791.
AerssensJDepoortereIThielemansLMitselosACoulieBPeetersTL2004The rat lacks functional genes for motilin and for the motilin receptor. Neurogastroenterology and Motility16841.
BanfieldDKMacGillivrayRTBrownJCMcIntoshCH1992The isolation and characterization of rabbit motilin precursor cDNA. Biochimica et Biophysica Acta1131341–344.
BergJLässigMWagnerA2004Structure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications. BMC Evolutionary Biology451.
CarrollSMBridghamJTThorntonJW2008Evolution of hormone signaling in elasmobranchs by exploitation of promiscuous receptors. Molecular Biology and Evolution252643–2652.
DassNBMunonyaraMBassilAKHervieuGJOsbourneSCorcoranSMorganMSangerGJ2003Growth hormone secretagogue receptors in rat and human gastrointestinal tract and the effects of ghrelin. Neuroscience120443–453.
DepoortereIDe WinterBThijsTDe ManJPelckmanPPeetersT2003Comparison of the prokinetic effects of ghrelin, GHRP-6 and motilin in rats in vivo and in vitro. Gastroenterology124580.
HuangZDepoortereIDe ClercqPPeetersT1999Sequence and characterization of cDNA encoding the motilin precursor from chicken, dog, cow and horse. Evidence of mosaic evolution in prepromotilin. Gene240217–226.
HuchonDMadsenOSibbaldMJAmentKStanhopeMJCatzeflisFde JongWWDouzeryEJ2002Rodent phylogeny and a timescale for the evolution of glires: evidence from an extensive taxon sampling using three nuclear genes. Molecular Biology and Evolution191053–1065.
MurphyWJEizirikEJohnsonWEZhangYPRyderOAO'BrienSJ2001aMolecular phylogenetics and the origins of placental mammals. Nature409614–618.
MurphyWJEizirikEO'BrienSJMadsenOScallyMDouadyCJTeelingERyderOAStanhopeMJde JongWW2001bResolution of the early placental mammal radiation using Bayesian phylogenetics. Science2942348–2351.
Nowak RM 1999 Walker's Mammals of the World edn 6. Baltimore MD: Johns Hopkins University Press.
PeetersTLAerssensJDe SmetBMitselosAThielemansLCoulieBDepoortereI2004The mouse is a natural knock-out for motilin and for the motilin receptor. Functionally they have been replaced by ghrelin. Neurogastroenterology and Motility16687.
PerryGHVerrelliBCStoneAC2005Comparative analyses reveal a complex history of molecular evolution for human MYH16. Molecular Biology and Evolution22379–382.
SchwartzSZhangZFrazerKASmitARiemerCBouckJGibbsRHardisonRMillerW2000PipMaker – a web server for aligning two genomic DNA sequences. Genome Research10577–586.
ThompsonJDHigginsDGGibsonTJ1994ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Research224673–4680.
ThorntonJW2001Evolution of vertebrate steroid receptors from an ancestral estrogen receptor by ligand exploitation and serial genome expansions. PNAS985671–5676.
Van AsscheGDepoortereIThijsTJanssensJJPeetersTL1997Concentration-dependent stimulation of cholinergic motor nerves or smooth muscle by [Nle13]motilin in the isolated rabbit gastric antrum. European Journal of Pharmacology337267–274.
VarkiA2001Loss of N-glycolylneuraminic acid in humans: mechanisms, consequences, and implications for hominid evolution. American Journal of Physical Anthropology3354–69.
WagnerA2001The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. Molecular Biology and Evolution181283–1292.
XuLDepoortereITomasettoCZandeckiMTangMTimmermansJPPeetersTL2005Evidence for the presence of motilin, ghrelin, and the motilin and ghrelin receptor in neurons of the myenteric plexus. Regulatory Peptides124119–125.