Monday, August 24, 2015

Mammals genome - platypus


1) This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic
lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. 

2) Analysis of the
first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins
have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying
eggs; and immune gene family expansions are directly related to platypus biology. 
3. Expansions of protein, non-protein-coding
RNA and microRNA families, as well as repeat elements, are identified. 



4) The platypus genome, as well as the animal, is an amalgam of
ancestral reptilian and derived mammalian characteristics. The
platypus karyotype comprises 52 chromosomes in both sexes14,15,
with a few large and many small chromosomes, reminiscent of reptilian
macro- and microchromosomes. Platypuses have multiple sex
chromosomes with some homology to the bird Z chromosome16.
Males have five X and five Y chromosomes, which form a chain at
meiosis and segregate into 5X and 5Y sperm17,18. 


5) Non-protein-coding genes
In general, the platypus genome contains fewer computationally predicted
non-protein-coding (nc)RNAs (1,220 cases excluded high
repetitive small nucleolar RNA (snoRNA) copies; see below) than
do other mammalian species (for example, human with 4,421 Rfam
hits), similar to observations in chicken19 (655 Rfam-based ncRNAs).
This is probably because of the extensive retrotransposition of
ncRNAs in therian mammals and the apparent lack of L1-mediated
retrotransposition in chicken and platypus. The exception to this is
the platypus family of snoRNAs, which is markedly expanded
(,2,000 matches to the Rfam covariant models) compared to that
for therian mammals (,200). snoRNAs are involved in RNA modifications,
in particular of ribosomal RNA, and are often located in
introns of protein-coding genes22. 

6. Our investigations revealed a
novel short-interspersed-element (SINE)-like, snoRNA-related
retrotransposon—which we have labelled snoRTEs—that has duplicated
in platypus to ,40,000 full-length or truncated copies. It is
retrotransposed by means of retrotransposon-like non-LTR (long
terminal repeat) transposable elements (RTE) as opposed to the
L1-mediated transposition mechanism in therians23. 

7 We constructed
a complementary DNA library of small, ncRNAs and identified 371
consensus sequences of small RNAs that included 166 snoRNAs23
(Supplementary Table 3). Ninety-nine of these cloned snoRNAs
are found in paralogous families, and 21 of them belong to the
snoRTE class. The presence of both the structural requirements
known to be important in snoRNA function24 and evidence of their
expression are consistent with these snoRTE elements being functional
in the platypus. Similar to other unrelated ncRNAs that have
proliferated in therian mammals (for example, 7SL RNA-derived
primate Alu elements, tRNA-derived rodent identifier (ID) elements),
this recent SINE-like expansion is probably due to chance
events. However, given the RNA modification activity of snoRNAs,
and our increasing awareness of the cellular importance of RNA
molecules, it might be that some of the retrotranspositionally duplicated
RNAs were exapted into new functions in this species.
Other small RNAs. Overall, we found commonalities with small
RNA (sRNA) pathways of other mammals, but also features that
are unique to monotremes. Components of the RNA interference
machinery are conserved in platypus, including elements of biogenesis
pathways (Dicer and Drosha) and RNA-interference effector
complexes (argonaute proteins; Supplementary Table 4). Of
20,924,799 platypus and echidna sRNA reads derived from liver,
kidney, brain, lung, heart and testis, 67% could be assigned to
known microRNA (miRNA) families. Established patterns of
miRNA expression were generally recapitulated in monotremes.
To determine the conservation patterns of miRNAs in platypus, we
identified platypus miRNAs sharing at least 16-nucleotide identity with
miRNAs in eutherian mammals (mouse/human) and chicken.
Although most conserved miRNAs were identified across these vertebrate
lineages (137 miRNAs), 10 miRNAs were shared only with eutherians
(mouse/human) and 4 only with chicken (Fig. 2a). miRNAs can
be classified into families based on identity of the functional ‘seed’
region at position 2–8 of the mature miRNA strand. We identified
miRNA families that were shared between platypus and eutherians
but not chicken (40 families), or between platypus and chicken but
not eutherians (8 families), suggesting that for some miRNAs only
the seed region may have been selectively conserved (Fig. 2a).
Conserved miRNAs tended to be more robustly expressed in the
platypus tissues analysed than lineage-restricted miRNAs (Fig. 2b).
To identify miRNAs unique to monotremes we used a heuristic
search that identifies miRNA candidates in deep-sequencing data
sets25. This method predicted 183 novel miRNAs in platypus and
echidna (Fig. 2a). Notably, 92 of these lay in 9 large clusters, on
platypus chromosome X1 and contigs 1754, 7160, 7359, 8388,
11344, 22847, 198872 and 191065. Physical mapping confirmed that
at least five of these contigs are linked to the long arm of chromosome
X1 (ref. 25). These abundantly expressed clusters were sequenced
almost exclusively from platypus and echidna testis (Fig. 2b). The
expansion of this unique miRNA class and its expression domain
suggest possible roles in monotreme reproductive biology25.
Piwi-interacting RNAs (piRNAs) associate with a germlineexpressed
clade of argonaute proteins, known as Piwis26, and have
a role in transposon silencing and genome methylation26. Monotreme
piRNAs bear strong structural similarity to those in eutherians.
They are,29 nucleotides in length and arise from large testis-specific
genomic clusters with distinct genomic strand asymmetry, often with
a typical ‘bidirectional’ organization. We identified 50 major platypus
piRNA clusters as well as numerous smaller clusters25. In contrast
to piRNAs in mouse, platypus piRNAs are repeat-rich and bear
strong signatures of active transposon defence.

8) Gene evolution
Overall this resulted in 18,527
protein-coding genes being predicted from the current platypus
assembly. 
As expected, the majority of platypus genes (82%; 15,312 out of
18,596) have orthologues in these five other amniotes (Supplementary
Table 5). The remaining ‘orphan’ genes are expected to primarily
reflect rapidly evolving genes, for which no other homologues are
discernible, erroneous predictions, and true lineage-specific genes
that have been lost in each of the other five species under consideration.
Simple 1:1 orthologues, which have been conserved without
duplication, deletion or non-functionalization across the five mammalian
species, were greatly enriched in housekeeping functions,
such as metabolism, DNA replication and mRNA splicing 


8. Chemoreception. The semi-aquatic platypus was expected to sense
its terrestrial, but not aquatic, environment by detecting airborne
odorants using olfactory receptors and vomeronasal receptors (types
1 and 2: V1Rs, V2Rs). Nevertheless large numbers of odorant receptor,
V1R and V2R homologues (approximately 700, 950 and 80,
respectively) are apparent in the platypus genome assembly, although
for each family only a minority lack frame disruptions (approximately
333, 270 and 15, respectively)34. The large expansion
of the platypus V1R gene family might reflect sensory adaptations
for pheromonal communication or, more generally, for the detection
of water-soluble, non-volatile odorants, during underwater
foraging.
The platypus odorant receptor gene repertoire is roughly one-half
as large as those in other mammals37. Nevertheless, platypus odorant
receptors fall into class, family and subfamily structures that are well
represented from across the mammals, with a few notable exceptions
such as family 14 (Fig. 3a). Together with the finding that lizard
contains only ,200 odorant receptor genes and pseudogenes, this
indicates that the platypus olfactory repertoire is, as expected, more
akin to other mammals than it is to sauropsids.

10. Eggs. Fertilization in the platypus exhibits both sauropsid and therian
characteristics. Platypus ova are small (4mm diameter) relative
to comparably sized reptiles and birds, and eggs hatch at an early
stage of development so that most growth of the embryo and infant is
dependent on lactation, as in marsupials. Like all mammals and
many other amniotes, when fertilization occurs the ovum is invested
with a zona pellucida. The platypus genome encodes each of the four
proteins of the human zona pellucida38, as well as two ZPAX genes
(Table 1) that previously were observed only in birds, amphibians
and fish. The aspartyl-protease nothepsin is present in platypus, but
has been lost from marsupial and eutherian genomes (Table 1). In
zebrafish, this gene is specifically expressed in the liver of females
under the action of oestrogens, and accumulates in the ovary39.
These are the same characteristics as of the vitellogenins, indicating
that nothepsin may be involved in processing vitellogenin or other
egg-yolk proteins. We find that platypus has retained a single vitellogenin
gene and pseudogene, whereas sauropsids such as chicken
have three and the viviparous marsupials and eutherians have none.
Spermatozoa. Orthologues of many of the eutherian sperm membrane
proteins related to fertilization40 are present in platypus (and
marsupial) genomes. These include the genes for a number of putative
zona pellucida receptors and proteins implicated in sperm–
oolemma fusion. Testis-specific proteases, which in eutherians participate
in degradation of the zona pellucida during fertilization, are
all absent from the platypus genome assembly.
Monotreme spermatozoa undergo some post-testicular maturational
changes, including the acquisition of progressive motility, loss
of cytoplasmic droplets and aggregation of single spermatozoa into
bundles during passage through the epididymis11. Nevertheless,
maturational changes in the sperm surface that are both unique
and essential in other mammals for fertilization of the ovum have
yet to be identified. Also, the epididymis of monotremes is not highly
adapted for sperm storage as in most marsupial and eutherian mammals.
Consistent with these findings is the absence of platypus genes
for the epididymal-specific proteins that have been implicated in
sperm maturation and storage in other mammals. The most abundant
secreted protein in the platypus epididymis is a lipocalin, the
homologues of which are the most secreted proteins in the reptilian
epididymis41. Notably, ADAM7, a protease that is secreted in the
epididymis of eutherians, has an orthologue in the platypus. This is
a bona fide protease with a characteristic Zn21-coordinating
sequence HExxH in the platypus, in the opossum and the tree shrew
(Tupaia belangeri). However, loss of its proteolytic activity is predicted
in eutherians42 owing to a single point mutation within its
active site (E to Q).
11. Lactation and dentition. Lactation is an ancient reproductive trait
whose origin predates the origin of mammals. It has been proposed
that early lactation evolved as a water source to protect porous
parchment-shelled eggs from desiccation during incubation43 or as
a protection against microbial infection. Parchment-shelled egglaying
monotremes also exhibit a more ancestral glandular mammary
patch or areola without a nipple that may still possess roles in egg
protection. However, in common with all mammals, the milk of
monotremes has evolved beyond primitive egg protection into a true
milk that is a rich secretion containing sugars, lipids and milk proteins
with nutritional, anti-microbial and bioactive functions. In a
reflection of this eutherian similarity platypus casein genes are tightly
clustered together in the genome, as they are in other mammals,
although platypus contains a recently duplicated b-casein gene
(Supplementary Fig. 2).
12. Mammalian casein genes are thought to have originally arisen by
duplication of either enamelin or ameloblastin44, both of which are
tooth enamel matrix protein genes that are located adjacent to the
casein gene cluster in eutherians and, we find, also in platypus. Adult
platypuses, as well as echidnas, lack teeth but the conservation of
these enamel protein genes is consistent with the presence of teeth
and enamel in the juvenile, as well as the fossil platypuses45.
Venom. Only a handful of mammals are venomous, but the male
platypus is unique among them in delivering its poison not via a bite
but from hind-leg spurs. Despite the obvious difficulties in obtaining
samples, it is now known that platypus venom is a cocktail of at least
19 different substances46 including defensin-like peptides (vDLPs),
C-type natriuretic peptide (vCNP) and nerve growth factor (vNGF).
When analysed phylogenetically and mapped to the platypus genome
assembly, these sequences are revealed to have arisen from local
duplications of genes possessing very different functions (Fig. 4).
Notably, duplications in each of the b-defensin, C-type natriuretic
peptide and nerve growth factor gene families have also occurred
independently in reptiles during the evolution of their venom47.
Convergent evolution has thus clearly occurred during the independent
evolution of reptilian and monotreme venom48.
13. Immunity. Although the major organs of the monotreme immune
system are similar to those of other mammals49, the repertoire of
immunity molecules shows some important differences from those
of other mammals. In particular, the platypus genome contains at
least 214 natural killer receptor genes (Supplementary Notes 18)
within the natural killer complex, a far larger number than for human
(15 genes50), rat (45 genes50) or opossum (9 genes51).
Both platypus and opossum genomes contain gene expansions in
the cathelicidin antimicrobial peptide gene family (Supplementary
Fig. 3). Among eutherians, primates and rodents have a single cathelicidin
gene52,53, whereas sheep and cows have numerous genes that
have been duplicated only recently54. The expanded repertoire of
cathelicidin genes in both marsupials and monotremes may arm their
immunologically naive young with a diverse arsenal of innate
immune responses. In eutherians, with their increases in length of
gestation and advances in development in utero of their immune
systems, the diversity of antimicrobial peptide genes may have
become less critical. The platypus genome also contains an expansion
in the macrophage differentiation antigen CD163 gene family
(Supplementary Notes 18).
14. Genome landscape
First, we analyse the phylogenetic position of platypus and confirm
that marsupials and eutherians are more closely related than either is
tomonotremes (Supplementary Notes 19).Wethen describe platypus
chromosomes and observe some properties of platypus interspersed
and tandem repeats. We also discuss a potential relationship between
interspersed repeats and genomic imprinting and investigate how the
extremely highG1Cfraction in platypus affects the strong association
seen in eutherians between CpG islands and gene promoters.
Platypus chromosomes. Platypus chromosomes provide clues to the
relationship between mammal and reptile chromosomes, and to the
origins of mammal sex chromosomes and dosage compensation. Our
analysis provides further insight with the following findings: the 52
platypus chromosomes show no correlation between the position of
orthologous genes on the small platypus chromosomes and chicken
microchromosomes; for the unique 5X chromosomes of platypus we
reveal considerable sequence alignment similarity to chicken Z and no
orthologous gene alignments to human X, implying that the platypusX
chromosome evolved directly from a bird-like ancestral reptilian system
55; and the genes on the five platypus X chromosomes appear to be
partially dosage compensated (Supplementary Fig. 5), perhaps parallel
to the incomplete dosage compensation recently described in birds56.
Repeat elements. About one-half of the platypus genome consists of
interspersed repeats derived from transposable elements. The most
abundant and still active repeats are (severely truncated) copies of the
5-kb long-interspersed-element (LINE2) and its non-autonomous
SINE-companion mammalian-wide interspersed repeat (MIR,
Mon-1 in monotremes) that became extinct in marsupials and in
eutherians 60–100 Myr ago. We estimate that there are 1.9 and 2.75
million copies of LINE2 and MIR/Mon-1, respectively, in the 2.3-Gb
platypus genome. DNAtransposons and LTR retroelements are quite
rare in platypus, but there are thousands of copies of an ancient
gypsy-class LTR element (all LTR elements previously identified in
mammals, birds, or reptiles belong to the retrovirus clade). Overall,
the frequency of interspersed repeats (over 2 repeats per kb) is
higher than in any previously characterized metazoan genome.
Population analysis using LINE2/Mon-1 elements distinguished
the Tasmanian population from three other mainland clusters
(Supplementary Fig. 4a, b), in good agreement with tree-based
analysis, physical proximity and previous knowledge of platypus
population relationships57.
Cluster analysis of all LINE2 copies revealed a phylogenetic relationship
lacking branches, as if a single-locus, fast-evolving gene has
steadily spread an exceptional number of pseudogenes over time
(Supplementary Fig. 6). This ‘master gene’ appearance is, to a lesser
degree, also observed for LINE1 in eutherians58, but not to the same
extent for MIR/Mon-1 or other retrotransposons in mammals. The
phylogeny of LINE2 and Mon-1 was also supported by a genomewide
transposition-in-transposition (TinT) analysis59 (Supplementary
Tables 7 and 8). LINE2 density is similar on all chromosomes
(Supplementary Fig. 7); it does not correlate with chromosome
length (and recombination rate) as the CR1 LINE density does in
the chicken genome19, nor is it higher on sex chromosomes than on
autosomes, as LINE1 density is in eutherians (which has led to postulations
on a function in dosage compensation)60.
We compared microsatellites in the platypus genome with those of
representative vertebrates (Supplementary Notes 22). The mean
microsatellite coverage of platypus genomic sequences assembled
into chromosomes is 2.6760.34%; significantly lower than all other
mammalian genomes sequenced so far and most similar to that
observed in chicken (Supplementary Fig. 8). Microsatellites are on
average shorter in platypus than in other genomes (Supplementary
Table 9), but microsatellite coverage surpasses chicken owing to very
long tri- and tetranucleotide repeats (Supplementary Fig. 9). The
platypus has a higher proportion of microsatellites with high A1T
content, in comparison to the other vertebrates examined, an abundance
distribution that has more in common with reptiles than with
mammals (Supplementary Fig. 10).
15. Genomic imprinting. Genomic imprinting is an epigenetic phenomenon
that results in monoallelic gene expression. In the vertebrates,
imprinting seems to have evolved recently and has only been
confirmed in marsupials and eutherian mammals61,62. The autosomal
localization of some imprinted orthologues in platypus is known63.
However, we examined the conservation of synteny and the distribution
of retrotransposed elements in all orthologous eutherianimprinted
clustered and non-clustered genes in the platypus genome.
A representative cluster is shown in Fig. 5 (see also Supplementary
Fig. 12).
Clusters that became imprinted in therians (with the exception
of the Prader–Willi–Angelman locus64) have not been assembled
recently and reside in ancient syntenic mammalian groups, although
some regions have expanded by mechanisms such as gene duplication
or transposition. There were significantly fewer LTR and DNA
elements across all platypus orthologous regions relative to eutherian
imprinted genes (P,0.04 and 0.04, respectively), whereas there was
a significant increase in the sequences masked by SINEs (P,0.03).
The chicken had fewer total repeats and no SINEs or sRNAs.
Comparison of all regions in the platypus with the orthologous
regions in opossum, mouse, dog and human demonstrates that accumulation
of LTR, DNA elements, and simple and low complexity
repeats coincides with, and may be a driving force in, the acquisition
of imprinting in these regions in therian mammals.
The CpG fraction. The eutherian and chicken genomes generally
average around 41% G1C content, although many intervals differ
substantially from the average, particularly in humans (Supplementary
Notes 23). In contrast, the platypus genome averages
45.5% G1C content and rarely deviates far from the average. The
opossum genome averages only 38% G1C content and also has a
narrow distribution (Supplementary Fig. 13). The source of the elevated
G1C fraction in platypus remains unclear. It is explained only
in part by monotreme interspersed repeat elements, as platypus DNA
outside of known interspersed repeats is 44.7% G1C. Furthermore,
tandem repeats of short DNA motifs (microsatellites) in platypus
show an A1T bias, as with other mammals. Recombination-driven
biased gene conversion may be a factor, in agreement with what has
been shown for eutherians65 and marsupials66. This is suggested by
the observation that the six platypus chromosomes where the currently
mapped DNA sequence averages over 45% G1C content (that
is, 17, 20, 15, 14, 10 and 11 in order of decreasing G1C fraction) are
among the 10 shortest (Supplementary Fig. 14), because short chromosomes
have a higher recombination rate67. However, a direct test
is currently lacking because platypus recombination rates have not
been measured. A further examination of the CpG fraction, that
associated with promoter elements, is found in Supplementary
Notes 24 and Supplementary Fig. 15.
Conclusions
The egg-laying platypus is a remarkable species with many biological
features unique among mammals. Our sequencing of the
platypus genome now enables us to compare its sequence characteristics
and organization with those of birds and therian mammals
in order to address the questions of platypus biology and to
date the emergence of mammalian traits. We report here that
sequence characteristics of the platypus genome show features of
reptiles as well as mammals.
Platypus contains a largely standard repertoire of non-proteincoding,
ncRNAs, except for the snoRNAs, which exhibit a marked
expansion associated with at least one retrotransposed subfamily.
Some of these retrotransposed snoRNAs are expressed and thus
may have functional roles. The platypus has fully elaborated
piRNA and miRNA pathways, the latter including many monotreme-
specific miRNAs and miRNAs that are shared with either
mammals or chickens. Many functional assessments of these novel
miRNAs remain to be carried out and will surely add to our knowledge
of mammalian miRNA evolution.
The 18,527 protein-coding genes predicted from the platypus
assembly fall within the range for therian genomes. Of particular
interest are families of genes involved in biology that links
monotremes to reptiles, such as egg-laying, vision and envenomation,
as well as mammal-specific characters such as lactation,
characters shared with marsupials such as antibacterial proteins,
and platypus-specific characters such as venom delivery and underwater
foraging. For instance, anatomical adaptations for chemoreception
during underwater foraging are reflected in an unusually
large repertoire of vomeronasal type 1 receptor genes. However,
the repertoire of milk protein genes is typically mammalian, and
the arrangement of milk protein genes seems to have been preserved
since the last common ancestor of monotremes and therian
mammals.
Since its initial description, the platypus has stood out as a species
with a blend of reptilian and mammalian features, which is a characteristic
that penetrates to the level of the genome sequence. The
density and distribution of repetitive sequence, for example, reflects
this fact. The high frequency of interspersed repeats in the platypus
genome, although typical for mammalian genomes, is in contrast
with the observed mean microsatellite coverage, which appears more
reptilian. Additionally, the correlation of parent-of-origin-specific
expression patterns in regions of reduced interspersed repeats in
the platypus suggests that the evolution of imprinting in therians is
linked to the accumulation of repetitive elements.
We find that the mixture of reptilian, mammalian and unique
characteristics of the platypus genome provides many clues to the
function and evolution of all mammalian genomes. The wealth of
new findings and confirmation of existing knowledge immediately
evident from the release of these data promise that the availability of
the platypus genome sequence will provide the critically needed
background to inspire rapid advances in other investigations of

mammalian biology and evolution.

No comments:

Post a Comment