Thursday, February 20, 2014

Mineralization and the elephant shark

It's getting easier to document in detail the evolution of many traits, thanks to the quickly growing database of whole genome sequences from an increasing diversity of organisms.  A friend and long-time colleague of ours at Penn State, Kazu Kawasaki, has long been interested in, and in fact a world leader in the genetics of animal biomineralization.  He has done much painstaking, seminal work on understanding the evolution of the genetic architecture of this trait.  The nature of the proteins involved requires particluar ways of finding and characterizing genes that are responsible.

Mineralization was important in the evolution of vertebrates, for numerous reasons.  It is about how free calcium is captured and crystal structures formed.  As Kazz wrote in a 2004 paper:
Mineralized tissue is a critical innovation in vertebrate evolution, offering the basis for various adaptive phenotypes: body armor for protection, teeth for predation, and endoskeleton for locomotion. Two distinct types of mineralized tissues emerged in Paleozoic agnathans [jawless fishes]: tooth-like oral skeleton and dermal skeleton. The dermal skeleton, which first appeared in the heterostracomorphs, consists of surface dentin and basal bone, which are occasionally overlaid by enameloid. Eventually, dermal skeleton developed into simple scales. Based on the histological similarity, these scales have been considered homologous to teeth. Teeth composed of all three tissues first appeared in [an immediate ancestor of] chondrichthyans [cartilaginous fishes]. Recently, the oral skeleton of conodonts [extinct jawless chordates resembling eels] was recognized as the earliest mineralized tissue in vertebrates and proposed to be the likely precursor of all teeth. However, there is no phylogenetic support for homology between the oral skeleton and teeth. The evolution of mineralized tissues has been enigmatic for more than a century.
More than ten years ago, as a leading member of Ken's lab, Kazz discovered a gene family that he called the Secretory Calcium-binding Phosphoprotein (SCPP) genes.  These genes encode proteins that regulate the calcium (Ca) phosphate concentration in the extracellular matrix.  Calcium is involved in tooth enamel, bone mineralization, calcium content of milk, and indeed may have been centrally important in the evolution of vertebrates.  The supersaturation of milk with Ca-phosphate was critical to mammalian divergence; the enamel made possible by SCPP genes led to mineralized teeth, which made active predation possible, and the mineralized skeleton, lactation and even saliva emerged early enough in the evolution of mammals to be the foundation of much of the subsequent evolution of the lineage because so many fundamentally adaptive structures are calcium-based.

Gene families generally arise from gene duplication and the SCPP genes are no exception; a single gene or part of a gene duplicates by some DNA copying error during the production of egg or sperm cells, and the transmitted genome inherited by the individual's offspring contains the original and the duplicate copy.  In the case of SCPP genes, the copy or copies are usually in tandem array with recent and more ancient duplicate genes or segments of the genes in one or two clusters on the chromosome.  One copy is generally constrained by natural selection to maintain its original function, but the subsequent copy or copies are free to vary and contribute new function to the lineage.  The new function may simply supplement the parental function, or some modified use may evolve.  Indeed, adaptive evolution largely evolves via gene duplication, and genomes are largely comprised of families of genes related via ancestral duplication events.

Kazz has been able to trace the evolution of SCPP genes in many vertebrate lineages, and has found that all SCPP genes originally arose from a gene called SPARC-Like 1 (SPARCL1), which originated earlier from SPARC. Both SPARC and SPARCL1 code for a calcium-binding protein that occupies a space between cells, called extracellular matrix.

SCPP genes aren't always easy to find.  Most related genes are identified because their DNA or protein sequences are close.  But with SCPP genes it's not the DNA sequence, or the amino acid sequences that the DNA codes for that are important to mineralization, but instead their ability to bind calcium.  Therefore, natural selection has conserved not the sequence, but the calcium-binding properties of each duplicate gene.  This involves negatively charged amino acids, which associate with positively charged calcium ions, in the SCPP, but the function of the genes is less dependent on their particular order. Because of their highly biased amino acid composition, most parts of SCPPs do not adopt a rigid 3D structure comprising alpha-helix and beta-sheets. For this reason, these These are called 'disordered' proteins.

Kazz has discovered that the gene structure itself is a clue to whether or not a novel gene belongs in the SCPP gene family.  An important shared characteristic is the presence in the gene  of a 'signal peptide' (that is, a short region at one end of the coded protein that enables the cell to transport the protein to a membranous structure called endoplasmic reticulum reticulum and then secrete it from the cell). Another important give-away aspect of SCPP genes is that their exons begin and end with whole codons; the resulting '0-phase' introns (the non-coding regions of the gene) do not interrupt a nucleotide triplet coding for an amino acid, but instead are between whole triplets -- this is a technical property, for readers not familiar with the field, but is an unusual characteristic of genes if all introns are phase-0, and a major way in which SCPP genes can be found in a newly obtained species' DNA sequence. Kazz has patiently identified and then used these facts to find and understand the evolution of SCPP genes in many different vertebrate species.

Structures of SPARCL1(A), SPARC (B), and SPP1(C). Boxes represent the untranslated region (white), signal peptide (localizes the protein in ECM; gray), and the mature protein (black). The length (nucleotide) of each exon is shown in the boxes. Intron phases are described below. Dashed lines show equivalent introns shifted by intron gain, loss, or sliding. (A) Exons 2–5 code domain I, which is separated by phase 0 introns. Exons 6 and 7 code domain II, and exons 8–11 code domain III. (B) Intron 4 in Ciona and intron 5 in nematode slide 1 base upward or downward, respectively. (C) The penultimate exon codes an Arg-Gly-Asp motif.  Source, PNAS Kawasaki et al., 2004
Another clue is the proximity of potential SCPP genes to one another.  Except for unusual instances such as amelogenin, an enamel-protein gene located alone on the X chromosome (and the Y),  the SCPP genes in most species are in two clusters near to each other on a single chromosome, reflecting their origin via  fairly recent gene duplication events, happening differently in different vertebrate lineages.

One cluster of SCPP genes codes for acidic proteins and the other for proline- and glutamine-rich (P/Q-rich) proteins.  Acidic SCPP genes are crucial to ossification of the collagenous scaffolding that becomes bone, and dentine in teeth.   P/Q-rich SCPP genes are central in the production of enamel, saliva, tears and milk.

This is a rather dense description for anyone, especially non-specialists, but it's important because it is an understanding of these genes and their sequence characteristic and genome organization that allows them to be related to major steps in vertebrate evolution (including primates and ourselves).  When did various important structures, like external protective scales, teeth, internal supporting bones, and even the properties of milk come about?  And what about birds, who have no teeth, but that do need to make eggshells?  And other descendants of earlier vertebrates?

SCPP genes can help tell these stories.

New insights
A recent paper published in Nature ("Elephant shark genome provides unique insights into gnathostome evolution", Venkatesh et al.) reported on what the odd-looking elephant shark genome can tell us about the evolution of gnathostomes (jawed vertebrates).  The emergence of gnathostomes was a major innovation in the evolution of vertebrates, and brought important traits including hinged jaws, paired fins and the adaptive immune system.

Elephant shark.  Source:

Gnathostomes diverged into two major groups during their evolution, bony vertebrates and cartilaginous fishes.  The latter are fish with skeletons made of cartilage rather than hard bone; sharks, rays and chimaeras.  Kazz has traced SCPP genes through bony vertebrates but whether sharks have SCPP genes has been a question that he has long wanted to answer, so the publication of the elephant shark genome was of great interest.

Elephant sharks (chimaeras) are a small cartilaginous fish that live in Australia.  Its genome is about 1/3 the size of the human genome, and it was proposed as a good representative model of cartilaginous fishes, the oldest living group of jawed vertebrates.

Genetic events underlying the emergence of bone formation in vertebrates.

    Duplication of Sparc by whole-genome duplication initially gave rise to Sparcl1, and the subsequent tandem duplication of Sparcl1 gave rise to the SCPP gene family responsible for endochondral ossification. Because the sea lamprey genome contains only Sparc but no Sparcl1 (ref. 22), we have placed the genome duplication event that gave rise to Sparcl1 after the divergence of jawless vertebrates from the jawed vertebrate ancestor. The sister relationship of chondrichthyans and acanthodians is based on ref. 37SCPP, secretory calcium-binding phosphoprotein gene family member. , extinct. Nature 
    The authors of the paper were interested in the genetic basis of bone formation, knowing that SCPP genes are the foundation of mineralization in subsequent vertebrate lineages.  They report finding SPARC and SPARCL1 in the elephant shark but that they were unable to identify any SCPP gene clusters where expected.  They conclude that SCPP genes did not arise until after the emergence of bony fish, and propose that the absence of SCPP genes explains the absence of endochondral bones (bones developing through a cartilage precursor, as our long bones) from cartilaginous fishes.

    With his inordinate patience and an incredibly meticulous talent for detail, Kazz has searched for SCPP genes in the elephant shark genome as well, but is yet to find any, even taking into account that, as we described above, it is the genes' structure, not specific sequence details, that is crucial to finding them.

    After this so-far fruitless search, Kazz still thinks that more research is necessary to confirm the conclusions of Venkatesh et al. As he says, bone originated in ancient jawless fish, which developed extensive exoskeleton (dermal bone), and extensive dermal bone was also found in the common ancestor of bony vertebrates and cartilaginous fish. Therefore, the lack of bone in the modern cartilaginous fishes is a secondary condition; they lost bone during evolution. In fact, the bodies of sharks and rays haves innumerable small scales, called dermal denticles because of their morphological and histological similarities to teeth, whereas, in the elephant shark and other chimaeras, scales grow in very limited locations. The elephant shark is therefore highly derived from their ancestral state in terms of mineralized skeletal elements.

    In the paper, the authors tested their hypothesis that SCPP genes are important for endochondral bone formation by reducing the activity of one SCPP gene, called osteopontin, in zebrafish. The result indicated an essential role of osteopontin in normal bone growth. However, this experiment showed that osteopontin is important for the growth of both endochondral bone and dermal bone, which is consistent with the fact that the human osteopontin gene is expressed in both endochondral bone and dermal bone. Thus, this experiment does not appear to support their hypothesis that the absence of osteopontin explains the absence of endochondral bone. Instead, it is possible that osteopontin could have originated before the origin of endochondral bone and that this gene was subsequently lost in the elephant shark, and possibly other cartilaginous fish, as they lost the entire bone.

    Not finding a particular gene in the elephant shark does not necessarily mean that this gene is absent in the genome. Furthermore, we only know the genome sequence of elephant shark, a species of chimaera. Kazz believes that we will learn more about the evolution of tissue mineralization from the genome of other cartilaginous fishes, and he is continuing to do this work.

    Reconstructing important aspects of our evolution can be very challenging, and require more than superficial or quick scans of genomes.  But the reward for doing that is to find how evolution has taken advantage of available genes as new structures arise, or when a given structure or mechanism is no longer needed for whatever reason, has lost related genes that it no longer uses.

    Update, 3/1/14 
    In reply to Alex Stoddard's question about intron/exon structure in a comment (below), Kazz said the following:

    We have reported that all SCPP genes are derived from the last common ancestor, the SPARC-like 1 gene (SPARCL1). SPARCL1 consists of 10 protein coding exons (figure). Among these exons, the most upstream exon mostly encodes the signal peptide (yellow in figure), and the following three exons (blue in figure) code for a large and highly acidic (negatively charged) amino acid sequence, comprising many glutamic acids, aspartic acids, and phosphorylated serine residues. Similar protein coding exons are found in SCPP genes that are strongly expressed in bone and/or dentin, although the number of exons coding for the acidic sequence may be different. It is thought that such highly acidic sequences are essential for these proteins to associate with a large number of calcium ions (positively charged).

    In fact, SPARCL1 also has an evolutionary precursor, called SPARC. These two genes are similar to each other but are different in the size of the acidic region (figure). SPARCL1 has a huge exon that code for most of the acidic region, whereas SPARC has only two small exons for the acidic region. It follows that SPARCL1 presumably obtained the huge additional exon, which resulted in a larger calcium-binding capacity of the encoded protein. Based on these observations, we proposed that the enlarged acidic region arose in SPARCL1 was inherited to acidic SCPP genes.

    In both SPARC and SPARCL1, introns within acidic-region coding exons (blue exons in figure) are all phase 0 (so exons 3 and 4 in SPARCL1 and exon 3 in SPARC are phase 0-0 exons). It is likely that the phase 0-0 exon was one important factor that facilitated enlargement of the acidic region. Obviously, duplication of these exons can enlarge the acidic region. However, duplication of exons may result in frameshift. After duplication, the original highly acidic sequence (and downstream sequence) can be maintained only when this sequence is coded by phase 0-0, phase 1-1, or phase 2-2 eons.

    It is astonishing to find that some SCPP gene with a large number of exons have all phase-0 introns. For example, mouse casein 1s1 gene (casein genes belong to the SCPP gene family) consists of 31 protein coding exons, and all these exons are flanked by phase 0 introns. However, most of these exons are probably derived from only a single phase-0-0 exon of ancient SPARC (arrow head in figure).

    Reference: Genetic basis for the evolution of vertebrate mineralized tissue. Kawasaki, K., Suzuki, T., and Weiss, K. M. 2004 Proc. Natl. Acad. Sci. USA. 101: 11356-11361.


    Mineralized tissue and vertebrate evolution: The secretory calcium-binding phosphoprotein gene cluster
    Kazuhiko Kawasaki and Kenneth M. Weiss, PNAS, 2003

    Genetic basis for the evolution of vertebrate mineralized tissue
    Kazuhiko Kawasaki, T Suzuki, KM Weiss, PNAS, 2004

    Gene duplication and the evolution of vertebrate skeletal mineralization 
    Kawasaki, K., Buchanan, A. V. & Weiss, K. M., Cells Tissues Organs, 2007

    The SCPP gene repertoire in bony vertebrates and graded differences in mineralized tissues
    Kazuhiko Kawasaki, Dev Genes Evol, 2009

    The SCPP gene family and the complexity of hard tissues in vertebrates
    Kazuhiko Kawaski, Cells Tissues Organs, 2011

    Elephant shark genome provides unique insights into gnathostome evolution, Venkatesh et al., Nature, 2014


    Alex Stoddard said...

    [Ah blog commentary - an opportunity to show my ignorance to the world. But also an opportunity for me to learn something (-:]

    Intron-exon structure seems relatively little studied. Presumably this is because until recently the great bulk of comparative molecular data has been in the form of cDNAs which have already been spliced.

    Exclusively 0-phase introns are a most peculiar feature to me.

    Is this believed to be a happenstance from the origin of the SCPP gene family?

    My presumption is that once present intron-phase is very hard to change. It would seem to require two consecutive frame-shifting mutations the second compensating for the first. This would need to happen before the opportunity to regain function is lost to unconstrained genetic drift of the intermediate pseudo-gene.

    I find it very hard to envisage a molecular or functional mechanism that would require exclusively 0-phase introns in and of itself.

    Ken Weiss said...

    Kazz is preoccupied right now, but says he'll respond in a few days. I think your questions can be answered, and he (and I as a collaborator) looked at the exon/intron structure and so on in several different papers. Phase-0 introns is not an unknown phenomenon, and has been studied by lots of people in general (and by Kazz in detail in this SCPP case).

    If you email me, I'll send you a couple of reprints on the SCPPs and their evolution.