Wednesday, February 3, 2016

Thoughts on the latest schizophrenia genetics report

The news and social media were headlining a report last week that presented some genetic findings, and even aspects of a possible causal mechanism, related to schizophrenia.  As habitually skeptical readers of these daily stories, we wondered how substantial this claim is.

The report in question was a Nature paper by Sekar et al. that identifies variation in the very complex MHC genome region that, based on the authors' analysis, is statistically associated with schizophrenics relative to unaffected controls. These are variants in the number of copies of particular genes in the C4 'Complement' system.  The authors show that gene copy number is correlated with gene expression level and, in turn, with some changes in brain tissue that may be related to functional effects in schizophrenia patients.

Comparing genotypes and disease status, in ~30,000 cases and controls of European ancestry, in 40 cohorts from 22 countries, the authors find that genotypes with higher C4 gene copy numbers are more frequent in schizophrenics, and there is a quantitative relationship between copy number and expression level in postmortem-tested neural tissue.  The relevant potential mechanism involved may have to do with the pruning of synapses among neurons in the brain.

The authors estimate that the relative risk of the highest-copy number genotype is 1.27 times that of the lowest. The lowest risk genotype is rare in the population, comprising only about 7% of the sample population, meaning that almost everyone has a middling relative-risk genotype.  That is comparable, say, to most of us having middling height or blood pressure. But the net population absolute risk of schizophrenia is about 1%, so that the absolute risks associated with these various genotypes are small and not even very different from each other.  The careful work done by the authors has many different components that together consistently seem to show that these copy number differences do have real effects, even if the absolute risks are small.

How that effect or association arises is not clear, and the findings are certainly not the same as explaining schizophrenia as a C4 disease per se.  As the authors note, around 100 or so other chromosome locations have been associated with the disease in genome-wide mapping studies that have been done.  That means that if their results stand up to scrutiny, C4 variation is one component of what is basically a polygenic disorder.  The association for each C4 genotype category is the effect averaged over all other contributing causes in those people. The absolute risk in individuals with a given copy number is still very small, and may depend on other genetic or environmental factors.

Schizophrenia is not a single disorder and has a spectrum of onset age, sex, symptoms, and severity of occurrence.  Many authors have been warning against using a single term for this variety of traits. Whether that is relevant here or not remains to be seen, but at least as presented in the their paper, some of the current authors' results seem not to vary with age.  This study doesn't address whether there is a smallish subset of individuals in each C4 category who are at much higher risk than the average for the category.  However, the familial clustering of schizophrenia suggests this may be so, because family members share environments and also genomic backgrounds.  One might expect that C4 genotypes are interacting with, or if not, being supplemented by, many other risk factors.

Even if average risk is not very high in absolute terms, this paper received the attention it did because it may be the first providing a seemingly strong case for a potentially relevant cellular mechanism to study, even if the specific effect on risk turns out to be quite small.  It could provide a break in understanding the basic biology of schizophrenia, given the dearth of plausible mechanisms know so far.

Because the statistically riskier genotypes are found in a high percentage of Europeans, one would expect them to be found, if at varying frequencies, in other populations than Europeans. Whether their associated risks will be similar probably depends on how similarly the other risk factors are in other populations.  C4 copy number variation must be evolutionarily old because there is so much of it, clearly not purged by natural selection--another indicator of a weak effect, especially because onset is often in the reproductive years and would seem to be potentially 'visible' to natural selection. So why is the C4 variation so frequent?  Perhaps C4 provides some important neural function, and most variation causes little net harm, since schizophrenia is relatively rare at roughly 1% average risk.  Or, copy number changes must happen regularly in this general MHC genome region, and can't effectively be purged, but is generally harmless.  But there is another interesting aspect to this story.

The Complement system is within a large, cluster of genes generally involved in helping destroy invading pathogens that have been recognized.  It is part of what is called the 'innate' immune system. Innate here means it does not vary adaptively in response to foreign bodies, like bacterial or viruses, that get into the blood stream.  The adaptive immune system does that, and is highly variable for that reason; but once a foreigner is identified, the complement system takes part in destroying it.  So it is curious that it would be involved in neural degeneration, unless it is responding to some foreign substance in the brain, or is an autoimmune reaction. But if the latter, how did it become so common?  Or is the use of C4 genes in this neural context a pleiotropy--a 'borrowed' use of existing genes that arose for immunity-related functions but then came also to be used for a different function?  Or is neural synapse regulation a kind of 'immune' function that hasn't been thought of in that way?  Whatever it's doing, in modern society it contributes to problems about 1% of the time, for reasons for which this paper clearly will stimulate investigation.

Why does this system 'misfire' only about 1% of the time?  One possible answer is that the C4 activity prunes synapse connections away normally in a random kind of way, but occasionally, by chance, prunes too much, leading to schizophrenia.  The disease would in that sense be purely due to random bad luck, rather than interacting with other mechanisms or factors. The higher the copy number the more likely the bad luck but too weakly for selection to 'care'.  However, that reason for the disease seems unlikely, for several reasons.  First, mapping has identified about 100 or so genome regions statistically associate with schizophrenia risk, suggesting that the disease is not just bad luck. Secondly, schizophrenia is familial: close relatives seem to be at elevated risk, 10-fold in very close relatives and almost 50-fold in identical twins.  This should not happen if the pathogenetic process is purely random, even though since haplotypes are inherited in close family members there could be a slight correlation in risk.  Also, the authors cite several incidental facts that suggest that C4 plays some sort of systematic relevant functional role.  But thirdly, since the absolute risk is so small, about 1%, one has to assume that C4 is not acting alone, but is directly interacting with, or is complemented by (so to speak) many other factors to which the unlucky victims have been exposed.

Something to test?
This might be a good situation in which to test a variant of an approach that British epidemiologist George Davey Smith has suggested as 'Mendelian randomization'.  His idea is basically that, when there is a known candidate environmental risk factor and a known gene through which that environmental factor operates, one can compare people with a genetic variant exposed to an environmental risk factor to people with that genetic risk factor but not exposed to test whether the environmental factor really does affect risk.

Here, we could have a variant of that situation.  We have the candidate gene system first, and could sort individuals having, say, the highest 'risk' genotypes, compared to the lowest, and see if any environmental or other systematic genomic differences are found that differentiates the two groups.

Interesting lead but not 'the' cause
Investigating even weakly causal factors could lead the way to discovering major pathogenic mechanisms or genetic or environmental contributors not yet known that interact with the identified gene region. There will be a flood of follow-up studies, one can be sure, but hopefully they will largely be focused investigations rather than repeat performances of association studies.

Given the absolute risks, which are small for given individuals, there may or may not be any reason to think that intervening on the C4 system itself would be a viable strategy even if it could be done. This still seems to be a polygenic--many-factorial--set of diseases, for which some other preventive strategy would be needed.  Time will tell.

In any case, circumspection is in order.  Remember traits like Alzheimer's disease, for which apoE, presenilins, beta-amyloid, and tau-protein associations were found years--or is it decades?--ago and still mystify to a great extent.  Or the critical region of chromosome 21 in in Down syndrome that has, as far as we know, eluded intensive study for similarly long times. And there are many other similar stories related to what are essentially polygenic disorders with major environmental components.  This one is, at least, an interesting one.


Ed Hollox said...

I'd argue that C4 copy number variation is not necessarily evolutionarily old but generated by recurrent mutation, which would explain different copy number alleles on different SNP haplotypes, and also why mild negative selection has not removed the variation.

Ken Weiss said...

If C4 CNV is as common as the report says, I would be surprised if the same is not found in other species, even the mouse (surely that's known, though not by me). It seems to be too prevalent not to be a part of long-term copying processes. Recurrent 'mutation' would mean, I think, slippage and so on, not nucleotide changes. That sort of thing happens once multiple adjacent copies of a gene arise, and there are plenty of other genomic precedents.

Anyway, it's an empirical question how widespread it is among mammals. Mild negative selection might remove copies, but new slippage events would recreate new multiple copies. Mild selection might just keep the system from having too many copies, or something like that. Or population dynamics could over-ride the effects of mild selection. And there may not be any systematically negative selection for the SZ trait now, or in the past. Some have argued that such behavioral traits could even have an advantage in securing mates, etc. That's been a long-standing point of view, whether right or wrong.

Joseph D. Terwilliger said...

I find it incredibly scary that the lead author was quoted in the New York Times as saying that the risk variant has a relative risk of 1.25, which he said means that the risk of carriers is 0.0125 instead of 0.01... While this itself is a trivial effect, it is complete bollocks... Based on the allele frequencies of the different structural forms of C4 from the paper, and given they assigned the relative risk of 1.0 to the BS form, meaning that all risks are relative to that one, not to the population prevalence, the actual risk of carriers of the AL-AL form would be 0.0107, not 0.0125, while the risk of carriers of the RARE BS form would be reduced from 0.01 to 0.0084... Effectively the point is that 0.0107 is a MUCH tinier increase over the 0.01 population risk than 0.0125, and its kind of scary that the lead author doesn't understand his own statistics, or at least that he conveyed it so inaccurately to the media.....

Ken Weiss said...

Thanks for making this comment. The problem is the hunger for attention by investigators and for Big Stories by the news media. The real story here, if there is one, is the potential functional lead to something involved in relevant brain function. Why this should involve Complement, which has been thought (I thought, at least) to be about immunity, is interesting. But I cant' guess what it might mean with any degree of knowledgeability. The fact that these causal effects are so small should also have been said, however, not just the impression that this accounts for a high fraction of schizophrenia or high risk to the individual.