We were told in no uncertain terms the other day that no one believes in single gene causation anymore. Genetic determinism is passé, and everyone knows that most traits are complex, caused by multiple genes, gene x environment interaction, or if they're really sophisticated, epigenetics or the microbiome. But is this the view that investigators actually follow? That's not so clear.
We all throw around the word 'complex' as if we actually believe it and perhaps even understand it. Of course, (nearly) everyone recognizes that some traits are 'complex' meaning that one can't find a single clear-cut deterministic cause, the way being hit on the side of the head with a baseball bat by itself can send one out for the count. But in fact the hunt for the gene (or, alright if you insist!) genes 'for' a trait is still on. You name your trait: cancer, diabetes, IQ, ability to dunk a basketball, or get into Harvard without a Kaplan course, and someone's still looking for the gene that causes it.
It is this push to find genes for traits, despite all sorts of denials, that fuels the GWAS and similar fires. Caveats notwithstanding (and usually offered just to provide technical escape lest one is wrong), that is what the promised 'personalized genomic medicine' in its various forms and guises is all about.
So let's take a careful, and hopefully even thoughtful look at the idea of genetic causation. We are personally (it must be obvious!) quite skeptical of what we think are excessive claims of genetic determinism (or, now, microbiomial determinism), but they are still being made, so let's tease a few of them apart.
Single gene causation does exist, at least sometimes (doesn't it?)!
First, though, what do we mean by the word 'causation'? Generally, we think people mean that gene X or risk factor Y is sufficient to cause trait Z. But, it might also mean that gene X or risk factor Y are necessary but not sufficient causes of trait Z. The baseball bat might have caused your concussion, but in fact someone had to swing it.
There are well-documented single risk factors, genetic and otherwise, that everyone accepts 'cause' some disease in a very meaningful sense. Examples are some alleles (variant states) of the CFTR gene and Cystic Fibrosis (CF), BRCA1 and 2 variants and breast cancer, or smoking and lung cancer. Having the alleles associated with CF or breast cancer, or being a long time smoker do put people at high risk of disease.
But, for these and other examples there are usually healthy people walking around with serious mutations in the gene, or heavy smokers who enjoyed their cigarettes well into old-age. Gene X or factor Y aren't sufficient to cause trait Z. There are also hundreds of genetic variants found in patients that are assumed to be causal, but for elusive reasons (for example, mutations in non-coding regions near to the CFTR coding regions that have no known function). What is it about gene X that causes trait Z? We don't know, but gene X looks damaged in this person, so it must be causal.
The CF case is interesting. This is an ion channel disease. Ion channels are gated openings on the cell surface that pass sodium, potassium, calcium and other ions into and out of the cell in response to local circumstances. CF is characterized by abnormal passage of chloride and sodium through ion channels, causing thick viscous mucous and secretions, primarily in the lungs but with involvement of the pancreas as well. If the ion channel is badly built, or doesn't get to the surface of cells lining various organs like pancreas and lungs, then the cell cannot control its water content, secretion, or absorption, and the person with the malfunctioning channels has CF. Again, gene X causes trait Z.
But there are gradations in channel malfunction, and gradations in severity of the disease, and we have no way to know how many people are walking around with variants but no actual disease. Here, we can say that when it happens, CFTR mutations do cause the trait in the usual way. But what about when there are mutations but no disease? Gene X doesn't cause the disease after all? Or disease and none of the known causal mutations? Wait, we thought gene X caused the disease? Could we be assuming single gene causation, and looking only at the CFTR gene, rather than at many other aspects of the genome that may affect ion channels in the same cells or, indeed, may cause the trait in a way we could understand if we but identified them? This is an open question--but it applies to many other purportedly single-gene diseases. Gene X and some other gene/s, or some environmental factor cause the disease in at least some instances. Is it simple or isn't it?
The BRCA story is also interesting. A BRCA1 variant associated with disease does not lead directly to cancer. Instead, BRCA1 is a gene that detects and repairs genomic mutations in breast (and other) cells. If you have a dysfunctional BRCA1 genotype, you are at risk of some one breast cell acquiring a set of mutations that don't get detected and repaired. What causes those mutations? Some happen when cells divide, so the activity of breast cells affects the rate of mutational accumulation. Other lifestyle factors do as well (parity, age of childbearing, lactation and apparently things like diet and exercise). And a person with a causal BRCA mutation lives perfectly healthfully for decades, which if you think in classical Mendelian terms, would not happen if s/he had a 'bad' gene. BRCA doesn't exactly cause cancer, but it allows it to be caused. Gene X plus time plus environmental risk factors cause the disease. Though, we all believe it's a single gene, BRCA1 or 2, that causes cancer.
The obvious non-genetic instance, smoking and lung cancer, is similar but not exactly the same. Smoking is, among other things, a mutagen: it damages genes. So one, if not the major, reason for the association is that the mutations caused by smoke can damage genes in lung cells that lead those cells to proliferate out of control. The reason the risk is probabilistic -- that is, a smoker doesn't have a 100% chance of getting lung cancer -- is that it's impossible to know how many or which mutations a given person's smoking has led to. In fact, smoking is only an indirect cause, since it is mutant genes in lung cells that, after accumulating in an unlucky way, start the tumor. Still, in this case, knowing how much a person has smoked can allow one to estimate in some probabilistic way the relative risk of lung cancer due to enough mutations having arisen in at least one lung cell. Still, many who smoke don't get cancer, and many get cancer who don't smoke. Since smoking, and a few other such risk factors (e.g., exposure to asbestos, and some toxic chemicals) have strong effects, even if probabilistic, everyone is generally comfortable with thinking of them as causal. Risk factor Y causes trait Z. But, in fact, risk factor Y plus time plus unlucky mutations cause trait Z.
Deterministic genotype 'pseudo-single-gene' causation?
More problematic are 'complex' traits, that clearly are not due to simple single gene variants--they don't follow patterns of trait appearance in families that would be consistent with simple Mendelian inheritance. The number of such traits is legion, and is driving the GWAS industry as we have many times commented on. They are typically common in the population. They are complex because we feel--know, really--that many different factors combine somehow to generate the risk. Most instances are not due to one factor alone, though some may be in the sense of BRCA and CFTR. So, we do something like genomewide association studies to try to identify all the potentially causal parts of the genome (and, similarly, but less definitively as a rule, lifestyle factors, too). Here, we'll assume that all of the genomic variants that might contribute are known and can be typed in every person (this is, as everyone knows, far, far from being true at present, if it's even possible).
Advocates can deny any element of single-gene thinking in GWAS reports, where hundreds of loci are claimed to have been found, but these are treated as causal, and major journals are filled to the gills with papers with titles to the effect of "five novel genes for xyz-itis". This is the slippery slope of simple causation thinking.
If multiple factors contribute, what we know is that most do so only probabilistically in the above senses. That there are other things at work is rather obvious for the many traits that even in those at high risk don't arise until later or even late in life. And, in reality, it is nearly always true that cases of the disease are associated with individually unique combinations of the risk factors. So, your personalized risk is computed as some kind of combination, like the sum, of your estimated risk at each of the putative sites, R= R1+R2+R3...., where at each site one allele is given the minimal risk (or, perhaps, zero) and the other allele the risk estimate by the difference in prevalence of the trait in cases vs in controls. This might be considered the very opposite of single-gene causation, but conceptually it's pretty much the same, because it treats your aggregate as a single kind of risk score, as if it were acting as a unit. The idea would be completely analogous to the risk associated with a specific variant at the CFTR or BRCA gene. Your genotype as a whole would be viewed essentially as a single cause.
These are examples of causal rhetoric. But these causes are probabilistic. What does that mean? It means that you are not 100% protected from getting the trait, nor 100% doomed. Your fortune is estimated by some number in between. We call it a probability, estimated either from the presence of one risk-factor or from the fixed set that you inherited. But what does that probability mean, and how do we arrive at the value and how reliable is it? Indeed, how often can we not even know how reliable it is?
These are topics for tomorrow.