Monday, June 2, 2014

The visible colors: and the falseness of human races as natural categories

There is a constant tension between the tendencies to view the world in continuous vs discrete terms.  Even in science, this can be a problem, because a continuous view can lead to different interpretations than a discrete view.  Disputes about reality can arise, perhaps, over the distinction.  Is something a particle, or is it a wave?  Are the categories of a discrete view natural realities, or are they being imposed on Nature for some human reason?

The argument currently afoot has to do with how culpable it is to use genomic variation data to claim that there are a small number (usually stated as 5) major or primary human races, that blur at the intersections between them.  And, as commonly used software has it, those 'blurred' individuals are considered to be admixed between parents from the 'pure' races.

This is very misleading scientifically and, worse, unnecessarily so.  No analogy is perfect, but we can see the major issues using the example of color, which is often cited as comparable and showing the validity of the 'race' assertion (here, e.g.).  Color is the word we use for our sensory perceptions, the qualia, or psychological experience, by which we perceive light.  In physical terms, a given color is produced by light photons with a given energy level, with particular wavelengths or frequency (since light has a fixed speed, higher frequency means more waves pass by per second, and hence are shorter so they add up to the distance traveled in a second). From that point of view, here is the range of the colors to which the 'standard' human eye (that is, genotype) can respond, that is, a graphic portrayal of the wavelengths we detect:

The spectrum of visible light (wavelength in nanometers).  Wikimedia commons

The word 'color' refers to the qualia of perception, but we assign names to particular wavelengths, a cultural phenomenon based on our particular detection system. In those terms, visible light is a continuum of detectable wavelengths.  But traditionally, given that we are trichromat beings (with three distinct opsin genes, that is, whose three coded proteins each responds most efficiently to a different wavelength--see diagram below) we name three what we term 'primary' colors.  Each retinal 'cone' cell normally produces one of these opsin pigment proteins.  Each color of light that enters the eye triggers an appropriately weighted mix of red, green, and blue signals.  So for example pure
blue' frequency light basically only triggers a response from retinal cone cells that express the blue opsin gene product.

Basically, our ability to perceive any wavelength across the visible range is due to our brain's ability to mix the signal strength received from the retinal cells reporting its respective color activations. We often think of colors as being a mix of these primary colors, but there is nothing physically primary about them. They are artificial mark-points chosen by us because of our particular opsin repertoire.  One could choose other mark-points, and there need not be three (some species have fewer or more), and still perceive light in the entire visible (or even broader) wavelength range.  Various activities such as printing and the like have used different 'primary' colors (e.g., Google primary colors).   When we receive a mix of frequencies, our brain can sort out that mix and identify it.

What 'typical' human cone cells respond to.  Source:

In a sense, so long as you realize what is being done, there is no problem.  But if you think of the light-world as being inherently made of truly primary color categories, and of other colors as blurs in the edge of these categorical realities, then you are seriously misunderstanding the physical reality. First, the color spectrum reflects the color, as we perceive it, of single-wavelength radiation.  No individual wavelength is 'primary'.  Second, other colors are a mix of wavelengths that a trigger response by red, green, and blue opsins, and are synthesized (such as to be interpreted as 'pink') by the brain.

This is also a stereotype for two other reasons.  First, there is considerable variation among humans in the response characteristics of our opsins--the figure shows a typical response pattern for a reference blue, green, and red rhodopsin protein.  And of course a substantial fraction of people can't see some colors because they are missing one or more normally functioning opsin gene.  Secondly, the qualia, or what makes a given wavelength be experienced as 'blue,' is beyond current understanding, nor do we know that what you see as blue is the same as what I see as blue--even if we both have learned to call it 'blue'.  At present this is in the realm of philosophers, and causes a discussion--but no harm is done.

But that is not always the case.  Sometimes when falsely dividing a phenomenon into categories assumed to be true units rather than arbitrary reference points, with some rather unimportant blurs at the boundaries between the categories, the results of the error can be, literally, lethal.  This has been one consequence of the mis-use of theoretical misrepresentation of quantities as categories in human affairs.

Races are not like primary colors
We are writing this because there has been a recent resurrection of science that knowingly misrepresents the global distribution of human biological variation.  People are not photons, and we do not exist in 'primary' groups with blurred boundaries between them--any more than blue, red, and green are sacred and special points in the color spectrum.

We hear a lot of innocent-sounding talk about how one can argue for the existence of human 'races' as genetic, not just sociocultural, entities--but not be a 'racist'.  Yes, the argument goes, there is blurring at the edges, but the categories are real and they exist.

Human populations have long lived on different continents and some of our recent evolution as a species has taken place across great spans of distance, with geographic effects on the rates of gene flow over distance. Time and local geography, climate, culture, food sources, prey and predators and the like vary over space as well, and have in various ways led to adaptive differences among people, differently in different places.  Both cultural and genomic variation has accumulated around the globe.  But with few exceptions, such as truly isolated islands, genomic differences are correlated with geographic distance. 

Europe and Africa are not wholly discrete parts of the world.  The Americas may have been close to that, but only for about 10,000 or so years.  To assert that Europeans are genomically different from Africans, you must define what you mean by these categories.  Do you mean Italians are different from Egyptians?  Or do you mean Bantu speakers from South Africa are not the same as Norwegians?  This is important because with the same statistical methods of analysis, the same sorts of variation, if proportionately less in quantity, occur within these areas.  And had the analysis been done 1000 years ago, the major population of the world might be considered to be the Middle East, not Europe, because the decision of what are the major races, and what the admixed blurs would have been made by Islamic scholars, perhaps with some complaints by the high culture in India.   Choosing other populations as reference points ('parental' populations, or actual 'races')--Tahitians, Mongolians and South Indians, say, rather than the usual Africans, Europeans and Native Americans--would yield very different admixture statistics, because admixture programs are based on assumptions about history, not some inherent 'truth'.

So even those who want to stress differences, for whatever reasons, and who want to make assertions based on the several 'continents', themselves somewhat arbitrarily defined, have to be clear about what they are asserting--what they define as 'race', in particular.  This, of course, is made far more complicated by the 'admixture' that has occurred throughout known history of mass migration. Indeed, even the concept of 'admixture' itself requires specifying who is mixing with who--which in turn determines the outcome of admixture studies.

This sort of analysis has another aspect that is not properly understood.  The user chooses which and/or how many populations are considered parentals, of which other sampled individuals are admixed product.  These are statistical rather than history-based assumptions, using various sorts of significance criteria (which are subjective choices).  And, importantly, this type of analysis is based on alleles that were chosen for study because they are global--that is, the same variants are shared by the different  'races', just in different frequencies. Truly local variation is just that, local, so groups can't be compared in the same way.  Any sample you might choose to take will have lots of rare variants, found nowhere else.  So races in much if not most of the modern discussion, are groups defined in part because their frequencies of the same variants differ.  The genotypes in one 'race' can appear in others as well, but with lower probability.  If you want group-specific variants, you will usually find that they depend essentially on how you define the groups, and very rarely will everyone in a group that is more than very local have the purportedly characteristic variant.  A given genotype may be more likely in one pre-defined sample or group, but these are quantitative rather than qualitative differences largely based on local proximity.  Locally restricted variants can be important in adaptive traits, depending on the dynamics of history, and they can be exceedingly important, but they are generally far from characterizing everyone in a group or in defining groups.  People come into this world as discrete entities, but this is not how populations are generally constructed or evolve.

If we were talking about turtles or ostriches or oaks, nobody would care about these distinctions, even if there is absolutely no need to use such categories.  There are ways to represent human biological variation over space in more continuous terms, avoiding the obviously manifest problems with false, vague, or leaky categories of people, or making excuses for the 'blurring' at the edges, as if those blurred individuals were just no-accounts staggering around polluting the purity of our species!  Asserting the supposed reality of 'race', that is, of true categories on the ground rather than just in the mind, leads to all sorts of scientific problems and, of course, historically to the worst of human problems.

Does it make sense to ask whether members of 'the' European' race are taller than those in the African 'race'?  What part of Europe, and what part of Africa do you mean?  Ethiopia?  Nigeria?  Botswana?  Norway?  Greece? And does the person have to be living there now, or just have had all his/her ancestry from there?  And what about that 'his/her'?  Do we have to consider only living 'Africans' and 'Europeans', or can we use, say, skeletons from these 'races' from any time in the past (should be OK, if the trait is really 'genetic' since gene pools change slowly).  Or can we use Kazakhs or Saamis or Mbutis in our 'race' comparison?  Clearly we have to start refining our statements, and when that is the case even for societally rather neutral traits like stature, how much more careful need we be when we raise topics--as those who like to assert the reality of 'race' can't resist focusing on--with sociocultural or policy relevance (criminality, intelligence, addictability, reckless behavior, genes for ping-pong skill or running speed or being a violinist)?  Why do we need the categories, unless it really is a subterranean desire to focus on such traits to make a political point....or to affect policy?

At the same time, when scientists who think carefully and avoid this sort of categorical thinking, or even deny the reality of categories, or denigrate the idea that the categories are 'just' social constructs, they (the scientists) are denying what is an even greater reality.  That is that, for many people, 'race' is an entirely real category, one they experience on a daily basis.  If in the US you are 'black' or 'white' or 'Hispanic' or 'Asian' you are treated in a group-based way culturally.  If you have any phenotypically discernible African ancestry, for example, you may very well be treated as, and feel as if you were  'black', regardless of your ancestry fraction.  You may have some legal rights if you have at least 1/8 Native American ancestry, and for that and other reasons, you may know very well that 'race' does exist as a reality in your life.  This is inherently a sociocultural construct, and hence a reality.  In that very correct sense, the existence of 'race' is a scientific fact.

Scientists who acknowledge this but then continue to assert the genomic reality of race, essentially  because it is a convenient shorthand and because the bulk of data come from widely dispersed people, play into the hands of the ugliest aspects of human history, and given that history, which they know very well, they do so willingly.  Some even do it with great glee, knowing how it angers 'liberals'. One can speak of genetic (and cultural) variation having a geographic-historic origin that is (except for recent long-distance admixture) proportional to distance, and can think about local adaptations,  without using categorical race concepts.  Some may argue with what is genomic, what is the result of natural selection, and what is basically cultural. But there is no need to wallow in categories, and then  no need to try to define the 'fuzzy boundaries' between them.

Evolutionary genetic models as they are conventionally constructed contribute to the problem, because they are based on the frequency of genetic variants, and frequency is inherently a sample statistic. That is, frequencies are based on a population of inference, specified by the user.  A population is defined as if it had specific boundaries.  Natural selection is also modeled as if 'environments' were packaged in population-delimited ways.  For many reasons, it would be better if we developed less boxed-in evolutionary concepts and analysis, but that's not convenient if it takes time or means your book or grant can't just be dashed off without considering serious underlying issues like these, or if the hurried press likes to take whatever you say and make hay with it.

The use of 'primary' color category concepts is arbitrary relative to the actual color spectrum, but at least is based on our retinal genes, which in a natural way provide a convenient set of what are otherwise arbitrary physical reference points.  Nobody is disadvantaged by the use of those categories in human affairs.  But human populations are not in natural categories, categories are not needed, and they are not neutral relative to human affairs.

Like the light spectrum, there are not, and never have been 'primary' colors of humans.  What is true, however, is that when it comes to that topic, a lot of people cannot see the light.


Holly Dunsworth said...

I'm so impressed by this post that I'm moved to comment. All I can seem to come up with is thank you.

Holly Dunsworth said...

Oop. Nope. Got a bone to pick. "People come into this world as discrete entities..."-- this sort of thinking, too, could be unraveled, dismantled, even flat-out rejected! But that's for another day. Thanks again.

JKW said...

Great post, Ken. It makes me want to revisit Marshall Sahlins (1976) "Colors and Cultures."

Ken Weiss said...

I think this is a semantic issue on my part. I meant that I, you, your baby, and my wife and kids are individual human units. We don't blur into each other. I didn't mean we come as a member of a categorical group like a 'race'. Of course I also didn't mean conjoined twins.

Ken Weiss said...

I've read a lot of Sahlins, and he's a very smart guy, though of course he has his own polarization (he was one of the faculty members when I was a graduate student at Michigan, and at that time a great manifestation of the then very high level of leading ethnologists. They were seeking a scientific basis for culture and its evolution, and needed to do that in the post-WWII reaction to what the Nazis had done to the world. He became more political and I guess 'post-Modern' and polarizing, but someone like him was so intellectually respectable that, agreeing or not, you learned from reading what he had to say. I'll go back and read Colors and Cultures, which I didn't know of (or don't remember). Thanks for pointing it out.

Bill said...

Bravo, Ken. I betcha Frank is smiling.

Brad Weiss said...

I though the same thing! Marshall recently said this was perhaps his favorite among his essays. Here's the ref:

Ken Weiss said...

Well, Frank would be furious that things long known (actually, long before his writing on the subject) have still not been absorbed. Of course, those who haven't 'absorbed' it usually know that--it's intentional.

Ken Weiss said...

deGruyter, obviously a very greedy outfit, still makes one pay rather a lot for a reprint of this 40-year old paper. So I have had to order it and wait for my library here to make me a copy. I now do remember the paper, however, and it was widely read at the time.

Holly Dunsworth said...

Please send my way when it arrives, if they send as pdf.

Daniel M Parker said...

Please pass it my way too.

And yes, this is a great post Ken.

Nathaniel said...

As usual, beautifully reasoned and written. Should be required reading for anyone following the debate over Nick Wade's book, and I will circulate it. I'm a historian and so have my own arguments against the book, but this is a potent and clear argument from the genetic side. Thank you. Two questions:

First, how do you respond to the argument that the visual spectrum has no natural boundaries, like oceans, deserts, and mountain ranges? There's no electromagnetic Sahara causing certain frequencies to bunch up on one side or the other. I think I know the answer, but would like very much to hear how you'd answer.

Second, what about the medical side of race? Physicians insist that, until whole-genome sequencing is so cheap and fast it it is part of any medical exam, racial profiling will be a useful proxy for a haplotype that is clinically beneficial. How do population geneticists respond to doctors who say that theoretically, human genomic variation may be a spectrum, but practically, races "work" for a lot of diagnosis?

Again, marvelous post. Thanks again.

Ken Weiss said...

I'd answer the first by repeating my caveat that no analogy is perfect. Human variation follows a rough isolation-by-distance pattern (Jeff Long has showed that this is an approximation, and partly because of geographic and other factors). Also we appear to have spread from Africa by a kind of serial-expansion process. There would have been, among these hunter-gatherers (as a rough stereotype) back and forth each generation because of rules about clan or village exogamy and just plain trade, friendship and so on. This is gene flow not hermetically sealed boundaries.

Of course topography leads to unevenness in population size and distribution and hence in genomic variation. But that is not the same as categories, and if one realizes that the admixture models rely on categories identified statistically, and that also depends on sampling and how one specifies the problem, the model is rather arbitrary. Genomic variation isn't uniformly spread like a coat of paint, but it's not parceled into categories.

If sequencing will be so cheap, and if (a big, big 'if') genomic data are important predictors, then DNA sequencing can be done without 'race'. Of course, if environments are correlated with self-reported 'race' and if there is some geographic-origin correlation, then there can be some utility to asking about 'race'. But then this is a self-reporting sociocultural phenomenon.

My nephew and my niece have one 'white' and one 'black' parent. He lives a 'black' life and she a 'white' one based on jobs, friends, marriage, neighborhood, etc. Should a doctor treat him as 'black' and her as 'white'?

This shows, to me, the problem of conveniently falling back on that kind of utility. Genomic background is important, and has geographic-historic relevance. Some variants have different disease risk depending on the sociocultural/geographic background of people when they are sampled by such categories (whether self-reported or investigator-assigned). But if you think in more fluid rather than categorical terms, many subtleties that are just as important, and especially sociocultural (even psychological) factors become important--and there, I think, the categories are most truly real and important--but not because of genes.

And what, then, about someone who is Eastern European, or Kasakh, or Korean vs Indonesian. Or Afghans? Or many groups within India? How far from their assigned 'race' peak do they have to be before the utility is lost?

Above all, however, it is that the categorical treatment of people for technical, or even the kind of statistical pragmatism you refer to, is too often, one might say too typically, extended to group-based policy. Such as 'don't put money into black school districts' or 'don't give foreign aid to Africans'. Or (a century ago) 'don't try to educate the Chinese coolies, as they're mainly good for laying railway tracks'. If you look at some of the gutter-traffic around this issue recently, you'll see that that is just there (sometimes) under the surface.

In a metaphoric way, 'almost blue' is not the same as 'blue' A continuum needs to be evaluated as a continuum.

Jonathan said...

Funny story -- I was reading this post, and had forgotten who the author of this blog was. I found myself thinking, "wow, some of this sounds a lot like that great paper by Weiss and Fullerton, 'Racing Around, Getting Nowhere.' I wonder if the author is familiar with it?" Then I saw the comments & felt very silly. But seriously, great work. :)

John said...

For a diagnostic genetic sequencing test (which I develop for a living btw) an individual’s ancestry can often be relevant, but usually the critical information for interpretation provided by ancestry is related to the identity of the specific population the individual is descended from within the last few hundred years and whether or not that population was ever reduced to the point of being impacted by disease-associated founder mutations or genetic drift. For example, there are certain genetic markers that are associated with pathogenic mutations in Finnish people that are interpreted as polymorphisms in other Europeans, because in Finnish people they always co-occur in cis (= on same chromosomes) with other pathogenic mutations (and constitute a haplotype of sorts). Knowing this can change the interpretation of a tested individual who has Finnish ancestry with two mutations from that of carrier to affected. Other examples of such relationships are usually population-specific (for example there are similar situations with French Canadians vs French, Japanese vs Korean, Afrikaners vs Dutch) and not something that can be assumed to be always informative at the level of a “race”. This doesn’t mean btw, that there are no genotypic linkages that are important at the broadest levels of human ancestry (races if you like), just that a medical geneticist would much rather know exactly where a person’s family comes from and if anyone else in their family had something similar to what’s being tested for. Great posts on this Wade business btw.

Ken Weiss said...

Hell, I often can't remember my own kids' names! (and it's not just senility). The points apparently need continual repetition, given their widespread impermeability.

Ken Weiss said...

The problem is that we don't yet have adequate ways to speak of think about these things, largely because they've been twisted around for political ends so often.

I work in Finland all the time, and know what you mean. But even then, in 'isolates' like Finland or French Canadians, or the Amish etc., things often turn out to be more complex. The rarer the combination of variants, the more local perhaps but also the harder to study what their effects are or to show convincingly that they have them. Put a variant into one strain of mice and it looks as it does in humans, while in another strain it has no effects. And we have personal knowledge of the elusiveness of causation even when one thinks it might be simple.

The drive to categorize is often antithetic to the drive to view things as continuous. I'm glad I'm not a medical doctor, so that nothing I say makes any difference to peoples' lives.

John said...

Agree with you 100%. Even the simplest genotype:phenotype relationship can be much more complicated than first appearances. My intent was really just to counter the notion posed by Nathanial that "race" is a good interpretive proxy for diagnostics....In my experience that is often not the case and at best is an oversimplification to the point of being misleading.

Ken Weiss said...

It's hard to have a measured discussion of all of these topics, given their potential inflammatory nature, and their complexity, which can threaten many different vested interests. People, not just in the biomedical community but in general, too often want simple answers, and are less concerned with how correct the answers are, I think.