Tuesday, January 26, 2016

"The Blizzard of 2016" and predictability: Part II: When is a prediction a good one? When is it good enough?

Weather forecasts require the prediction of many different parameter values.  These include temperature, wind at the ground and aloft (winds that steer storm systems, and where planes fly), humidity on the ground and in the air (that determines rain and snowfall), friction (related to tornadoes and thunderstorms), change over time and the track of these things across the surface with its own weather-affecting characteristics (like water, mountains, cities).  Forecasters have to model and predict all of these things.  In my day, we had to do it mainly with hand-drawn maps and ground observations--no satellites, basically no useful radar, only scattered ship reports over oceans, etc.), but of course now it's all computerized.

Other sciences are in the prediction business in various ways.  Genetic and other aspects of epidemiology are among them.  The widely made, now trendy promise of 'precision' medicine, or the predictions of what's good or bad for you, are clear daily examples.  But as with the weather, we need some criteria, or even some subjective sense of how good a prediction is.  Is it reliable enough to convince you to change how you live?

Yesterday, I discussed aspects of weather prediction and what people do in response, if anything.  Last weekend's big storm was predicted many days in advance, and it largely did what was being predicted.  But let's take a closer look and ask: How good is good enough for a prediction?  Did this one meet the standard?

Here are predicted patterns of snowfall depth, from the January 24th New York Times, the day after the storm, with data provided by the National Weather Service:



And now here are the measured results, as reported by various observers:




Are these well-forecast depths, or not?  How would you decide?  Clearly, the maximum snowfall reported (42") in the Washington area was a lot more than the '20+"' forecast, but is that nit-picking?  "20+" does leave a lot of leeway for additional snowfall, after all.  But, the prediction contour plot is very similar to the actual result. We are in State College, rather a weather capital because the Penn State Meteorology Department has long been a top-rated one and because Accuweather is located here as a result.  Our snowfall was somewhere between 7 and 10 inches.  The top prediction map shows us in the very light area, with somewhere between 1-5" and 7-10" expected, and the forecasts were for there to be a sharp boundary between virtually no snowfall, and a large dump.  A town only a few miles north of us had very few inches.

So was the forecast a good one, or a dud?

How good is a good forecast?
The answer to this fair question depends on the consequences.  No forecast can be perfect--not even in physics where deterministic mathematical theory seems to apply.  At the very least, there will always be measurement errors, meaning you can never tell exactly how good a prediction was.

As a lead-up to the storm's arrival in the east, I began checking a variety of commercial weather companies (AccuWeather, WeatherUnderground, the Weather Channel, WeatherBug) as well as the US National and the European Weather Services, interested in how similar they were.

This is an interesting question, because they all rely on a couple of major computer models of the weather, including an 'ensemble' of their forecasts. The local companies all use basically the same global data sources, and the same physical theory of fluid dynamics, and the same resulting numerical models.  They try to be original (that's the nature of the commercial outfits, of course, since they need to make sales, and even the government services want to show that they're in the public eye).

In the vast majority of cases, as in this one, the shared data from weather balloons, radar, ground reports, and satellite imagery, as well as the same physical theory, means that there really are only minor differences in the application of the theory to the computed models.  Data resources allow retrospective analysis to make corrections to the various models and see how each has been doing and adjust them.  For the curious, most of this is, rightly, freely available on the internet (thanks to its ultimately public nature).  Even the commercial services, as well as many universities, make data conveniently available.

In this case, the forecasts did vary. All more or less had us (State College) on a sharp edge of the advancing snow front.  Some forecasts had us getting almost no snow, others 1-3", others in the 5-8" range.  These varied within any given organization over time, as of course it should when better models become available.  But that's usually when D-day is closer and there is less extrapolation of the models, in that sense less accuracy or usefulness from a precision point of view.  At the same time, all made it clear that a big storm was coming and our location was near to the edge of real snowfall. They all also agreed about the big dump in the Washington area, but varied in terms of what they foresaw for New York and, especially, Boston.  Where most snow and disruption occurred, they gave plenty of notice, so in that sense the rest can be said to be details.  But if you expected 3" of snow and got a foot, you might not feel that way.

If you're in the forecasting business--be it for the weather or health risks based on, say, your genome or lifestyle exposures--you need to know how accurate forecasts are since they can lead to costly or even life-or-death consequences.  Crying wolf--and weather companies seem ever tempted to be melodramatic to retain viewers--is not good of course, but missing a major event could be worse, if people were not warned and didn't take precautions.  So it is important to have comparative predictions by various sources based on similar or even the same data, and for them to keep an eye on each other's reasons, and to adjust.

As far as accuracy and distance (time) is concerned, precision is a different sort of thing.  Here is the forecast by our local, excellent AccuWeather company for the next several days:

This and figure below from AccuWeather.com

And here is their forecast for the days after that.



How useful are these predictions, and how would you decide?  What minor or major decisions would you make, based on your answers?  Here nothing nasty is in the forecast, so if they blow the temperature or cloud over on the out-days of this span, you might grumble but you won't really care.

However, I'm writing this on Sunday, January 24.  The consensus of several online forecasts was all roughly like the above figures.  Basically smooth sailing for the week, with a southerly and hence warm but not very stormy air flow, and no significant weather.  But late yesterday, I saw one forecast for the possibility of another Big One like what we just had.  The forecaster outlined the similarities today with conditions ten days ago, and in a way played up the possibility of another one like it.  So I looked at the upper-air steering winds and found that they seem to be split between one that will steer cold arctic air down towards the southern and eastern US, and another branch that will sweep across the south including the most Gulf of Mexico and join up with the first branch in the eastern US, which is basically what happened last week!

Now, literally as I write, one online forecast outfit has changed its forecast for the coming week-end (just 5 days from now) to rain and possibly ice pellets.  Another site now asks "Could the eastern US face more snow later this week?" Another makes no such projection.  Go figure!

Now it's Monday.  One commercial site is forecasting basically nothing coming.  Another forecasts the probability of rain starting this weekend.  NOAA is forecasting basically nothing through Friday.

But here are screenshots from an AccuWeather video on Monday morning, discussing the coming week.  First, there is doubt as to whether the Low pressure system (associated with precipitation) will move up the east coast or farther out to sea.  The actual path taken, steered by upper-level winds, will make a big difference in the weather experienced in the east.

Source: AccuWeather.com

The difference in outcomes would essentially be because the relevant wind will be across the top of the Low, moving from east to west, that is, coming off the ocean onto land (air circulates as a counter-clockwise eddy around the center of the Low).  Rain or possibly snow will fall on land as the result.  How much, or how cold it will be depends on which path is taken.  This next shot shows a possible late-week scenario.

Source:  AccuWeather.com
The grey is the upper-level steering winds, but their actual path is not certain, as the prior figure showed, meaning that exactly where the Low will go is uncertain at present.  There just isn't enough data, and so there's too much uncertainty in the analysis, to be more precise at this stage.  The dry and colder air shown coming from the west would flow underneath the most air flowing in from offshore, pushing it up and causing precipitation.  If the flow is more eastward of the alternatives in the previous figure, the 'action' will mainly be out at sea.

Well, it's now Monday afternoon, and two sites I check are predicting little if anything as of the weekend....but another site is predicting several days in a row of rain.  And....(my last 'update'), a few hours later, the site is predicting 'chance of rain' for the same days.

To me, with my very rusty, and by now semi-amateur checking of various things, it looks as if there won't be anything dropping on us.  We'll see!

The point here is how much things change and how fast on little prior indication--and we are only talking about predicting a few days, not weeks, ahead.  The above AccuWeather video shows the uncertainty explicitly, so we're not being misled, just advised.

This level of uncertainty is relevant to biology, because meteorology is based on sophisticated, sound physics theory (hydrodynamics, etc.).  It lends itself to high-quality, very extensive and even exotic instrumentation and mathematical computer simulation modeling.  Most of the time, for most purposes, however, it is already an excellent system.  And yet, while major events like the Big Blizzard this January are predictable in general, if you want specific geographic details, things fall short.  It's a subjective judgment as to when one would say "short of perfection" rather than "short but basically right.".

With more instrumentation (satellites, radar, air-column monitoring techniques, and faster computers) it will get inevitably better.  Here's a reasonable case for Big Data.  However, because of measurement errors and minor fluctuations that can't be detected, inaccuracies accumulate (that is an early example of what is meant by 'chaotic' systems: the farther down the line you want to predict, the greater your errors.  Today, in meteorology, except in areas like deserts where things hardly change, I've been told by professional colleagues who are up to date, that a week ahead is about the limit.  After that, at least under conditions and locations where weather change is common, specific conditions today are no better than the climate average for that location and time of year.

The more dynamic a situation--changing seasons, rapidly altering air and moisture movement patterns, mountains or other local effects on air flow, the less predictable over more than a few days. You have to take such longer-range predictions with a huge grain of salt, understanding that they're the best theory and intuition and experience can do at present (and taking into account that it is better to be safe--warned--than sorry, and that companies need to promote their services with what we might charitably call energetic presentations).  The realities are that under all but rather stable conditions, such long-term predictions are misleading and probably shouldn't even be made: weather services should 'just say no' to offering them.

An important aspect of prediction these days, where 'precision' has recently become a widely canted promise, is in health.  Epidemiologists promise prediction based on lifestyle data.  Geneticists promise prediction based on genotypes.  How reliable or accurate are they now, or likely to become in the predictable future?  At what point does population average do as well as sophisticated models? We'll discuss that in tomorrow's installment.

Monday, January 25, 2016

"The Blizzard of 2016" and predictability: Part I--the value of prediction

Mark Twain famously quipped, "Everybody talks about the weather but nobody does anything about it." But these days, that's far from accurate.  At least, an army of specialists try to predict the weather so that we can be prepared for it.  The various media, as well as governmental agencies, publicize forecasts.  But how good are those forecasts?

As a former meteorologist myself (back--way back--when I was an Air Force weather officer), I take an interest, partly professional but also conceptual, in how accurate forecasting has become in our computer and satellite era.

Last week, a storm developed over the southwest, and combined with atmospheric disturbance barreling down from the Canadian arctic, to cause huge rain and wind damage across the south and then veered north where it turned into "The Blizzard of 2016", dubbed by the exaggeration-hungry media.  How well was it forecast and did that do any societal good?

Here is a past-few-days summary page of mapped conditions at upper air (upper left), surface (upper right) and other levels.  On a web page called eWall ( http://mp1.met.psu.edu/~fxg1/ewall.html ) you can scroll these for the prior 5 days.  The double Low pressure (red L's) on the right panel represent the center of the storm, steered in part by the winds aloft (other panels).



If you followed the forecasting over the week leading to the storm's storming up the east coast to wreak havoc there, you would say it was exceedingly well forecast, and many days in advance. Was it worth the cost?  One has to say that probably many lives were saved, huge damage avoided, and disruption minimized: people emptied grocery store shelves and hunkered down to watch the Weather Channel (and State College's own Accuweather).  Urgent things, including shopping for supplies in case of being house-bound, were done in advance and probably many medical and other similar procedures were done or rescheduled and the like.  Despite the very heavy snowfall, as predicted, the forecast was accurate enough to have been life-saving.

Lots of people still don't do anything about it!
And yet....
Despite a lot of people talking about the weather, on all sorts of public media, masses of people, acting like Mark Twain, don't do anything about it, even with the information in hand.  At least 12 people died in this storm in accidents, and others from coronaries while shoveling, and this is just what I've seen in a quick check of the online news outlets.  Thousands upon thousands were stranded for many hours in freezing cold on snow-sodden highways.  There were things like 25-mile-long stationary lines of vehicles on interstates and thousands of car and truck accidents.  That's a lot of people paying the price for their own stubbornness or ignorance.  This is what such a jam looks like:

A typical snowstorm traffic jam (www.breakingnews.com)
People were warned in the clearest terms for days in advance.  Our fine National Weather Service, in collaboration with complementary services in other countries, scoped out the situation and let everyone know about it, as is their very important job.  Some states, like New York,  properly closed their roads to all but necessary traffic. Their governments did their jobs.  Other states, like Kentucky, failed to do that.  So then, how is it that there was so much of what seems like avoidable damage?

Let's put the issue another way: My auto insurance rates will reflect the thousands of costly claims that will be filed because of those who failed to heed the warnings and were out on the highways anyway. So I paid for the forecasts first through my taxes, and then through the purchase prices of goods whose makers pay to advertise on weather channels, but then I also have to pay for those whose foolhardiness led to the many accidents they'll make claims for.  That's similar to people knowingly enjoying an unhealthy lifestyle, and then expecting health insurance to cover their medical bills--that insurance, too, is amortized over the population of insured including those who watch their lifestyles conscientiously.  That's the nature of insurance.

Some people, of course, simply can't stay home.  But many just won't.  Countless truckers were stranded on the roads.  They surely knew of the coming storm.  Did commercial pressure keep them on the road?  Then shame on their companies!  They surely could have pulled over or into Walmart parking lots to wait out the snowfall and its clearance--a day or so, say.  Maybe there aren't enough parking lots for that, but surely, surely they should not have been on the Interstates!  And while some people probably had strong legitimate reasons for being out, and a few may not have seen the strong, repeated forecasts over the many preceding days, most and I would say by far the most, just decided to take their trips anyway.

Nobody can say they aren't aware of pileups, crashes, and hours-long stalls that happen on Interstates during snowstorms.  It is not a new phenomenon!  Yet, again, we all will have to pay for their foolhardiness.  Maybe insurance should refuse to cover those on the road for unnecessary trips. Maybe those who clog the roads in this way should be taxed to cover the costs of, say, increased insurance rates on everyone else or emergencies that couldn't be dealt with because service vehicles couldn't get to the scene.

The National Weather Service, and companies who use their data, did a terrific job of alerting people of the coming storm, and surely saved many lives and prevented damage as a result.  Just as they do when they forecast hurricanes and warn of tornadoes.  Still, there are always people who ignore the warnings, at their own cost, and at cost to society, but that's not the fault of the NWS.

But what about predictability? Did they get it right?  What is 'right'?
It is a fair and important question to ask how closely the actual outcome of the storm was predicted.   The focus is on the accuracy in detail, not the overall result, and that leads one to examine the nature of the science and--of course in our case here on this blog--to compare it with the state of the art of epidemiological, including genetic, predictions.  Not all forecasts are as dramatic and in a sense clear-cut as a major storm like this one.

I have been in the 'prediction' business for decades, first as a meteorologist and subsequently in trying to understand the causal relationships, genetic and evolutionary, that explain our individual traits.  Tomorrow, we'll discuss aspects of the Big Storm's forecasts that weren't so accurate and compare that with the situation in these biological areas.

Monday, January 18, 2016

Blogs and public science: nothing new

Universities generally say their faculty have three major responsibilities: teaching, research, and service.  That's the usual listing order, though in our time it would be perhaps more accurate to reverse that.  Service comes first, in the form first and foremost of grants and fund-raising, and secondly, time-eating bureaucracy.  Research, meaning raising funds (again!) and lots of publication comes second.  Research is worshipped as a public good, but arcane research counts in many fields (like the humanities or very micro-focused science).  Teaching, well, only if you can't find a way to get out of it.

Actually, we're being cynical (just a bit).  'Service' isn't only about hawking for money.  Public education is also part of that.  And again we're not being wholly cynical about that.  The universities want to write about your work in their PR magazines sent to alumni and in press releases (of course, this is also largely about money).  But educating the general public about what we're doing in research and scholarship is, in fact, an important role even if part of that is self-serving.  Popular science books, for example, draw attention but also can provide the non-specialist citizenry a way to get a general understanding or even a fascination with scholarly and scientific discoveries.

Science can be very technical, specialized, and arcane.  Much of what we ask about is quite remote from direct application or things the non-scientific public care about much less know anything about. That means that if you don't have the detailed background or time to continue keeping up to date in a professional sense, as most people clearly don't, having a professional explain the gist of the issues can be quite valuable and also quite interesting.

The idea of 'popular' science has changed over time, of course, because in the past only a small segment of the public was involved, perhaps only the aristocracy. This was true of much of music, philosophy, literature and so on.  But there has also long been at least somewhat of a tradition of specialists explaining things to the public.

Probably among the most common instances would be religious leaders explaining the technicalities of scripture.  Travelers have long told tales of what they saw in far-away places.  Even ancient itinerant speakers--Homer, perhaps, in ancient Greece--came around and did this, 'performing' in a sense.  Maybe most of this was for the upper classes to witness, but how restrictive that would have been probably varied.

The physician Galen was a performer of this sort.  He did dissections (or, worse, vivisections) to show off anatomy and attract attention to his medical knowledge and services.  I don't know about Marco Polo, but would expect he regaled many with his tales.  Boyle, Thomas Edison, and others here and in Europe regularly put on demonstration shows for the public, or at least those whose support they might want.  Probably phrenologists and alchemists did the same.

In the 18th and 19th centuries in Europe, travelers certainly entertained audiences with their tales, often for fund-raising purposes.  A major exploratory expedition to the South Pole was funded in this way (not by government grants).  Thomas Huxley, Darwin's famous 'bulldog' (outspoken, aggressive advocate) loved to give public lectures, and was especially dedicated to educating the working man. In the mid-1800s, Michael Faraday gave public lectures about phenomena related to electricity and magnetism.  Not all of these were university faculty by any means, and perhaps most of the latter stayed within their classrooms.  But there were several leading academics who became quite popular among the reading and museum/lecture-attending public.  Tales of fossil exploration in the American west were given, back east, and public intellectuals in the major coastal universities were well-known.

Popular science and popularizing scientists: only the formats are new
The tradition has proliferated as faculty have more and more been expected to do 'service', including public education.  For much of the 20th century and even more into the present, the public scientist has been a TV stable, and one widely seen in magazines and newspaper science sections.  Indeed, many now are not really scientists any longer but drop-outs who became journalists or documentary makers.  But a few famous ones, like Steven Jay Gould, Ed Wilson, Richard Feynman, Neal deGrasse Tyson, Sean Carroll, Carl Sagan, Richard Dawkins, and others essentially kept their academic groundings.  Often one could argue they lost much of their rigorous credentials in the process, but not always.

As media changed, so did the means by which scientists, professional and once-were, could convey the gist of technical science to the public.  Among other things, funding has become media-driven and so organizations like NIH, NASA, and universities (and their counterparts in other countries) have major PR departments of their own, to spin as well as educate.

We now have at least two relatively new medium: the blogosphere and open-source publishing.  The latter is largely still arcanely professional, but is opening up to unmonitored commentary.  Popular science magazines online allow commentary and discussion.  Q and A sites, like Quora in physics, Reddit, and many others, provide interactions among scientists, students, and public freely and without being restricted to classrooms.

Blogs, such as this one, are increasingly being used as regular outlets for faculty, to communicate to the web-addicted public.  Here, we can write technical or popular or tweener posts.  We can Tweet a post to reach a desired audience.  We can mix technical, whimsical, speculative and specific commentaries.

Universities need to get on board faster
Universities are themselves now waking up to the value of these media as part of faculty members' 'service' responsibilities.  But universities, which should pioneer what's new, are often stodgy and fearsomely conservative.  Deans and chairs tend to stick to what's known, the traditional, even if most research, in the sciences as well as humanities, mainly collects dust in library archive annexes. Students go to the library to work on computers, online data bases, and the like.

It's not just that most research will quickly be dust-collecting.  The idea of peer review is way over-rated as a way to purge the bad and only publish the good and important.  Peer review is creaking under the weight of its hoary insider tradition, and because reviewers are so overloaded that they rarely can give proper scrutiny.  Overloaded 'supplemental information' doesn't help, nor does the need to review grant proposals (or write them).  Time is short.

It's true that traditional publishing of research in peer-reviewed journals, even burnished with online (pay as you go) open-access routes, still has first priority in administrators' eyes.  But things are changing, as they should, must, and will.  Online publishing also has online open reviewing, and comments by readers.  There may be far too many journals, but weird ideas do have a chance to be seen, and online searching makes them available.  Much is junk, of course, but at least you, not a panel of insider reviewers, get to judge.

It's a different kind of arena, and recalcitrant institutions will have to modernize.  Some faculty we know (including our own, fantastic Holly Dunsworth) have successfully, and deservedly achieved tenure with public media being a substantial part of their records.  As they age into administrative roles, the changing landscape will be built into their world-views.  That, too, will mature and perhaps become stodgy, to be replaced or supplemented by whatever the future holds.  But it's likely to be much more dynamic and flexible than the legacy of the past that we have too much still to live with today.

As open-source and online media increase their fraction of publication, we will likely become a more widely integrated and aware society.  The local classroom is opening up to a global forum, where anyone, not just the elite few, can gather round, and hear whichever oracle or orator they choose.

Homer would probably recognize the phenomenon.

Tuesday, January 12, 2016

Cancer--luck or environment? Part II: Nothing to food-fight over

Yesterday we commented on the 'controversy' over whether cancer is mainly due to environmentally (lifestyle) or inherently (randomly) arising mutations.  This is a tempest in a teaspoon.

Mutations, whatever their individual cause, must accumulate among dividing cells until one cell has the bad luck to accumulate a set of changes that 'transforms' it into a misbehaving cancer cell.  The set of changes varies even among tumors of the same organ, because many different genes and their expression-regulation contribute to the growth, or restraint of growth, even within the same tissue. That is, not all breast, colon, or lung cancers are caused by the same set of mutations.   It then proliferates, rapidly dividing and thus indubitably acquiring more mutational changes that enable it to do things like metastasize to other parts of the body, or develop resistance to drug treatment.  The more rapidly it grows and spreads, the more rapidly such things can happen.

Even if the first transformational cause were due entirely to environmentally-induced mutations, the real dangers that ensue during the tumor's lifespan are relatively rapid additions to the original tumorigenesis process, and so in a sense the main dangers of cancer are primarily, if not nearly exclusively, due to inherent mutation among cancer cells.  If you get lung cancer and then stop smoking, your lung cancer will still evolve. Indeed, if environment contributes, it may make things worse--if that "environment" is radiation or chemotherapy: radiation definitely causes mutations, and chemotherapy weeds out cells that haven't experienced resistance mutations, leaving or even making room for tumor lineage cells that do have resistance mutations.  Finally, things that stimulate cell division can facilitate new mutations or even just make a tumor spread more rapidly.

So clearly cancer is not all due to environmental, nor to inherently occurring changes.  These and other factors comprise multiple, interacting causative effects.  Attributing cause to environment or inherency is misleading.  But what if cancer were in fact even entirely due to lifestyle factors that stimulate cell division or directly cause mutation?  Of course this would be very good for the Big Data epidemiologists and their studies, and threatening to industries and so on that produce mutagenic waste or products etc.  But suppose epidemiologists were to continue to find major carcinogenic environmental factors (that is, that the major ones, like smoking, aren't already known). Let us further suppose that avoidance behavior were to follow the announcement of the risks (not an obvious thing to assume, actually; the tobacco industry is still thriving, after all).  Then what?

Epidemiologists would say their work has prevented cancer and would claim victory over the to-them strange idea that cancer is due to inherent mistakes in DNA replication and is inevitable if one were to live long enough.  A lifestyle-change-based reduction in cancer would be clearly a very good thing.  But in fact, it would not be an unalloyed victory: one thing it would do is keep the non-exposed person alive (because s/he didn't get cancer!) and that in turn means that s/he would be at higher risk of (1) other age-related deteriorative diseases that dying of cancer would have precluded, many of which are waiting in the wings at ages when cancers arise, and (2) eventually getting cancer at some older age.  In the first case, the rates of other diseases like stroke and diabetes etc. would necessarily go up.  The risk of slowly petering out in increasingly bad shape in an intensive nursing unit would go up.  That would, of course, lower the lifetime cancer risk, but not in a very pleasant way.

In other words, lifestyle changes can delay cancer, but even assuming that the per-year exposure to environmental mutagens were reduced, the consequently longer exposure to those mutagens might mean their lifetime total would go up), so whether or not it decreased the lifetime risk of cancer would be an open question.  However, what this would do would be, by removing environmental causes, to raise the fraction of cancers that are due to inherent mutation, strengthening the fraction of Vogelstein-Tomasetti cases!

It's undoubtedly good to get cancer later rather than earlier in life, but not an unalloyed good.  In any case, what these points show is that the argument over the particular fraction of cancers that are due to environment vs inherent mutation is rather needless.  At most it might be relevant to ask how much of funding investment in big epidemiological studies is going to pay off, rather than spending on some other clearer issues (especially if the major environmental mutagens are already known).  There have already been scads of massive long-term studies of almost anything you can name, to identify carcinogenic exposures. With some very important exceptions, that are by now well known, these studies have largely come up empty, or with now-it-is/now-it-isn't conclusions, in the sense that risk factors are either weak, or if strong are rare and hard to find embedded in the broad mix of chronic disease risk factors.  Environments are always changing with new possible carcinogenic exposures arising, but basically those with strong effects usually show up on their own such as by multiple cases of a particular cancer type in some specific location or among workers in a particular industry or in vitro mutagenesis studies and the like.

If causation is too generic, don't get your hopes up
If comparisons among countries, for example, show that the same cancer can have very different age patterns or incidence rates, this may suggest lifestyles as major risk differences. But that's far from saying that the causal elements are individually strong or simple enough to be enumerated by the usual Big Study epidemiological approach. One can be extremely doubtful that this would be the case.

Saying something is 'environmental' because, for example, it varies among populations is like saying something is 'genetic' because it varies among relatives.  If it's like genetic factors as documented by countless GWAS studies, there are many different, correlated or even independent contributors, then each person's cancer will be due to a different set or complex set of experiences and the luck of the mutational draw.  As with GWAS and related approaches, it is far from clear that large, long-term environmental studies, more mega than we've already had for decades, will be the appropriate way to approach the problem.

Indeed, to a considerable extent, if each case is causally unique, by some different combination of factors and their respective strengths in that individual, then it's epistemologically not very different from saying that cancer occurs randomly, which, though for a different sort of reason, is what V and T said.  There won't, for example, be a specific environmental change you can make, any more than a specific gene you can re-engineer, to make the disease go away or even to change much in frequency or age of onset.

Food fights like this one are normal in science and often have to do with egos, investment in one 'paradigm' or another, how research is supported or advice from experts are conveyed to the public.  But such disputes, though very human, are rather off the point.  We often basically ignore risks we know, as in the proliferation of CT and other radiation-based scanning and medical testing which can be carcinogenic.  Life is mutagenic, one way or another.  So while you have life, enjoy your food--don't waste it by throwing it at each other!  There are better questions to argue about.

Monday, January 11, 2016

Food-Fight Alert!! Is cancer bad luck or environment? Part I: the basic issues

Not long ago Vogelstein and Tomasetti stirred the pot by suggesting that most cancer was due to the bad luck of inherent mutational events in cell duplication, rather than to exposure to environmental agents.  We wrote a pair of posts on this at the time. Of course, we know that many environmental factors, such as ionizing radiation and smoking, contribute causally to cancer because (1) they are known mutagens, and (2) there are dose or exposure relationships with subsequent cancer incidence. However, most known or suspected environmental exposures do not change cancer risk very much or if they do it is difficult to estimate or even prove the effect.  For the purposes of this post we'll simplify things and assume that what transforms normal cells into cancer cells is genetic mutations; though causation isn't always so straightforward, that won't change our basic storyline here.

Vogelstein and Tomasetti upset the environmental epidemiologists' apple cart by using some statistical analysis of cancer risks related, essentially, to the number of cells at risk, their normal time of renewal by cell division, and age (time as correlated with number of cell divisions).  Again simplifying, the number of at-risk actively dividing cells is correlated with the risk of cancer, as a function of age (reflecting time for cell mutational events), and with a couple of major exceptions like smoking, this result did not require including data on exposure to known mutagens.  V and T suggested that the inherently imperfect process of DNA replication in cell division could, in itself, account for the age- and tissue-specific patterns of cancer.  V and T estimated that except for the clear cases like smoking, a large fraction of cancers were not 'environmental' in the primary causal sense, but were just due, as they said, to bad luck: the wrong set of mutations occurring in some line of body cells due to inherent mutation when DNA is copied before cell division, and not detected or corrected by the cell.  Their point was that, excepting some clear-cut environmental risks such as ionizing and ultraviolet radiation and smoking, cancer can't be prevented by life-style changes, because its occurrence is largely due to the inherent mutations arising from imperfect DNA replication.

Boy, did this cause a stink among environmental epidemiologists!  Now one we think undeniable factor in this food fight is that environmental epidemologists and the schools of public health that support them (or, more accurately, that the epidemiologists support with their grants) would be put out of business if their very long, very large, and very expensive studies of environmental risk (and the huge percent of additional overhead that pays the schools' members meal-tickets) were undercut--and not funded and the money went elsewhere.  In a sense of lost pride, which is always a factor in science because it's run by humans, all that epidemiological work would go to waste, to the chagrin of many, if it was based on misunderstanding the basic nature of the mutagenic and hence carcinogenic processes.

So naturally the V and T explanation has been heavily criticized from within the industry.  But they will also raise the point, and it's a valid one, that we clearly are exposed to many different agents and chemicals that are the result of our culture and not inevitable and are known to cause mutations in cell culture, and these certainly must contribute to cancer risk.  The environmentalists naturally want the bulk of causation to be due to such lifestyle factors because (1) they do exist, and (2) they are preventable at least in principle.  They don't in principle object to the reality that inherent mutations do arise and can contribute to cancer risk, but they assert that most cancer is due to bad behavior rather than bad luck and hence we should concentrate on changing our behavior.

Now in response, a paper in Nature ("Substantial contribution of extrinsic risk factors to cancer development," Wu et al.) provides a statistical analysis of cancer data that is a rebuttal to V and T's assertions.  The authors present various arguments to rebut V and T's assertion that most cancer can be attributed to inherent mutation, and argue instead that external factors account for 70 to 90% of risk.  So there!

In fact, these are a variety of technical arguments, and you can judge which seem more persuasive (many blog and other commentaries are also available as this question hits home to important issues--including vested interests).  But nobody can credibly deny that both environment and inherent DNA replication errors are involved.  DNA replication is demonstrably subject to uncorrected mutational change, and that (for example) is what has largely driven evolution--unless epidemiologists want to argue that for all species in history, lifestyle factors were the major mutagens, which is plausible but very hard to prove in any credible sense.  

At the same time, environmental agents do include mutational effects of various sorts and higher doses generally mean more mutations and higher risk.  So the gist of the legitimate argument (besides professional pride or territoriality and preservation of public health's mega-studies) is really the relative importance of environment vs inherent processes.  The territoriality component of this is reminiscent of the angry assertion among geneticists, about 30 years ago, that environmental epidemiologists and their very expensive studies were soaking up all the money so geneticists couldn't get much of it.  That is one reason geneticists were so delighted when cheap genome sequencing and genetic epidemiological studies (like GWAS) came along, promising to solve problems that environmental epidemiology wasn't answering--to show that it's all in the genes (and so that's where the funding should go).  

But back to basic biology 
Cells in each of our tissues have their own life history.  Many or most tissues are comprised of specialized stem cells that divide and one of the daughter cells differentiates into a mature cell of that tissue type.  This is how, for example, the actively secreting or absorbing cells in the gut are produced and replaced during life.  Various circumstances inherent and environmentally derived can affect the rate of such cell division. Stimulating division is not the same as being a direct mutagen, but there is a confounding because more cell division means more inherent mutational accumulation.  That is, an environmental component can increase risk without being a mutagen and the mutation is due to inherent DNA replication error.  Cell division rates among our different tissues vary quite a lot, as some tissues are continually renewing during life, others less so, some renew under specific circumstances (e.g., pregnancy or hormonal cycles), and so on.

As we age, cell divisions slow down, also in patterned ways.  So mutations will accumulate more slowly and they may be less likely to cause an affected cell to divide rapidly.  After menopause, breast cells slow or stop dividing.  Other cells, as in the gut or other organs, may still divide, but less often.  Since mutation, whether caused by bad luck or by mutagenic agents, affects cells when they divide and copy their DNA, mutation rates and hence cancer rates often slow with advancing age.  So the rate of cancer incidence is age-specific as well as related to the size of organs and lifestyle stimulates to growth or mutation.  These are at least a general characteristics of cancer epidemiology.

It would be very surprising if there were no age-related aspect to cancer (as there is with most degenerative disease).  The absolute risk might diminish with lower exposure to environmental mutagens or mitogens, but the replicability and international consistency of basic patterns suggests inherent cytological etiology.  It does not, of course, in any sense rule out environmental factors working in concert with normal tissue activity, so that as noted above it's not easy to isolate environment from inherent causes.

Wu et al.'s analysis makes many assumptions, the data (on exposures and cell-counts) are suspect in many ways, and it is difficult to accept that any particular analysis is definitive.  And in any case, since both types of causation are clearly at work, where is the importance of the particular percentages of risk due to each?  Clearly strong avoidable risks should be avoided, but clearly we should not chase down every miniscule risk or complex unavoidable lifestyle aspect, when we know inherent mutations arise and we have a lot of important diseases to try to understand better, not just cancer.

Given this, and without discussing the fine points of the statistical arguments, the obvious bottom line that both camps agree on is that both inherent and environmental mutagenic factors contribute to cancer risk. However, having summarized these points generally, we would like to make a more subtle point about this, that in a sense shows how senseless the argument is (except for the money that's at stake). As we've noted before, if you take into account the age-dependency of risk of diseases of this sort, and the competing causes that are there to take us away, both sides in this food fight come away with egg on their face.  We'll explain what we mean, tomorrow.

Wednesday, December 30, 2015

Wee willies that no longer respond to a warm globe

               "If the idea spreads that pollution is affecting not just whales but also willies, 
                   I think we might witness sudden conversions to environmentalism."**

Environmental effects of human activity remain controversial, particularly because of the active opposition of conservative political groups. We've been slow to halt or reverse air pollution, and we're dragging our feet one climate change--the two are of course connected because of the burning of fossil fuels.  Because it has economic implications, the problem clearly mixes science and social politics.  But if some consequences of climate change were to be more in hand, and thus very clear, the otherwise rather vague idea might, by being brought so closely home, be a springboard to corrective action.  A recent report** alerted us to a consequence of environmental degradation that, if more widely known, might help reduce the controversy, and get everyone moving toward the same goal, reversing the damage.

Global warming has many consequences, but most of them rather general, gradual and of only ambiguous long-term implications.  Thus, perhaps no implication can hit home more poignantly and persuasively than one that directly impacts our most intimate personal lives.  As background to this report, we know that carbon (denoted C) and oxygen (O) in molecular combination are greenhouse gasses that are accumulating in our atmosphere, arguably at least in major part because of human combustion of fossil fuels.

Briefly, excess carbon dioxide (CO2) in the upper atmosphere, and carbon monoxide (CO) lower down have been steadily increasing and are widely believed to have caused many changes in both natural and human ecology.  But the data are so complex, involving sensitive measurement challenges of many different global factors, that they are rather hard to get one's head around.  The result has enabled opponents of environmental action to dispute whether climate change is real, and even if so, whether it has done more than cause things like occasional smog alerts in Beijing and overall mean global temperature to increase, with consequent glacial melting.  The skeptical opposing view is that all that the data are suggesting, at most, is temporary natural variation in the normal earthly ecology.

Those who resist the climate scientists' idea that we need to change our behavior to prevent further damage, or who may even think the whole idea is some sort of plot by Democrats (or worse, tree-huggers), do not react to these climate changes with alarm. Even if they believe the data, and indeed even if they say, for the purposes of argument, that climate change is caused by humans, they simply point out that people have always had to deal with one sort of crisis or another, often even ecological ones like the decline of Mayan or Mesopotamian or Indus Valley civilizations.  It's the natural course of events, whose personal consequences for us are hardly experienced.  As for future human generations, the argument is, they will just have to adapt, as humans have always done, even if that means major dislocation, food or resource wars, or societal disruption.  We ourselves should not be asked to give up our quality of life to give future generations a kind of free pass that neither we nor our forebears ever had.  Future quality of life may be different from ours, but people will recover or adjust in their own way.

However, the new report** hits right at the heart of the latter assumption, the very notion of future generations!

COnic section (dotted line); schematic
'Does size matter?' is no longer just a joke!
The new data differ from the existing climate reports because they finally show an important effect of ecological damage.  It is a rather sensitive or awkward finding to discuss and perhaps has received less publicity as a result.  But the fact is that there is a strong correlation between reduction in penis size and global warming.  We find that this is based on COnic section samples from the organ at birth (shown in a hopefully respectful schematic way in the above figure).  These appear to be hard data, not just whimsical speculation with a political agenda in mind.

The second figure, below, shows the relationship over the past 50 years, the period for which there are reliable data.  Clearly, as air pollution levels rise the cross-section size declined.  The smaller size may be a direct ecological effect on the quality of life or, indeed, on the very future of humanity. Whereas in the past, climate change data were rather abstract, the new data hit so close to home that one would expect sensitive people finally to stand up and take note.  Even scientists care about such things in a way that goes beyond the impersonal nature of Big Data spewing from their computers. "These wee willies just give me the willies!" one investigator said as part of our inquiry.


Pollution (black) vs organ size (red), 1957-2013

Genital size is relatively easy to measure, a single, simple indicator that does not require the expensive instrumentation, not to mention computer modeling that is required to analyze more general figures on global warming.  While scientists are careful to caution that it is very difficult to claim that global warming is causally responsive for the observed organic change, the  clarity of the data suggest that one can at least hope that some parts of our society will try to rise to the challenge.  Even if one dismisses the association as not clearly being a directly causal one because, for example, it is ultimately due to a correlation with some unmeasured factor(s), a reduction in global warming could also reduce those intermediate influences, and thus halt the observed trend.

Many effects of ecological change have been dismissed as fads, falsely reported 'trends', or even faked evidence, a kind of self-supporting conspiracy for funding and attention among the climate and ecological scientists, who urge their view as a political tactic to rankle their politically conservative opponents.  Communication on the subject has become so angry that in our society today the one hand doesn't really know what the other hand is up to.

The importance of the new data is that even after many years of steady stories on climate change and its implications, as a society in general, we seem not to have been able to take abstract facts, like sea-level height or the bushel yield of wheat, seriously, because they don't appeal to our deepest or more immediate emotions.  We must acknowledge even as we urge respect for the sciences that face complex analytic data problems, that not all the abstract science in the world can change that.  But wee willies may entail the emotions that one needs to reach, if widespread political change is to be hoped for.

This new report gets to the nub of the effects on human behavior of our wanton destruction of the environment.  But, sensitive though the subject is, or perhaps because of that, we may finally have the kind of hard-hitting report required to shake the complacency and stir up a call to action.


**This post reflects my reading of original reporting by U. Eco, as published in the recent issue of Numero Zero, which is widely available.  Interpretations are of course my own.

Wednesday, December 23, 2015

Good Tidings from Jolly Ol' St Nickase!

Here are some good tidings for the season!


I SAW THREE SNPS

I saw three SNPs come sequenced in
On CRISPR day, on CRISPR day
Yes all three SNPs came sequenced in
On CRISPR day in the morning
On CRISPR day in the morning
And what was with those SNPs all three
On CRISPR day, on CRISPR day?
And what was with those SNPs all three
On CRISPR day in the morning?
On CRISPR day in the morning
The TracrRNAs were there
On CRISPR day, on CRISPR day
The Spacers and Repeats were there
On CRISPR day in the morning
And HDR the changes made
On CRISPR day, on CRISPR day
Oh, with all the target changes made!
On CRISPR day in the morning

Then let all the lab rejoice again
On CRISPR day, on CRISPR day
Then let all the lab rejoice again
On CRISPR day in the morning!


GOOD NEWS,  YE MERRY GENTLEMEN

Good news, ye merry gentlemen,
Now nothing you’ll dismay,
Remember that our Savior
Was born on CRISPR-day
To save poor souls from Mutant’s power,
Which long had gone awry.
And it is tidings of comfort and joy.

In genes that were our father's
The blessed changes came
Unto some certain Cas9 kit,
With tidings of the same;
That he was born in perfect health
The Son-of-CRISPR, named.   
Oh! tidings of comfort and joy, etc.