Thursday, June 11, 2015

Occasionality, probability, ....., and grantsmanship

In a previous post some time ago, we used the term occasionality to refer to events or outcomes that arise occasionally, but are not the result of the kinds of replicable phenomena around which the physical sciences developed and for which probability concepts and statistical inference are constructed.  Here, we want to extend the idea as it relates to research funding.

There has long been recognized a kind of physics envy among biologists, wishing to have a precise, rigorous theory of life to match theories of motion, atomic chemistry, and the like. But we argue that we don't yet have such a theory or, perhapsthe theory of evolution and genetics that we do have, which is in a sense already a theory of occasionality, is close to the truth.

Instead of an occasionality approach, assumptions of repeatability are used to describe life and to justify the kinds of research being done, when a core part of our science is that evolution, which generates genomic function, largely works by generating diversity and difference rather than replication.  Since individual genetic elements are transmitted and can have some frequency in the population, there is also at that nucleotide level some degree of repetition even if no two genomes, the rest of that element's genomic environmental context, are entirely alike.  The net result is a spectrum of causal strength or regularity.  Because many factors contribute, the distribution of properties in samples or populations may be well-behaved, that is, may look quite orderly, even if the underlying causal spectrum is one of occasionality rather than probability.

Strongly causal factors, like individual variants in a particular gene, are those that when the factor occurs, its effects are usually manifest, and it generates repeatability.  It and analysis of it fit standard statistical concepts that rely on, are built upon, the idea of repeatable causation with fixed parameters. But that is a deception whose practice weaves the proverbial tangled web of deeper realities.  More often, and more realistically, each occurrence of 'occasional' events arises from essentially unique causal combinations of causal factors.  The event may arise frequently, but the instances are not really repeats at the causal level.

This issue is built into daily science in various sometimes subtle ways.  For example, it appears subtly as a fundamental factor in research funding.  To get a grant, you have to specify the sample you will collect (whether by observational sampling or experimental replicates, etc.), and you usually must show with some sort of 'power' calculation that if an effect you specify as being important is taking place, you'll have a good chance of finding it with the study design you're proposing.  But making power computations has become an industry in itself; that is, there is standard software, and standard formulas for doing such computations.  They are, candid people quietly acknowledge, usually based on heavily fictitiously favorable conditions in which the causal landscape is routinely over simplified, the strength of the hypothesized causal factors exaggerated, and so on.

Power calculations and their like rest on axioms or assumptions of replicability, which is why they can be expressed in terms of probability from which power and significance types of analysis are derived.  Hence study designs and the decisions granting agencies make often if not typically rest on simplifications we know very well are not accurate, not usually close to what evidence suggests are realistic truths, and that are based on untested assumptions such as probability rather than occasionality.  Indeed, much of 'omics research today is 'hypothesis free', in that the investigator can avoid having to, or perhaps is not allowed to, specify any specific causal hypothesis except something safely vague like 'genes are involved and I'm going to find them'.  But how is this tested?  With probabilistic 'significance' or conceptually similar testing of various kinds, justified by some variant of 'power' computations.

If you are too speculative, you simply don't get funded.
Power computations often are constructed to fit available data or what investigators think can be done with fundable cost limits.  This is strategy, not science, and everybody knows it.  Nowhere near the  promised fraction of successes occur, except in the sense that authors can always find at least something in their data that they can assert shows a successful result.  The need for essentially fabulous power calculations are accepted is also one reason that really innovative proposals are rarely funded, despite expressed intentions by the agencies to fund real science: Power computations are hard to do for something that's innovative because you don't know what the sampling or causal basis of your idea is.  But routine ones described above are safe.  That's why it's hard to provide that kind of justification for something really different--and, to be fair, it makes it hard to tell when something really different is really, well, whacko.

A rigorous kind of funding environment might say that you must present something at least reasonably realistic in your proposed study, including open acknowledgment of causal complexity or weakness.  But our environment leads the petitioning sheep to huddle together in the safety of appearances rather than substance.  If this is the environment in which people must work, can you blame them?

Well, one might say, we just need to tighten up the standards for grants, and not fund weak grant proposals.  It is true that the oversubscribed system does often ruthlessly cut out proposals that reviewers can find any excuse to remove from consideration, if for no other reason than massive work overload.  Things that don't pass the oversimplified but requisite kinds of 'power' or related computation can easily be dropped from consideration.  But the routine masquerade of occasionality as if it were probability is not generally a criterion for turning down a proposal.

What is done, to some extent at least, is to consider proposals that are not outrightly rejectable, instead scoring them based on their relative quality as seen by the review panel.  One might say that this is the proper way to do things: reject those with obvious flaws (relative to current judgment criteria), but then rank the remaining proposals, so that those with, say, weaker power (given the assumptions of probability) are just not ranked as high as those with bigger samples or whatever.

But this doesn't serve that well, either.  That's because the way bureaucracies work the administrators' careers depend on getting more funding each year, or at least keeping the portfolio they have.  That means that proposals will always be funded from the top-ranked downward in scores until the money runs out.  This guarantees that non-innovative ideas will be funded if there aren't enough strong ideas. And it's part of the reason we see the kinds of stories, based on weak (sometimes ludicrously weak) studies blared across the news almost every single day.

We have a government-university-research complex that must be fed.  We let it grow to become that way.  Given what we've crafted, one cannot really push hard enough to get deeply insightful work funded and yet stop paying for run of the mill work; political budget-protection is also why a great many studies of large and costly scale simply will not be stopped.  This is not restricted to genetics. Or to science.  It's the same sort of process by which big banks or auto companies get bailed out.

How novel might it be if it were announced that only really innovative or more deeply powerful grants were going to be funded, and that institute grant budgets wouldn't be spent otherwise!  They'd be saved and rolled over until truly creative projects were proposed.  In a way, that's how it would be if industry had, once again, to fund its own research rather than farm it out to the public to pay for via university labs.

For those types of research that require major data bases, such as DNA sequence and medical data (e.g., to help set up a truly nationwide single medical records system and avoid various costs and inefficiencies), the government could obligate funds to an agency, like NCBI or CDC and others that currently exist, to collect and maintain the data.  Then, without the burden to collect the data, university investigators with better ideas or even ideas about more routine analysis, would only have to be supported for the analysis.

History has basically shown that Big Data won't yield the really innovative leaps we all wish for; they have to come from Big Ideas, and those may not require the Big Expense that is to a great extent what is driving the system now, in which to some extent regardless of how big your ideas are, if you only have small budgets, you won't also have tenure. That is major structural reason why people want to propose big projects even if important, focused questions could be answered by small projects: you have to please your Dean, and s/he is judged by the bottom line of his/her faculty.  We've set this system up over the years, but few as yet seem to be ready to fight it.

Of course this will never happen!
We know that not spending all available resources is naive even to suggest.  It won't happen.  First, on the negative side, we have peer review, and peers hesitate to vote weak scores on their peers if it meant loss of funding over all. If for no other reason (and there is some of this already), panel members know that the tables will be turned in the future and their proposals will be reviewed then by the people they're reviewing now.  Insiders looking out for each other is to some extent an inherent part of the 'peer' review process, although tight times do mean that even senior investigators are not getting their every wish.

But secondly, we have far too many people seeking funding than are being funded, or than there are funds for, and we have the well-documented way in which the established figures keep the funds largely locked up, so they can't go to younger, newer investigators.  The system we've had for decades had exponential growth in funding and numbers of people being trained built into it.  In the absence of a maximum funding amount or, better yet, investigator age, the power pyramids will not be easy to dislodge (they never are).  And, one might say generically, the older the investigator the less innovative and the more deeply and safely entrenched the ideas--such as probability-based criteria for things for which such criteria aren't apt--will be.  More than that, the powerful are the same ones inculcating their thoughts--and the grantsmanship they entrain into the new up-and-coming who will constitute the system's future.

With the current inertial impediments, and the momentum of our conceptual world of probability rather than occasionality, science faces a slow evolutionary rather than a nimble future.

3 comments:

Anonymous said...

> With the current inertial impediments, and the momentum of our conceptual world
> of probability rather than occasionality, science faces a slow evolutionary
> rather than a nimble future.

To me it appears like Permian-Triassic extinction ahead for US science :)

http://en.wikipedia.org/wiki/Permian%E2%80%93Triassic_extinction_event

- Manoj

Ken Weiss said...

These things are part and parcel of complex societies, but acquiescence is the same as approval, so one must at least try to stir the pot of reform. Dinosaurs didn't have the internet to give warnings.....

Anonymous said...

Speaking of stirring the pot, this post and the following comments are worth taking a look, if you did not already -

https://liorpachter.wordpress.com/2015/05/26/pachters-p-value-prize/

Manoj