Suppose that most of the standing genetic variation in natural populations is neutral (a la Kimura and Ohta 1971). Here, the principal forces modulating neutral variation are drift and mutation, with drift removing variation at a rate 1/(2N) (where N is population size) and mutation adding variation at a rate 2Nu (where u is mutation rate to neutral alleles). Over time, an equilibrium state will be reached that balances these dynamics. At equilibrium, the heterozygosity, H-hat, in a population may be described by: H-hat = (4Nu)/(1+4Nu).
Notice that H-hat is extremely sensitive to population size, N. Indeed, if 4Nu is small, then one expects little to no heterozygosity and, conversely, if 4Nu is large, then one expects almost complete heterozygosity. Try it; you'll see.
Now suppose you have surveys of genic heterozygosity for a variety of natural populations and that they tell you that the range of levels of heterozygosity is 5% to 18%. What happens? On the assumptions we started with, Nu will have to be pretty uniform across the species data. Try it; you'll see.
Does this intersection between data and theory make you uncomfortable? It should. Richard Lewontin did the work in The Genetic Basis of Evolutionary Change in 1974. His data concerned a number of populations including horseshoe crabs, Drosophila, mice, and humans. And he thought the consequences of reconciling the data with the theory was absurd. Rightly so! Surely we know that the population sizes of horseshoe crabs, Drosophila, mice, and humans vary pretty widely.
This paradox is one of several Lewontin lumps under the name "paradox of variation." And while we now know that the data he uses to drive our specific problem, as well as the others, is not exactly representative of the major taxonomic groups, more recent data don't make the "population size paradox" go away (Gillespie 1991). Of the population size paradox, Lewontin said
"The only escape [from the population size paradox] would be to show somehow that the stochastic theory of random drift was incomplete in an important way and that in a correct theory the predicted heterozygosity would be independent of population size." Lewontin (1974, p. 210).
John Gillespie (2000a, 2000b, 2001) offers a solution that is exactly the escape for which Lewontin asks. (Gillespie himself doesn't say so. But history does.) Gillespie's solution involves the proposal of a new stochastic evolutionary process called "genetic draft." Draft produces drift-like dynamics in that it removes genetic variation as drift does. But draft does it independently of population size. And in so doing, can remove genetic variation at a rate faster than drift can. Where drift's ability to remove variation decreases as population size increases, draft's ability to do the same has no such dependency. Consequently, as population size increases, the effect of draft on genetic variation swamps the effect of drift.
Actually, draft is a form of linked-selection, or hitchhiking . Now, John Maynard Smith and John Haigh offered hitchhiking as a solution to Lewontin's population size paradox in 1974. Indeed, on their deterministic model, hitchhiking removed genetic variation faster than drift as population size increases. But Maynard Smith and Haigh's solution languished because it depended on the apparently problematic assumption that populations have high linkage disequilibrium. By the end of the 1980s, some relevant supporting data turned up (Gillespie 2004). And Gillespie got interested. His exploration of draft is a re-investigation of hitchhiking using a stochastic model (with random variables for the timing of a hitchhiking event and the probability that the neutral allele is linked to a selectively advantageous mutation).
As I said, draft (purportedly) solves the population size paradox by removing genetic variation faster than drift does. The analogue to the measure of H-hat above for the balance of drift, mutation, and draft, is H-hat* = (4Nu)/(1+2NrhoE{y2}), where the rate of substitution rho = 4Nus, s is the selection coefficient, and y is the final frequency of the hitchhiking allele. Draft will be more significant than drift when the rate at which draft decreases heterozygosity in a population is greater than the rate at which drift does the same, i.e., when E{y2} > 1/(2N). This happens when N gets sufficiently large so that draft dominates drift, i.e., when N > 104. The pressing problem is understanding where the rate of substitution comes from. This is no easy problem.
I find this stuff utterly remarkable. I encourage reading Gillespie's papers, which are very difficult but also very rewarding. Presently, I'm working on a paper that traces hitchhiking as a solution to the population size paradox from Maynard Smith and Haigh to Gillespie. I introduced genetic draft to philosophers of biology in a symposium called "Four Case Studies of Chance in Evolution" at the 2004 PSA meetings. The talk was ... confusing. But the resulting paper, in the press at Philosophy of Science, I think does a good job at articulating the differences between drift and draft.
References
Gillespie, J. H. (1991), The Causes of Molecular Evolution. New York: Oxford University Press.
Gillespie, J. H. (2000a), "Genetic Drift in an Infinite Population: The Pseudohitchhiking Model", Genetics 155: 909-919.
Gillespie, J. H. (2000b), "The Neutral Theory in an Infinite Population", Gene 261: 11-18.
Gillespie, J. H. (2001), "Is the Population Size of a Species Relevant to its Evolution?", Evolution 55: 2161-2169.
Gillespie, J. H. (2004), Population Genetics: A Concise Guide. Baltimore, MD: Johns Hopkins University Press.
Kimura, M. & T. Ohta (1971), "Protein polymorphism as a phase of molecular evolution. Nature 229: 467-469.
Lewontin, R. C. (1974), The Genetic Basis of Evolutionary Change. New York: Columbia University Press.
Maynard Smith, J. & J. Haigh (1974), “The Hitch-hiking Effect of a Favourable Gene”, Genetical Research 23: 23-25.


Aren't Lewontin's data that you quote based on allozymes? The expected heterozygosity formula assumes neutrality of the loci. It seems to be that assuming neutrality of allozyme alleles would run in to major problems. The Begun and Aquadro data make hitchhiking a very plausible scenario for explaining sequence polymorphism (although Charlesworth has argued that background purifying selection could produce the same patterns).
When we look at non-coding sequence polymorphism don't we see effects of population size? Wouldn't this overcome Lewontin's paradox?
Posted by: RPM | March 02, 2006 at 07:31 AM
Aren't Lewontin's data that you quote based on allozymes?
Yes. But see Gillespie (1991).
The expected heterozygosity formula assumes neutrality of the loci. It seems to be that assuming neutrality of allozyme alleles would run in to major problems.
The assumption of neutrality is crucial --it's the neutral theory (as an extended form of the classical view) that's under scrutiny here. Lewontin includes the problem you bring up as one of the problems lumped under the "paradox of variation." So good call.
The Begun and Aquadro data make hitchhiking a very plausible scenario for explaining sequence polymorphism (although Charlesworth has argued that background purifying selection could produce the same patterns).
This is one of the key sets of data that got Gillespie interested in re-investigating draft.
Posted by: Robert Skipper | March 02, 2006 at 07:38 PM