• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Ev: Something I don't understand

Paul C. Anagnostopoulos

Nap, interrupted.
Joined
Aug 3, 2001
Messages
19,141
This is another thread in the continuing saga of the Ev evolution simulation program. I have run across a behavior I don't understand, and I thought perhaps someone here might have a flash of insight.

Ev simulates the evolution of DNA binding sites:

http://www.lecb.ncifcrf.gov/~toms/paper/ev/

Certain critics have been complaining that the mutation rates used in experiments are much too high to be realistic. They insist that we run experiments with mutation rates on the order of 1/million DNA bases, rather than the 1/256 bases we often run.

So I ran a set of experiments with a fixed population (64) and a fixed chromosome length (1000). I varied the mutation rate from 1/1 million bases down to 1/200 bases. The number of generations to converge on a creature with perfect DNA binding varied according to the equation [latex]$g = .9m^{1.02}$[/latex], which is just about linear, as one would expect. I therefore conclude that high mutation rates can be extrapolated to lower ones with no fuss.

Then, for some reason, I decided to run another series of experiments with a fixed population (64) and a fixed number of mutations per base (1/16,000). I'm varying the chromosome length from 512 bases by factors of 2. I would expect the number of generations to evolve a perfect creature to remain constant, because the probability of mutating any non-junk DNA base in the chromosome remains constant. However, I'm seeing what appears to be a factor of 2 increase in the number of generations required as the chromosome increases in length by a factor of 2.

Does anyone have any thoughts on why this should be the case?

~~ Paul
 
Paul said:
So I ran a set of experiments with a fixed population (64) and a fixed chromosome length (1000). I varied the mutation rate from 1/1 million bases down to 1/200 bases. The number of generations to converge on a creature with perfect DNA binding varied according to the equation g=.9m^1.02, which is just about linear, as one would expect. I therefore conclude that high mutation rates can be extrapolated to lower ones with no fuss. Then, for some reason, I decided to run another series of experiments with a fixed population (64) and a fixed number of mutations per base (1/16,000). I'm varying the chromosome length from 512 bases by factors of 2. I would expect the number of generations to evolve a perfect creature to remain constant, because the probability of mutating any non-junk DNA base in the chromosome remains constant. However, I'm seeing what appears to be a factor of 2 increase in the number of generations required as the chromosome increases in length by a factor of 2.
Paul, I believe you now understand why the number of generations for convergence does not increase linearly with genome length even when you use a mutation rate that is fixed to a specific number of bases. For the sake of others that are trying to follow this issue, the reason the number of generations increase supra-linearly is the form of the selection process Dr Schneider uses in his program.

Selection is based on the number of errors that occur on a genome. There are two types of errors. The first type of error is the failure of the weight matrix to identify a binding site in the binding site region of the genome. The second type of error is the weight matrix identifying a binding site in the non-binding site region of the genome.

As you lengthen the genome in the model while keeping the mutation rate per locus constant, you are increasing the possible number of errors which can occur in the non-binding site region. That is what causes the generations for convergence to increase at a greater than linear rate.

Paul likes to complain that ev does not model reality in its entirety and I am going to have to agree with him on this point when it comes to the selection process Dr Schneider employs in his model.

In the early phase of the evolutionary process in ev, an error in the non-binding site region is less likely to select out that organism because there are so many errors in the binding site region. As the evolutionary process continues and more and more binding sites are properly identified, a single error in the non-binding site region can cause the selection out of that organism. Because of this effect, the early stages of evolution occur much more rapidly than the later stages of the process. Errors in the non-binding site of the genome have greater affect in the selection process late in the evolution of the binding sites than early in the evolutionary process.

If the selection process specified that any error in the non-binding site region will cause selection out of that creature regardless of how many or few binding sites are properly recognized in the binding site region would more accurately represent reality. If you employ this condition, the rate of evolution would markedly slow but I believe would be more representative of reality.
 

Back
Top Bottom