A very short intro to
evolutionary game theory
Game theory developed to study the strategic interaction among rational self regarding players (players seeking to maximize their own payoffs). However, by the early 1970’s the theory underwent a transformation and part of it morphed into evolutionary game theory, which allows to increase our understanding of dynamical systems, especially in biology and, more recently, in psychology and the social sciences, with significant ramifications for philosophy. The players are not required to be rational at all, but only to have (perhaps hardwired) strategies that are passed on to their progeny. In short, the notion of player is displaced by that of strategy, and consequently the notion of a player’s knowledge, complete or incomplete, is dispensed with. What drives systems is not the rationality of the players but the differential success of the strategies.
As before, we consider only two-player games. A game with strategies s1,…..sn for both players is strategy symmetric (symmetric, in brief) if:
For example, The Prisoners’ Dilemma is a symmetric game. Along the main diagonal, the payoffs are the same in each box that is, (1,1) and (6,-6); moreover, we have (-10, 10) in the top right box and (10, -10) in the bottom left, which are mirror images of each other.
|
S |
C |
S |
1;1 |
10;-10 |
C |
-10;10 |
-6;-6 |
A symmetric matrix can be simplified by writing only the payoffs of the row player, as those of the column player can be easily obtained by exploiting the symmetry of the game. So, the previous matrix can be simplified as
|
S |
C |
S |
1 |
10 |
C |
-10 |
-6 |
Evolutionary Stable
Strategies (ESS)
An important concept of evolutionary game theory is that of evolutionarily stable strategy (ESS). To understand it, we need some new notions.
Imagine now that we keep repeating a symmetric game (each round is called a ‘stage game’) with random pairing in an infinite population in which the only relevant consideration is that successful players get to multiply more rapidly than unsuccessful ones. (The demand that the population is theoretically infinite excludes random drift). Suppose that all the players (the incumbents) play strategy X, which can be a pure or a mixed strategy. If X is stable in the sense that a mutant playing a different strategy Y (pure or mixed) cannot successfully invade, then X is an ESS. More precisely, X is an ESS if either
that is, the payoff for playing X against (another playing) X is greater than that for playing any other strategy Y against X
or
that is, the payoff of playing X against itself is equal to that of playing Y against X but the payoff of playing Y against Y is less than that of playing X against Y.
Note that either (1) or (2) will do.
Obviously, if (1) obtains, the Y invader typically loses against X, and therefore it cannot persist. If (2) obtains, the Y invader does as well against X as X itself, but it loses to X against other Y invaders, and therefore it cannot multiply. In short, Y players cannot successfully invade a population of X players.
It is possible to introduce a strategy that is stronger than an ESS, namely, an unbeatable strategy. Strategy X is unbeatable if, given any other strategy Y
E(X,X)>E(Y,X) and E(X,Y)>E(Y,Y).
An unbeatable strategy is the most powerful strategy there is because it strictly dominates any other strategy; however, it is also rare, and therefore of very limited use.
Given a strategy X and any other strategy Y let us call X
Nash if E(X,X)≥E(Y,X)
(X is a best reply to itself)
and
Strict Nash if E(X,X)>E(Y,X)
(X is a strict best replu to itself)
Then the following relations obtain, with the arrow indicating entailment:
Unbeatable → Strict Nash → ESS → Nash
In sum, an unbeatable strategy is also strict Nash, and so on.
A few final points about ESS should be noted:
Often, ESS are associated with mixed strategy equilibria. For example, Chicken (Snowdrift is essentially the same game) has two pure strategy Nash equilibria, neither of which rest on ESS. However, the mixed strategy resulting in a Nash equilibirum is an ESS. (Note that in this context mixed strategies are understood in terms of frequencies of players in a population each playing a pure strategy). A very famous version of ESS is the mixed strategy resulting in Nash equilibrium in Hawk-Dove, a biology oriented version of Chicken.
We now turn to a more general approach to evolutionary games.
Evolutionary Dynamics
We just saw that in a population in which an ESS has already taken over, invasion does not occur successfully. However, under which conditions does a strategy take over in a population? What happens if a game in an infinite population is repeated indefinitely? The answer comes from evolutionary dynamics, which studies the behavior of systems evolving under some specific evolutionary rule. The basic idea here is that of the replicator, an entity capable of reproducing, that is, of making (relevantly) accurate copies of itself. Examples of replicators are living organisms, genes, strategies in a game, ideas (silly or not), as well as political, moral, religious, or economic customs (silly or not). A replicator system is a set of replicators in a given environment together with a given pattern of interactions among them. An evolutionary dynamics of a replicator system is the process of change of the frequency of the replicators brought about by the fact that replicators which are more successful reproduce more quickly than those which are less successful. Crucially, this process must take into account the fact that the success, and therefore the reproduction rate, of a replicator is due in part to its distribution (proportion) in the population. For example, when playing Chicken, although drivers do well against swervers, in a population of drivers a single swerver does better than anybody else. So, it will reproduce more quickly the others. However, at some point or other, there will be enough swervers that the drivers will start to do better again. It would be nice to know whether there is some equilibrium point, and if so what it is.
Since the differential rate of reproduction determines the dynamics of the system, we need to be more precise and specify what we mean by ‘more quickly’. This is determined by the dynamics of the system; the one we are going to study is replicator dynamics. There are other models that plausibly appply to evolution, but replicator dynamics is the easiest and the most often used, at least at a first approach. Replicator dynamics makes three crucial assumptions:
In addition, we shall restrict ourselves to studying repeated games whose stage games are symmetric and have only two players, so that the math remains easy.
To understand replicator dynamics, we need to introduce a few notions. The first is that of rate of change. Experience teaches that things often change at different rates. For example, sometimes accidentally introduces species multiply more quickly in a population than native species: in other words, their rate of growth is greater than that of native species, and this typically results in a decline of the natives. So, if at the moment of introduction the frequency of the non-native species was p and that of the native species was q= 1-p, with q >> p, after some time the situation may change and become p>q. This means that p has a positive rate of change (it increases) while q has a negative rate of change (it decreases). Mathematically, we express this by writing
D(p) > 0 and D(q) <0 ,
where D(p) (the derivative of p with respect to time) means ‘the rate of change of p’, and similarly for q.
So, suppose a population P increases by 1/3 every second; then if in the beginning p=100, then after one second, p1=100 + 1/3(100)= 130, after 2 seconds, p2= p1 +1/3(p1) = 130 +1/3(130), and so on.
The second notion, which we have already met, is that of expected payoff of a pure strategy. Suppose a strategy s is played against strategies S1, …,Sn, and that Pr(Si) is the probability that Si is played. Then,
EP(s)=E(s,S1) Pr(S1) + …..+ E(s,Sn) Pr(Sn).
That is, if by Si we denote a generic S, the expected payoff of s is the sum of the payoffs of s against each of the Si times the probability that Si is played.
The third notion is that of average payoff of a set of strategies, and to understand this, we need to consider the notion of the mean. Suppose that in a group of boxes, 1/3 weigh 30 kilos, ½ weigh 20 and 1/6 weigh 60. Then the average weight ĀW (notice the bar above ‘A’) is:
ĀW =30x1/3 + 40x1/2 + 30x1/6 = 10+20+5=35.
In words, the average weight is the sum of all the weights, each multiplied by its frequency. (Since 1/3 of the boxes weigh 30 kilos, we multiply 30 by 1/3, and so on).
Similarly, if S(tag) and H(are) are the two available strategies, the average payoff is
ĀEP= EP(S)Pr(S) + EP(H) Pr(H).
For example, consider the following Stag Hunt matrix, and suppose that Pr(S)=p, so that Pr(H)=1-p.
|
S |
H |
S |
3;3 |
0;2 |
H |
2;0 |
1;1 |
Table 12
Then, the EP of S, when played with frequency p against another S and with frequency 1-p against an H is:
EP(S)=3p+0(1-p)=3p.
Analogously, the EP of H, when played with frequency p against another S and with frequency 1-p against another H is
EP(H)=2p+1(1-p)=p+1.
So, the average expected payoff is EP(S) times the probability that S is played plus EP(H) times the probability that H is played:
ĀEP=EP(S) x Pr(S) + EP(H) x Pr(H)= 3p2 + (p+1)(1-p)= 2p2 +1.
In replicator dynamics, if Pr(S)=p, the dynamical equation (the equation ruling the behavior of the system through time) is:
D(p) = [EP(S) – ĀEP)p
In other words, the rate of change of the frequency of a strategy (in this case S), is determined by the difference between the expected payoff of S and the average payoff. Consequently, when S’s expected payoff is greater than the average payoff, the frequency of S increases, and when it’s smaller then the frequency of S decreases. Hence, in our example we have:
D(p) = [EP(S) – ĀEP)p = [-2p2 +3p-1]p.
Obviously, when D(p)=0 the frequency of S (that is, p) does not change. The values of p for which the frequency of S does not change are called “fixed points”. So, let us find the fixed points in our example, that is, let us find when
[-2p2 +3p-1]p=0.
Obviously, one fixed point is p=0. For the other two, we need to solve
-2p2 +3p-1=0,
which gives p=1 and p=1/2. So, when p=0, or p=1, or p=1/2, the frequency of S does not change. But what happens when p is not equal to any of the three fixed points? Let us study the plot of
D(p)=[-2p2 +3p-1]p.
As we can verify by substituting 1/3 for p, when 0<p<1/2 the rate of growth of p is negative, and when 1/2<p<1 the rate of growth of p is positive (just substitute 2/3 for p, for example). Since for p=1/2 the growth rate of p is zero, the plot looks (more or less) like this:
If at some time p<1/2, as the growth rate is negative p will eventually become zero, that is, the strategy S will disappear; by contrast, if at some time p>1/2, then S will become fixated, that is it will remain the only strategy (H will disappear). If p=1/2, then exactly half of the population will play S and half H. However, this equilibrium is not stable in the sense that even a minor deviation from it will push one of H or S to extinction and the other to fixation, both of which are stable.
The interval (0,1/2) is the basin of attraction of which H is the attractor, and the interval (1/2,1) that of S. The fixed points p=0 and p=1 are asymptotically stable because each is an attractor of a basin of attraction. If the basin of attraction of an attractor contains the whole interval in which p is defined, or at least all the interior points, then the attractor is globally stable. In our example, no fixed point is globally stable. 1/2 is the interior fixed point and, as we saw, it is unstable.
Replicator dynamics of
a generalized 2 by 2 symmetric game
A symmetric game with 2 strategies , A and B, can be represented by the following payoff matrix, where the payoff are those of the row strategies:
|
A |
B |
A |
a |
b |
B |
c |
d |
It turns out that the replicator dynamics does not change if in any column we add or subtract the same quantity from all the boxes. (For examples, see the exercises). Hence, we can reduce the matrix by subtracting c from the first column and d from the second, obtaining
|
A |
B |
A |
a-c |
b-d |
B |
0 |
0 |
Let us now determine the dynamics, with Pr(A)=p.
EP(A)=(a-c)p+(b-d)(1-p).
Because of our matrix manipulation EP(B)=0, and consequently the average expected payoff is simply
ĀEP = p[(a-c)p+(b-d)(1-p)]+0.
Hence,
D(p) = p[(a-c)p+(b-d)(1-p)-p2(a-c)-p(b-d) (1-p)].
After a bit of algebra, we get
D(p) = p(1-p)[(a-c)p+(b-d)(1-p)],
that is,
D(p)=p(1-p)[EP(A)].
Hence, D(p) = 0 when p=0, or p=1, or EP(A) = 0. Note that solving EP(A) = 0 gives the interior fixed point.
So, to find the interior point:
· Reduce the matrix by setting strategy B’s payoffs equal to zero and modifying the other payoffs accordingly
· Solve EP(A) = 0.
There are five cases in a game:
There are some interesting connections between dominance, Nash equilibriums, ESS, and replicator dynamics.
The quasi-replicator
dynamics of the iterated Prisoners Dilemma
In replicator dynamics, two players meet randomly, play a one-shot game and then separate, as each randomly meets a player again. As dominated strategies do not survive replicator dynamics, defecting reaches fixation in The Prisoners Dilemma. What happens if we play The Prisoners Dilemma with the evolution equation of replicator dynamics but with direct reciprocity, namely by having the same two players repeat the game more than once, with random drift, with the presence of mutations and with occasional strategy execution errors? To avoid the temptation of using backward induction to defecting all the times, let us suppose that the players do not know how many times they are playing each other; all they know is that after each round they have a probability p of playing again, so that the length of the average play is 1/(1-p) rounds. We may consider a general matrix for cooperation and defection in which only the row payoffs are given:
|
C |
D |
C |
R |
S |
D |
T |
P |
One gets R(eward) for mutual cooperation, P(unishment) for mutual defection, S(ucker) for cooperating against a defector, and T(emptation) for defecting against a cooperator. The Prisoners’ Dilemma obtains if T>R>P>S. We could think of the game as follows. The cooperator helps at a cost c and the receiver of the help gets a benefit b. Defectors do not help and therefore incur no costs. Then: R=b-c; S= -c; T=b; P=0.
To make things more interesting, in addition to ALLC (always cooperate) and ALLD (always defect), let us consider some reactive strategies that act on the basis of what happened in the previous stage.
TFT (tit-for-tat) acts as follows: it starts by cooperating and then it considers the opponent’s last move; if the opponent cooperated TFT cooperates, and if the opponent defected, it defects.
GTFT (generous tit-for-tat) acts as follows: it’s like TFT with one difference: every so many moves (say, 1/3 of the times) it cooperates even if in the previous stage the opponent defected.
WSLS (win-stay; lose-shift) acts as follows: WSLS looks at its own payoff in the last stage; if it equal to T or R it considers his payoff a success and it repeats the previous strategy; if not, it shifts strategy. In short, if the previous payoff was one of the two highest, it keeps doing the same thing; if it wasn’t, it switches.
Martin Nowak has run programs modeling the following scenario. The matrix has R=3,T=5,P=1, and S=0. There is a large number (100 in the run) of randomly chosen and uniformly distributed strategies. There is direct reciprocity; occasionally, the strategies make mistakes, simulating human behavior; new strategies are put into play, simulating mutations, and neutral drift is allowed. What typically happens (M. Nowak, Evolutionary Dynamics, ch. 5) can be visualized as follows:
In a random mix of strategies, ALLD does very well, almost taking over. At that point, even a small cluster of TFT, already present or introduced by mutation, will start expanding because it will defect against defectors (ALLD) but cooperate with cooperators (other TFT’s mostly). Once TFT becomes abundant, its unforgiving nature makes it succumb to GTFT. The reason is that since mistakes sometimes occur, a TFT playing another TFT might defect instead of cooperating. This will prompt the latter to defect as well in the next stage, thus starting a cycle of cooperation/defection. By contrast, GTFT will at some point try to cooperate again, breaking the cycle and receiving higher payoff. In short, GTFT quickly recovers from mistakes while TFT does not. What happens now depends on how generous GTFT is. If it is sufficiently vindictive, it takes over and becomes stable. However, if it is too generous, once it has taken over it will do no better than an ALLC, which may arise by mutation. If the game is played long enough, ALLC will take over by neutral drift. (In a population of N individuals with equal fitness, the probability that eventually all the population will be the descendent of a given individual A is 1/N. Hence, if the game is played long enough, this eventuality will come about). At this point, an ALLD mutation will result in an ALLD explosion. The cycle will start again. However, if when the frequency of ALLC’s is high WSLS arises as a mutant, the cycle is broken. When two WSLS A and B play each other, if they cooperated in the previous stage, they’ll keep cooperating. If A makes a mistake and defects, here’s what happens:
A: CCCDDCCC…
B: CCCCDCCC…
Cooperation, in other words, will resume after two stages.
If A is a WSLS and B an ALLC, if they cooperated in the previous stage they’ll keep cooperating. However, if A makes a mistake and defects, it will keep defecting against the much too nice B:
A: CCCDDDD…
B: CCCCCCC…
In short, WSLS exploits ALLC’s goodness while cooperating with other WSLS’s.
When a WSLS meets an ALLD, the ALLD is better off, as we have
WSLS: CDCD….
ALLD: DDDD…
Note that ALLD averages (P+T)/2 per game. Hence, as long as R>(P+T)/2 , the WSLS playing each other will average more than an ALLD playing a WSLS. In other words,
E(WSLS,WSLS)>E(ALLD,WSLS),
a strict Nash that comes close to delivering an ESS strategy. (It just comes close because the population is finite and random events, random drift for example, are allowed). In other words, one (or just a few) ALLD mutant will not invade. If R≤(P+T)/2 then a stochastic variant of WSLS that cooperates after mutual defection only with probability less than 1 will take over. Occasionally, the system cycles back to ALLD, but the mechanism is unclear.
The limits of
replicator dynamics
Replicator dynamics has some features that limit its application.
The dynamics we considered has no mutation, in the sense that every replicator produces an identical copy of itself. A side consequence of this is that if a strategy has disappeared, it will never reappear again. (By the way, this is why p=1 and p=0 are always fixed points, even when they are not stable). There are ways of dealing wih this. For example, one could modify the replicator equation by introducing a term that makes, say, S turn into H with some probability q>0, the simplest case of mutation; alternatively, one can use Markov chains. However, as this complicates things we shall not do it. Even so, we can note the following. In our Stag Hunt example, p=1/2 is an unstable interior point, which means that random mutations will move the system in either of the two attraction basins with equal probability. However, if we change the payoff matrix to (5,5) when S meets another S, the interior point becomes p=1/4 (check it out!), which means that, given initial uniform distribution, on average random mutations will push systems more often in the basin of attraction with p=1 as the attractor; hence, on average systems will spend most of their lives in that basin. Of course, if there are n individuals in the population, there is a probability qn that in the transition from one generation to another, all the S’s (or enough of them) become H, thus bringing H to fixation; however, even for relatively small n, that probabilty is likely to be really negligible. For example, if q=10% and n=17, the resulting probability is 10-17, which is really phenomenally low, as 1017 is roughly the age of the universe is seconds.
In addition, the population must be infinite (in practice, very large); with finite populations, random drift is unavoidable. Consequently, if we are modeling a population that is not very large we need to use a different, and more complex, procedure than replicator dynamics.
The Dynamics of Finite
Populations
In replicator dynamics, the population must be infinite. Of course, this requirement is unrealistic, but replicator dynamics has the advantage of being straightforward and easy to work with. Hence, as long as the population is large enough it is typically the first modeling approach.
However, often populations are not large enough, and this requires a different approach that involves stochasticity; in other words, realistic finite populations entail jettisoning deterministic evolutionary rules: in finite populations chance matters. The easiest way to see this is to consider the Moran process in the case of neutral drift, where selection plays no role.
Consider a population of N individuals reproducing at the very same rate, which means that selection does not favor one over another: they are neutral variants. At each time step, one individual is randomly chosen for reproduction and one for elimination. The two individuals may be the same individual X, in which case X produces another X and dies in the same step. (Note that this requirement makes the process of choosing the same as drawing with replacement). It follows that N remains constant. Suppose now that there are i individuals of type A and consequently N-i individuals of type B. If we indicate with XD the fact that X is chosen for death and with XR the fact that X is chosen for reproduction, there are 4 cases:
Now let us indicate with pi,i+1 the probability that i increases to i+1, and similarly for i decreasing to i-1, and i remaining the same. Then we obtain the following state transition rules:
(The last is true because when i=0 there are only B’s and when i=N there are only A’s). States i=0 and i=N are absorbing states because when the system reaches one of them it remains there forever; the other states are transient states. If we wait long enough, the system will end up with all A’s or all B’s; that is, all the population will descend from the very same individual. Hence, although there is no selection, eventually A or B will become fixated. There are techniques, Markov chains for example, to determine interesting facts like in how many steps the system will get to the fixation of A. However, we shall leave that aside and ask instead a different question we can answer directly: if there are i A’s, what is the probability that A will become fixated? As there is no selection, each individual has the same chance at reproducing and leaving a lineage, namely 1/N; hence, that probability is i/N, as I is the number of A’s.
Suppose now that we add selection to the Moran process as follows. Let A’s fitness be r and B’s fitness be 1. Obviously, if r>1 A is favored by selection; if r<1 B is favored, and if r=1 we have neutral drift. We may work the new quantity r into the previous formulas as follows:
The reason for ri in the nominator of the first formula, Pr(AR), is obvious: we simulate selection by making the number of A’s be higher/lower than it actually is by causing it to be sensitive to r. The nominator of Pr(BR) is given by the fact that B’s fitness is 1 by definition. The rationale for the denominator being ri+N-i rather than merely N is normalization. Since Pr(AR)+PR(BR)=1, the nominators determine the denominators. Note that when r=1 the formulas revert to those for neutral drift. From (1)-(4) we can obtain the state transition rules, just as before. (What are they?). It turns out that if A’s fitness is r and the number of A’s is i, A’s fixation probability is:
P = [1- 1/ri]/[1- 1/rN].
If the population is large and r>1, rN will be very large, 1/ rN will be very small and therefore the denominator will become very close to 1. Hence, in a large B population the fixation probability of a single A mutant with fitness r>1 will be approximately
ρA = 1- (1/r).
For example, if N=100, r=1.1 and i=1, the numerator becomes .091. In the denumerator, rN = 13780, so that the denumerator becomes 13779/13780, which is very close to 1. Hence, ρA = .091 is a good approximation.
The generic 2x 2
symmetric game in finite populations
What happens if we play a game in a finite population? Let us consider the generic 2x2 symmetric game
|
A |
B |
A |
a |
b |
B |
c |
d |
in a population of size N, with i A’s and therefore N-i B’s. For any A, there are i-1 other A’s, and for any B, there are N-i-1 other B’s. Hence, given any A, the probability it interacts with another A is (i-1)/(N-1); the probability it interacts with a B is (N-i)/(N-1), and so on. So, A’s expected value is
EP(A)= a[(i-1)/(N-1)] + b[(N-i)/(N-1)],
and B’s is
EP(B)= c[i/(N-1)] + d[(N-i-1)/(N-1)].
In replicator dynamics, fitness is totally determined by EP, and were we to apply it to this finite population, we would determine the average payoff and set up the replicator equation. Here, however, fitness is given by a modification of EP. Let us introduce a variable s measuring the intensity of selection, with s=0 indicating that selection is absent and s=1 indicating that fitness is completely given by EP. When there are i A’s, let us indicate the fitness of an A with Fi and that of a B with Gi and define the two as:
Fi = 1 – s + sEP(A)
and
Gi = 1 – s + sEP(B).
Note that when s=0, the fitness becomes 1 for every individual, which means that we have neutral drift. By contrast, when s=1, EP totally determines fitness. For 0<s<1, there will be a part of fitness determined by EP and a part by drift. A’s reproduction chances must depend on
(The same, of course applies to B’s.)
Suppose now that we superimpose a Moran process to our population so that one individual is chosen for reproduction and one for elimination. So, the state transition rule for an A to be added is given by the probability that an A reproduces times the probability that a B is chosen for elimination:
pi,i+1 = or
.
Here (i/N) Fi is A’s fitness times A’s frequency;
is the average fitness; (N-i)/N is the
probability a B is chosen for elimination.
Analogously:
pi,i-1 =
As usual,
pi,i = 1-
pi,i+1 - pi,i-1.
As before, i=0 and i=N determine the two absorption states, which means that in the long run, A or B will become fixated.
There are some general rules determining the behavior of this system when:
The first rule is that if A is risk dominant (A’s basin is larger than B’s, or p*<1/2) then ρA>ρB , where ρA is the probability that the offspring of a single A in a B-population will achieve fixation, and analogously for ρB . In short, the probablity of A replacing B through a single mutant is greater than that of B replacing A.
However, note that ρA > 1/N does not entail ρB < 1/N: both ρA and ρB can be greater or smaller than 1/N. In the first case, selection favors both the fixation of an A in a B-population and of a B in an A-population; in the second case, selection contrasts replacement in either direction.
So, can we determine when, say, ρA > 1/N, namely when a single A in a B population has greater probability of becoming fixated (its descendents taking over) than in neutral drif?. It turns out that when (1)-(3) above apply, unexpectedly the “1/3 Law” holds. Determine the basin of attraction for A and B by using replicator dynamics. Then,
if the basin of attraction of B is less than 1/3, then ρA > 1/N.
In other words, if in replicator dynamics A would become fixated for some initial p<1/3, then ρA > 1/N, that is, strategy A can be deemed advantageous under conditions (1)-(3). For example, if
|
A |
B |
A |
5 |
1 |
B |
2 |
2 |
is played, B’s basin is ¼, and therefore A is advantageous in that ρA > 1/N. The 1/3 law applies to many (possibly all) processes in addition to the Moran process. The intuitive rationale, which we shall not prove, for this is that an A invader in a B-population plays on average 2/3 of the times with a B and 1/3 of the times with an A.
One can easily see that if A dominates B, then in replicator dynamics there is no interior point p* as p*<0; since no probability is negative, p* is discarded: p has only two values, 0 and 1, and A or B will reach fixation every time in replicator dynamics. In this case, p*<1/3 is always true, and therefore A is always advantageous, which is not unexpectad. However, there is a surprise: even if A dominates B, as long as c>b (B does better against A than A against B) then there will be a critical Nc such that if N< Nc then ρB > 1/N: the probability that a single B mutant will reach fixation is greater than in neutral drift! This is in sharp contrast with replicator dynamics in which dominated strategies necessarily disappear.
One can define something analogous to an ESS in finite population of size N. (Remember that the notion of ESS involves infinite populations). For a large N, B is an ESSN against an A invader if
Condition (1) entails that selection is against A as B is a strict best answer to itself, and condition (2) entails that ρA, the probability of fixation of a single A in a B-population, is smaller than 1/N, which means that selection favors B. Of course, since the system is stochastic (1)-(2) do not guarantee that A will not invade as ρA need not be zero.
For example, consider the following game:
|
A |
B |
A |
3 |
1 |
B |
2 |
4 |
Both A and B are best replies to themselves. The unstable interior point is p=3/4. Hence, B’s basin is larger than 1/3, which entails that A’s fixation probability is less than 1/N. Hence, B is an ESSN .
Note that if we increase EP(A,A) to 5 and diminish EP(B,B) to 2 then p=1/4, and B will not be an ESSN any longer.
Spatial Games
Replicator dynamics assumes random interactions among strategies. But this, as we noted, is unrealistic in many contexts. So, in Stag Hunt we can then think of groups of S’s in an S-structure interacting with H’s grouped together in an H structure. Then things can be dramatically different from random interaction situations, as the increase or decrease of S depends not on what happens inside the S-structure but on what happens at the border between the S’s and the H’s.
In such contextes, types of evolutionary dynamics different fom replicator dynamics are obvious; for example, a simple dynamics could be:
Every individual
(remember that only those at the border matter) looks at its payoffs and at
those of its immediate neighbors, and in the next round all individuals
simultaneously adopt the strategy that produced the highest payoff.
The way to make these ideas more precise is to look at spatial games.
Consider a spatial grid in which each individual occupies a position and interacts with all of its neighbors. The payoffs of each interaction are summed and in the next round:
|
|
|
|
|
|
|
D1 |
D2 |
D3 |
D4 |
|
|
D12 |
C1 |
C2 |
D5 |
|
D13 |
D11 |
C3 |
C4 |
D6 |
|
|
D10 |
D9 |
D8 |
D7 |
|
|
|
|
|
|
|
Here we have 4 cooperators from Stag Hunt surrounded by 12 defectors. We may imagine this grid as a small part of a larger one that is wrapped around so that there are no boundary effects. A cell’s neighborhood is the von Neumann neighborhood, constituted by the 4 cells sharing a side with it. For example, C3’s neighborhood is constituted by D11, C1, C4, D9. Hence, the fate of C3 depends on its strategy, those of its neighbors D11, C1, C4, D9, and those of its neighbors’ neighbors. Let us look at C3’s fate. It will obtain a payoff of 6 from cooperating with C1 and C4 and a payoff of 0 from attempting, and failing, to cooperate with D9 and D11. In short, its payoff will be 6. The same is true for the remaining 3 cooperating cells. Consider now D11. It will have a payoff of 3 from its interaction with D13, D12, and D10, and a payoff of 2 from its interaction with C3, for a total of 5. The same is true for the remaining defecting cells. Hence, in the next round the twelve defecting cells will turn into cooperators, and the cooperator square will expand, eventually taking over. Note that cooperation is more successful in this spatial game than under replicator dynamics. This is true of most, but by no means all, interesting spatial games.
A standard way to study an evolutionary game is to consider the conditions for invasion. So, imagine that cooperators have taken over and that one mutates into a defector. Its payoff will be 8, while that of each of its cooperating neighbors will be 9, which means that the defector will vanish in the next round. Two neighboring defectors will have a payoff of 7, while their neighboring cooperators will get 9; hence, the defectors will vanish. With 3 neighboring defectors, the defectors bordering with three cooperators get a payoff of 7, against one of 9 for the cooperators. More defectors will fare even worse. So, a community of cooperators is immune from invasion from defectors.
Imagine now a community of defectors with one mutant cooperator. The cooperator will have a payoff of 0 while each defecting neighbor will have one of 5, with the result that the cooperator will vanish in the next round. Two neighboring cooperators will obtain each a payoff of 3, while each defecting neighbor will get one of 5; consequently, the two cooperators will not survive to the next round. Three neighboring cooperators will evolve into a group of four cooperators with the shape of a latin cross ( “+”), with the central cooperator getting 9; this group will hold its own. As we saw, a square of four cooperators will take over. In short, in the spatial version of Stag Hunt we considered, cooperation is much more successful than defection when compared with replicator dynamics.
In addition to von Neumann neighborhoods, one can use