Possible
models for the development of human prosociality
A model of the evolution of human
cooperation should mirror basic human features of hunter-gatherer societies
within which we spent 95% of our existence and within which our social evolution
originally took place. A good case can be made that such features are similar
to those present in most hunter gatherer societies for which we have
anthropological records. Hence,), the model should take into account crucial
features of late Pleistocene Homo Sapiens
societies including
Finally, the model must describe as
outcome an equilibrium that is
With this in mind, let us look at
some proposed models of the emergence and persistence of cooperation.
THE
RATIONAL MODEL
This model is based on the strategic
interaction among players who are not merely self-regarding but rational as well, as classical game
theory assumes. The basic idea is to appeal to repeated games and to Folk
Theorems, to which we now turn.
Suppose we repeat a game G, The Prisoners’ Dilemma, for example, an
indefinite amount of times. Each round of G is called a “stage” of the repeated
game. It turns out that the repeated game has different properties than G; for
example, there are Nash equilibriums of the repeated game that are not Nash equilibria of G. (A Nash
equilibrium obtains when a strategy is a best reply to itself. For example,
consider driving on the right side or on the left side of the road. If you
drive on the right side, my best strategy is to do the same instead of driving
on the left. The same applies to you. So, when all players drive on the right
side we have a Nash equilibrium). To understand the import of this we need the
notions of discount factor and of signaling.
The discount factor δ is the equivalent
in present units of one unit of value to be received one time unit from now.
So, in general, to you $1 received one year from now is worth δ of a
present dollar; when δ=1 one values goods to be received one time unit
from now exactly as much as present goods. Hence, when life is uncertain or the
future looks grim the discount factor is low, and in general patient agents act
on a basis of a high discount factor while impatient ones on the basis of a low
discount factor. In monetary terms, when inflation is high, the discount rate
is high and the discount factor small.
In determining the expected payoff
of a strategy in a repeated game in which defection matters it is important to
have reliable signals telling one whether other players have defected or not. A
signal is public if all the players receive it (otherwise
it’s private), and it is perfect if
it correctly reports whether a player has defected or not (otherwise it’s
imperfect).
Consider now the following Prisoners’ Dilemma
Player 2 |
|||
Player 1 |
S |
T |
|
S |
+5; +5 |
-10;+10 |
|
T |
+10;-10 |
-5;-5 |
If the players can use mixed
strategies (for example, Player 1 might use S 30% of the times and T 70% of the
times), it turns out that any point in
the quadrilateral below, where the abscissa represents player 1’s payoff and
the ordinate player 2’s, refers to a possible payoff outcome. However, since by
defecting each player can guarantee he’ll incur a loss of at most -5, only the
points in the quadrilateral ABCD represent strategies with payoffs greater than
those resulting from universal defection. As the players are rational, they’ll
never settle for any payoff smaller than -5 and consequently only the points in
ABCD represent feasible outcomes.
Suppose now that each player is
given a list of moves that will result, if both follow it accurately, in an
average payoff of a for player 1 and b for player two, corresponding to point
P in the graph. Then, player 1 could set up the following strategy:
“As long as player 2 follows the
list, then follow the list as well; however, if 2 deviates then maximally
punish him (in this specific case, defect) forever after”.
(This type of strategy is called a “trigger strategy”)
Imagine now that player 2 follows
the equivalent strategy. Clearly, the mixed strategies leading to P, constitute
a Nash equilibrium (they are best replies to each other), as any deviation from
them will result in the application of the trigger strategy which will assure
that the deviator (and everybody else once the deviator retaliates) will get -5
ever after. Importantly, there are three assumptions at work:
Two things are worth noting. First, in any game a player can always maximize
the losses to his opponents compatibly with his opponent minimizing his losses.
(Here -5 is the minimum to which the original deviator may be pushed). Hence,
the above argument is general; second, all of this applies to many-players
games as well. Since P can be any point whatever in ABCD, we have the most
basic version of the Folk Theorem:
If
(1)-(3) obtain, any
point in ABCD (any payoff outcome in
ABCD) can be reached by a Nash
equilibrium in the repeated game.
So, when the Folk Theorem applies,
saying that a certain outcome is a Nash equilibrium is not saying much, as just
about any outcome can be a Nash equilibrium. The theorem can be extended to
many cases of public but imperfect information, and even to cases of private,
but almost public, information.
In providing a genesis of
cooperation one must show that the relevant equilibrium is both attainable and stable. At first
inspection, Folk Theorems seem to do this straightforwardly as any feasible
payoff above the maximin of -5 is a Nash equilibrium and a Nash equilibrium is,
in a way, self-fulfilling in the sense that if others stick to it, then it’s
not to one’s advantage to deviate, which means that such an equilibrium is, in
principle, self-perpetuating, and therefore stable. However, there are serious
problems.
· attainability. Since there is an infinite number of of Nash equilibria,
how do separate individuals coordinate to settle for one? The easiest way is to
assume that coordination rules are
already present as social rules. In
our example, the players were given a list of moves. But obviously this will
not do as it posits what needs to be explained. Moreover, social rules are
often broken when contrary to immediate self-interest, and therefore they are discretionary unless enforceable, which already presupposes social coordination.
Of course, one might argue that the enforcement (through punishment) is carried
out by individual members without any need to coordinate because it is
advantageous to the enforcer, for example by withdrawing cooperation (a form of
punishment) and therefore reducing the cost associated with cooperation.
However, this solution is problematic because if punishment is advantageous to
the punisher, then why punish only rule breakers? And if it is disadvantageous,
why punish at all? It seems we are back in the swamps of the Prisoners’ Dilemma, where defection,
free-riding, dominate. A solution might be the introduction of bargaining, but
this seems already to presuppose coordination rules.
· Cognitive
requirements.
It’s doubtful that the cognitive requirements of the Folk Theorems can
be realistically satisfied. Public information can be achieved either in very
small groups where everyone sees what everyone else does, or with an
information distribution system in larger groups. But the presence of such
system already presupposes considerable levels of coordination.
· Homo
economicus.
evidence from behavioral game theory shows that the assumption that
humans are self-interested is mistaken; this, in turn, eliminates the need to
use self-interested individuals in the models for the development of altruism.
POSITIVE
ASSORTMENT MODELS
In standard replicator dynamics,
interactions are random. For example, if 20% of the group members are
self-regarding, then any given individual will interact with a self-regarding
member 20% of the times. However, when it comes to human interaction, such a
requirement seems implausible. Hence, many models try to explain the emergence
of other regarding behavior by appealing to positive assortment; for example,
those with a tendency to be other-regarding interact with each other more
frequently than by mere chance. Here are some interesting cases.
Kin
altruism
If I increase my identical twin’s
fitness at a cost to me, my altruistic genes will be transmitted, through my
twin, to the next generation. So, if I behave altruistically towards my kin, we
have a case of positive assortment. The key equation here is
r > c/b.
Problem:
The value of r is unlikely to be high in a population that is not highly inbred.
In short, although there is no question that kin altruism is a force in
evolution, it cannot explain the simple fact that among primates, and
especially hunter gatherers, altruistic cooperation typically extends well beyond kin, at times even trumping
it.
Reciprocal
Altruism
If two individuals are randomly
paired to play the Prisoners Dilemma
for many rounds, then cooperation becomes probable if a strategy of reciprocal
altruism is introduced. The idea here is that X cooperates with Y if Y has
cooperated with X, and vice versa. This is an example of positive assortment in
that cooperators tend to cooperate with other cooperators more than with the
generic player. The simplest of these strategies is Tit-For-Tat (TFT), which
says to cooperate if the other player cooperated in the previous round and
defect if the other player defected in the previous round: TFT has a short
memory. Although the evidence for reciprocal altruism outside humans is
relegated to other primates, there is no doubt that it played an important role
in human interaction, as food sharing in our ancestral past was probably
network-based rather than common pot based; in other words, one primarily
shared with the individual who had previously shared with one. It turns out
that if a few TFT’s are present, TFT beats “Always Defect”, and therefore
produces an accessible equilibrium. There is evidence that it is not stable, as
TFT is itself supplanted by “Generous TFT” (GTFT), a strategy involving some
degree of forgiveness towards shirkers, which becomes stable if it is not too generous. (Of course, what counts as too
generous depends on the payoff matrix).
Problem:
Computer models show that TFT is not
accessible and/or not stable when interactions involve more than two players (typically, six players are enough to reduce
cooperation drastically) unless there are no mistakes and information is public
and accurate, each of which is an
unrealistic requirement.
Indirect Reciprocity
Indirect reciprocity is based on reputation.
If X benefits Y, then X has a greater chance of being benefited by Z than if he
benefits nobody. Positive assortment comes about because those with good
reputation will cooperate among each other more than those without it. A
strategy embodying indirect reciprocity is the good standing strategy. Players who cooperated with others in the
past are in good standing; otherwise they are in bad standing. The strategy is
to cooperate only with those who are in good standing if one is in good
standing and cooperate unconditionally if one is in bad standing due to a
previous mistake so as to reacquire good standing. Under incarnations of this
model involving random interaction, indirect
reciprocity will succeed if the probability p of knowing the score of another
player exceeds the ratio between cost and benefit:
p> c/b.
Problem:
This information requirement is
high, and exceedingly difficult to satisfy if interactions involve more than a
few players. Although the use of language may facilitate the attainment of the
relevant information, the strategy based equilibrium is unstable if errors are
allowed and information is imperfect, as it is bound to be given the incentive
to convey false information.
Possible
solution:
a way to overcome the information problem
in a large group if one engages in costly
signaling, namely signaling that cannot be faked. For example, an act of
bravery or a public sharing of food or some ritual scarring may increase one’s
reputation. So, indirect reciprocity can produce a stable cooperative
equilibrium in small groups where everyone knows all the relevant information
about everyone else –a fact that can be plausibly modeled with various types of
spatial games, or when costly signaling is present.
Multi Level (Group) Selection Models
The population is divided into
groups. The idea here is that although
cooperation lowers one’s fitness within one’s group, it sufficiently increases
the group’s average fitness with respect to the population to render one
reproductively successful. So, in general, in a group selection model, there
are two forces:
· within group selection, that typically disadvantages altruists
because they tend to do worse than selfish members of the group
· between group selection, that advantages altruists because
the average member of a group with many cooperators will do better than the
average member of other groups with little cooperation.
When cooperators suffer no
disadvantage within their group (this may happen because of strong leveling
factors), the model is one of weak
multi-level (or group) selection; when cooperators do suffer a disadvantage,
the model is one of strong
multi-level (group) selection.
There are different types of multi-level
selection; some work and some don’t
Consider a purely genetic based
model. Groups are reproductively isolated but individuals replicate in
proportion to how much better or worse with respect to the population average
their payoffs are. Individuals interact
only with members of their group. Given
the presence of cooperators, the more internally homogeneous groups with more
cooperators will increase; for example, individuals in a group made of all
cooperators will have much above average population payoffs. The destiny of
cooperators depends on whether
1.
Pr(C|C) – Pr(C|~C) > c/b,
2.
Pr(C|C) – Pr(C|~C) = c/b,
3.
Pr(C|C) – Pr(C|~C) < c/b,
where Pr(C|C) stands for the
probability that one interacts with a cooperator given than one is a
cooperator, and Pr(C|~C) is the probability that one interacts with a
cooperator given that one is not a cooperator. In (1), cooperation will
increase, in (2) it will remain stationary, and in (3) it will decrease. The
quantity
F=Pr(C|C) – Pr(C|~C)
is Wrights’s inbreeding coefficient,
which measures the level of genetic differentiation among the groups and also
the degree of positive assortment. It turns out that the evidence from foraging populations indicates that F is quite low,
in the order of 1/12, which would require that benefit b must be at least
12 times greater than cost c, a condition too stringent to make the emergence
of cooperation likely. Ethnographic
evidence points to high levels of migration among groups resulting in low
genetic group differentiation causing insufficient disparity in average payoff
level among groups. So, such group
selection mechanisms are unlikely.
Networks
In human societies, individuals are
part of networks so that they interact only with members to whom they are
directly linked. In other words, each individual interacts only with a group of
neighbors. After each stage of the game, individuals update their strategy and
adopt the strategy C of neighboring cooperators with probability equal to the
sum of the payoffs of all the neighboring cooperators divided by the sum of all
the payoffs of the neighboring non-cooperators. Nowak and others have shown
that cooperation will increase if
1/k > c/b,
where k is the average number of
neighbors an individual has. This obtains because the smaller the
neighborhoods, the more likely that they are different from each other, thus
effecting positive assortment. This means that the smaller the average
neighborhood, the greater the chance of an expansion of cooperators.
Problem: in foraging societies the whole group, typically about 30
to 60 or so individuals, often constitutes the neighborhood, which requires c/b
to be too small to be realistic.