A quick intro to the
main interpretations of probability
Since probability calculus has been axiomatized, Kolmogorov’s axiomatization being the standard one, and the one we briefly considered in this course, one might simply say that probability is whatever satisfies the axioms of probability, much in the same way in which, say, Euclidean items are whatever satisfies Hilbert’s axiomatization of geometry. Many quantities, such as normalized length, satisfy the axioms of probability. However, such quantities do not provide an interpretation of probability in the sense of an analysis of the notion of probability, which, presumably, is what one has in mind when one asks what probability is. Hence, assuming that the question is not ill-posed, one may feel the need to engage in some mathematical/philosophical considerations.
The main interpretations of probability are best divided into into two groups:
The Classical interpretation (Bernoulli, Laplace, and most everyone up to the 1800’s)
This interpretation was developed
first in the late xvii century, especially by Jacob Bernoulli (Ars Conjecturandi,
1713), but codified by
NOTE: This is required by the fact that in science probabilities can be expressed by irrational numbers.
What possible cases? For example, in tossing a fair die, one
could have a sample universe of 2 and not-2 and claim that consequently
Pr(2)=Pr(not-2)=1/2, which will not do.
The answer is to require that the possible cases be equiprobable. To avoid circularity (defining probability by
appealing to equiprobability) one defines equiprobable cases as those for which there are no relevant
rational grounds for choosing among them (Principle
of Indifference). Hence, the case
not-2 subdivides into 5 equiprobable cases. So, for
NOTE: for the objective interpretation this is nonsense. If the coin is loaded, certainly Pr(H) ≠ Pr(T) ≠ 1/2.
Problems:
The frequency interpretation (Venn, Reichenbach, von Mises)
Probability theory is taken to be a mathematical science dealing with mass random events, which are unpredicatble in detail but whose numerical proportion in the long run with respect to a given set of events (the reference class) are predicatble
Example:proportion of heads when flipping a coin many times; births (deaths) of, say, males in a population; raindrop distributions, etc.
NOTE: analogy with, say, dynamics, whose subject matter is force.
Gamblers and statisticians have long known of the intimate relation between probability and frequency: if Pr(2)=1/6, then in the long run the frequency of 2 within the class of all the outcomes tends towards 1/6. The frequency interpretation holds that the probability of an event or property M in a reference class B is (perhaps in an idealized way) the frequency of M within B
NOTE:
Problems:
NOTE: von Mises did not consider this a serious objection: as the mechanical definition of work does not apply to the everyday notion of work, so this interpretation does not agree with our everyday notion of probability.
NOTE: this may be a problem that affects other interpretations as well, however.
The Logical Interpretation (Keynes, Jeffrey, Carnap)
The basic idea of the logical interpretation is that probability is the measurement of partial entailment (with probabilities 1 and 0 as limiting cases), that is, the measurement of the evidential link between evidence E and the hypothesis H supported by E. As such the logical interpretation tries to provide a framework for inductive logic. We have already seen this in our discussion of entailment in terms of conditional probability.
There are several versions of this interpretation, but the most famous is by Carnap (Logical Foundations of Probability, 1950).
Consider a language with 3 names, a, b, c, and a predicate F. This language has 8 state descriptions, that is, statements saying for each individual whether it has F or not:
When we look at the state descriptions, we note that some differ only by permutation of names. For example, (2), (3), (4) all have two things with F and one with –F. (2), (3), and (4) constitute a structure description. There are four structure descriptions:
Now one defines a function m* that assigns weights to structure and state descriptions in two steps:
Note that:
At this point, given any two statement h and e, one can introduce a confirmation function c* such that
c* (h,e) = [m*(h&e)]/m*(e).
Clearly, c* (h,e) does the job of Pr(h|e). c* is introduced expressly to account for our ability to learn from experience. c* can be generalized to a family of functions, but considering that is beyond our goals here.
Most of the problems of the logical theory center on the attempt to provide a framework for inductive logic:
Propensity interpretation (Popper)
In this view, probability is a physical disposition to produce outcomes of a certain kind.
NOTE:
For some, the outcomes are long run (but not infinite) frequencies: A fair coin has a propensity to land with T half the times in the long run. Note that ˝ does not measure this tendency, whose strength, as it were, is close to 1.
For others, the outcomes are single outcomes: the propensity of a fair coin to come up with T is 1/2.
Problems:
The Subjectivist interpretation (de Finetti, Jeffrey)
Probability is degree of belief held by a rational agent, that is, an agent whose degrees of belief (minimally):
NOTE:
Many subjectivists (e.g., de Finetti) analyze degrees of belief (probabilities) in terms of (possible) betting behavior. Consider a bet where one wins W=1 if A is true and loses L if it is false. The probability you attribute to A is what you think the fair value of L expressed in units of W, that is, the value of L if you did not know which side of the bet you would have to take. For example, suppose you consider the arrangement whereby one wins $1 if A it true and loses $1/3 if A is false a fair one. Then, you believe that Pr(A) =1/4. In fact the arrangement is fair when
1Pr(A)=1/3 (1-Pr(A)),
that is,
4/3Pr(A)=1/3,
or
Pr(A)=1/4.
Here probability is understood in terms of utility (in the example, $$) and rational preference. (Since you don’t know which side of the bet you’ll get, you’ll settle for a fair bet).
Others (e.g., Ramsey) try to obtain both probability and utility from rational preference in a two step procedure whereby first one obtains probability from rational preferences, and then utility from probability and rational preference. Roughly, here is the procedure. First, Ramsey introduces ‘ethically neutral’ statements, namely statements which per se are indifferent to you, so that their only significance is their association with the outcomes of gambles. Suppose now that you prefer A over B, that statement P is ethically neutral and that you are indifferent between the gamble: get A if P is true and B if P is false and get A if P is false and B if P is true. Then, by definition, Pr(P)=1/2. Note that now P can be used to set up a lottery just in the same way a fair coin can. The probabilities of many other ethically neutral statements can be obtained analogously. Once the set of ethically neutral statements for which we know the probability is large enough, they can be used in place of lotteries in determining utilities. Hence, one determines utilities and then the probabilities of the remaining ethically neutral statements. Finally, by knowing utilities, one can obtain the probabilities of non-ethically neutral statements by appealing to the expected values of bets.
Other constraints beyond (1)-(2) have been proposed, but dealing with them would take us too far afield.