Tuesday, October 23, 2012

Positive Quiddity: Occam's Razor

Occam's razor (also written as Ockham's razor, Latin lex parsimoniae) is the law of parsimony, economy, or succinctness. It is a principle stating that among competing hypotheses, the one that makes the fewest assumptions should be selected. Example:

It is possible to describe the other planets in the Solar System as revolving around the Earth, but that explanation is unnecessarily complex compared to the contemporary consensus that all planets in the Solar System revolve around the Sun.

Overview
The principle is often incorrectly summarized as "other things being equal, a simpler explanation is better than a more complex one." In practice, the application of the principle often shifts the burden of proof in a discussion. The razor states that one should proceed to simpler theories until simplicity can be traded for greater explanatory power. The simplest available theory need not be most accurate. Philosophers point out also that the exact meaning of simplest may be nuanced.

Solomonoff's inductive inference is a mathematically formalized Occam's razor: shorter computable theories have more weight when calculating the probability of the next observation, using all computable theories which perfectly describe previous observations.

In science, Occam's razor is used as a (general guiding rule or an observation) to guide scientists in the development of theoretical models rather than as an arbiter between published models. In the scientific method, Occam's razor is not considered an irrefutable principle of logic or a scientific result.

Justifications
Beginning in the 20th century, epistemological justifications based on induction, logic, pragmatism, and especially probability theory have become more popular among philosophers.

Aesthetic
Prior to the 20th century, it was a commonly-held belief that nature itself was simple and that simpler hypotheses about nature were thus more likely to be true. This notion was deeply rooted in the aesthetic value simplicity holds for human thought and the justifications presented for it often drew from theology. Thomas Aquinas made this argument in the 13th century, writing, "If a thing can be done adequately by means of one, it is superfluous to do it by means of several; for we observe that nature does not employ two instruments [if] one suffices."

Empirical

Occam's razor has gained strong empirical support as far as helping to converge on better theories (see "Applications" section below for some examples).

In the related concept of overfitting, excessively complex models are affected by statistical noise (a problem also known as the bias-variance trade-off), whereas simpler models may capture the underlying structure better and may thus have better predictive performance. It is, however, often difficult to deduce which part of the data is noise (cf. model selection, test set, minimum description length, Byesian inference, etc.)

Testing the razor

The razor's statement that "simpler explanations are, other things being equal, generally better than more complex ones" is amenable to empirical testing. The procedure to test this hypothesis would compare the track records of simple and comparatively complex explanations. The validity of Occam's razor as a tool would then have to be rejected if the more complex explanations were more often correct than the less complex ones (while the converse would lend support to its use).

Possible explanations can get needlessly complex. It is coherent, for instance, to add the involvement of Leprechuns to any explanation, but Occam's razor would prevent such additions, unless they were necessary.


In the history of competing explanations this is not the case. At least, not generally (some increases in complexity are sometimes necessary), and so there remains a justified general bias towards the simpler of two competing explanations. To understand why, consider that, for each accepted explanation of a phenomenon, there is always an infinite number of possible, more complex, and ultimately incorrect alternatives. This is so because one can always burden failing explanations with ad-hoc hypotheses.are justifications that prevent theories from being falsified. Even other empirical criteria like consilience can never truly eliminate such explanations as competition. Each true explanation, then, may have had many alternatives that were simpler and false, but also an infinite number of alternatives that were more complex and false.

Put another way, any new, and even more complex theory can still possibly be true. For example: If an individual makes supernatural claims that Leprechauns were responsible for breaking a vase, the simpler explanation would be that he is mistaken, but ongoing ad-hoc justifications (e.g. "And, that's not me on film, they tampered with that too") successfully prevent outright falsification. This endless supply of elaborate competing explanations cannot be ruled out – but by using Occam's Razor.
 
Applications
Science and the scientific method

In science, Occam's razor is used as a heuristic (rule of thumb) to guide scientists in the development of theoretical models rather than as an arbiter between published models. In physics, parsimony was an important heuiristic in the formulation of special relativity by Albert Einstein, the development and application of the principle of least action by Pierre Louis Maupertuis and Leonhard Euler, and the development of quantum mechanics by Ludwig Boltzmann, Max Planck, Werner Heisenberg and Louis de Broglie. In chemistry, Occam's razor is often an important heuristic when developing a model of a reaction mechanism. However, while it is useful as a heuristic in developing models of reaction mechanisms, it has been shown to fail as a criterion for selecting among some selected published models. In this context, Einstein himself expressed caution when he formulated Einstein's Constraint: "It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience". An often-quoted version of this constraint (that cannot be verified as being posited by Einstein himself) says "Everything should be kept as simple as possible, but no simpler."

In the scientific method, parsimony is an epistemological, metaphysical or heuristic preference, not an irrefutable principle of logic or a scientific result. As a logical principle, Occam's razor would demand that scientists accept the simplest possible theoretical explanation for existing data. However, science has shown repeatedly that future data often supports more complex theories than existing data. Science prefers the simplest explanation that is consistent with the data available at a given time, but the simplest explanation may be ruled out as new data become available. That is, science is open to the possibility that future experiments might support more complex theories than demanded by current data and is more interested in designing experiments to discriminate between competing theories than favoring one theory over another based merely on philosophical principles.

When scientists use the idea of parsimony, it only has meaning in a very specific context of inquiry. A number of background assumptions are required for parsimony to connect with plausibility in a particular research problem. The reasonableness of parsimony in one research context may have nothing to do with its reasonableness in another.

Biology
Biologists or philosophers of biology use Occam's razor in either of two contexts both in evolutionary biology: the units of selection controversy and systemics. George C. Williams in his book Adaption and Natural Selection (1966) argues that the best way to explain altruism among animals is based on low level (i.e. individual) selection as opposed to high level group selection. Altruism is defined as behavior that is beneficial to the group but not to the individual, and group selection is thought by some to be the evolutionary mechanism that selects for altruistic traits. Others posit individual selection as the mechanism which explains altruism solely in terms of the behaviors of individual organisms acting in their own self-interest without regard to the group. The basis for Williams's contention is that of the two, individual selection is the more parsimonious theory. In doing so he is invoking a variant of Occam's razor known as Lloyd Morgan’s Canon: "In no case is an animal activity to be interpreted in terms of higher psychological processes, if it can be fairly interpreted in terms of processes which stand lower in the scale of psychological evolution and development" (Morgan 1903).

However, more recent biological analyses, such as Richard Dawkins’s The Selfish Gene, have contended that Occams's view is not the simplest and most basic. Dawkins argues the way evolution works is that the genes propagated in most copies will end up determining the development of that particular species, i.e., natural selection turns out to select specific genes, and this is really the fundamental underlying principle, that automatically gives individual and group selection as emergent features of evolution.

Zoology provides an example. Muskoxen, when threatened by wolves, will form a circle with the males on the outside and the females and young on the inside. This as an example of a behavior by the males that seems to be altruistic. The behavior is disadvantageous to them individually but beneficial to the group as a whole and was thus seen by some to support the group selection theory.

However, a much better explanation immediately offers itself once one considers that natural selection works on genes. If the male musk ox runs off, leaving his offspring to the wolves, his genes will not be propagated. If however he takes up the fight his genes will live on in his offspring. And thus the "stay-and-fight" gene prevails. This is an example of kin selection. An underlying general principle thus offers a much simpler explanation, without retreating to special principles as group selection.

Medicine
When discussing Occam's razor in contemporary medicine, doctors and philosophers of medicine speak of diagnostic parsimony. Diagnostic parsimony advocates that when diagnosing a given injury, ailment, illness, or disease a doctor should strive to look for the fewest possible causes that will account for all the symptoms. This philosophy is one of several demonstrated in the popular medical adage "when you hear hoofbeats behind you, think horses, not zebras". While diagnostic parsimony might often be eneficial, credence should also be given to the counter-argument modernly known as Hickam’s dictum, which succinctly states that "patients can have as many diseases as they damn well please". It is often statistically more likely that a patient has several common diseases, rather than having a single rarer disease which explains their myriad symptoms.

Also, independently of statistical likelihood, some patients do in fact turn out to have multiple diseases, which by common sense nullifies the approach of insisting to explain any given collection of symptoms with one disease. These misgivings emerge from simple probability theory—which is already taken into account in many modern variations of the razor—and from the fact that the loss function is much greater in medicine than in most of general science. Because misdiagnosis can result in the loss of a person's health and potentially life, it is considered better to test and pursue all reasonable theories even if there is some theory that appears the most likely.
Religion

In the philosophy of religion, Occam's razor is sometimes applied to the existance of God; if the concept of a God does not help to explain the universe better, then the idea is that atheism should be preferred (Schmitt 2005). Some such arguments are based on the assertion that belief in God requires more complex assumptions to explain the universe than non-belief (e.g. the Ultimate Boeing 747 gambit). On the other hand, there are various arguments in favor of a God which attempt to establish a God as a useful explanation. Philosopher Del Ratzsch suggests that the application of the razor to God may not be so simple, least of all when we are comparing that hypothesis with theories postulating multiple invisible univiverses.

In speaking on religion in God Is Not Great, Christopher Hitchens espoused his variation named Hitchens’ Razor, which states "What can be asserted without evidence can be dismissed without evidence."

Penal ethics
In penal theory and the philosophy of punishment, parsimony refers specifically to taking care in the distribution of punishment in order to avoid excessive punishment. In the utilitarian approach to the philosophy of punishment, Jeremy Bentham’s "parsimony principle" states that any punishment greater than is required to achieve its end is unjust. The concept is related but not identical to the legal concept of proportionality. Parsimony is a key consideration of the modern restorative justice, and is a component of utilitarian approaches to punishment, as well as the prison abolition movement. Bentham believed that true parsimony would require punishment to be individualised to take account of the sensibility of the individual—an individual more sensitive to punishment should be given a proportionately lesser one, since otherwise needless pain would be inflicted. Later utilitarian writers have tended to abandon this idea, in large part due to the impracticality of determining each alleged criminal's relative sensitivity to specific punishments.

Probability Theory and Statistics
One intuitive justification of Occam's razor's admonition against unnecessary hypotheses is a direct result of basic probability theory. By definition, all assumptions introduce possibilities for error; if an assumption does not improve the accuracy of a theory, its only effect is to increase the probability that the overall theory is wrong.

There are various papers in scholarly journals deriving formal versions of Occam's razor from probability theory and applying it in statistical inference, and also of various criteria for penalizing complexity in statistical inference. Recent papers have suggested a connection between Occam's razor and Kolmogorov complexity.

One of the problems with the original formulation of the principle is that it only applies to models with the same explanatory power (i.e. prefer the simplest of equally good models). A more general form of Occam's razor can be derived from Bayesian model comparison and Bayeds factors, which can be used to compare models that don't fit the data equally well. These methods can sometimes optimally balance the complexity and power of a model. Generally the exact Ockham factor is intractable but approximations such as Akaike Information Criterion, Bayesian Information Criterion, Variational Bayes, False discovery rate and Laplace approximation are used. Many artificial intelligence researchers are now employing such techniques.
William H. Jefferys and James O. Berger (1991) generalise and quantify the original formulation's "assumptions" concept as the degree to which a proposition is unnecessarily accommodating to possible observable data. The model they propose balances the precision of a theory's predictions against their sharpness; theories which sharply made their correct predictions are preferred over theories which would have accommodated a wide range of other possible results. This, again, reflects the mathematical relationship between key concepts in Bayesian inference (namely marginal probability, conditional probability and posterior probability).

The statistical view leads to a more rigorous formulation of the razor than previous philosophical discussions. In particular, it shows that "simplicity" must first be defined in some way before the razor may be used, and that this definition will always be subjective. For example, in the Kolmogorov-Chaitin Minimum description length approach, the subject must pick a Turing machine whose operations describe the basic operations believed to represent "simplicity" by the subject. However one could always choose a Turing machine with a simple operation that happened to construct one's entire theory and would hence score highly under the razor. This has led to two opposing views of the objectivity of Occam's razor.

In literature and writing
Occam's razor has been recommended as a measure of how good the plot of a novel is. Simple and logical plots are easy to explain and this enhances the experience of the reader. The writer is also less likely to make an error while explaining the plot to the reader.

In psychology and humor
Hanlon’s razor -- to explain human behavior, assume incompetence before malice.

Controversial aspects of the razor
Occam's razor is not an embargo against the positing of any kind of entity, or a recommendation of the simplest theory come what may. Occam's razor is used to adjudicate between theories that have already passed "theoretical scrutiny" tests, and which are equally well-supported by the evidence. Furthermore, it may be used to prioritize empirical testing between two equally plausible but unequally testable hypotheses; thereby minimizing costs and wastes while increasing chances of falsification of the simpler-to-test hypothesis.

The other things in question are the evidential support for the theory. Therefore, according to the principle, a simpler but less correct theory should not be preferred over a more complex but more correct one. It is this fact which gives the lie to the common misinterpretation of Occam's razor that "the simplest" one is usually the correct one. For instance, classical physics is simpler than more recent theories; nonetheless it may not be preferred over them, because it produces inaccurate predictions in some circumstances.

Another contentious aspect of the razor is that a theory can become more complex in terms of its structure (or syntax), while its ontology (or semantics) becomes simpler, or vice versa. Quine, in a discussion on definition, referred to these two perspectives as "economy of practical expression" and "economy in grammar and vocabulary", respectively. The theory of relativity is often given as an example of the proliferation of complex words to describe a simple concept.

Galileo Galilei lampooned the misuse of Occam's razor in his Dialogue. The principle is represented in the dialogue by Simplicio. The telling point that Galileo presented ironically was that if you really wanted to start from a small number of entities, you could always consider the letters of the alphabet as the fundamental entities, since you could construct the whole of human knowledge out of them.


This and much more at: http://en.wikipedia.org/wiki/Occam%27s_razor

No comments:

Post a Comment