Quantitative, statistics-based research can be approached in two quite different ways: Bayesian inference, which takes into account the prior probability of a hypothesis, and frequentist inference, which makes use only of the likelihood of the evidence.
Orthodox quant agencies (such as Ipsos Mori or Gallup) start from the assumption that we know nothing and therefore sample from as large a population as possible – an expensive and time-consuming approach. The result is usually expressed within confidence limits or probability of error.
This academic approach starts by collating existing knowledge from multiple sources, which is then built into a statistical model. It uses the available information to generate a probability that an assumption is correct, then tests that theory against a sample. Bayesian research uses credibility intervals rather than confidence limits.
We believe that the Bayesian statistical approach is better suited to market research than the traditional commercial models for a number of reasons:
Because we have used informed opinion to create our hypothesis we can test the hypothesis with a surprisingly small sample (calculated using the Bayesian Stats equation) within reasonable credibility intervals.
"Bayesian" refers to the Reverend Thomas Bayes (c.1702-1761). The development of probability theory in the early 18th century arose to answer questions surrounding gambling, and to underpin the new and related ideas of insurance. A problem arose, known as the question of inverse probability: the mathematicians of the time knew how to find the probability that, say, 4 people aged 50 die in a given year out of a sample of 60, if the probability of any one of them dying was known. But they did not know how to find the probability of one 50-year old dying based on the observation that 4 had died out of 60. The answer was found by Thomas Bayes, and was published in 1763. Like many educated men of his time, Bayes was both a clergyman and an amateur scientist/mathematician. His solution, known as Bayes’ theorem, underlies, and gave its name to, the modern Bayesian approach to the analysis of all kinds of data.
What we now know as Bayesian statistics has not had a clear run since 1763. Although Bayes’s method was enthusiastically taken up by Laplace and other leading probabilists of the day, it fell into disrepute in the 19th century because they did not yet know how to handle prior probabilities properly. The first half of the 20th century saw the development of a completely different theory, now called frequentist statistics. But the flame of Bayesian thinking was kept alive by a few thinkers such as Bruno de Finetti in Italy and Harold Jeffreys in England.
The modern Bayesian movement began in the second half of the 20th century, spearheaded by Jimmy Savage in the USA and Dennis Lindley in Britain, but Bayesian inference remained extremely difficult to implement until the late 1980s and early 1990s when powerful computers became widely accessible and new computational methods were developed. The subsequent explosion of interest in Bayesian statistics has led not only to extensive research in Bayesian methodology but also to the use of Bayesian methods to address pressing questions in diverse application areas such as astrophysics, weather forecasting, health care policy, and criminal justice.
Powerful computational tools allow Bayesian methods to tackle large and complex statistical problems with relative ease, where frequentist methods can only approximate or fail altogether. Bayesian modelling methods provide natural ways for people in many disciplines to structure their data and knowledge, and they yield direct and intuitive answers to the practitioner’s questions.
Scientific hypotheses typically are expressed through probability distributions for observable scientific data. These probability distributions depend on unknown quantities called parameters. In the Bayesian paradigm, current knowledge about the model parameters is expressed by placing a probability distribution on the parameters, called the "prior distribution", often written as
When new data y become available, the information they contain regarding the model parameters is expressed in the "likelihood," which is proportional to the distribution of the observed data given the model parameters, written as
This information is then combined with the prior to produce an updated probability distribution called the "posterior distribution," on which all Bayesian inference is based. Bayes' Theorem, an elementary identity in probability theory, states how the update is done mathematically: the posterior is proportional to the prior times the likelihood, or more precisely,
In theory, the posterior distribution is always available, but in realistically complex models, the required analytic computations often are intractable. Over several years, in the late 1980s and early 1990s, it was realized that methods for drawing samples from the posterior distribution could be very widely applicable.
There are many reasons for adopting Bayesian methods, and their applications appear in diverse fields. Many people advocate the Bayesian approach because of its philosophical consistency. Various fundamental theorems show that if a person wants to make consistent and sound decisions in the face of uncertainty, then the only way to do so is to use Bayesian methods. Others point to logical problems with frequentist methods that do not arise in the Bayesian framework. On the other hand, prior probabilities are intrinsically subjective – your prior information is different from mine – and many statisticians see this as a fundamental drawback to Bayesian statistics. Advocates of the Bayesian approach argue that this is inescapable, and that frequentist methods also entail subjective choices; this has been a basic source of contention between the ‘fundamentalist’ supporters of the two statistical paradigms for at least the last 50 years. In contrast, it is more the pragmatic advantages of the Bayesian approach that have fuelled its strong growth over the last 20 years, and are the reason for its adoption in a rapidly growing variety of fields.
There are many varieties of Bayesian analysis. The fullest version of the Bayesian paradigm casts statistical problems in the framework of decision making. It entails formulating subjective prior probabilities to express pre-existing information, careful modelling of the data structure, checking and allowing for uncertainty in model assumptions, formulating a set of possible decisions and a utility function to express how the value of each alternative decision is affected by the unknown model parameters. But each of these components can be omitted. Many users of Bayesian methods do not employ genuine prior information, either because it is insubstantial or because they are uncomfortable with subjectivity. The decision-theoretic framework is also widely omitted, with many feeling that statistical inference should not really be formulated as a decision.
So there are varieties of Bayesian analysis and varieties of Bayesian analysts. But the common strand that underlies this variation is the basic principle of using Bayes' theorem and expressing uncertainty about unknown parameters probabilistically.