Tuesday 27 March 2012

Judea Pearl's Foreword for the Book

We are delighted to announce that Judea Pearl, who has just won the 2011 Turing Award for work on AI reasoning, has written the following Foreword for the book:

Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with work by the geneticist Sewall Wright in the 1920s. Variants have appeared in many fields. Within statistics, such models are known as directed graphical models; within cognitive science and artificial intelligence, such models are known as Bayesian networks. The name honours the Rev. Thomas Bayes (1702-1761), whose rule for updating probabilities in the light of new evidence is the foundation of the approach. The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bi-directional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of choice for uncertain reasoning in AI and expert systems, replacing earlier ad-hoc rule-based schemes. Perhaps the most important aspect of Bayesian networks is that they are direct representations of the world, not of reasoning processes. The arrows in the diagrams represent real causal connections and not the flow of information during reasoning (as in rule-based systems of neural networks). Reasoning processes can operate on Bayesian networks by propagating information in any direction. For example, if the sprinkler is on, then the pavement is probably wet (prediction); if someone slips on the pavement, that also provides evidence that it is wet (abduction, or reasoning to a probable cause). On the other hand, if we see that the pavement is wet, that makes it more likely that the sprinkler is on or that it is raining (abduction); but if we then observe that the sprinkler is on, that reduces the likelihood that it is raining. It is the ability to perform this last form of reasoning – called explaining away – that makes Bayesian networks so powerful compared to rule-based systems and neural networks. They are especially useful and important for risk assessment and decision-making.

Although Bayesian networks are now used widely in many disciplines, those responsible for developing (as opposed to using) Bayesian network models typically require highly specialist knowledge of mathematics, probability, statistics and computing. Part of the reason for this is that, although there have been several excellent books dedicated to Bayesian Networks and related methods, these books tend to be aimed at readers who already have a high-level of mathematical sophistication – typically they are books that would be used at graduate or advanced undergraduate level on mathematics, statistics or computer science. As such they are not really accessible to readers who are not already proficient in those subjects. This book is an exciting development because it addresses this problem. While I am sure it would be suitable for undergraduate courses on probability and risk, it should be understandable by any numerate reader interested in risk assessment and decision-making. The book provides sufficient motivation and examples (as well as the maths and probability where needed from scratch) to enable readers to understand the core principles and power of Bayesian networks. However, the focus is on ensuring that readers can build practical Bayesian network models, rather than understand in depth the underlying propagation algorithms and theory. Indeed readers are provided with a tool that performs the propagation, so they will be able to build their own models to solve real-world risk assessment problems.

The danger of p-values and statistical significance testing

I have just come across an article in the Financial Times (it is not new - it was published in 2007) titled "The Ten Things Everyone Should Know About Science".  Although the article is not new the source where I found the link to it is, namely right at the top of the home page for the 2011-12 course on Probabilistic Systems Analysis at MIT. In fact the top bullet point says:
The concept of statistical significance (to be touched upon at the end of this course) is considered by the Financial Times as one of " The Ten Things Everyone Should Know About Science".
The FT article does indeed list "Statistical significance" as one of the ten things, along with: Evolution, Genes and DNA, Big Bang, Quantum Mechanics, Relativity, Radiation, Atomic and Nuclear Reactions, Molecules and Chemical Reactions, and Digital data.   That is quite illustrious company, and in the sense that it helps promote the importance of correct probabilistic reasoning I am delighted. However, as is fairly common, the article assumes that 'statistical sugnificance' is synonymous with p-values. The article does hint at the fact that there there might be some scientists who are sceptical of this approach when it says:
Some critics claim that contemporary science places statistical significance on a pedestal that it does not deserve. But no one has come up with an alternative way of assessing experimental outcomes that is as simple or as generally applicable.
In fact, that first sentence is a gross under-statement, while the second is simply not true. To see why the first sentence is a gross understatement look at this summary (which explains what p-values are) that appears in Chapter 1 of our forthcoming book (you can see full draft chapters of the book here). To see why the second sentence is not true look at this example from Chapter 5 of the book (which also shows why Bayes offers a much better alternative). Also look at this (taken from Chapter 10) which explains why the related 'confidence intervals' are not what most people think (and how this dreadful approach can also be avoided using Bayes).

Hence it is very disappointing that an institute like MIT should be perpetuating the myths about this kind of significance testing. The ramifications of this myth have had (and continues to have) a profound negative impact on all empirical research - see, for example, the article "Why Most Published Research Findings Are False". Not only does it mean that 'false' findings are published but also that more scientifically rigorous empirical studies are rejected because authors have not performed the dreaded significance tests demanded by journal editors or reviewers.  This is something we see all the time and I can share an interesting anecdote on this. I was recently discussing a published paper with its author. The paper was specifically about using the Bayesian Information Criteria to determine which model was producing the best prediction in a particular application. The Bayesian analysis was the 'significance test' (only a lot more informative).Yet at the end of the paper was a section with a p-value significance test analysis that was redundant and uninformative. I asked the author why she had included this section as it kind of undermined the value of the rest of the paper. She told me that the paper she submitted did not have this section but that the journal editors had demanded a p-value analysis as a requirement for publishing the paper.