Bayes was a past master at code breaking.. probably

Thomas Bayes
Thomas Bayes
Have your say

Thomas Bayes, an 18th-century clergyman, is not well known outside the realms of academia but his name is attached to a theorem with important applications in the modern world.

He introduced two ideas – uncertainty can be represented by probabilities and evidence can be used to update probabilities about events.

Gambling, code breaking, and the detection of spam e-mails have two things in common. The first is that they all involve thinking about uncertainty. The second is that the uncertainty can be described by the mathematical theory introduced by Bayes.

Bayes studied logic and theology at the University of Edinburgh in preparation for a career in the ministry. Although he was a Presbyterian minister in Tunbridge Wells for 18 years, he maintained a lifelong interest in mathematics. His mathematical accomplishments were sufficient for him to be elected to the Royal Society.

He died 250 years ago, in 1761, and the University of Edinburgh is commemorating this anniversary with a public lecture today.

Bayes is remembered for his work on probability, which is contained in a single article that was retrieved from his papers and published only after his death.

His ideas were known by academics, but only really became practical in the 20th century once powerful computers became available.

Today, Bayesian methods are used by scientists who study the stars and the human genome, by computer programmers who build systems that automatically process e-mail and place advertisements on search engines, and for predicting volcanic eruptions and the outcomes of US presidential elections – and even for gambling.

Consider a forensic problem. A bag containing a large number of identical looking white tablets is found in suspicious circumstances. They can be tested chemically to discover if they are illicit, but we are only allowed to test one tablet at a time.

Suppose we pick a tablet randomly, test it, and find it to be illicit. Surely other tablets must be illicit as well? What are the chances that we picked the only bad one from the bag? On the other hand, the tablets might not be all bad.

The courts are interested in the proportion that are bad and wish to know this whilst allowing the scientists to examine as few tablets as possible. Using Bayes’ ideas, we can determine, to a reasonable degree of certainty, the proportion in the consignment that is illicit by examining very few tablets, and, amazingly, the number to examine remains the same, no matter how large the number of tablets in the bag. The courts get the answer they want and the scientists save resources in examining no more tablets than necessary.

Despite their power, Bayesian ideas remained somewhat impractical until the advent of computers, because maintaining lots of probabilities requires performing large amounts of arithmetic. One example is in computer programmes for detecting spam e-mail. A simple and surprisingly effective way to identify it is called the Naive Bayes classifier. The method starts with an initial probability that a message is spam, which is then updated in light of each individual word in the message. Suppose half the messages that you receive are spam, then for any new message you receive, the Naive Bayes classifier will have an initial estimate that there is a 50 per cent probability the new message is spam. Then it processes your e-mail one word at a time, modifying the new message’s spam probability after each word.

Some words, like “coriander”, are very rarely used in spam messages. So if the suspect e-mail contains the word “coriander”, we want to decrease its spam probability.

Other words, such as the names of some medications, tend to be used often in spam e-mails, so their presence will increase the spam probability. Bayes’ theorem tells us how to combine all these probabilities, but the short answer is you multiply them. Depending on the final value of the spam probability the message is placed in your spam box or inbox.

Bayes’ ideas were used in code breaking during the Second World War. In this case, probabilities were associated with code tables – tables that map letters in a coded message to those in a confidential message.

Every time we attempt to decode a new message, it provides more evidence on whether we have managed to determine the code table correctly or not.

One of the most amazing things about mathematics is that ideas turn up in unexpected places. In the case of Bayes’ ideas, an 18th-century clergyman led us to crime investigation, computer programming, and the shortening of a world war. Who said mathematics wasn’t useful?

n Colin Aitken is Professor of Forensic Statistics, Chris Watson is Professor of Machine Learning and Charles Sutton is Assistant Professor at the School of Informatics, all at Edinburgh University.