Not a long time ago, most anti-spam products simply used a list of keywords to identify spam. A good set of keywords could catch much spam. However, a keyword-based anti-spam filter requires manual updating and can be easily fooled by tweaking the message a little. Spammers simply examine the latest anti-spam techniques and find ways to bypass them. At the result youre left with a high number of false positives.
The need in a new effective technique to fight against spam stood up. The experience showed that this new method might adapt itself to the spammers’ tactics that would change with time.
The Bayesian filtering is based on the principle that most events are dependent and that the probability of an event occurring in the future can be inferred from the occurrences of this event in the past. This approach is used to identify spam. If some piece of text occurred mostly in spam emails but not in legitimate mail, then it would be reasonable to suppose that this email is probably spam.
To filter mail using the Bayesian technology, you need to generate a database of words collected from spam and legitimate mail. Then a probability value is assigned to...