FutureQuest, Inc. FutureQuest, Inc. FutureQuest, Inc.
Knowledgebase
What is SpamAssassin "Auto-Learn"
Posted on 15 April 2015 06:23 PM

Question:

What is the option within the SpamAssassin settings "Use auto-learned data:" and how will that affect my filtering?

Answer:

The SpamAssassin auto-learned data, also known as Bayesian learning or
simply "bayes", is a system wide adaptive learning system. It learns emerging
characteristics of both spam and non-spam messages that the
pre-configured tests don't catch.



To determine which messages to learn, it does a second scoring pass on all messages passed through SpamAssassin (excluding blacklists, whitelists, and a few other settings that might be manipulated by individuals). If this score is below a configured threshold, it is learned as non-spam, and above another threshold it is learned as spam. This learning step is done irregardless if you have the "Use auto-learned data" flag enabled, and does not affect the final score.
When the "Use auto-learned data" flag is enabled, the following happens: 1. The value of different scores used within SpamAssassin is changed
to take into account that additional scores from the bayes filtering may be added.
These scores can be seen in the SpamAssassin Test and Scoring Chart under the bayes column.

2. The SpamAssassin bayesian classifier analyzes the message to compute a probability that the contents are spam, based on previously learned messages. 3. A score is chosen based on that calculated probability value and added to the final score.

All messages that are received within the FutureQuest network that pass through
SpamAssassin are used to "Auto-Learn" as well as messages from blacklisted IP
addresses, as complied from Realtime Blackhole Listings.