Preventing email spam/Spam identification tactics

This is an incomplete draft resource that will require considerable expansion. If you are able to assist by providing appropriate information, as well as citations to appropriate reference works, please help Wikiversity by contributing on this resource..


There are a number of techniques that can be used as the basis of identifying potential spam.

This can be based on a combination of the following :

* Content type identification, such as the presence of MIME data
* regular expression matching for specfic textual paterns in the message headers, or body
* data pattern analysis in attachements (like for example program code)
* Bayesian network analysis of message headers or text. (building upon simple regexp pattern matching)
* Language pattern analysis
* Traffic pattern analysis over a large group of messages..

Most of the current generation of 'spam' filtering systems use a combination of one or more of the techniques listed above.