Blocking Spam with Exim

Recent reports indicate that spam is increasing again.  I have been using Exim to filter spam for several years.  Some recent tuning I have done have decreased the percent of spam which reaches my spam filters.   This article provides a discussion of the techniques used, and provides implementation examples.   Spambots tend to be simple programs which don’t handle slow servers very well.   Using a greylist is effective method of blocking them as they usually don’t retry.   My latest changes use delays to cause many spambots to abandon their attempt.  Greylisting is used only for poorly configured servers that make it to the Recipient command.

Configuration Modifications

I use the Debian split configuration on an Ubuntu system.  This makes it easy to add ACLs and supporting configuration.   The changes do not alter the standard configuration, and can be easily altered or disabled if required.  The example ACLs provided here have been edited to remove some local logging code.

The split configuration uses sub-directories under /etc/exim/conf.d for the various sections of the configuration files.   The attached files have a header with a suggested name prefixed with the sub-directory it belongs to.  Files in these directories are processed in alpha-numeric order.  Comments are stripped when the final configuration is generated, so comment liberally.

I use the file main/00_localmacros.   Among other things this includes the Internet IP and addresses for the server.   This file is processed very early so you can add additional overrides to this file.   Omit the ACL specifications if you do not provide the implementation files.

The RFCs indicate that servers should expect servers to delay responses for significant periods.  In most cases they are required to handle delays up to 5 minutes.   Only in handling Data are minimum timeouts lower.  We define short, standard, and long delays to handle various delays.  Our configuration uses cumulative delays so the long delay is limited accordingly.

ACLs (Access Control Lists)

Exim uses ACLs to determine if the email is to be accepted.  The ACLs described below are optional additions to the standard Debian/Ubuntu configuration.   If they are not provided, the default action is to accept.   We use three ACL result codes:

  • accept (return a success response);
  • deny (return an permanent error); and
  • defer (return a temporary error).

With the exception of the connect ACL,  the ACLs described here are run after the command is received, and determine the response to the command.  The connect ACL is run before the banner message is given, and determine the status of the connect response.

The ACLs run at connect time and before the HELO/EHLO command apply to all SMTP connections.  We use conditions at the head of the ACL to exempt locally trusted connections, and MUAs using the Submission port.

The mail ACL is a replacement for the existing ACL.  The default mail ACL consist of an optional check to ensure that a HELO was done first.  This check is included in our configuration.

The recipient and data ACLs described here are run by the existing ACLs for corresponding commands.   The existing ACLs provide basic checks.   The data ACL handles all incoming messages regardless of source.

Connection Delays

The acl/25_local-config_check_connect file contains a connect time ACL The banner response is delayed  until the ACL returns a result. The delay includes all triggered delays and the time required by the DNS lookups and other processing.   With the delays specified in our configuration they should rarely exceed a minute.  More than 40% of servers listed with Spamhaus abandon their connections at this point.

A check is done to verify the address has the appropriate PTR and A records for a mail server.  If the correct DNS entries are missing, a short delay is applied and pipelining is denied.   If the DNS lookups can’t be done  the connection is deferred after a short delay.   🙁 Unfortunately, a number of legitimate servers don’t respect the timeout requirements and have poorly configured DNS  records.   As these are mostly marketing mail servers, you may want to risk some of this mail by increasing the delay.

The primary action used is apply a long delay for servers that are listed in a couple of trusted DNS blacklists.  This delay is repeated in most ACLs.  Servers listed in a DNS whitelist run by dnswl.org are exempted from the delay.  We also add a long delay to  servers that are locally blacklisted.

EHLO / HELO Checks

The acl/25_local-config_check_helo file contains an ACL for the HELO command. Once we have received the remote servers identification, we can begin to verify that it is legitimate or at least properly configured.  The HELO banner is easily forged, and the checks here are designed to block many of these forgeries.  The ACL enforces these checks:

  • The sender identity must be a fully qualified domain name.
  • The sender identity must not be a  local identity.
  • Tee sender identity must not be an IP address.
  • The sender identity must not be a domain literal.  (Added because we have enabled Domain Literals for statistical purposes.)
  • If an SPF record exists for the server,  the server must be approved by the SPF record, or the SPF return must be neutral.

As much as I would like to verify that a valid domain is used, I receive too much valid email with invalid domains.  Unfortunately these are servers so poorly configured that they do not know their own Internet identity.  I am continuing to research this case and will likely begin conditionally blocking servers.

Long delays are applied after all failures.   Unaccepted SPF conditions other than the fail condition include short delays.

MAIL Checks

More than 75% of the servers listed with Spamhaus abandon the connection before sending a mail command.

The mail command provided an opportunity to check for forged sender identities.  Like the HELO identity this is also easy to forge.  The acl/30_local-config_check_mail file contains an ACL to be applied to the MAIL command. This ACL enforces these checks:

  • A HELO command must be issued before the MAIL command.
  • The sender must not use a local domain unless the server hosts an approved mailing list.

The default configuration does sender checks in the recipient ACL.  These checks include:

  • Optionally, verify that the sender address has a domain to which email can be routed.
  • Optionally, verify that the sender address can receive mail.
  • Optionally, verify that the sender address id permitted by SPF.

Due to information forgery by legitimate organizations, a number of checks are difficult to use without whitelists.  Some of these are conditionally being tested using the freeze control.  This causes the email to be held in the delivery queue if it is accepted.  These rules, which may initially require significant administration time, include:

  • The HELO identity does not have a valid domain at the second level (example.com rather than com).
  • The sender address has a domain part to which email can not be routed.
  • The sender is not permitted by SPF policy.

Recipient checks

The 30_local-config_check-rcpt file contains local additions to the ACL for the Recipient command. The recipient command is issued once for each recipient.  Our ACL is included in the existing ACL just before accepting the current recipient.  This mechanism is designed to provide local checks to be easily added to either configuration setup.  Other than the delays, the checks in the prior ACLs could be implemented here.  The same mechanism is used for the data command.

The following rules are implemented:

  • Accept signed return path addresses when the sender is either empty, or the targeted recipient.  Other senders are handled by normal checks. Signing return paths have been described in another post.
  • Flags bogus notifications for handling in the Data ACL.   This incoming callouts to work.
  • Deny the recipient if the server is listed in the Spamhaus blacklist.
  • Accept the message if is a notification.
  • Greylist the message if the server looks bogus.  (Server fails rDSN and HELO validations and is not whitelisted. This could be more aggressive and greylist if either validation fails.) Greylisting is implemented using the MySQL method described on a variety of sites.

The checks applied to this point eliminate over 90% of our SPAM load.  This occurs with little overhead, and requires only a few DNS look-ups.   The DNS lookups end up locally cached for use by the Spam filter when the Data ACL is run.

DATA checks

The 40_local-config_check-data file local additions to the ACL for the DATA command. The data ACL is the last chance to reject SPAM.  The following rules are implemented.

  • Reject messages flagged as a bogus notification in the recipient ACL.
  • Scan the message for malware and freeze if any is found.
  • Accept messages from local senders and senders using the Submission protocol.
  • Invoke Spamassassin to check the message content for Spam.

To feed our database, we scan all Internet messages.  Otherwise we would skip the Spamassassin based on whitelists, and other criteria.  Once we have  gotten a spam score from Spamasssassin we run several rules.   These include:

  • Add a spam status header to the message.
  • Accept the message if is ham.  (Lowest spam scores.)
  • Reject all spam from postmasters and mailer-daemons.
  • Reject all Spam which exceeds a high limit.
  • Flag Spam in the subject header if we are going to accept it.

Notes

When testing new rules I use two techniques to prevent loss of email.  During testing I use two techniques.  With both these techniques, I add hosts to a whitelist or a blacklist as appropriate.   The techniques I use are:

  • Defer rather than deny the message.
  • Use control = freeze to prevent final deliver if the email is eventually accepted.

The logcheck program checks Exim’s mainlog file and reports lines of interest.  Normal messages are excluded from the report, so only lines of interest are reported.  All messages which are frozen or deferred are reported.  This allows for timely followup.

My experience has shown that servers for legitimate bulk and automated email are often poorly configured.  This prevents applying some rules I would prefer to implement.   I do attempt to notify some operators, but in some cases their configuration is so poor it is nearly impossible.  (Thomas Cook Travel this means you, among others.)