Distributed checksum Clearinghouse

Top Dog gives you access to public Distributed Checksum Clearinghouse (DCC) servers


What is the DCC?

  • DCC provides information as to the "bulkiness" of your incoming email by telling you how many people other than yourself have already received it.
  • DCC is a similar system to the Apocgraphy database in that it computes checksums of emails and checks these checksums against a server that responds with information on the email.
  • More information on the DCC is available at the DCC home page.
  • A general description of DCC is available here.
  • DCC is written and generously given away by Vernon Schryver.


Using DCC with Top Dog

Important: If you are using Top Dog in Lite or Advanced rather than Power User mode, the DCC is used in the background and you do not have to worry about any of the following.

The recommended method of using DCC with Top Dog is to explicitly white list any bulk mail that you want to receive (e.g. mailing lists) using a list of approved senders or one of the other tests. You can then have Top Dog check one of the public DCC servers to see if the remaining message are bulk or not. The idea being that this will allow you to screen out any bulk mail that you are not interested in receiving.

When a query is made to a DCC server a header tag is added to your email indicating the response from the server in the format described in the documentation.

Top Dog buffers server responses. This means that if you have more than one test for a server the server will only be queried the first time.


Dcc Checksums

There are 7 different types of checksums that DCC generates:

  • IP Address of Sender
    This checksum is generated using the last "Received:" header tag.
    In other words it is the IP address of the server that sent the message to your server.
    This functionality is similar to a DNS database query.
  • From Tag
    This checksum is generated using the email address in the "From:" header tag.
    The reply to this checksum indicates the number of emails that have been received by DCC using the same from address as the current email.
  • Message-ID
    This checksum is generated using the value of the "Message-ID:" header tag.
    This value uniquely identifies an email message.
    Large values for returned for this checksum is a good indication that the email is bulk.
  • Received Line
    This checksum is generated using the last "Received:" header tag.
    It is similar to the IP Address checksum above except that it is more inclusive.
  • Body
    This checksum is generated using the text found in the body of the email.
    Large values returned for this checksum are a good indication that the exact same email has been sent out to a large number of people with out modification.
    This is the safest value to screen on since it will only match identical emails.
  • Fuz1
    This checksum is generated using the text found in the body of the email after applying some filtering of the text to remove insignificant differences.
    Large values returned for this checksum are a good indication that the email has been sent out to a large number of people with minor modifications.
    Spammers often change their emails only slightly by inserting your name into the text or by adding "hash busters" or "unique text insertions" to avoid being detected by anti-spam measures. This checksum attempts to return the same checksums for such emails thus giving you a better idea as to the number of people the email was sent to.
  • Fuz2
    This checksum is identical to the Fuz1 checksum except that a different filter is used.
    The result is that some emails that manage to elude Fuz1 are caught by Fuz2.

Note: Not all checksums will be generated by all emails since the email may not contain enough of the appropriate information to generate a checksum.

A server will return one of four different values for each checksum type submitted:

  • OK
    This value returned by the server indicates that the checksum is known to the server but the server has certified that it is not an indication of spam.
  • OK2
    This value returned by the server indicates that the checksum is known to the server but the server is only half sure that it is not an indication of spam.
  • Many
    This value returned by the server indicates that the checksum is known to the server and the server considers it an indication of spam.
    This value is assigned to a checksum if the email that generated the checksum landed in a spam trap or the administrator of the server has some other reason to expect that it is a reliable indicator of spam.
    It is worthwhile noting that what one person may consider spam may not be considered spam by you and the criteria for a "Many" listing is at the discretion of the server administrator. It is therefore recommended that you only filter such messages after having first checked your approved lists.
  • Number
    If the server has no opinion on what a checksum may indicate it will simply return the number of times that the checksum has been encountered.