ChangeLog

The domains.txt file is regularly updated in order to include new domains and also cleaned to make sure that domains that have expired are removed.

About

The file below is a comprehensive and regularly updated list of domains and IP addresses that are used for delivering advertisements. Please check the file description on this page for the most recent update and feel free to sync. This list is frequently updated in the following ways:

  • new entries are added when new advertisements advertisements are spotted on websites.
  • old entries are regularly deleted in order to avoid cluttering-up the files with domains and addresses that do not exist anymore.

The usual method of operation of these spam organisations is to register or lease domains for a short while and frequently unregister them so that spam filters such as these will not pick them up. If you look at other spam lists and check the domains and addresses you will notice that they may contain dead domains.

Since this list is comprehensive, additions will appear in the changelog. It is important to note that this domain list does not censor "self-spam" (banners or advertisements that are displayed using the same domain name or address path of a legitimate website). For example, a website that choses to spam itself by adding banners and other annoyances using its own URL (either as a subdomain or a website path from the root) is not censored by this list.

This database tries very hard to not block service providers that may be part of the hosting policy for hosted websites. The database does block "related spam", for example, on sourceforge.net or slashdot.net.

Spam Classification

Type Description Example Blocked
out of context spam any banner, image, or content that is completely out of the context of a website numerous: pills, porn, forex etc… Yes
related spam related spam are advertisements that appear on websites and are related to the website content but advertise for other domains examples include technical websites that advertise for large websites like ORACLE, Google, IBM, etc.. Yes
hosting spam spam enforced by hosting providers forums that are hosted on a free service, and as part of the TOS, they are forced (either manually or automatically) to display a banner linking to the hosting company No
self spam any banner, image or content that is served by a website from its own address space in order to promote itself or a different domain several No

Using this List

This list can be used in the following manner:

  • In the /etc/hosts file (or using tcp-wrappers, /etc/hosts.deny) in order to redirect requests to the localhost or some invalid address. This method is slow even on powerful machines and, if the situation permits, it should be avoided.
  • As part of a DNS system, by creating zones. This method may hog or stall the DNS server because it would require a large number of dummy zones to be created. On the other hand, this method is the most efficient because these domains will be blocked at the DNS level and the browser (or proxy) will never even contact the domain.
  • In a proxying set-up, for example the proxy chaining method in order to filter spam for an entire network.

Generating Hosts File

On Unix systems, you can download domains.txt to a folder and then run:

cat domains.txt | sed 's/^/127.0.0.1 /g' >> /etc/hosts

which will append all the spam domains to the /etc/hosts text file.

Generating Zone File

On Unix systems, you can download domains.txt to a folder and then run:

cat domains.txt | sed 's/^\(.*\)$/zone "\1" { type master; file "db.spam."; }/g' >> spamzones.txt

to generate zone lines to be used with a DNS service.

Domain Maintenance

This script was used on Debian Squeeze 6.0.6!

The following bash script will query each domain and then clean out the domains that are dead. It is perhaps a good idea to run it every week via cron to remove dead domains.

clean_domains.sh
#!/bin/bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2014 - License: GNU GPLv3      ##
##  Please see: http://www.gnu.org/licenses/gpl.html for legal details,  ##
##  rights of fair usage, the disclaimer and warranty conditions.        ##
###########################################################################
 
if [[ -z "$1" ]]; then
  echo "Syntax: $0 spam_domains.txt"
  exit 1
fi
 
DOMAIN_FILE="$1"
 
# Cull dead domains.
while read DOMAIN; do
  if [[ -z "$DOMAIN" ]]; then
  	continue;
  fi
  DNS_LOOKUP=`nslookup -timeout=1 -retry=0 -fail $DOMAIN | grep Name`
  if [[ ! -z "$DNS_LOOKUP" ]]; then
    echo "$DOMAIN" >> /tmp/spam_domains.txt
  fi
done < $DOMAIN_FILE
 
# Sort and clean
sort -u /tmp/spam_domains.txt > $DOMAIN_FILE
rm /tmp/spam_domains.txt

The code above can be saved to a file called clean_domains.sh and then executed with:

./clean_domains.sh domains.txt

Provided that domains.txt is in the current path from where you are running the script.

Files

FilenameFilesizeLast modified
domains.txt1.3 MiB2015/10/06 07:42

assets/databases/spam.txt · Last modified: 2017/02/22 18:30 (external edit)

Access website using Tor Access website using i2p


For the copyright, license, warranty and privacy terms for the usage of this website please see the license, privacy and plagiarism pages.