The domains.txt
file is regularly updated in order to include new domains and also cleaned to make sure that domains that have expired are removed.
The file below is a comprehensive and regularly updated list of domains and IP addresses that are used for delivering advertisements. Please check the file description on this page for the most recent update and feel free to sync. This list is frequently updated in the following ways:
The usual method of operation of these spam organisations is to register or lease domains for a short while and frequently unregister them so that spam filters such as these will not pick them up. If you look at other spam lists and check the domains and addresses you will notice that they may contain dead domains.
Since this list is comprehensive, additions will appear in the changelog. It is important to note that this domain list does not censor "self-spam" (banners or advertisements that are displayed using the same domain name or address path of a legitimate website). For example, a website that choses to spam itself by adding banners and other annoyances using its own URL (either as a subdomain or a website path from the root) is not censored by this list.
This database tries very hard to not block service providers that may be part of the hosting policy for hosted websites. The database does block "related spam", for example, on sourceforge.net or slashdot.net.
Type | Description | Example | Blocked |
---|---|---|---|
out of context spam | any banner, image, or content that is completely out of the context of a website | numerous: pills, porn, forex etc… | Yes |
related spam | related spam are advertisements that appear on websites and are related to the website content but advertise for other domains | examples include technical websites that advertise for large websites like ORACLE, Google, IBM, etc.. | Yes |
hosting spam | spam enforced by hosting providers | forums that are hosted on a free service, and as part of the TOS, they are forced (either manually or automatically) to display a banner linking to the hosting company | No |
self spam | any banner, image or content that is served by a website from its own address space in order to promote itself or a different domain | several | No |
This list can be used in the following manner:
/etc/hosts
file (or using tcp-wrappers, /etc/hosts.deny
) in order to redirect requests to the localhost or some invalid address. This method is slow even on powerful machines and, if the situation permits, it should be avoided.DNS
system, by creating zones. This method may hog or stall the DNS
server because it would require a large number of dummy zones to be created. On the other hand, this method is the most efficient because these domains will be blocked at the DNS
level and the browser (or proxy) will never even contact the domain.
On Unix systems, you can download domains.txt
to a folder and then run:
cat domains.txt | sed 's/^/127.0.0.1 /g' >> /etc/hosts
which will append all the spam domains to the /etc/hosts
text file.
On Unix systems, you can download domains.txt
to a folder and then run:
cat domains.txt | sed 's/^\(.*\)$/zone "\1" { type master; file "db.spam."; }/g' >> spamzones.txt
to generate zone lines to be used with a DNS service.
The following bash script will query each domain and then clean out the domains that are dead. It is perhaps a good idea to run it every week via cron to remove dead domains.
#!/bin/bash ########################################################################### ## Copyright (C) Wizardry and Steamworks 2014 - License: GNU GPLv3 ## ## Please see: http://www.gnu.org/licenses/gpl.html for legal details, ## ## rights of fair usage, the disclaimer and warranty conditions. ## ########################################################################### if [[ -z "$1" ]]; then echo "Syntax: $0 spam_domains.txt" exit 1 fi DOMAIN_FILE="$1" # Cull dead domains. while read DOMAIN; do if [[ -z "$DOMAIN" ]]; then continue; fi DNS_LOOKUP=`nslookup -timeout=1 -retry=0 -fail $DOMAIN | grep Name` if [[ ! -z "$DNS_LOOKUP" ]]; then echo "$DOMAIN" >> /tmp/spam_domains.txt fi done < $DOMAIN_FILE # Sort and clean sort -u /tmp/spam_domains.txt > $DOMAIN_FILE rm /tmp/spam_domains.txt
The code above can be saved to a file called clean_domains.sh
and then executed with:
./clean_domains.sh domains.txt
Provided that domains.txt
is in the current path from where you are running the script.
Filename | Filesize | Last modified |
---|---|---|
domains.txt | 1.3 MiB | 2015/10/06 06:42 |