Table of Contents

Description

It is very easy to generate geolocation charts of website accesses by using shell scripts based on access_log. The following document offers a way to generate a datafile containing the number of hits based on all the existing ccTLDs at the current time of writing.

Generating Data File

counter.sh
#!/bin/bash
# Copyright (C) Wizardry and Steamworks.
#
#  Licensed to Wizardry and Steamworks under
# the GPLv3 GNU License which can be found at:
#    http://www.gnu.org/licenses/gpl.html
#
 
TLDS=(ac ad ae af ag ai al am an ao aq ar as at au aw ax az ba bb bd be bf bg bh bi bj bm bn bo br bs bt bv bw by bz ca cc cd cf cg ch ci ck cl cm cn co cr cs cu cv cx cy cz dd de dj dk dm do dz ec ee eg eh er es et eu fi fj fk fm fo fr ga gb gd ge gf gg gh gi gl gm gn gp gq gr gs gt gu gw gy hk hm hn hr ht hu id ie il im in io iq ir is it je jm jo jp ke kg kh ki km kn kp kr kw ky kz la lb lc li lk lr ls lt lu lv ly ma mc md me mg mh mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nc ne nf ng ni nl no np nr nu nz om pa pe pf pg ph pk pl pm pn pr ps pt pw py qa re ro rs ru rw sa sb sc sd se sg sh si sj sk sl sm sn so sr ss st su sv sy sz tc td tf tg th tj tk tl tm tn to tp tr tt tv tw tz ua ug uk us uy uz va vc ve vg vi vn vu wf ws ye yt yu za zm zw)
SCORE=(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)
 
for ip in `cat access_log | awk '{ print $1 }'`; do 
  XX=`dig -x $ip +noall +answer +short | awk 'BEGIN { FS="." } { if ( NF > 0 ) print $(NF-1) }'`
  for TLD in ${!TLDS[*]}; do 
    CC=${TLDS[$TLD]}
    if [ "$XX" == "$CC" ]; then
      (( SCORE[$TLD]++ )) # bash increment
    fi
  done
done 
 
for TLD in ${!TLDS[*]}; do
  printf "%s = %d\n" ${TLDS[$TLD]} ${SCORE[$TLD]} >> stats.dat
done

What this script does is to go through the access_log of the webserver and extract the IP addresses of the page hits. After that, it performs a reverse DNS lookup on the IP addresses to determine the country of origin. While doing that it keeps a score of the TLDs matching the IPs, incrementing the old value every time a new lookup matches a TLD. After the whole access_log is examined, the script dumps all the data to a file called stats.dat which you can use further and generate charts.

Output Format

The output format is given by the following line:

  printf "%s = %d\n" ${TLDS[$TLD]} ${SCORE[$TLD]} >> stats.dat

which can be changed, for example to obtain a CSV datafile.

The result of the above formatting is the sample output:

...
wf = 0
ws = 0
ye = 0
yt = 0
...