Parsing Apache Server Log for Unique IPs

I created this short Perl script to tally the number of unique IPs from the Apache server log. Since by default, the logrotate daemon compresses Apache logs, we need to disable the compression option for the Apache log configuration. In Ubuntu, this setting can be changed in /etc/logrotate.d/apache2. To disable compression, just comment out the following lines:

        #compress
        #delaycompress

The Perl script is quite straight forward: it basically builds a hash table and use the IP address as the key:

#!/usr/bin/perl
$apacheLogDir = "/var/log/apache2";
opendir(DIR, $apacheLogDir);
@files = grep(/^access/, readdir(DIR));
closedir(DIR);
%hashTable=();
$uniqueIP = 0;
foreach $file (@files) {
        open FILE, "$apacheLogDir" . "/" . "$file" or die $!;
        @lines = <FILE>;
        foreach $line (@lines) {
                @IP = split(/\s+/, $line);
                if (exists($hashTable{$IP[0]})) {
                        $hashTable{$IP[0]}++;
                } else {
                        $hashTable{$IP[0]} = 1;
                        $uniqueIP++;
                        #$resolved = qx{resolveip $IP[0]};
                        #print "$resolved";
                        print "$IP[0]\n";
                }
        }
        close(FILE);
}
print "Total unique IPs: $uniqueIP.\n";

 

The two lines commented out:

                       #$resolved = qx{resolveip $IP[0]};
                        #print "$resolved";

can be used to identify the domain name the IP is associated with. Since reverse DNS lookup is a rather slow process, it will take a long time to resolve all the IPs. I may change it to use multiple threads to speed up the DNS lookup in the further.

Be Sociable, Share!

Leave a Reply