Skip to content

Instantly share code, notes, and snippets.

@eqhmcow
Last active July 29, 2019 14:54
Show Gist options
  • Save eqhmcow/1040144 to your computer and use it in GitHub Desktop.
Save eqhmcow/1040144 to your computer and use it in GitHub Desktop.
Real-time top-like requests-per-second average for httpd request logs - similar to apachetop
DESCRIPTION
logstat.pl and logtop.pl
These scripts show current and average request-per-second counts based on
apache access logs in real-time.
You can see the total requests-per-second as well as a breakdown by:
* vhost
* URL
* user agent
* or referrer.
You can sort by:
* number of requests in the last second
* time of last stats update
* average requests-per-second over the last one, five or fifteen minutes
The output shows:
* the date of the last stats update
* the last second's hits per second (hps)
* the one, five and fifteen minute hps average
* the seconds since last request
* an asterisk to mark requests hit in the last 5 seconds
* and the request's vhost, URL, user agent or referrer.
See also the apachetop tool - http://freshmeat.net/projects/apachetop/ and
http://packages.debian.org/search?keywords=apachetop . This script is not
based on or intended to mimic apachetop, but the functionality is broadly
similar.
SYNOPSIS
To use:
In one screen, run the log parser. It writes out a stats file which the
other scripts use.
$ cd /dev/shm # use shared memory
$ tail -F /var/log/httpd/access.log | ~/logstat.pl # centralized logging helps here
In another screen, view your stats:
$ cd /dev/shm # use shared memory
# defaults to vhosts sorted by five minute average
$ watch -n 5 -- ~/logtop.pl
or
# URLs sorted by five minute average, ignoring images, JS and CSS
$ watch -n 5 -- '~/logtop.pl urls \
--skip '\''\.(jpg|png|gif|js|css)$|^\S+\s(/robots\.txt$|/favicon\.ico$)'\'''
or
# user agents sorted by requests in the last second
$ watch -n 5 -- ~/logtop.pl ua --sort hps
or
# referrers sorted by one minute average
$ watch -n 5 -- ~/logtop.pl referrer --sort one
or
# URLs sorted by last stats update
$ watch -n 5 -- ~/logtop.pl urls --sort time
etc.
EXAMPLE
Every 5.0s: ~/logtop.pl Fri Jun 24 03:31:59 2011
Fri Jun 24 03:31:58 2011 hps: 2, average: 5.35, 4.45, 2.37, 0 seconds ago, total
vhosts sorted by five
Fri Jun 24 03:31:52 2011 hps: 0, average: 0.41, 0.59, 0.35, 6 seconds ago, example.com
Fri Jun 24 03:31:54 2011 hps: 1, average: 1.17, 0.58, 0.23, 4 seconds ago, * example.net
Fri Jun 24 03:31:34 2011 hps: 2, average: 0.99, 0.33, 0.12, 24 seconds ago, example.org
...
NOTES
* The script's tally sub is properly abstracted so this script could be
quite easily modified to tally and report averages on anything you can count
(not just httpd logs)
* The log parsing regex matches the "v-combined" log format (combined with
* the virtual host at the front)
* The logtop script accepts "match" and/or "skip" arguments to only show
events that match (or do not match) a given regex.
TODO
* Persist historical data and reload on restart (this can be done
* asynchronously with another client script)
* Persist position in log file (using ftell) to recover from where we left
off on restart (this should account for log file rotation)
#!/usr/bin/perl -w
use strict;
use warnings;
use Time::HiRes 'time';
use List::Util qw/sum/;
use Storable;
use Digest::MD5 'md5_hex';
#use Data::Dumper;
my %data; # internal (server-only) data
my %shared; # data shared with clients
my %keys; # keys to data shared with clients
sub init
{
my $data = $_[0];
my $shared = $_[1];
$data->{count} = 0;
$data->{counts} = [0];
$shared->{hps} = 0;
$shared->{one} = 0;
$shared->{five} = 0;
$shared->{fifteen} = 0;
init_time($data, $shared, $_[2]);
return;
}
sub init_time
{
# mode 0 == init all
# mode 1 == init time and second only
# mode 2 == init minute only
my $data = $_[0];
my $shared = $_[1];
my $mode = $_[3] || 0;
unless ($mode == 2) {
$shared->{time} = $_[2] || time();
}
return if $mode == 1;
$data->{minute} = $shared->{time} + 60;
return;
}
my $prune_every = 60 * 20; # prune every 20 minutes
my $prune_time = time() + $prune_every;
my $last_prune_took = "prune hasn't happened yet";
# TODO: read previous data from persistant store on restart
# (have a seprate client that persists data)
my $stotal = 0; # FIXME debug
# update this when adding new categories
my @data = qw/vhosts urls ua referrer/;
foreach my $key (@data) {
mkdir $key;
}
while (<>) {
my $nt = time();
my ($vhost, $date, $method, $url, $protocol, $alt_url, $code, $bytes,
$referrer, $ua) = (m/
^(\S+)\s # vhost
\S+\s # IP
\S+\s+ # remote logname
(?:\S+\s+)+ # remote user
\[([^]]+)\]\s # date
"(\S*)\s? # method
(?:((?:[^"]*(?:\\")?)*)\s # URL
([^"]*)"\s| # protocol
((?:[^"]*(?:\\")?)*)"\s) # or, possibly URL with no protocol
(\S+)\s # status code
(\S+)\s # bytes
"((?:[^"]*(?:\\")?)*)"\s # referrer
"(.*)"$ # user agent
/x);
die "Couldn't match $_" unless $vhost;
$alt_url ||= '';
$url ||= $alt_url;
# vhost counts
$data{vhosts}{$vhost} ||= $shared{vhosts}{$vhost} ||= {};
tally($data{vhosts}{$vhost}, $shared{vhosts}{$vhost}, $nt,
'vhosts', $vhost);
# url counts
my $vhost_url = "$vhost $url";
$data{urls}{$vhost_url} ||= $shared{urls}{$vhost_url} ||= {};
tally($data{urls}{$vhost_url}, $shared{urls}{$vhost_url}, $nt,
'urls', $vhost_url);
# user agent counts
$data{ua}{$ua} ||= $shared{ua}{$ua} ||= {};
tally($data{ua}{$ua}, $shared{ua}{$ua}, $nt, 'ua', $ua);
# referrer counts
$data{referrer}{$referrer} ||= $shared{referrer}{$referrer} ||= {};
tally($data{referrer}{$referrer}, $shared{referrer}{$referrer}, $nt,
'referrer', $referrer);
# TODO: status codes? bytes? (bytes would require some additional work to
# be useful)
# total hit count
print '.'; # FIXME debug
next unless tally(\%data, \%shared, $nt);
# show the last line we parsed; this can be used to confirm we're parsing
# data in real-time (and that we're parsing the log correctly)
# NOTE: the date from the apache log is the date the HTTP request
# started, but apache writes the request to the log when the request
# finishes. This means if the request took a long time to serve (e.g.
# due to a slow network) the date may be seconds or even minutes in the
# past, even though the line was actually written to the log much more
# recently than that (i.e. just now)
print "\n\nlast line parsed: [$vhost] [$date] [$method] [$url] " .
"[$protocol] [$alt_url] [$code] [$bytes] [$referrer] [$ua]\n\n";
# show the last second's total hits-per-second and the historical average
# and the result of the last prune
my $pt = $prune_time - $nt;
show(\%shared, sprintf(
" total\nnext prune in %.0f seconds; " .
"last $last_prune_took\n", $pt));
# store the total and keys to the categorized data for clients to read.
# NOTE: change the working directory to /dev/shm before starting these
# scripts to use shared memory; this can be considerably faster than
# writing the shared data to local disk
# NOTE: rename is used to atomically update the data file so clients do
# not try to read the data while we're writing it
my $nst = time(); # FIXME debug
$keys{total} = {
hps => $shared{hps},
one => $shared{one},
five => $shared{five},
fifteen => $shared{fifteen},
time => $shared{time},
};
my $entry_count = 0;
foreach my $key (@data) {
my $count = keys %{ $keys{$key} };
$entry_count += $count;
print "$count $key; ";
}
store(\%keys, 'logstats.keys.tmp');
rename('logstats.keys.tmp', 'logstats.keys');
$stotal += time() - $nst; # FIXME debug
printf "$entry_count total entries; store took %.3f seconds\n\n", $stotal;
$stotal = 0;
# prune old data to prevent using too much memory
# NOTE: this does block (for a hopefully short amount of time); we
# should catch back up quickly, though
next unless $pt <= 0;
$pt = time();
print "pruning data...\n";
$entry_count = 0;
my $prune_count = 0;
$prune_time = $nt + $prune_every;
foreach my $k (keys %data) {
my $v = $data{$k};
next unless (ref($v) || '') eq 'HASH';
foreach my $key (keys %$v) {
my $val = $shared{$k}{$key};
$entry_count++;
next unless (ref($val) || '') eq 'HASH'
and $nt - $val->{time} > $prune_every;
unlink("$k/" . $keys{$k}{$key});
delete $keys{$k}{$key};
delete $v->{$key};
delete $shared{$k}{$key};
$prune_count++;
}
}
my $elapsed = time() - $pt;
$last_prune_took =
sprintf "pruned $prune_count stale out of $entry_count total entries in %.3f seconds.", $elapsed;
print "$last_prune_took\n\n";
}
sub tally
{
my $nt = $_[2] || time();
my $init = defined $_[0]->{count};
init($_[0], $_[1], $nt) unless $init;
# count the event
$_[0]->{count}++;
# save data if this is the first time we've seen this event
save($_[0], $_[1], $_[3], $_[4]) if $_[3] && !$init;
# add $count to per-minute count and reset $count every second
my $diff = $nt - $_[1]->{time};
return 0 unless $diff >= 1;
my $data = $_[0];
my $shared = $_[1];
init_time($data, $shared, $nt, 1);
$shared->{hps} = $data->{count} / $diff;
$data->{count} = 0;
$data->{counts}[0] += $shared->{hps};
# add per-minute count to 15-minute historical data set and reset
# per-minute count every minute
$diff = $nt - $data->{minute};
if ($diff >= 0) {
init_time($data, $shared, $nt, 2);
# log "0" counts to historical data set if this is an infrequent stat
splice(@{$data->{counts}}, 1, 0, (0) x int($diff / 60));
# FIXME: reduce value of per-minute count by remainder if it's been
# over a minute (that is, add a fractional "0" count)
# compute historical average
unshift @{$data->{counts}}, 0;
no warnings qw/uninitialized misc/;
splice @{$data->{counts}}, 16;
$shared->{one} = $data->{counts}[1] / 60;
$shared->{five} = sum(@{$data->{counts}}[1..5]) / 5 / 60;
$shared->{fifteen} = sum(@{$data->{counts}}[1..15]) / 15 / 60;
save($data, $shared, $_[3], $_[4]) if $_[3];
return 1;
}
# if it hasn't been a minute yet, extrapolate a running historical average
my $count = $data->{counts}[0];
# this gets more accurate as the minute progresses,
# but it does tend to over-estimate infrequent events
$count *= 60 / ($diff + 60);
# egregious HACK - but this only matters for the first minute(s)
# after we start up
defined($data->{counts}[1]) or $data->{counts}[1] = $count;
$shared->{one} = sum($count, $data->{counts}[1]) / 2 / 60;
no warnings 'uninitialized';
$shared->{five} = sum($count, @{$data->{counts}}[1..5]) / 6 / 60;
$shared->{fifteen} = sum($count, @{$data->{counts}}[1..15]) / 16 / 60;
save($data, $shared, $_[3], $_[4]) if $_[3];
return 1;
}
sub show
{
my $shared = $_[0];
print scalar localtime($shared->{time});
printf " hps: %6.0f, average: %.2f, %.2f, %.2f, ",
$shared->{hps}, $shared->{one}, $shared->{five}, $shared->{fifteen};
print $_[1] || "\n";
return;
}
sub save
{
my $nst = time(); # FIXME debug
my $file = $_[2] . '/' . ($keys{$_[2]}{$_[3]} ||= md5_hex($_[3]));
store($_[1], "$file.tmp");
rename("$file.tmp", $file);
$stotal += time() - $nst; # FIXME debug
return;
}
#!/usr/bin/perl -w
use strict;
use warnings;
use Time::HiRes 'time';
use Storable;
use Getopt::Long;
my ($match, $skip, $match_re, $skip_re, $sort);
GetOptions(
'match=s' => \$match,
'skip=s' => \$skip,
'sort=s' => \$sort,
);
# time, hps, one, five or fifteen
$sort ||= 'five';
# vhosts, urls, ua or referrer
my $type = shift() || 'vhosts';
$match_re = qr/$match/i if $match;
$skip_re = qr/$skip/i if $skip;
# load data
my $rt = time();
my $datakeys = retrieve('logstats.keys');
my $data = $datakeys->{total};
foreach my $key (keys %{ $datakeys->{$type} }) {
my $md5 = $datakeys->{$type}{$key};
my $d;
# file may have been pruned; skip if we can't read it
eval { $d = retrieve("$type/$md5") };
next if $@;
$data->{$type}{$key} = $d;
}
my $now = time();
show($data, " total\n", $now - $data->{time});
printf "loaded data in %.3f seconds - $type sorted by $sort\n", $now - $rt;
my @stale;
foreach my $event (sort
{ $data->{$type}{$b}{$sort} <=> $data->{$type}{$a}{$sort} }
keys %{ $data->{$type} }) {
next if $skip and $event =~ m/$skip_re/;
next if $match and $event !~ m/$match_re/;
my $d = $data->{$type}{$event};
my $t = $now - $d->{time};
if ($t > 60) {
push @stale, $event;
next;
}
my $text = ' ';
if ($now - $d->{time} < 6) {
$text = " * ";
}
show($d, "$text $event\n", $t);
}
print "\nstale:\n" if @stale;
my $i = 0;
foreach my $event (@stale) {
my $d = $data->{$type}{$event};
my $t = $now - $d->{time};
show($d, " $event\n", $t);
last if ++$i > 100;
}
sub show
{
my $data = $_[0];
print scalar localtime($data->{time});
my $s = "s";
$s = " " if sprintf("%.0f", $_[2]) eq '1';
printf " hps: %6.0f, average: %.2f, %.2f, %.2f, %2.0f second$s ago, ", $data->{hps}, $data->{one}, $data->{five}, $data->{fifteen}, $_[2];
print $_[1] || "\n";
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment