Created
March 31, 2012 17:08
-
-
Save Stantheman/2266820 to your computer and use it in GitHub Desktop.
One-Liner to get approximate size of remote Apache directory listing using wget and perl
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
wget -r -nd -np --spider http://URL_GOES_HERE 2>&1 | perl -ne '$size += $1 if $_ =~ m/^Length: (\d+)/; END{print $size . "\n";}' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This assumes that the remote Apache directory is using the standard index module. Wget issues a HEAD request to every URL found in the listing, and every content-length line is summed with Perl.
I've compared the output of this script with the output of 'du -sb' on the target directory and achieved the same answer with a 2 MB difference. The difference comes from the different sorting-links that a default Apache index offers and could be removed with additional lines. The target remote directory was nearly 3 GB in size and had 29 subdirectories.