Skip to content

Instantly share code, notes, and snippets.

@tobbez
Created November 21, 2010 11:38
Show Gist options
  • Save tobbez/708670 to your computer and use it in GitHub Desktop.
Save tobbez/708670 to your computer and use it in GitHub Desktop.
Domain stats in redis
* Redis key/value storage: http://code.google.com/p/redis/
* credis C bindings for redis: http://code.google.com/p/credis/
* redis-py python client for redis: https://github.com/andymccurdy/redis-py/
A python client was used to display the stats because of problems with zrevrange in credis.
tobbez@sagiri ~/redis $ LD_LIBRARY_PATH=./credis-0.2.3 ./domain-test
reading list of domains... done
inserting into redis...
inserted 231249 records in 8.690000 seconds (26610 records/s)
tobbez@sagiri ~/redis $ ./show-top-domains.py
top domains are:
www.netLibrary.com: 39846
purl.access.gpo.gov: 39383
en.wikipedia.org: 6519
www.netlibrary.com: 6270
www.youtube.com: 5639
search.epnet.com: 3933
etymonline.com: 2037
link.springer-ny.com: 1778
blinkenshell.org: 1653
youtube.com: 1387
tinyurl.com: 1351
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include "credis.h"
int main(int argc, char **argv)
{
REDIS rh;
double _crap;
FILE *f;
char *s[250000]; /* yeah, yeah, i know */
char input[1024];
int scount = 0;
clock_t cstart, cend;
float seconds;
int i;
printf("reading list of domains...");
f = fopen("domains.txt", "r");
while (fgets(input, 1024, f) != NULL) {
if (strlen(input) > 0)
input[strlen(input) - 1] = '\0'; /* remove newlines */
s[scount] = malloc(strlen(input) + 1);
strcpy(s[scount], input);
scount++;
}
fclose(f);
printf(" done\n");
rh = credis_connect(NULL, 0, 2000);
credis_del(rh, "domains");
printf("inserting into redis...\n");
cstart = clock();
for (i = 0; i < scount; ++i) {
credis_zincrby(rh, "domains", 1, s[i], &_crap);
}
cend = clock();
seconds = (cend - cstart)/(float)CLOCKS_PER_SEC;
printf("inserted %i records in %f seconds (%i records/s)\n", scount, seconds, (int) (scount/seconds));
credis_close(rh);
return 0;
}
#!/usr/bin/env python
import redis
r = redis.Redis()
print "top domains are:"
for x in r.zrevrange('domains', 0, 10, True):
print x[0] + ":" + (25 - len(x[0])) * ' ' + str(int(x[1]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment