Skip to content

Instantly share code, notes, and snippets.

@kiyoto
kiyoto / 140_bytes_not_chars
Created February 25, 2013 08:53
UTF-8 heavy languages like Japanese manage to pack in more meaning into 140 characters than English. Here is one farcical attempt to keep them honest: truncate the tweet after 140 bytes, not characters.
javascript:(function(d){
var tbox, text, len = 140, ii = 0;
tbox = d.getElementById('tweet-box-global');
if (!tbox) return;
if (!(tbox = tbox.firstChild)) return;
if (!(text = tbox.innerHTML)) return;
while (true) {
len -= encodeURI(text[ii]).replace(/%[A-F\d]{2}/g, 'U').length;
javascript:(function(d){
var cls, ii, cl, ct = 0;
cls = d.getElementsByClassName('discussion-bubble');
for (ii = 0; ii < cls.length; ii++) {
cl = cls[ii];
if (/IETF/i.test(cl.innerHTML)) {
cl.style.display = 'none';
ct++;
}
}
#!/bin/bash
repo=$1
project=$2
curl -sL https://api.github.com/repos/$1/$2/issues | perl -e '$nc=0; while(<>) { if(/"comments": (\d+)/ && ($nc < ($c=int($1)))) { $nc=$c; } } print "$nc\n";'
@kiyoto
kiyoto / in_twitterstream.rb
Created November 2, 2012 22:43
TD-agent input plugin to get data from Twitter (thanks so jyuan)
module Fluent
class TwitterStreamInput < Fluent::Input
# Register plugin
Plugin.register_input('twitterstream', self)
# required auth params
config_param :consumer_key, :string
config_param :consumer_secret, :string
config_param :access_token_key, :string
@kiyoto
kiyoto / anonconvo.rb
Created October 10, 2012 01:12
anonymize/filter out timestamps from Adium chat logs
#!/usr/bin/env ruby
# to parse Adium conversation
regex = /^[\d:]+ [AP]M (?<name>[^:]+): (?<msg>.*)$/
for line in STDIN
m = regex.match(line)
next if not m
name = m['name']
msg = m['msg']
@kiyoto
kiyoto / book_crossing_action_query.sql
Created July 23, 2012 22:13
Book Crossing Dataset Action Queries
-- Query 1: Counting Harsh, Generous and Lazy
td query -w -d book_crossing_dataset "
SELECT rating_type, COUNT(*) As cnt
FROM
(
SELECT user_id, MIN(book_rating) AS stat, COUNT(book_rating) AS cnt, 'Generous' AS rating_type
FROM ratings
WHERE 0 < book_rating
GROUP BY user_id
HAVING 5 < COUNT(book_rating)
@kiyoto
kiyoto / book_crossing_status_query.sql
Created July 16, 2012 23:37
Book Crossing Dataset Status Queries
-- Queries 1
td query -w -d book_crossing_dataset "
SELECT t AS type, cnt
FROM
(
SELECT COUNT(*) AS cnt, 'only in users' AS t
FROM
(
SELECT user_id
@kiyoto
kiyoto / basic_information.sql
Created July 10, 2012 19:33 — forked from doryokujin/basic_information.sql
Book-Crossing Dataset
-- 1.1.1 all users v. active users --
td query -w -d book_crossing_dataset "
SELECT t1.cnt AS all_users, t2.cnt AS active_users, ROUND(t2.cnt/t1.cnt*100) AS active_rate
FROM
(
SELECT COUNT(distinct user_id) as cnt, 1 AS one
FROM users
) t1
JOIN
(
@kiyoto
kiyoto / random_array.php
Created June 15, 2012 00:38
creating random associative arrays
<?php
function random_array(&$current) {
foreach ($current as $k => &$v) {
if ((rand() % 10) / 10 >= 0.68) {
$v = array(0, 'aoiyu', 1337);
random_array($v);
}
}
unset($v);
@kiyoto
kiyoto / hn-stats
Created June 2, 2012 22:21
Just a random script to scrape info from HN's top page.
#!/bin/sh
HN_FILE='hn.html'
HN="http://news.ycombinator.com"
rm -f $HN_FILE
wget "$HN" -O $HN_FILE
if [ -f $HN_FILE ]
then
perl -ne 'while (m!<td class="title"><a href="(http[^"]+)">(?:.*?)</a><span class="comhead"> \(([^)]+)\) </span></td></tr><tr><td colspan=2></td>(?:<td class="subtext"><span id=score_\d+>(\d+) points</span> by <a href="user\?id=[^"]+">(?:.*?)</a> (\d+) (minute|hour|day)s? ago \| <a href="item\?id=\d+">(\d+) comments?</a>)?!g) { print "$1 $2"; if (defined $3) { print " $3 $4 $5 $6"; } print "\n"; }' < $HN_FILE
else