Created
August 25, 2010 08:53
-
-
Save weakish/549137 to your computer and use it in GitHub Desktop.
Upload #photos to renren.com and fetch news #python #sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
### get news from your friends at renren.com | |
# Copyright: (c) 2009,2010 Jakukyo Friel <[email protected]> | |
# License: GPL v2 | |
# Config area | |
# We put everything in an hg repo. You should set it up already. | |
renrepo=$HOME/repo/renren | |
# Login to m.renren.com with your acccount and get the bookmark url. | |
privateURL='http://m.renren.com/home.do?bm=yourPrivateURLHere' | |
# The cookies file. | |
rencookie=rencookie.txt | |
# Main script | |
# functions | |
# fetch pages: fetch url > outputfile | |
fetch() { | |
wget -O - --load-cookies $rencookie \ | |
--save-cookies rencookie.txt --keep-session-cookies $1 | |
} | |
# get the link to the next page: | |
# It's dirty to use global virable, but afterall this is a sh script. | |
getlink() { | |
local rawurl=$(grep -Eo '<a[^>]+>下一页</a>' | | |
grep -Eo 'href="[^"]+"' | | |
sed -e 's/^href=//' | | |
sed -e 's/"//g') | |
if (echo $rawurl | grep -e '^http://' > /dev/null); then | |
echo $rawurl | |
else | |
echo http://3g.renren.com$rawurl | |
fi | |
} | |
# procedures | |
cd $renrepo | |
# fetch front page | |
fetch $privateURL >0.html | |
# fetch more pages | |
newspage=0.html | |
i=1 | |
while (grep '下一页' $newspage); do | |
output=$i.html | |
fetch $(cat $newspage | getlink) >$output | |
newspage=$output | |
i=$(expr $i + 1) | |
done | |
fetch $(cat $newspage | getlink) >$i.html | |
# We have fetched all the pages we want. | |
# convert to text file | |
for file in *.html; do | |
w3m -dump $file > $file.txt | |
done | |
# extract news | |
for file in [0-9]*.html.txt; do | |
sed -n '/新鲜事/,/[上下]一页/p' $file >> news.all | |
done | |
sed -e '/新鲜事/d' -e '/[上下]一页/d' news.all | sort | uniq > news | |
# commit in changes | |
# you should have already run 'hg init' and 'hg add news' before. | |
hg ci -m 'automated commit' | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python2.6 | |
# by weakish <[email protected]>, licensed under GPL v2. | |
''' | |
Upload photos to renren.com | |
This script uploads all jpeg files at the current dir to an existing album | |
at renren.com. | |
Todo: | |
- read albumid from command line arguments | |
- support files other than jpg | |
''' | |
import glob | |
from twill.commands import * | |
# config | |
email = '[email protected]' | |
password = 'password' | |
albumid = 'albumid' | |
# auth | |
go('http://www.renren.com/') | |
formvalue('1', 'email', email) | |
formvalue('1', 'password', password) | |
submit() | |
code('200') | |
# upload | |
for photo in glob.glob('*.jpg'): | |
go('http://upload.renren.com/addphotoPlain.do?id='+albumid) | |
formfile('uploadForm', 'photo1', photo) | |
submit() | |
code('200') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment