Skip to content

Instantly share code, notes, and snippets.

@xiangze
Created December 15, 2014 17:14
Show Gist options
  • Select an option

  • Save xiangze/a92fe58a7e2ec56ad3ef to your computer and use it in GitHub Desktop.

Select an option

Save xiangze/a92fe58a7e2ec56ad3ef to your computer and use it in GitHub Desktop.
title & authers of NIPS
# -*- coding: utf-8 -*-
"""
Created on Tue Dec 16 00:16:31 2014
@author: xiangze
"""
import glob
from BeautifulSoup import BeautifulSoup
fns=["advances-in-neural-information-processing-systems-8-1995"]
for fname in fns:
f=open(fname)
soup=BeautifulSoup(f.read())
papers=[]
for l in soup.findAll("li"):
p={}
p["author"]=[]
for i in l.contents:
si=str(i)
if "author" in si:
txt=i.text.encode("utf-8")
p["author"].append(txt)
elif "href" in si:
txt=i.text.encode("utf-8")
p["title"]=txt
papers.append(p)
print papers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment