titipata · March 30, 2015 05:34 · titipata · Mar 30, 2015
diff --git a/gs_scoreboard.ipynb b/gs_scoreboard.ipynb
 {
 "metadata": {
  "name": "",
  "signature": "sha256:8918c770e5f0319be5b9003a936c20e387589b76d64fca0d1a9d2d130e695954"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Google Scholar - Kording Lab"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This is ipython notebook code to create real time web scraper using information from Google Scholar.\n",
      "We use information from our lab members' Google Scholar. Code is divided into 3 sections - library and function, run scholar update and clear, delete figure.\n",
      "\n",
      "- Original Code by Daniel Acuna (on Mathematica)\n",
      "- Created by Titipat Achakulvisut with great help of Daniel Acuna\n",
      "\n",
      "HISTORY:\n",
      "- Created on: 27 Aug 2014\n",
      "- Updated:\n",
      "   - 29 Aug 2014 update webscraping using lxml instead of regular expression\n",
      "   - 9 Sep 2014 minor changes in order to put on github\n",
      "- Version 0.1"
     ]
    },
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Libraries and Functions"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Import all libraries and functions to run the real time google scholar\n",
      "\n",
      "Requirement:\n",
      "- ipython +notebook\n",
      "- numpy\n",
      "- lxml\n",
      "- pygame\n",
      "- pandas\n",
      "- urllib2\n",
      "- matplotlib\n",
      "\n",
      "Notice\n",
      "- We also have real time twitter if you download library 'twitter' and get twitter api online\n",
      "- We can change the plot in variable 'parameters' depending on screen size you display on\n",
      "\n",
      "ps. \n",
      "- if install python with Anaconda, you need only 'pygame' that is separately installed\n",
      "- if install python on MacOSX using Macports, feel free to read our documents where we have the section that we can install from command line http://klab.smpp.northwestern.edu/wiki/images/e/e6/Macport.pdf"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%pylab qt4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# import library to scrape website\n",
      "from urllib2 import urlopen\n",
      "import numpy as np\n",
      "import pandas as pd\n",
      "import time\n",
      "from lxml import etree # read html\n",
      "from lxml import html\n",
      "\n",
      "# import library to plot and draw\n",
      "import matplotlib.pyplot as plt\n",
      "import cStringIO # get image from website\n",
      "from PIL import Image\n",
      "\n",
      "# use pygame to play music\n",
      "import pygame\n",
      "\n",
      "\n",
      "#import twitter\n",
      "#api = twitter.Api(consumer_key='',\n",
      "#                    consumer_secret='',\n",
      "#                    access_token_key='',\n",
      "#                    access_token_secret='')\n",
      "\n",
      "#### CONSTANT ####\n",
      "# name list\n",
      "NAME = ['Konrad', \n",
      "        'Daniel', \n",
      "        'Pavan',\n",
      "        'Josh', \n",
      "        'Ted', \n",
      "        'Pat',\n",
      "        'Eva', \n",
      "        'Mohammad', \n",
      "        'Hugo',\n",
      "        'Luca',\n",
      "        'Sohrob',\n",
      "        'Steve',\n",
      "        'Iris']\n",
      "\n",
      "# url of each lab member (based on name lists)\n",
      "BASE_URL = ['http://scholar.google.com/citations?user=MiFqJGcAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=GAi23ssAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=JtltLUAAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=tbfWCDgAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=T8W-5LsAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=jjvixpcAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=wdFV87UAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=AlTQrFcAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=JG7xb2AAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=xxDk3-EAAAAJ&hl=en',\n",
      "            'http://scholar.google.co.uk/citations?user=9jqURCEAAAAJ&hl=en',\n",
      "            'http://scholar.google.com/citations?user=uwpOnSAAAAAJ&hl=en&oi=sra',\n",
      "            'http://scholar.google.com/citations?user=Ztwn608AAAAJ&hl=en']\n",
      "\n",
      "\n",
      "# image link that we want to display if someone get cited\n",
      "IMG_LINK = {'Konrad': 'http://www.qwantz.com/patreon/p3.png',\n",
      "            'Daniel': 'http://www.quickmeme.com/img/45/451ab8e56df6f66c37c7eda8e36765f743cbabf1e1dbee5dffd648f47dde54d1.jpg',\n",
      "            'Mohammad': 'http://mybroadband.co.za/vb/attachment.php?s=c864c202183cf1b3d2c57f738e78fce8&attachmentid=103452&d=1394014893',\n",
      "            'Pavan': 'http://www.nbc.com/sites/nbcunbc/files/files/styles/nbc_bio_image/public/images/2013/11/08/azizAnsari_tomHaverford.jpg?itok=PCowY6uk',\n",
      "            'Hugo': 'http://public.media.smithsonianmag.com/legacy_blog/dinosaur-comic-strip.jpg',\n",
      "            'Ted': 'http://lovestats.files.wordpress.com/2012/07/r-square-success-kid.jpg',\n",
      "            'Pat': 'http://m.memegen.com/x6259d.jpg',\n",
      "            'Josh': 'http://cdn.mhpbooks.com/uploads/2013/10/Success-Kid.jpg',\n",
      "            'Eva': 'http://veryhilarious.com/wp-content/uploads/2012/07/indy-hipster.jpg',\n",
      "            'Luca': 'http://i1.cpcache.com/product_zoom/510200164/bunga_bunga_berlusconi_classic_thong.jpg?color=White&height=460&width=460&padToSquare=true',\n",
      "            'Sohrob': 'http://4.bp.blogspot.com/-rjyBJpjUizw/U35ON10k8eI/AAAAAAAAfUM/KMFJEgwJhM8/s1600/stupid-meme-stalin-obama-2.jpg',\n",
      "            'Steve': 'http://ct.fra.bz/ol/fz/sw/i58/2/5/25/frabz-giant-burrito-man-will-make-you-fire-torpedoes-of-another-kind-a01c69.jpg',\n",
      "            'Iris': 'http://public.media.smithsonianmag.com/legacy_blog/dinosaur-comic-strip.jpg'\n",
      "            }\n",
      "\n",
      "\n",
      "# music snippet that you want to play if anyone got cited\n",
      "MUSIC_DIR = '/Users/titipat/Desktop/Amazon Web Service/snippet.wav'\n",
      "\n",
      "# parameter for plotting\n",
      "params_cite = {'fontsize': 22, 'color': 'green', 'fontweight':'bold'}\n",
      "params_hindex = {'fontsize': 22, 'color': 'red', 'fontweight':'bold'}\n",
      "params_date = {'fontsize': 20, 'color': 'blue', 'fontweight':'bold'}\n",
      "params_others = {'fontsize': 30, 'color': 'black'}\n",
      "params_gini = {'fontsize': 20, 'color': 'blue', 'fontweight':'bold'}\n",
      "params_gini_val = {'fontsize': 20, 'color': 'black'}\n",
      "params_tweet = {'fontsize': 20, 'color': 'red', 'fontweight':'bold'}\n",
      "\n",
      "def get_citation_matrix():\n",
      "    ''' Get all citation datafram from Google Scholar '''\n",
      "    all_people = pd.DataFrame(columns=['name', 'citation', 'h_index','url', 'hn_index'])\n",
      "    all_people['name'] = NAME\n",
      "    all_people['url'] = BASE_URL\n",
      "\n",
      "    for i in range(len(all_people)):\n",
      "        tree = html.parse(all_people.url[i])\n",
      "        cit = tree.xpath(\"/html/body/div[@id='gs_top']/div[@id='gsc_bdy']/div[@id='gsc_rsb']/div[@class='gsc_rsb_s']/table[@id='gsc_rsb_st']//tr//td[@class='gsc_rsb_std']\")\n",
      "        citations = np.int(cit[0].text)\n",
      "        h_index = np.int(cit[2].text)\n",
      "        \n",
      "        all_people['citation'][i] = np.int(citations)\n",
      "        all_people['h_index'][i] = np.int(h_index)\n",
      "        all_people['hn_index'][i] = float(h_index**2)/(float(citations) + 1) # suggested ratio by Mohammad and Hugo\n",
      "\n",
      "    return all_people # return matrix of citation\n",
      "\n",
      "def sort_citation(all_people):\n",
      "    ''' Sort the given dataframe by citation '''\n",
      "    all_people_sorted = all_people.sort(columns=['h_index', 'citation'], ascending=False)\n",
      "    all_people_sorted.index = np.arange(len(all_people))\n",
      "    return all_people_sorted\n",
      "\n",
      "def get_options(all_people, all_people_new):\n",
      "    ''' Get citation and new citation then return option dataframe (including difference) '''\n",
      "    # get difference\n",
      "    cite_diff = all_people_new.citation - all_people.citation\n",
      "    index = cite_diff.nonzero()[0]\n",
      "    name_diff = list(all_people.name[index]) # list of name different\n",
      "    \n",
      "    # get table of options if there is different\n",
      "    options = pd.DataFrame(columns=['name', 'citation_diff', 'h_index_diff','sign'])\n",
      "    options['name'] = all_people['name'] # refer to new update\n",
      "    options['citation_diff'] = 0\n",
      "    options['h_index_diff'] = 0\n",
      "    options['sign'] = ''\n",
      "\n",
      "    # for one people in name_diff\n",
      "    for j in range(len(name_diff)):\n",
      "        idx_new = np.where(name_diff[j] == all_people_new['name'])[0]\n",
      "        idx_old = np.where(name_diff[j] == all_people['name'])[0]\n",
      "        diff_citation = all_people_new.citation[idx_new] - all_people.citation[idx_old]\n",
      "        diff_hindex = all_people_new.h_index[idx_new] - all_people.h_index[idx_old]\n",
      "        #print sign(all_people_sorted_new.citation[idx_new] - all_people_sorted.citation[idx_old]) # get sign +1 or -1\n",
      "        options['citation_diff'][idx_new] = diff_citation\n",
      "        options['h_index_diff'][idx_new] = diff_hindex\n",
      "        sign_pm = np.int(np.sign(diff_citation))\n",
      "        if sign_pm == 1:\n",
      "            options['sign'][idx_new] = '+'\n",
      "        else:\n",
      "            options['sign'][idx_new] = '-'\n",
      "\n",
      "    # change index\n",
      "    options.index = options.name # do this one!\n",
      "    \n",
      "    # return dataframe of options and name that have different citation\n",
      "    return options, name_diff\n",
      "    \n",
      "def is_different(all_people_sorted, all_people_sorted_new):\n",
      "    ''' Find if two dataframe are equal or not '''\n",
      "    result = np.max(all_people_sorted_new.h_index != all_people_sorted.h_index) or np.max(all_people_sorted_new.citation != all_people_sorted.citation)\n",
      "    return result\n",
      "\n",
      "def play_music():\n",
      "    ''' Play Everything is Awesome wavfile track '''\n",
      "    pygame.init()\n",
      "    pygame.mixer.music.load(MUSIC_DIR)\n",
      "    pygame.mixer.music.play()\n",
      "    \n",
      "def compute_gini(y):\n",
      "    ''' function to compute Gini index '''\n",
      "    N = len(y)\n",
      "    gini = double(2*np.dot(sorted(y, reverse=False), np.arange(1, N+1)))/double(double(N)*np.sum(y)) - ((N+1.0)/double(N))\n",
      "    gini = np.ceil(gini * 1000) / 1000.0\n",
      "    return gini\n",
      "    \n",
      "def draw_citation_table(): #all_people_new_sorted\n",
      "    global all_people, all_people_new, all_people_sorted, all_people_new_sorted, options, options_old, name_diff # see the outer variable\n",
      "    \n",
      "    # get new citation\n",
      "    all_people_new = get_citation_matrix()\n",
      "    all_people_new_sorted = sort_citation(all_people_new)\n",
      "    \n",
      "    # if different optain new options\n",
      "    if is_different(all_people_sorted, all_people_new_sorted):\n",
      "        options_old, name_diff = get_options(all_people, all_people_new)\n",
      "        options = options_old\n",
      "    else:\n",
      "        options = options_old\n",
      "     \n",
      "    \n",
      "    # DRAW TITLE/ HEADERS\n",
      "    fig = plt.gcf() # get current figure\n",
      "    fig.clf() # clear current figure\n",
      "    fig.suptitle('Bayesian Behavior Lab Citations',\n",
      "                 fontsize=35, fontweight='bold',\n",
      "                 color='gray', style='italic')\n",
      "    \n",
      "    plt.text(0.2, 0.94, 'Citations', **params_cite)\n",
      "    plt.text(0.5, 0.94, 'h-index', **params_hindex)\n",
      "    plt.axis('off')\n",
      "\n",
      "    for i in range(len(all_people_new_sorted)):\n",
      "        name = all_people_new_sorted.name[i] # get name\n",
      "        if (name in name_diff):\n",
      "            option_cit = '('+ options.loc[name].sign + str(options.loc[name].citation_diff) + ')'\n",
      "            option_h = '(' + options.loc[name].sign + str(options.loc[name].h_index_diff) + ')'\n",
      "            # if citation or h-index different is 0, turn to blank\n",
      "            if np.int(options.loc[name].citation_diff) == 0:\n",
      "                option_cit = ''\n",
      "            if np.int(options.loc[name].h_index_diff) == 0:\n",
      "                option_h = ''\n",
      "        else:\n",
      "            option_cit = ''\n",
      "            option_h = ''\n",
      "\n",
      "        # DRAW Name, distance between shown name lists is here\n",
      "        plt.text(-0.1, 0.85-0.065*i, str(all_people_new_sorted.name[i]) , **params_others)\n",
      "        # DRAW Citation\n",
      "        plt.text(0.2, 0.85-0.065*i, str(all_people_new_sorted.citation[i]) + option_cit, **params_others)\n",
      "        # DRAW H-index\n",
      "        plt.text(0.5, 0.85-0.065*i, str(all_people_new_sorted.h_index[i]) + option_h, **params_others)\n",
      "\n",
      "    plt.text(0.0, -0.1, 'Last citation update: ' + time.strftime('%X %b %d, %Y'), **params_date)\n",
      "    plt.text(0.65, 0.6, 'Gini (citation)', **params_gini)\n",
      "    plt.text(0.92, 0.6, str(compute_gini(all_people_new_sorted.citation)), **params_gini_val)\n",
      "    plt.text(0.65, 0.53, 'Gini (h-index)', **params_gini)\n",
      "    plt.text(0.92, 0.53, str(compute_gini(all_people_new_sorted.h_index)), **params_gini_val)\n",
      "\n",
      "    # DRAW New Twitter Feed from Kording Lab\n",
      "    #statuses = api.GetUserTimeline(screen_name=\"KordingLab\") # getting all tweets\n",
      "    #plt.text(-0.1, -0.05, 'Tweets: ', **params_gini)\n",
      "    #try:\n",
      "    #    new_tweet = str(statuses[0].text.encode('ascii', 'ignore'))\n",
      "    #    plt.text(0.1, -0.05, new_tweet, **params_gini_val)\n",
      "    #except ValueError:\n",
      "    #    plt.text(0.1, -0.05, \"Can't retrive tweet...\", **params_gini_val)\n",
      "    \n",
      "    \n",
      "    # DRAW IMAGE\n",
      "    if len(name_diff) > 0:\n",
      "        URL = IMG_LINK[name_diff[-1]] # use lowested name to show image\n",
      "    else:\n",
      "        URL = 'http://www.qwantz.com/patreon/p3.png' # default dinosaur images \n",
      "    file = cStringIO.StringIO(urlopen(URL).read())\n",
      "\n",
      "    img = Image.open(file)\n",
      "    axicon = fig.add_axes([0.6,0.15,0.33,0.33])\n",
      "    plt.imshow(img)\n",
      "    plt.axis('off')\n",
      "    \n",
      "    fig.canvas.draw()\n",
      "    fig.canvas.activateWindow()\n",
      "    plt.draw()\n",
      "    plt.show()\n",
      "    \n",
      "    \n",
      "    # Play music and update part!\n",
      "    if is_different(all_people_sorted, all_people_new_sorted):\n",
      "        play_music()\n",
      "        all_people = all_people_new # replace all people with new one\n",
      "        all_people_sorted = sort_citation(all_people) # sorted again\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Run Google Scholar"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "After running library and functions part, run this line to show the citation update"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Draw the First citation image\n",
      "plt.close('all')\n",
      "fig = plt.figure(facecolor='white') # or 'white' depending on bg we want\n",
      "\n",
      "all_people = get_citation_matrix() # get citation from provided url\n",
      "options, name_diff = get_options(all_people, all_people)\n",
      "options_old = options # assign value to get rid of conflict\n",
      "all_people_sorted = sort_citation(all_people)\n",
      "all_people_new = get_citation_matrix()\n",
      "all_people_new_sorted = sort_citation(all_people_new)\n",
      "draw_citation_table() # draw first K-lab citation\n",
      "\n",
      "# timer to run code every some amount of time\n",
      "timer = fig.canvas.new_timer(interval=1000*60*15) # run every 15 minutes (1000*60*15 milli-seconds)\n",
      "timer.add_callback(draw_citation_table)\n",
      "timer.start()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Close figures and Stop Timer"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "this section is to close the real time citation"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# close all figure and stop timer\n",
      "timer.stop()\n",
      "plt.close('all')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Adding customize css file for NBViewer"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from IPython.core.display import HTML\n",
      "HTML(open(\"./custom_nb.css\", \"r\").read())"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
 }
	{
	"metadata": {
	"name": "",
	"signature": "sha256:8918c770e5f0319be5b9003a936c20e387589b76d64fca0d1a9d2d130e695954"
	},
	"nbformat": 3,
	"nbformat_minor": 0,
	"worksheets": [
	{
	"cells": [
	{
	"cell_type": "heading",
	"level": 1,
	"metadata": {},
	"source": [
	"Google Scholar - Kording Lab"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"This is ipython notebook code to create real time web scraper using information from Google Scholar.\n",
	"We use information from our lab members' Google Scholar. Code is divided into 3 sections - library and function, run scholar update and clear, delete figure.\n",
	"\n",
	"- Original Code by Daniel Acuna (on Mathematica)\n",
	"- Created by Titipat Achakulvisut with great help of Daniel Acuna\n",
	"\n",
	"HISTORY:\n",
	"- Created on: 27 Aug 2014\n",
	"- Updated:\n",
	" - 29 Aug 2014 update webscraping using lxml instead of regular expression\n",
	" - 9 Sep 2014 minor changes in order to put on github\n",
	"- Version 0.1"
	]
	},
	{
	"cell_type": "heading",
	"level": 1,
	"metadata": {},
	"source": [
	"Libraries and Functions"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Import all libraries and functions to run the real time google scholar\n",
	"\n",
	"Requirement:\n",
	"- ipython +notebook\n",
	"- numpy\n",
	"- lxml\n",
	"- pygame\n",
	"- pandas\n",
	"- urllib2\n",
	"- matplotlib\n",
	"\n",
	"Notice\n",
	"- We also have real time twitter if you download library 'twitter' and get twitter api online\n",
	"- We can change the plot in variable 'parameters' depending on screen size you display on\n",
	"\n",
	"ps. \n",
	"- if install python with Anaconda, you need only 'pygame' that is separately installed\n",
	"- if install python on MacOSX using Macports, feel free to read our documents where we have the section that we can install from command line http://klab.smpp.northwestern.edu/wiki/images/e/e6/Macport.pdf"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"%pylab qt4"
	],
	"language": "python",
	"metadata": {},
	"outputs": []
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"# import library to scrape website\n",
	"from urllib2 import urlopen\n",
	"import numpy as np\n",
	"import pandas as pd\n",
	"import time\n",
	"from lxml import etree # read html\n",
	"from lxml import html\n",
	"\n",
	"# import library to plot and draw\n",
	"import matplotlib.pyplot as plt\n",
	"import cStringIO # get image from website\n",
	"from PIL import Image\n",
	"\n",
	"# use pygame to play music\n",
	"import pygame\n",
	"\n",
	"\n",
	"#import twitter\n",
	"#api = twitter.Api(consumer_key='',\n",
	"# consumer_secret='',\n",
	"# access_token_key='',\n",
	"# access_token_secret='')\n",
	"\n",
	"#### CONSTANT ####\n",
	"# name list\n",
	"NAME = ['Konrad', \n",
	" 'Daniel', \n",
	" 'Pavan',\n",
	" 'Josh', \n",
	" 'Ted', \n",
	" 'Pat',\n",
	" 'Eva', \n",
	" 'Mohammad', \n",
	" 'Hugo',\n",
	" 'Luca',\n",
	" 'Sohrob',\n",
	" 'Steve',\n",
	" 'Iris']\n",
	"\n",
	"# url of each lab member (based on name lists)\n",
	"BASE_URL = ['http://scholar.google.com/citations?user=MiFqJGcAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=GAi23ssAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=JtltLUAAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=tbfWCDgAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=T8W-5LsAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=jjvixpcAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=wdFV87UAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=AlTQrFcAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=JG7xb2AAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=xxDk3-EAAAAJ&hl=en',\n",
	" 'http://scholar.google.co.uk/citations?user=9jqURCEAAAAJ&hl=en',\n",
	" 'http://scholar.google.com/citations?user=uwpOnSAAAAAJ&hl=en&oi=sra',\n",
	" 'http://scholar.google.com/citations?user=Ztwn608AAAAJ&hl=en']\n",
	"\n",
	"\n",
	"# image link that we want to display if someone get cited\n",
	"IMG_LINK = {'Konrad': 'http://www.qwantz.com/patreon/p3.png',\n",
	" 'Daniel': 'http://www.quickmeme.com/img/45/451ab8e56df6f66c37c7eda8e36765f743cbabf1e1dbee5dffd648f47dde54d1.jpg',\n",
	" 'Mohammad': 'http://mybroadband.co.za/vb/attachment.php?s=c864c202183cf1b3d2c57f738e78fce8&attachmentid=103452&d=1394014893',\n",
	" 'Pavan': 'http://www.nbc.com/sites/nbcunbc/files/files/styles/nbc_bio_image/public/images/2013/11/08/azizAnsari_tomHaverford.jpg?itok=PCowY6uk',\n",
	" 'Hugo': 'http://public.media.smithsonianmag.com/legacy_blog/dinosaur-comic-strip.jpg',\n",
	" 'Ted': 'http://lovestats.files.wordpress.com/2012/07/r-square-success-kid.jpg',\n",
	" 'Pat': 'http://m.memegen.com/x6259d.jpg',\n",
	" 'Josh': 'http://cdn.mhpbooks.com/uploads/2013/10/Success-Kid.jpg',\n",
	" 'Eva': 'http://veryhilarious.com/wp-content/uploads/2012/07/indy-hipster.jpg',\n",
	" 'Luca': 'http://i1.cpcache.com/product_zoom/510200164/bunga_bunga_berlusconi_classic_thong.jpg?color=White&height=460&width=460&padToSquare=true',\n",
	" 'Sohrob': 'http://4.bp.blogspot.com/-rjyBJpjUizw/U35ON10k8eI/AAAAAAAAfUM/KMFJEgwJhM8/s1600/stupid-meme-stalin-obama-2.jpg',\n",
	" 'Steve': 'http://ct.fra.bz/ol/fz/sw/i58/2/5/25/frabz-giant-burrito-man-will-make-you-fire-torpedoes-of-another-kind-a01c69.jpg',\n",
	" 'Iris': 'http://public.media.smithsonianmag.com/legacy_blog/dinosaur-comic-strip.jpg'\n",
	" }\n",
	"\n",
	"\n",
	"# music snippet that you want to play if anyone got cited\n",
	"MUSIC_DIR = '/Users/titipat/Desktop/Amazon Web Service/snippet.wav'\n",
	"\n",
	"# parameter for plotting\n",
	"params_cite = {'fontsize': 22, 'color': 'green', 'fontweight':'bold'}\n",
	"params_hindex = {'fontsize': 22, 'color': 'red', 'fontweight':'bold'}\n",
	"params_date = {'fontsize': 20, 'color': 'blue', 'fontweight':'bold'}\n",
	"params_others = {'fontsize': 30, 'color': 'black'}\n",
	"params_gini = {'fontsize': 20, 'color': 'blue', 'fontweight':'bold'}\n",
	"params_gini_val = {'fontsize': 20, 'color': 'black'}\n",
	"params_tweet = {'fontsize': 20, 'color': 'red', 'fontweight':'bold'}\n",
	"\n",
	"def get_citation_matrix():\n",
	" ''' Get all citation datafram from Google Scholar '''\n",
	" all_people = pd.DataFrame(columns=['name', 'citation', 'h_index','url', 'hn_index'])\n",
	" all_people['name'] = NAME\n",
	" all_people['url'] = BASE_URL\n",
	"\n",
	" for i in range(len(all_people)):\n",
	" tree = html.parse(all_people.url[i])\n",
	" cit = tree.xpath(\"/html/body/div[@id='gs_top']/div[@id='gsc_bdy']/div[@id='gsc_rsb']/div[@class='gsc_rsb_s']/table[@id='gsc_rsb_st']//tr//td[@class='gsc_rsb_std']\")\n",
	" citations = np.int(cit[0].text)\n",
	" h_index = np.int(cit[2].text)\n",
	" \n",
	" all_people['citation'][i] = np.int(citations)\n",
	" all_people['h_index'][i] = np.int(h_index)\n",
	" all_people['hn_index'][i] = float(h_index**2)/(float(citations) + 1) # suggested ratio by Mohammad and Hugo\n",
	"\n",
	" return all_people # return matrix of citation\n",
	"\n",
	"def sort_citation(all_people):\n",
	" ''' Sort the given dataframe by citation '''\n",
	" all_people_sorted = all_people.sort(columns=['h_index', 'citation'], ascending=False)\n",
	" all_people_sorted.index = np.arange(len(all_people))\n",
	" return all_people_sorted\n",
	"\n",
	"def get_options(all_people, all_people_new):\n",
	" ''' Get citation and new citation then return option dataframe (including difference) '''\n",
	" # get difference\n",
	" cite_diff = all_people_new.citation - all_people.citation\n",
	" index = cite_diff.nonzero()[0]\n",
	" name_diff = list(all_people.name[index]) # list of name different\n",
	" \n",
	" # get table of options if there is different\n",
	" options = pd.DataFrame(columns=['name', 'citation_diff', 'h_index_diff','sign'])\n",
	" options['name'] = all_people['name'] # refer to new update\n",
	" options['citation_diff'] = 0\n",
	" options['h_index_diff'] = 0\n",
	" options['sign'] = ''\n",
	"\n",
	" # for one people in name_diff\n",
	" for j in range(len(name_diff)):\n",
	" idx_new = np.where(name_diff[j] == all_people_new['name'])[0]\n",
	" idx_old = np.where(name_diff[j] == all_people['name'])[0]\n",
	" diff_citation = all_people_new.citation[idx_new] - all_people.citation[idx_old]\n",
	" diff_hindex = all_people_new.h_index[idx_new] - all_people.h_index[idx_old]\n",
	" #print sign(all_people_sorted_new.citation[idx_new] - all_people_sorted.citation[idx_old]) # get sign +1 or -1\n",
	" options['citation_diff'][idx_new] = diff_citation\n",
	" options['h_index_diff'][idx_new] = diff_hindex\n",
	" sign_pm = np.int(np.sign(diff_citation))\n",
	" if sign_pm == 1:\n",
	" options['sign'][idx_new] = '+'\n",
	" else:\n",
	" options['sign'][idx_new] = '-'\n",
	"\n",
	" # change index\n",
	" options.index = options.name # do this one!\n",
	" \n",
	" # return dataframe of options and name that have different citation\n",
	" return options, name_diff\n",
	" \n",
	"def is_different(all_people_sorted, all_people_sorted_new):\n",
	" ''' Find if two dataframe are equal or not '''\n",
	" result = np.max(all_people_sorted_new.h_index != all_people_sorted.h_index) or np.max(all_people_sorted_new.citation != all_people_sorted.citation)\n",
	" return result\n",
	"\n",
	"def play_music():\n",
	" ''' Play Everything is Awesome wavfile track '''\n",
	" pygame.init()\n",
	" pygame.mixer.music.load(MUSIC_DIR)\n",
	" pygame.mixer.music.play()\n",
	" \n",
	"def compute_gini(y):\n",
	" ''' function to compute Gini index '''\n",
	" N = len(y)\n",
	" gini = double(2np.dot(sorted(y, reverse=False), np.arange(1, N+1)))/double(double(N)np.sum(y)) - ((N+1.0)/double(N))\n",
	" gini = np.ceil(gini * 1000) / 1000.0\n",
	" return gini\n",
	" \n",
	"def draw_citation_table(): #all_people_new_sorted\n",
	" global all_people, all_people_new, all_people_sorted, all_people_new_sorted, options, options_old, name_diff # see the outer variable\n",
	" \n",
	" # get new citation\n",
	" all_people_new = get_citation_matrix()\n",
	" all_people_new_sorted = sort_citation(all_people_new)\n",
	" \n",
	" # if different optain new options\n",
	" if is_different(all_people_sorted, all_people_new_sorted):\n",
	" options_old, name_diff = get_options(all_people, all_people_new)\n",
	" options = options_old\n",
	" else:\n",
	" options = options_old\n",
	" \n",
	" \n",
	" # DRAW TITLE/ HEADERS\n",
	" fig = plt.gcf() # get current figure\n",
	" fig.clf() # clear current figure\n",
	" fig.suptitle('Bayesian Behavior Lab Citations',\n",
	" fontsize=35, fontweight='bold',\n",
	" color='gray', style='italic')\n",
	" \n",
	" plt.text(0.2, 0.94, 'Citations', **params_cite)\n",
	" plt.text(0.5, 0.94, 'h-index', **params_hindex)\n",
	" plt.axis('off')\n",
	"\n",
	" for i in range(len(all_people_new_sorted)):\n",
	" name = all_people_new_sorted.name[i] # get name\n",
	" if (name in name_diff):\n",
	" option_cit = '('+ options.loc[name].sign + str(options.loc[name].citation_diff) + ')'\n",
	" option_h = '(' + options.loc[name].sign + str(options.loc[name].h_index_diff) + ')'\n",
	" # if citation or h-index different is 0, turn to blank\n",
	" if np.int(options.loc[name].citation_diff) == 0:\n",
	" option_cit = ''\n",
	" if np.int(options.loc[name].h_index_diff) == 0:\n",
	" option_h = ''\n",
	" else:\n",
	" option_cit = ''\n",
	" option_h = ''\n",
	"\n",
	" # DRAW Name, distance between shown name lists is here\n",
	" plt.text(-0.1, 0.85-0.065i, str(all_people_new_sorted.name[i]) , *params_others)\n",
	" # DRAW Citation\n",
	" plt.text(0.2, 0.85-0.065i, str(all_people_new_sorted.citation[i]) + option_cit, *params_others)\n",
	" # DRAW H-index\n",
	" plt.text(0.5, 0.85-0.065i, str(all_people_new_sorted.h_index[i]) + option_h, *params_others)\n",
	"\n",
	" plt.text(0.0, -0.1, 'Last citation update: ' + time.strftime('%X %b %d, %Y'), **params_date)\n",
	" plt.text(0.65, 0.6, 'Gini (citation)', **params_gini)\n",
	" plt.text(0.92, 0.6, str(compute_gini(all_people_new_sorted.citation)), **params_gini_val)\n",
	" plt.text(0.65, 0.53, 'Gini (h-index)', **params_gini)\n",
	" plt.text(0.92, 0.53, str(compute_gini(all_people_new_sorted.h_index)), **params_gini_val)\n",
	"\n",
	" # DRAW New Twitter Feed from Kording Lab\n",
	" #statuses = api.GetUserTimeline(screen_name=\"KordingLab\") # getting all tweets\n",
	" #plt.text(-0.1, -0.05, 'Tweets: ', **params_gini)\n",
	" #try:\n",
	" # new_tweet = str(statuses[0].text.encode('ascii', 'ignore'))\n",
	" # plt.text(0.1, -0.05, new_tweet, **params_gini_val)\n",
	" #except ValueError:\n",
	" # plt.text(0.1, -0.05, \"Can't retrive tweet...\", **params_gini_val)\n",
	" \n",
	" \n",
	" # DRAW IMAGE\n",
	" if len(name_diff) > 0:\n",
	" URL = IMG_LINK[name_diff[-1]] # use lowested name to show image\n",
	" else:\n",
	" URL = 'http://www.qwantz.com/patreon/p3.png' # default dinosaur images \n",
	" file = cStringIO.StringIO(urlopen(URL).read())\n",
	"\n",
	" img = Image.open(file)\n",
	" axicon = fig.add_axes([0.6,0.15,0.33,0.33])\n",
	" plt.imshow(img)\n",
	" plt.axis('off')\n",
	" \n",
	" fig.canvas.draw()\n",
	" fig.canvas.activateWindow()\n",
	" plt.draw()\n",
	" plt.show()\n",
	" \n",
	" \n",
	" # Play music and update part!\n",
	" if is_different(all_people_sorted, all_people_new_sorted):\n",
	" play_music()\n",
	" all_people = all_people_new # replace all people with new one\n",
	" all_people_sorted = sort_citation(all_people) # sorted again\n"
	],
	"language": "python",
	"metadata": {},
	"outputs": []
	},
	{
	"cell_type": "heading",
	"level": 1,
	"metadata": {},
	"source": [
	"Run Google Scholar"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"After running library and functions part, run this line to show the citation update"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"# Draw the First citation image\n",
	"plt.close('all')\n",
	"fig = plt.figure(facecolor='white') # or 'white' depending on bg we want\n",
	"\n",
	"all_people = get_citation_matrix() # get citation from provided url\n",
	"options, name_diff = get_options(all_people, all_people)\n",
	"options_old = options # assign value to get rid of conflict\n",
	"all_people_sorted = sort_citation(all_people)\n",
	"all_people_new = get_citation_matrix()\n",
	"all_people_new_sorted = sort_citation(all_people_new)\n",
	"draw_citation_table() # draw first K-lab citation\n",
	"\n",
	"# timer to run code every some amount of time\n",
	"timer = fig.canvas.new_timer(interval=10006015) # run every 15 minutes (10006015 milli-seconds)\n",
	"timer.add_callback(draw_citation_table)\n",
	"timer.start()"
	],
	"language": "python",
	"metadata": {},
	"outputs": []
	},
	{
	"cell_type": "heading",
	"level": 1,
	"metadata": {},
	"source": [
	"Close figures and Stop Timer"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"this section is to close the real time citation"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"# close all figure and stop timer\n",
	"timer.stop()\n",
	"plt.close('all')"
	],
	"language": "python",
	"metadata": {},
	"outputs": []
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Adding customize css file for NBViewer"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"from IPython.core.display import HTML\n",
	"HTML(open(\"./custom_nb.css\", \"r\").read())"
	],
	"language": "python",
	"metadata": {},
	"outputs": []
	}
	],
	"metadata": {}
	}
	]
	}