Skip to content

Instantly share code, notes, and snippets.

View rcalsaverini's full-sized avatar

Rafael Calsaverini rcalsaverini

View GitHub Profile
@rcalsaverini
rcalsaverini / sample.py
Created September 1, 2014 00:28
Note to self
import pandas as pnd
import numpy as np
import numpy.random as rnd
from collections import namedtuple
class Sampler(list):
def __init__(self, items, weights=None):
super(Sampler, self).__init__(items)
@rcalsaverini
rcalsaverini / strip_accents.py
Created August 30, 2014 15:05
Removing accents from unicode strings in python
import unicodedata
def strip_accents(unicode_string):
"""
Strip accents (all combining unicode characters) from a unicode string.
"""
ndf_string = unicodedata.normalize('NFD', unicode_string)
is_not_accent = lambda char: unicodedata.category(char) != 'Mn'
return ''.join(
char for char in ndf_string if is_not_accent(char)
@rcalsaverini
rcalsaverini / python_resources.md
Created April 3, 2014 14:20 — forked from jookyboi/python_resources.md
Python-related modules and guides.

Packages

  • lxml - Pythonic binding for the C libraries libxml2 and libxslt.
  • boto - Python interface to Amazon Web Services
  • Django - Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
  • Fabric - Library and command-line tool for streamlining the use of SSH for application deployment or systems administration task.
  • PyMongo - Tools for working with MongoDB, and is the recommended way to work with MongoDB from Python.
  • Celery - Task queue to distribute work across threads or machines.
  • pytz - pytz brings the Olson tz database into Python. This library allows accurate and cross platform timezone calculations using Python 2.4 or higher.

Guides

@rcalsaverini
rcalsaverini / crawl.py
Last active August 22, 2017 17:43
Crawl Futpedia for brazilian soccer data.
"""
Crawling brazilian soccer results from
http://futpedia.globo.com/campeonato/campeonato-brasileiro/2011#/fase=fase-unica/rodada=1
With:
http://www.clips.ua.ac.be/pages/pattern-web
"""
import pandas
from pattern import web
@rcalsaverini
rcalsaverini / InvalidXMLFilter.py
Last active December 20, 2015 23:29
To filter invalid XML characters in python
#Based on an answer by John Machin on Stack Overflow (http://stackoverflow.com/users/84270/john-machin)
#http://stackoverflow.com/questions/8733233/filtering-out-certain-bytes-in-python
def isValidXMLChar(char):
codepoint = ord(char)
return 0x20 <= codepoint <= 0xD7FF or \
codepoint in (0x9, 0xA, 0xD) or \
0xE000 <= codepoint <= 0xFFFD or \
0x10000 <= codepoint <= 0x10FFFF
from mpl_toolkits.basemap import Basemap
import numpy as np
import matplotlib.pyplot as plt
m = Basemap(llcrnrlon=-73.45,llcrnrlat=-35,urcrnrlon=-30.1,urcrnrlat=6, resolution='i',projection='lcc',lon_0=-45,lat_0=-23.7)
m.drawcoastlines()
m.fillcontinents(color='beige',lake_color='aqua')
# draw parallels and meridians.
m.drawparallels(np.arange(-40,10,1.))
m.drawmeridians(np.arange(-80.,-0,1.))
@rcalsaverini
rcalsaverini / countourplot.py
Created March 22, 2013 16:56
Contourplot with matplotlib
import pandas
import datetime
from pylab import *
data = pandas.read_csv('file.csv', index_col = 0, header = 0, sep = '\t')
timesIndex = data.index
dynamapsIndex = data.columns
levels = map(lambda x : round(x,1), linspace(data.min().min(), data.max().max(), 101))
contourf(data, levels = levels, cmap=plt.cm.jet_r, interpolation = 'bicubic')
@rcalsaverini
rcalsaverini / Singleton.py
Created October 8, 2012 00:16
A Singleton metaclass in python:
class Singleton(type):
""" This is a Singleton metaclass. All classes affected by this metaclass
have the property that only one instance is created for each set of arguments
passed to the class constructor."""
def __init__(cls, name, bases, dict):
super(Singleton, cls).__init__(cls, bases, dict)
cls._instanceDict = {}
def __call__(cls, *args, **kwargs):
@rcalsaverini
rcalsaverini / mongowatch.py
Created September 26, 2012 13:19
Script to monitor an indexing operation on mongodb
import pymongo
import re
import time
from datetime import datetime, timedelta
patt = re.compile("\d*/\d*\s\d*%")
class Operations(object):
def __init__(self, db, host='localhost'):
self.conn = pymongo.Connection(host)
@rcalsaverini
rcalsaverini / testpymc.py
Created September 20, 2012 20:06
Testing pymc
import pymc
import numpy
def model(disasters):
size = len(disasters)
low = pymc.Exponential('low', beta=1.)
high = pymc.Exponential('high', beta=2.)
@pymc.stochastic(dtype=int)