Skip to content

Instantly share code, notes, and snippets.

View MlataIbrahim's full-sized avatar
🏠
Working from home

Mlata Ibrahim MlataIbrahim

🏠
Working from home
View GitHub Profile
@cjdd3b
cjdd3b / fingerprint.py
Created February 22, 2015 14:17
Python implementation of Google Refine fingerprinting algorithms here: https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth
# -*- coding: utf-8 -*-
import re, string
from unidecode import unidecode
PUNCTUATION = re.compile('[%s]' % re.escape(string.punctuation))
class Fingerprinter(object):
'''
Python implementation of Google Refine fingerprinting algorithm described here: