Skip to content

Instantly share code, notes, and snippets.

View MlataIbrahim's full-sized avatar
🏠
Working from home

Mlata Ibrahim MlataIbrahim

🏠
Working from home
View GitHub Profile
@MlataIbrahim
MlataIbrahim / fingerprint.py
Created February 3, 2020 11:38 — forked from cjdd3b/fingerprint.py
Python implementation of Google Refine fingerprinting algorithms here: https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth
# -*- coding: utf-8 -*-
import re, string
from unidecode import unidecode
PUNCTUATION = re.compile('[%s]' % re.escape(string.punctuation))
class Fingerprinter(object):
'''
Python implementation of Google Refine fingerprinting algorithm described here: