Skip to content

Instantly share code, notes, and snippets.

@sharkdeng
sharkdeng / slk-581.py
Created October 18, 2020 06:02
SLK-581 code
def slk581(name, birthdate, gender):
result = ''
first_name, last_name = name.split(' ')
# Take the 2nd, 3rd, and 5th letters of a record's family name (surname)
if len(last_name) >=5 :
result += last_name[1] + last_name[2] + last_name[4]
@sharkdeng
sharkdeng / jaro.py
Last active October 19, 2020 02:46
implementations of jaro similarity
def jaro(val1, val2):
'''
Computes the Jaro similarity between 2 sequences from:
Matthew A. Jaro (1989). Advances in record linkage methodology
as applied to the 1985 census of Tampa Florida. Journal of the
American Statistical Association. 84 (406): 414–20.
@sharkdeng
sharkdeng / jaro-wrinkle.py
Created October 19, 2020 02:47
Jaro-wrinkle to compare string similarity (updated version of Jaro)
def jaro(val1, val2):
'''
Computes the Jaro similarity between 2 sequences from:
Matthew A. Jaro (1989). Advances in record linkage methodology
as applied to the 1985 census of Tampa Florida. Journal of the
American Statistical Association. 84 (406): 414–20.
Returns a value between 0.0 and 1.0.
@sharkdeng
sharkdeng / crop.py
Created November 2, 2020 05:56
crop white background of an image
def crop_white(image: np.ndarray, value: int = 255) -> np.ndarray:
assert image.shape[2] == 3
assert image.dtype == np.uint8
ys, = (image.min((1, 2)) < value).nonzero()
xs, = (image.min(0).min(1) < value).nonzero()
if len(xs) == 0 or len(ys) == 0:
return image
return image[ys.min():ys.max() + 1, xs.min():xs.max() + 1]