This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def crop_white(image: np.ndarray, value: int = 255) -> np.ndarray: | |
assert image.shape[2] == 3 | |
assert image.dtype == np.uint8 | |
ys, = (image.min((1, 2)) < value).nonzero() | |
xs, = (image.min(0).min(1) < value).nonzero() | |
if len(xs) == 0 or len(ys) == 0: | |
return image | |
return image[ys.min():ys.max() + 1, xs.min():xs.max() + 1] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def jaro(val1, val2): | |
''' | |
Computes the Jaro similarity between 2 sequences from: | |
Matthew A. Jaro (1989). Advances in record linkage methodology | |
as applied to the 1985 census of Tampa Florida. Journal of the | |
American Statistical Association. 84 (406): 414–20. | |
Returns a value between 0.0 and 1.0. | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def jaro(val1, val2): | |
''' | |
Computes the Jaro similarity between 2 sequences from: | |
Matthew A. Jaro (1989). Advances in record linkage methodology | |
as applied to the 1985 census of Tampa Florida. Journal of the | |
American Statistical Association. 84 (406): 414–20. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def slk581(name, birthdate, gender): | |
result = '' | |
first_name, last_name = name.split(' ') | |
# Take the 2nd, 3rd, and 5th letters of a record's family name (surname) | |
if len(last_name) >=5 : | |
result += last_name[1] + last_name[2] + last_name[4] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def soundex(string): | |
# keep first letter of a string | |
string = string.lower() | |
result = string[0] | |
string = string[1:] | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<html> | |
<head> | |
</head> | |
<body> | |
<form> | |
<input type='text'></input> | |
<select id='voiceSelect'></select> | |
<input type="submit"></input> | |
</form> | |
<script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# detect two duplicate photos | |
# 1 get hash function | |
import cv2 | |
import imagehash | |
funcs = [ | |
imagehash.average_hash, | |
imagehash.phash, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# you have run this code in .py file (cannot be in .ipynb) | |
# you have to add the head `if __name__ == '__main__':` | |
if __name__ == '__main__': | |
start = time.time() | |
img_ids = df.image_id.values # a list | |
pool = Pool(processes=multiprocessing.cpu_count()) | |
pool.map(crop_all_img, img_ids) # (function, list) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Source: https://www.kaggle.com/lopuhin/panda-2020-level-1-2 | |
def crop_white(image: np.ndarray, value: int = 255) -> np.ndarray: | |
assert image.shape[2] == 3 | |
assert image.dtype == np.uint8 | |
ys, = (image.min((1, 2)) < value).nonzero() | |
xs, = (image.min(0).min(1) < value).nonzero() | |
if len(xs) == 0 or len(ys) == 0: | |
return image |
NewerOlder