Skip to content

Instantly share code, notes, and snippets.

View HAKSOAT's full-sized avatar
🏗️
Building information retrieval systems...

Habeeb Shopeju HAKSOAT

🏗️
Building information retrieval systems...
View GitHub Profile
@HAKSOAT
HAKSOAT / SearchEngineer.md
Created July 17, 2021 22:11 — forked from morria/SearchEngineer.md
Search Engineer

Search Relevance Engineer

Working with the Search team, you'll be applying your background in Information Retrieval, Machine Learning or Data Mining to run experiments and develop products that have a provable impact on the Etsy marketplace. You'll be analyzing data, understanding language, developing new algorithms and building large-scale distributed systems.

Our team is responsible for creating and optimizing the best experiences for buyers and getting the best performance for sellers. Our work focuses on improvements to search ranking, query understanding, spelling correction, auto completion and query intent recognition.

Requirements

  • Strong background in Machine Learning, Statistics, Information Retrieval
@HAKSOAT
HAKSOAT / text_preprocessing.py
Created April 15, 2023 10:47 — forked from jiahao87/text_preprocessing.py
Full code for preprocessing text
from bs4 import BeautifulSoup
import spacy
import unidecode
from word2number import w2n
import contractions
nlp = spacy.load('en_core_web_md')
# exclude words from spacy stopwords list
deselect_stop_words = ['no', 'not']