A Pen by Shubhanshu Mishra on CodePen.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Faster Implementation of Unsupervised Query Segmentation. | |
Uses vectorized operations | |
- author: @napsternxg | |
Unsupervised Query Segmentation Using only Query Logs [Mishra et. al. 2011] | |
https://www.microsoft.com/en-us/research/wp-content/uploads/2011/01/pp0295-mishra.pdf |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from flask import Flask, jsonify, request, render_template | |
from queued_map import example_items | |
app = Flask(__name__) | |
@app.get("/") | |
@app.get("/<int:n>") | |
def home(n: int=10): | |
output = example_items(n) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import asyncio | |
def async_decorator(acreate_fn): | |
async def _f(*args, **kwargs): | |
print(f"Decorated fn: {args=}, {kwargs=}. Sleeping.") | |
await asyncio.sleep(0.1) | |
return await acreate_fn(*args, **kwargs) | |
return _f | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from copy import deepcopy | |
import numpy as np | |
import matplotlib.pyplot as plt | |
import pandas as pd | |
from scipy import sparse | |
from joblib import dump, load | |
import joblib | |
import time |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mkdir food.com | |
cd food.com | |
wget https://www.food.com/sitemap.xml | |
for url in $(cat sitemap.xml | grep "<loc>https://www.food.com/sitemap-" | sed -n 's:.*<loc>\(.*\)</loc>.*:\1:p'); | |
do echo "Download: $url"; | |
done | |
for url in $(cat sitemap.xml | grep "<loc>https://www.food.com/sitemap-" | sed -n 's:.*<loc>\(.*\)</loc>.*:\1:p'); | |
do wget "$url"; | |
done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pathlib import Path | |
import torch | |
from transformers import CLIPProcessor, CLIPTextModelWithProjection | |
from accelerate import Accelerator | |
from datasets import Dataset | |
import pandas as pd | |
import numpy as np |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
pip install pypdf | |
""" | |
from pypdf import PdfWriter | |
def main(args): | |
merger = PdfWriter() | |
file_paths = args.input_files | |
for pdf in file_paths: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
diff --git a/sentence_transformers/SentenceTransformer.py b/sentence_transformers/SentenceTransformer.py | |
index e44e573..ae4dea4 100644 | |
--- a/sentence_transformers/SentenceTransformer.py | |
+++ b/sentence_transformers/SentenceTransformer.py | |
@@ -16,6 +16,7 @@ from torch.optim import Optimizer | |
from torch.utils.data import DataLoader | |
import torch.multiprocessing as mp | |
from tqdm.autonotebook import trange | |
+from tqdm.autonotebook import tqdm | |
import math |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Spacy Embedding Transformer for Sklearn pipeline | |
Install spacy and floret | |
```bash | |
pip install spacy floret scikit-learn | |
``` | |
First download the vectors from: | |
```bash |