Skip to content

Instantly share code, notes, and snippets.

View frutik's full-sized avatar

Andrew Kornilov frutik

View GitHub Profile
@frutik
frutik / bloated_indices
Last active October 22, 2024 20:48
useful postgresql queries - when your postgres went crazy
SELECT
c.relname AS index_name,
pg_size_pretty(pg_relation_size(c.oid)) AS index_size,
pg_size_pretty(pg_total_relation_size(c.oid) - pg_relation_size(c.oid)) AS index_bloat_size,
ROUND((pg_total_relation_size(c.oid) - pg_relation_size(c.oid)) / pg_relation_size(c.oid)::numeric * 100, 2) AS bloat_percentage
FROM
pg_class c
JOIN
pg_namespace n ON c.relnamespace = n.oid
WHERE
import onnxruntime as ort
from transformers import AutoTokenizer
session = ort.InferenceSession('./bge-small-en/model.onnx')
tokenizer = AutoTokenizer.from_pretrained("./bge-small-en")
inputs = tokenizer("hello world.", padding="longest", return_tensors="np")
inputs_onnx = {key: ort.OrtValue.ortvalue_from_numpy(value) for key, value in inputs.items()}
https://brandur.org/fragments/postgres-partitioning-2022
https://pganalyze.com/blog/postgresql-partitioning-django
https://django-postgres-extra.readthedocs.io/en/master/table_partitioning.html
https://hevodata.com/learn/postgresql-partitions/
https://www.2ndquadrant.com/en/blog/postgresql-12-foreign-keys-and-partitioned-tables/
https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
def divide_chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
a = 'i.strip() for i in a.split('.') if i.strip()]
c = list(divide_chunks(b, 3))
d = ['. '.join(i + ['']).strip() for i in c]
y = '\n\n'.join(d)
print(y)
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" -v=8 | python -m json.tool
Must have
- Google Analytics / Tag manager / Funnels
- Set up and maintain pipelintes for for importing Analytical data to Data lake of your choice
- Defining KPI, Metrics that will help us to understand our visitors better.
Nice to have
- BigQuery
https://medium.com/@kelvin.lu.au/compare-pdf-question-answering-with-openai-and-google-vertexai-46638d62327b
https://medium.com/@kelvin.lu.au/what-we-need-to-know-before-adopting-a-vector-database-85e137570fbb
https://medium.com/@kelvin.lu.au/disadvantages-of-rag-5024692f2c53
https://medium.com/@Ratnaparkhi/how-the-search-technology-is-evolving-88607f5efb9e
cat en_esci.json | grep '"Clothing"' | grep -E '"Men"|"Women"' | jq -c '. | [.category, .image]' | grep -v '],""]' > clothing.txt
>>> from transformers.tools import HfAgent
>>> a = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
>>> text = """Ukraine says Friday's missile strike on the headquarters of Russia's Black Sea fleet in Crimea was timed to coincide with a meeting of naval officials.
The fleet, based in the port city of Sevastopol, is seen as the best of Russia's navy.
A Ukrainian military source told the BBC that Friday's attack was carried out using Storm Shadow missiles, which are supplied by Britain and France."""
>>> a.run("Can you summarize `text` for me", text=text)
from spacy.pipeline import EntityRuler
import spacy
nlp = spacy.blank("nl")
ruler = nlp.add_pipe("entity_ruler")
ruler.from_disk('fb.jsonl')
doc = nlp("I like Ralph Lauren and fruit of THE LOOM")
print([(ent.text, ent.label_) for ent in doc.ents])