Skip to content

Instantly share code, notes, and snippets.

View cdxker's full-sized avatar
☁️

cdxker cdxker

☁️
View GitHub Profile
[
{
"name": "SofleKeyboard",
"author": "Josef Adamcik",
"switchMount": "cherry"
},
[
{
"y": 0.2,
"x": 3,
@cdxker
cdxker / jobs_scraper.py
Created January 23, 2024 05:23
Scraping YC job information and putting descriptions into trieve collection.
import bs4 as bs
import urllib.request
import time
import trieve_python_client as trieve
from trieve_python_client.api_client import ApiClient
from trieve_python_client.rest import ApiException
from trieve_python_client.configuration import Configuration
api_key = "tr-********************************"
dataset_id = "************************************"
@cdxker
cdxker / reindex_collection.py
Last active February 20, 2024 01:53
How to reindex collection if qdrant point id's get lost, or to change the embedding model.
import os
import csv
from trieve_client import AuthenticatedClient
from trieve_client.api.chunk import update_chunk
from trieve_client.models import UpdateChunkData, ReturnCreatedChunk
from trieve_client.models.error_response_body import ErrorResponseBody
api_key = "tr-XXXXXXXXXXXXXXXXXXXXXXXXXXX"
@cdxker
cdxker / diesel_async_with_struct.rs
Last active March 13, 2024 10:45
Async Diesel with Vec error
use diesel_async::RunQueryDsl;
struct A;
impl A {
fn first() {
print!("hi");
}
}
fn main() {
@cdxker
cdxker / .env
Last active April 1, 2024 20:44
Supa fast bulk_create script with https://trieve.ai and bun.js (unsupported)
API_URL="http://api.trieve.ai/api"
API_KEY="tr-***************"
ORGANIZATION_ID="************************************"
# Optional
DATASET_ID="*************" # If doesn't exist, will make one from the organization ID
# If QDRANT information doesn't exist, it just uses the defaults in trieve
QDRANT_URL="https://<my-qdrant-ip>:6334"
QDRANT_API_KEY="my-qdrant-api-key"
@cdxker
cdxker / amazonAbo.ts
Created May 7, 2024 22:07
AmazonAboWithImages
import * as fs from "fs";
import * as readline from "readline";
import * as path from "path";
import { ChunkApi, Configuration } from "@devflowinc/trieve-js-ts-client";
interface LanguageTaggedValue {
language_tag: string;
value: string;
}
@cdxker
cdxker / kubedf
Created May 22, 2024 20:27 — forked from redmcg/kubedf
Bash script to show k8s PVC usage
#!/usr/bin/env bash
NODESAPI=/api/v1/nodes
function getNodes() {
kubectl get --raw $NODESAPI | jq -r '.items[].metadata.name'
}
function getPVCs() {
jq -s '[flatten | .[].pods[].volume[]? | select(has("pvcRef")) | '\
@cdxker
cdxker / index.html
Last active June 15, 2024 03:13
Website
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>cdxker</title>
<script src="https://cdn.tailwindcss.com"></script>
<style>
@cdxker
cdxker / index.html
Created June 19, 2024 06:53
Trieve Search Spam test
wrk https://api.trieve.ai/api/chunk/search -t12 -d30s -c800 -s post_search.lua
@cdxker
cdxker / ingestion_guide.md
Last active July 25, 2024 21:02
How to make any ingestion pipeline scale fast and how to run the pipline.

Take for example this simplified pseudo code for hackernews ingest.

import requests

def get_post(id):
  
  return requests.get(f"https://hacker-news.firebaseio.com/v0/item/{id}.json?print=pretty").json()