We typicaly do this through Hetzner, but EC2 or Compute Instances on GCP would also be ok. However, Hetzner is preferred unless you have a strong preference against it.
The preferred region is US-West.
For operating system, pick Ubuntu.
| const createDatasetPayload = JSON.parse("{\"dataset_name\":\"test14\",\"organization_id\":\"95b7c53e-2c24-49a1-97fa-c87188c7324b\",\"server_configuration\":{\"LLM_BASE_URL\":\"\",\"LLM_DEFAULT_MODEL\":\"\",\"EMBEDDING_BASE_URL\":\"https://embedding.trieve.ai\",\"EMBEDDING_MODEL_NAME\":\"jina-base-en\",\"MESSAGE_TO_QUERY_PROMPT\":\"\",\"RAG_PROMPT\":\"\",\"EMBEDDING_SIZE\":768,\"N_RETRIEVALS_TO_INCLUDE\":8,\"DUPLICATE_DISTANCE_THRESHOLD\":1.1,\"DOCUMENT_UPLOAD_FEATURE\":true,\"DOCUMENT_DOWNLOAD_FEATURE\":true,\"COLLISIONS_ENABLED\":false,\"FULLTEXT_ENABLED\":true,\"QDRANT_COLLECTION_NAME\":null,\"EMBEDDING_QUERY_PREFIX\":\"Search for: \",\"USE_MESSAGE_TO_QUERY_PROMPT\":false,\"FREQUENCY_PENALTY\":null,\"TEMPERATURE\":null,\"PRESENCE_PENALTY\":null,\"STOP_TOKENS\":null,\"INDEXED_ONLY\":false,\"LOCKED\":false},\"client_configuration\":\"{}\"}"); | |
| for (let i = 0; i<500; i++) { | |
| createDatasetPayload.dataset_name = `test_${i}`; | |
| fetch("http://localhost:8090/api/dataset", { | |
| "headers": { | |
| "accept": |
| git branch | grep -v main | xargs git branch -D |
| { | |
| "params": { | |
| "vectors": { | |
| "1024_vectors": { | |
| "size": 1024, | |
| "distance": "Cosine", | |
| "hnsw_config": { | |
| "on_disk": false | |
| }, | |
| "quantization_config": { |
| [ | |
| { | |
| "id": "df1a46f4-0737-427e-890c-69b10d5a4833", | |
| "link": "api-reference/chunk/get-recommended-chunks", | |
| "qdrant_point_id": "99ed2fa9-1bad-45be-b13b-171e9bd0c8f2", | |
| "created_at": "2024-06-21T05:04:57.406160", | |
| "updated_at": "2024-06-21T05:04:57.406160", | |
| "chunk_html": "Get Recommended Chunks\nGet Recommended Chunks\n\nGet recommendations of chunks similar to the positive samples in the request and dissimilar to the negative. You must provide at least one of either positive_chunk_ids or positive_tracking_ids.", | |
| "metadata": { | |
| "openapi": "post /api/chunk/recommend", |
Trieve consists of multiple different services and it is convenient to load them with tmuxp.
You will need to enter the directory with the trieve repository with cd trieve from wherever you cloned.
Then, I recommend killing all your running docker containers with docker ps -q | xargs docker kill.
cat ./output.txt | while read line; do redis-cli -u redis://:thisredispasswordisverysecureandcomplex@localhost:6379 LPUSH file_ingestion "$line"; done
| docker ps -q | xargs docker kill |
First step is to list the available db's to figure out which one to describe and then copy the connectionName for.
gcloud sql instances list
Assuming the db you want is called foo you would then copy the connectionName as follows.
gcloud sql instances describe foo | grep connectionName | awk '{print $2}' | xclip -selection clipboard
Download cloud-sql-proxy here.
| -- the output of this command will have a table with a pid colummn | |
| -- find the migration you want to stop and copy it's pid for step 2 | |
| SELECT * | |
| FROM pg_stat_activity | |
| WHERE state = 'active'; | |
| -- use the pid you copied from the previous output here instead of 7844 | |
| SELECT pg_cancel_backend(7844); |