Skip to content

Instantly share code, notes, and snippets.

@jmcastagnetto
Created March 17, 2025 16:37
Show Gist options
  • Save jmcastagnetto/c83925797241240fb191f9f25da74dbd to your computer and use it in GitHub Desktop.
Save jmcastagnetto/c83925797241240fb191f9f25da74dbd to your computer and use it in GitHub Desktop.
Quick notes on creating a local duckdb helper using RAG with aichat and ollama

DIY - build your own dudckdb local helper with aichat and ollama using RAG

Caveat: The following "works for me"™, and it is kept simple because I implemented in on my 2016 refurbished Thinkpad P50. Also, I've done very minor testing of prompts, and the majority have worked, but YMMV.

This recipe assumes that you have already have ollama and aichat installed on your local machine. If you don't have them installed, you can follow the instructions in the following links:

There are different ways to install those two, so pick the one you are most comfortable with. I installed them on my laptop directly, but you might like to use docker instead.

Once you have those two installed, you will need at least a model to be used for content embedding (e.g. mxbai-embed-large) and an LLM model to interact and ask duckdb related questions (e.g. smollm2).

I am also going to assume that you already know how to use the command line, git, and github, which we'll use for retrieving the content needed to create the embeddings.

Step 1: Get the ollama models

Assuming that your ollama is already installed, you can pull the models you need by running the following commands:

$ ollama pull mxbai-embed-large # This will be used for content embedding
$ ollama pull smollm2:1.7b # This will be used for the LLM model

Make sure that the ollama service is running, because aichat will require it to be up and running.

To make sure the service is running, you can run the following command:

$ ollama  run smollm2:1.7b
>>> Send a message (/? for help)

If you got a prompt then the service is running, and you can exit it using /bye. If, on the other hand, you got an error, check the documentation to see if you followed all the recommended steps. The docs for Linux indicate how to set up the service, and even how to configure CUDA if you have an NVIDIA GPU.

Step 2: Configure aichat

You will need to configure aichat to use the models you just downloaded. You can do this by editing the ~/.config/aichat/config.yaml file in Linux (see the aichat documentation for other Operating Systems).

Open the file with your favorite text editor and add the models we just downloaded. A minimal version is is shown below.

# see https://github.com/sigoden/aichat/blob/main/config.example.yaml

clients:
- type: openai-compatible
  name: `ollama`
  api_base: http://localhost:11434/v1
  models:
  - name: smollm2:1.7b
  - name: mxbai-embed-large:latest
    type: embedding

keybindings: vi  # You can also use: emacs if you prefer
save_session: true
model: `ollama`:smollm2:1.7b  # Our default LLM model
rag_embedding_model: `ollama`:mxbai-embed-large:latest  # Our default embedding model
rag_top_k: 5  # Consider the top 5 results

Make sure that this configuration is working by running:

$ aichat
Welcome to aichat 0.28.0
Type ".help" for additional help.
> .info
model                   ollama:smollm2:1.7b
temperature             null
top_p                   null
use_tools               null
max_output_tokens       null
save_session            true
compress_threshold      4000
rag_reranker_model      null
rag_top_k               5
dry_run                 false
function_calling        true
stream                  true
save                    false
keybindings             vi
wrap                    no
wrap_code               false
highlight               true
light_theme             false
config_file             /home/username/.config/aichat/config.yaml
env_file                /home/username/.config/aichat/.env
roles_dir               /home/username/.config/aichat/roles
sessions_dir            /home/username/.config/aichat/sessions
rags_dir                /home/username/.config/aichat/rags
macros_dir              /home/username/.config/aichat/macros
functions_dir           /home/username/.config/aichat/functions
messages_file           /home/username/.config/aichat/messages.md

> 

Step 3: Clone the duckdb-web repository

$ cd /path/to/project # Change to a path of your liking
# Perform a shallow clone
$ git clone --depth 1 https://github.com/duckdb/duckdb-web.git  

We are doing a shallow clone because for this purpose we do not need the full history of the documentation. The full path to the repository should be /path/to/project/duckdb-web, and we will use the Markdown documents in there to create the embeddings.

Caveat: The duckdb-web contains documentation for current and older versions of duckdb, as well as other Markdown files that you might not want to use for the embeddings. If you want be more careful than I was, then you need to filter the documents you want to use for the embeddings, perhaps by copying them to a new directory and doing some cleanup.

Step 4: Create the embeddings

We will do something analogous to what is shown in the aichat RAG documentation, using the repo we cloned in the previous step

$ aichat --rag duckdb-web
⚙ Initializing RAG...
Select embedding model: ollama:mxbai-embed-large:latest
> Set chunk size: 1500
> Set chunk overlay: 75
> Add documents: /path/to/project/duckdb-web/**/*.md
Load /path/to/project/duckdb-web/microbenchmarks/README.md [1/1912]
...  # after some time you should see the following message
✓ Saved RAG to '/home/username/.config/aichat/rags/duckdb-web.yaml'.
@duckdb-web> .exit  # Exit for now

Let's take a look at the RAG file:

$ less /home/username/.config/aichat/rags/duckdb-web.yaml

embedding_model: ollama:mxbai-embed-large:latest
chunk_size: 1500
chunk_overlap: 75
reranker_model: null
top_k: 5
batch_size: 100
next_file_id: 1
document_paths:
- /path/to/project/duckdb-web/**/*.md
files:
  0:
    hash: e76c4f0c9124a1213c94ccbc85f04885a79e8c3efc989ab87657fdb9fc68b493
... # The rest of the file

Step 5: Test your local helper

First, let's try something simple

$ aichat --rag duckdb-web
@duckdb-web> how can I load one or more parquet files into a table in duckdb?
    To load one or more Parquet files into a table in DuckDB, you can use the `read_parquet` function. Here's an example of how to do it:

    1. Specify which Parquet files you want to read using a list parameter. For instance, if your three Parquet files are named "file1.parquet", "file2.parquet", and "file3.parquet", you can use the following code:
    ```sql
    SELECT * FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']);
    ```

    2. Also, ensure that the Parquet files have the same schema, or any errors will occur when trying to import them into a database table.

Now we will ask for something a bit more complex

@duckdb-web> Show me how I can do vertical stacking of results in duckdb

DuckDB allows you to vertically stack results by column name rather than order position. You can use `SELECT` statements like the ones above or a query that references columns with their names instead of positions, like this:

1. Stack using `UNION BY NAME`:

    ```sql
    SELECT ...
    FROM table;
    
    UNION ALL BY NAME 
    SELECT ...
    FROM other_table;
    ```

2. The new syntax `INSERT INTO table BY NAME` also allows you to stack data by column name:

    ```sql
    CREATE TABLE year_info (year INTEGER, status VARCHAR);
    
    INSERT INTO year_info BY NAME 
    SELECT ...
    FROM some_other_table;
    

You can now ask your local helper any questions you have about duckdb and it will try to provide you with the best answer it can find in the documentation.

Extras:

Keeping your local helper up-to-date

In order to keep the helper up-to-date, you need to first pull the latest version of the documentation:

$ cd /path/to/project/duckdb-web
$ git pull origin

Then, you can rebuild the RAG model with the new documents using aichat.

$ aichat --rag duckdb-web
@duckdb-web> .rebuild rag

After that, you can test your helper with the new information.

Making your helper respond in multiple languages

A good model that can understand and respond in multiple languages is gemma3. I've used gemma3:12b for testing and it works well.

First, download the model:

$ ollama pull gemma3:12b

Then, configure aichat to use the model:

clients:
- type: openai-compatible
  name: `ollama`
  api_base: http://localhost:11434/v1
  models:
  - name: smollm2:1.7b
  - name: gemma3:12b   # Add this line
  - name: mxbai-embed-large:latest
    type: embedding

keybindings: vi  # You can also use: emacs if you prefer
save_session: true
model: `ollama`:smollm2:1.7b  # Our default LLM model
rag_embedding_model: `ollama`:mxbai-embed-large:latest  # Our default embedding model
rag_top_k: 5  # Consider the top 5 results

Let's test it, asking questions in Spanish and Italian (my Italian is very basic, so it might not be the best example):

$ aichat
Welcome to aichat 0.28.0 
Type ".help" for additional help.
> .model ollama:gemma3:12b
> .rag duckdb-web
@duckdb-web> Explicame como puedo crear una tabla en duckdb leyendo un archivo CSV

Para crear una tabla en DuckDB leyendo un archivo CSV, puedes usar la función `read_csv`. Por ejemplo, para leer el archivo `duckbox-cleaned.csv` con un delimitador '|', usarías el siguiente comando SQL:

    ```sql
    FROM read_csv('duckbox-cleaned.csv', delim = '');
    ```

  También puedes exportar una tabla a un archivo CSV usando el comando `COPY`:

    ```sql
    COPY (FROM read_csv('duckbox-cleaned.csv', delim = '')) TO 'out.csv';
    ```
@duckdb-web> Sai se duckdb supporta i filtri bloom per gli archivi parquet?

Sì, DuckDB supporta i filtri bloom per gli archivi Parquet. Secondo il testo, DuckDB ora è in grado di leggere e scrivere filtri bloom Parquet.

@duckdb-web> Come posso leggere un file JSON in una tabella in duckdb?

Ci sono diversi modi per leggere un file JSON in una tabella in DuckDB:

1.  **Utilizzando la sintassi implicita:** Puoi semplicemente selezionare da un file JSON come se fosse una tabella: `SELECT * FROM 'todos.json';` DuckDB riconoscerà l'estensione `.json` e chiamerà `read_json('todos.json')`.
2.  **Utilizzo di `read_json` con opzioni personalizzate:** Puoi usare la funzione `read_json` per specificare il formato e i tipi di colonna: `SELECT * FROM read_json('todos.json', format='array', columns={userId: 'UBIGINT', id: 'UBIGINT', title: 'VARCHAR', completed: 'BOOLEAN'});`
3.  **Creazione di una tabella e copia dei dati:**  Puoi creare una tabella con uno schema definito e poi copiare i dati dal file JSON nella tabella:
    ```sql
    CREATE TABLE todos(userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN);
    COPY todos FROM 'todos.json';
    ```
    In alternativa, puoi creare la tabella direttamente dalla query: `CREATE TABLE todos AS SELECT * FROM 'todos.json';`
4.  **Utilizzo dell'estensione JSON:** Puoi installare e caricare l'estensione JSON con i comandi `INSTALL 'json';` e `LOAD 'json';` e poi semplicemente selezionare dal file JSON.  Tuttavia, l'estensione è già inclusa nella distribuzione binaria di DuckDB.

Using aichat through a web browser

Simply, use:

$ aichat --serve
Chat Completions API: http://127.0.0.1:8000/v1/chat/completions
Embeddings API:       http://127.0.0.1:8000/v1/embeddings
Rerank API:           http://127.0.0.1:8000/v1/rerank
LLM Playground:       http://127.0.0.1:8000/playground
LLM Arena:            http://127.0.0.1:8000/arena?num=2

Then go to http://localhost:8000 in your browser and you will be able to interact with your RAG, as well as a bunch of other options.

@jmcastagnetto
Copy link
Author

jmcastagnetto commented Mar 17, 2025

Even though aichat supports using a reranker and there are reranker ollama models, because ollama itself does not (yet) support using rerankers [1], I omitted that part in the notes above.

[1] See: ollama/ollama#4510 and ollama/ollama#3749

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment