Caveat: The following "works for me"™, and it is kept simple because I implemented in on my 2016 refurbished Thinkpad P50. Also, I've done very minor testing of prompts, and the majority have worked, but YMMV.
This recipe assumes that you have already have ollama
and aichat
installed on your local machine. If you don't have them installed, you can follow the instructions in the following links:
There are different ways to install those two, so pick the one you are most comfortable with. I installed them on my laptop directly, but you might like to use docker instead.
Once you have those two installed, you will need at least a model to be used for content embedding (e.g. mxbai-embed-large) and an LLM model to interact and ask duckdb
related questions (e.g. smollm2).
I am also going to assume that you already know how to use the command line, git, and github, which we'll use for retrieving the content needed to create the embeddings.
Assuming that your ollama
is already installed, you can pull the models you need by running the following commands:
$ ollama pull mxbai-embed-large # This will be used for content embedding
$ ollama pull smollm2:1.7b # This will be used for the LLM model
Make sure that the ollama
service is running, because aichat
will require it to be up and running.
To make sure the service is running, you can run the following command:
$ ollama run smollm2:1.7b
>>> Send a message (/? for help)
If you got a prompt then the service is running, and you can exit it using /bye
. If, on the other hand, you got an error, check the documentation to see if you followed all the recommended steps. The docs for Linux indicate how to set up the service, and even how to configure CUDA if you have an NVIDIA GPU.
You will need to configure aichat
to use the models you just downloaded. You can do this by editing the ~/.config/aichat/config.yaml
file in Linux (see the aichat
documentation for other Operating Systems).
Open the file with your favorite text editor and add the models we just downloaded. A minimal version is is shown below.
# see https://github.com/sigoden/aichat/blob/main/config.example.yaml
clients:
- type: openai-compatible
name: `ollama`
api_base: http://localhost:11434/v1
models:
- name: smollm2:1.7b
- name: mxbai-embed-large:latest
type: embedding
keybindings: vi # You can also use: emacs if you prefer
save_session: true
model: `ollama`:smollm2:1.7b # Our default LLM model
rag_embedding_model: `ollama`:mxbai-embed-large:latest # Our default embedding model
rag_top_k: 5 # Consider the top 5 results
Make sure that this configuration is working by running:
$ aichat
Welcome to aichat 0.28.0
Type ".help" for additional help.
> .info
model ollama:smollm2:1.7b
temperature null
top_p null
use_tools null
max_output_tokens null
save_session true
compress_threshold 4000
rag_reranker_model null
rag_top_k 5
dry_run false
function_calling true
stream true
save false
keybindings vi
wrap no
wrap_code false
highlight true
light_theme false
config_file /home/username/.config/aichat/config.yaml
env_file /home/username/.config/aichat/.env
roles_dir /home/username/.config/aichat/roles
sessions_dir /home/username/.config/aichat/sessions
rags_dir /home/username/.config/aichat/rags
macros_dir /home/username/.config/aichat/macros
functions_dir /home/username/.config/aichat/functions
messages_file /home/username/.config/aichat/messages.md
>
$ cd /path/to/project # Change to a path of your liking
# Perform a shallow clone
$ git clone --depth 1 https://github.com/duckdb/duckdb-web.git
We are doing a shallow clone because for this purpose we do not need the full history of the documentation. The full path to the repository should be /path/to/project/duckdb-web
, and we will use the Markdown documents in there to create the embeddings.
Caveat: The duckdb-web
contains documentation for current and older versions of duckdb
, as well as other Markdown files that you might not want to use for the embeddings. If you want be more careful than I was, then you need to filter the documents you want to use for the embeddings, perhaps by copying them to a new directory and doing some cleanup.
We will do something analogous to what is shown in the aichat
RAG documentation, using the repo we cloned in the previous step
$ aichat --rag duckdb-web
⚙ Initializing RAG...
Select embedding model: ollama:mxbai-embed-large:latest
> Set chunk size: 1500
> Set chunk overlay: 75
> Add documents: /path/to/project/duckdb-web/**/*.md
Load /path/to/project/duckdb-web/microbenchmarks/README.md [1/1912]
... # after some time you should see the following message
✓ Saved RAG to '/home/username/.config/aichat/rags/duckdb-web.yaml'.
@duckdb-web> .exit # Exit for now
Let's take a look at the RAG file:
$ less /home/username/.config/aichat/rags/duckdb-web.yaml
embedding_model: ollama:mxbai-embed-large:latest
chunk_size: 1500
chunk_overlap: 75
reranker_model: null
top_k: 5
batch_size: 100
next_file_id: 1
document_paths:
- /path/to/project/duckdb-web/**/*.md
files:
0:
hash: e76c4f0c9124a1213c94ccbc85f04885a79e8c3efc989ab87657fdb9fc68b493
... # The rest of the file
First, let's try something simple
$ aichat --rag duckdb-web
@duckdb-web> how can I load one or more parquet files into a table in duckdb?
To load one or more Parquet files into a table in DuckDB, you can use the `read_parquet` function. Here's an example of how to do it:
1. Specify which Parquet files you want to read using a list parameter. For instance, if your three Parquet files are named "file1.parquet", "file2.parquet", and "file3.parquet", you can use the following code:
```sql
SELECT * FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']);
```
2. Also, ensure that the Parquet files have the same schema, or any errors will occur when trying to import them into a database table.
Now we will ask for something a bit more complex
@duckdb-web> Show me how I can do vertical stacking of results in duckdb
DuckDB allows you to vertically stack results by column name rather than order position. You can use `SELECT` statements like the ones above or a query that references columns with their names instead of positions, like this:
1. Stack using `UNION BY NAME`:
```sql
SELECT ...
FROM table;
UNION ALL BY NAME
SELECT ...
FROM other_table;
```
2. The new syntax `INSERT INTO table BY NAME` also allows you to stack data by column name:
```sql
CREATE TABLE year_info (year INTEGER, status VARCHAR);
INSERT INTO year_info BY NAME
SELECT ...
FROM some_other_table;
You can now ask your local helper any questions you have about duckdb
and it will try to provide you with the best answer it can find in the documentation.
In order to keep the helper up-to-date, you need to first pull the latest version of the documentation:
$ cd /path/to/project/duckdb-web
$ git pull origin
Then, you can rebuild the RAG model with the new documents using aichat
.
$ aichat --rag duckdb-web
@duckdb-web> .rebuild rag
After that, you can test your helper with the new information.
A good model that can understand and respond in multiple languages is gemma3
. I've used gemma3:12b
for testing and it works well.
First, download the model:
$ ollama pull gemma3:12b
Then, configure aichat
to use the model:
clients:
- type: openai-compatible
name: `ollama`
api_base: http://localhost:11434/v1
models:
- name: smollm2:1.7b
- name: gemma3:12b # Add this line
- name: mxbai-embed-large:latest
type: embedding
keybindings: vi # You can also use: emacs if you prefer
save_session: true
model: `ollama`:smollm2:1.7b # Our default LLM model
rag_embedding_model: `ollama`:mxbai-embed-large:latest # Our default embedding model
rag_top_k: 5 # Consider the top 5 results
Let's test it, asking questions in Spanish and Italian (my Italian is very basic, so it might not be the best example):
$ aichat
Welcome to aichat 0.28.0
Type ".help" for additional help.
> .model ollama:gemma3:12b
> .rag duckdb-web
@duckdb-web> Explicame como puedo crear una tabla en duckdb leyendo un archivo CSV
Para crear una tabla en DuckDB leyendo un archivo CSV, puedes usar la función `read_csv`. Por ejemplo, para leer el archivo `duckbox-cleaned.csv` con un delimitador '|', usarías el siguiente comando SQL:
```sql
FROM read_csv('duckbox-cleaned.csv', delim = '│');
```
También puedes exportar una tabla a un archivo CSV usando el comando `COPY`:
```sql
COPY (FROM read_csv('duckbox-cleaned.csv', delim = '│')) TO 'out.csv';
```
@duckdb-web> Sai se duckdb supporta i filtri bloom per gli archivi parquet?
Sì, DuckDB supporta i filtri bloom per gli archivi Parquet. Secondo il testo, DuckDB ora è in grado di leggere e scrivere filtri bloom Parquet.
@duckdb-web> Come posso leggere un file JSON in una tabella in duckdb?
Ci sono diversi modi per leggere un file JSON in una tabella in DuckDB:
1. **Utilizzando la sintassi implicita:** Puoi semplicemente selezionare da un file JSON come se fosse una tabella: `SELECT * FROM 'todos.json';` DuckDB riconoscerà l'estensione `.json` e chiamerà `read_json('todos.json')`.
2. **Utilizzo di `read_json` con opzioni personalizzate:** Puoi usare la funzione `read_json` per specificare il formato e i tipi di colonna: `SELECT * FROM read_json('todos.json', format='array', columns={userId: 'UBIGINT', id: 'UBIGINT', title: 'VARCHAR', completed: 'BOOLEAN'});`
3. **Creazione di una tabella e copia dei dati:** Puoi creare una tabella con uno schema definito e poi copiare i dati dal file JSON nella tabella:
```sql
CREATE TABLE todos(userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN);
COPY todos FROM 'todos.json';
```
In alternativa, puoi creare la tabella direttamente dalla query: `CREATE TABLE todos AS SELECT * FROM 'todos.json';`
4. **Utilizzo dell'estensione JSON:** Puoi installare e caricare l'estensione JSON con i comandi `INSTALL 'json';` e `LOAD 'json';` e poi semplicemente selezionare dal file JSON. Tuttavia, l'estensione è già inclusa nella distribuzione binaria di DuckDB.
Simply, use:
$ aichat --serve
Chat Completions API: http://127.0.0.1:8000/v1/chat/completions
Embeddings API: http://127.0.0.1:8000/v1/embeddings
Rerank API: http://127.0.0.1:8000/v1/rerank
LLM Playground: http://127.0.0.1:8000/playground
LLM Arena: http://127.0.0.1:8000/arena?num=2
Then go to http://localhost:8000
in your browser and you will be able to interact with your RAG, as well as a bunch of other options.
Even though
aichat
supports using a reranker and there are reranker ollama models, because ollama itself does not (yet) support using rerankers [1], I omitted that part in the notes above.[1] See: ollama/ollama#4510 and ollama/ollama#3749