= Vim’s `libcall()` builtin function
:author: Peter Kenny
:doctype: article
:icons: font
ifdef::env-github[]
:tip-caption: :bulb:
:note-caption: :information_source:
:caution-caption: :fire:
:warning-caption: :warning:
:white-check-mark: :white_check_mark:
endif::env-github[]

Vim’s `libcall()` is used to call a function in
either a Windows `.dll` or Linux `.so` library.

== Help?

Vim’s https://vimhelp.org/builtin.txt.html#libcall%28%29[builtin.txt]:

----
libcall({libname}, {funcname}, {argument})
		Call function {funcname} in the run-time library {libname}
		with single argument {argument}.
		This is useful to call functions in a library that you
		especially made to be used with Vim.  Since only one argument
		is possible, calling standard library functions is rather
		limited.
		The result is the String returned by the function.  If the
		function returns NULL, this will appear as an empty string ""
		to Vim.
		If the function returns a number, use libcallnr()!
		If {argument} is a number, it is passed to the function as an
		int; if {argument} is a string, it is passed as a
		null-terminated string.
		This function will fail in restricted-mode.

		libcall() allows you to write your own 'plug-in' extensions to
		Vim without having to recompile the program.  It is NOT a
		means to call system functions!  If you try to do so Vim will
		very probably crash.

		For Win32, the functions you write must be placed in a DLL
		and use the normal C calling convention (NOT Pascal which is
		used in Windows System DLLs).  The function must take exactly
		one parameter, either a character pointer or a long integer,
		and must return a character pointer or NULL.  The character
		pointer returned must point to memory that will remain valid
		after the function has returned (e.g. in static data in the
		DLL).  If it points to allocated memory, that memory will
		leak away.  Using a static buffer in the function should work,
		it's then freed when the DLL is unloaded.
----

== Why use `libcall()`?

Some scenarios may be well-suited to using `libcall()`.  One use
case is a large dictionary.  Although vim9script is compiled, Vim
cannot _pre-compile_ vim9script.  So, a dictionary or list used by a
plugin needs to be created with each and every new Vim instance.
That’s fine, usually, but what if you have a dictionary or list
that is several, or even hundreds, of megabytes?

NOTE: The Unicode character database, for example, is a few
hundred megabytes.

Especially if you are writing it for your own setup, where the vagaries
of operating systems, versions, etc., are of only concern to you, using
pre-compiled `.dll` or `.so` may have advantages.

== An example (using a dictionary extract)

As noted in the help, the returned result is always either a
string or a number.  The help notes that is a limitation, but
it won’t be in all cases.
Keep in mind, returning a string means that even
dictionaries of dictionaries are feasible (using `eval()` on
that returned string, which is what is demonstrated in the
following example).

Depending on how it’s written, the compiled `.dll` or `.so` should be
very fast, though some factors may significantly impact performance.
One such factor is using WSL and having the `.so` in a location
on the Windows file system (`/mnt/c/Users/...`, for example).

CAUTION: I am no C programmer!  The following C code is potentially not
great!  (_And it was created with some AI help_.)  That is not an issue for
this example, and, at any rate, the same concept has been tested on
a 300Mb `.dll` and `.so`, and the string for any requested key was
returned in <0.1 second.  So, there may well be room for improvement,
but it does the job for this illustration.

== gcc installation

=== Windows 64-bit

NOTE: Skip this if a C compiler already exists on your Windows PC (though the
scripts calling the compiler may need adjustment, perhaps).

MSYS2 with gcc is a relatively simple means of adding the gcc C compiler
to a Windows 64-bit PC.  Steps:

1. Install MSYS2 from https://www.msys2.org/
2. In MSYS2 UCRT64, install gcc with:
+
[source,sh]
----
pacman -S mingw-w64-ucrt-x86_64-gcc
----
+
3. Validate that gcc has installed:
+
[source,sh]
----
gcc --version
----
+
Something like this should be returned:
+
----
gcc.exe (Rev3, Built by MSYS2 project) 14.1.0
Copyright (C) 2024 Free Software Foundation, Inc.
----

=== Debian-based Linux

There are many sites explaining how to do this.  Commonly,
using `sudo apt install build-essential` is recommended.
Search for that if you do not have gcc installed on your Debian-based
Linux machine (or, also, WSL: Debian, Ubuntu, _et al_).

== C code example

The following `eg.c` file creates a static array of key-value pairs
where the key is the Unicode code point and the value is a tiny
subset of the Unicode database (i.e.,
the https://www.unicode.org/Public/16.0.0/ucdxml/[XML] version)
content for each associated code point.
Each value is a string, which subsequently may be turned into a Vim
dictionary with `eval()`.  As noted in the comments, this has been
left as-is, aside from adding more comments and removal
of _over 155,000 pairs_, for this demo.

[source,c]
----
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdint.h>

// NB: The _real_ Unicode code points table has >155k entries.
//     It has been left as-is.
#define TABLE_SIZE 262144

typedef struct {
    const char *key;
    const char *value;
} KeyValue;

typedef struct HashNode {
    const char *key;
    const char *value;
    struct HashNode *next;
} HashNode;

static HashNode *hash_table[TABLE_SIZE] = {0};
static int initialized = 0;

// Static array of key-value pairs.
// The 'real' data is >155,000 key-value pairs.
static KeyValue dictionary[] = {
  {"00A0", "{'na': 'NO-BREAK SPACE', 'gc': 'Zs', 'bc': 'CS', 'dt': 'nb', 'dm': '0020'}"},
  {"00A1", "{'na': 'INVERTED EXCLAMATION MARK', 'gc': 'Po', 'bc': 'ON'}"},
  {"00A2", "{'na': 'CENT SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
  {"00A3", "{'na': 'POUND SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
  {"00A4", "{'na': 'CURRENCY SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
  {"00A5", "{'na': 'YEN SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
  {"00A6", "{'na': 'BROKEN BAR', 'gc': 'So', 'bc': 'ON'}"},
  {"00A7", "{'na': 'SECTION SIGN', 'gc': 'Po', 'bc': 'ON'}"},
  {"00A8", "{'na': 'DIAERESIS', 'gc': 'Sk', 'bc': 'ON', 'dt': 'com', 'dm': '0020 0308'}"},
  {"00A9", "{'na': 'COPYRIGHT SIGN', 'gc': 'So', 'bc': 'ON'}"},
  {"00AA", "{'na': 'FEMININE ORDINAL INDICATOR', 'dt': 'sup', 'dm': '0061'}"},
  {"00AB", "{'na': 'LEFT-POINTING DOUBLE ANGLE QUOTATION MARK', 'gc': 'Pi', 'bc': 'ON', 'bm': 'Y'}"},
  {"00AC", "{'na': 'NOT SIGN', 'gc': 'Sm', 'bc': 'ON'}"},
  {"00AD", "{'na': 'SOFT HYPHEN', 'gc': 'Cf', 'bc': 'BN'}"},
  {"00AE", "{'na': 'REGISTERED SIGN', 'gc': 'So', 'bc': 'ON'}"},
  {"00AF", "{'na': 'MACRON', 'gc': 'Sk', 'bc': 'ON', 'dt': 'com', 'dm': '0020 0304'}"},
  {NULL, NULL} // End marker
};


// FNV-1a hash function.
// Again, this has been left as produced by Claude AI for
// where there are ~155k key/value pairs, which have been omitted.
uint32_t hash(const char *key) {
    uint32_t h = 2166136261u;
    for (; *key; key++) {
        h ^= *key;
        h *= 16777619;
    }
    return h % TABLE_SIZE;
}

void init_hash_table() {
    if (initialized) return;
    for (int i = 0; dictionary[i].key != NULL; i++) {
        uint32_t index = hash(dictionary[i].key);
        HashNode *new_node = malloc(sizeof(HashNode));
        new_node->key = dictionary[i].key;
        new_node->value = dictionary[i].value;
        new_node->next = hash_table[index];
        hash_table[index] = new_node;
    }
    initialized = 1;
}

const char* get_value(const char *key) {
    if (!initialized) {
        init_hash_table();
    }
    uint32_t index = hash(key);
    HashNode *current = hash_table[index];
    while (current != NULL) {
        if (strcmp(current->key, key) == 0) {
            return current->value;
        }
        current = current->next;
    }
    return "{}";
}
----

If this is saved as `eg.c`, the command line `gcc -O3 -shared -o eg.dll eg.c`,
(Windows) or `gcc -O3 -fPIC -shared -o eg.so eg.c` (Linux) should create
the associated `eg.dll` or `eg.so` file.

TIP: These commands are saved to `eg_dll.sh` and `eg_so.sh` in
the `libcall_Vim_builtin.7z` file, below.  The former should be run with MSYS2 UCRT64 Shell and
the latter in Linux / WSL.

== How to use `libcall()` with `eg.dll` or `eg.so`

To directly use `libcall()` with `eg.dll` or `eg.so`, the
following Windows and Linux instructions explain how.

=== Windows

Open Vim, enter command-line mode (with `:`), then put the following,
replacing `FULLPATH` with the full file path (to the `.dll`):

[source,vim]
----
call append('$', libcall('{FULLPATH}\eg', 'get_value', '00A1'))
----

NOTE: In Windows, the `.dll` extension is omitted from `{libname}`, hence
it is `{FULL_PATH}\eg`, with no `.dll` extension.  This is explained
in https://vimhelp.org/builtin.txt.html#libcall%28%29[builtin.txt].

=== Linux

Using Linux, open Vim, ensure your current working directory is where
the `.so` is, then:

[source,vim]
----
call append('$', libcall('./eg.so', 'get_value', '00A1'))
----

In either instance the following should be appended to the end of the
active buffer:

----
{'na': 'INVERTED EXCLAMATION MARK', 'gc': 'Po', 'bc': 'ON'}
----

== Demo files

The demo files `eg_dll_test.vim` and `eg_so_test.vim` may be used to see
this working in action.  Their content is not reproduced here, nor are
they essential.  They include using `eval()` to take the returned
string, turn it into a dictionary, and return a value relating
to a specified key.  They are also shown working in the
animated `.gif` files, below.

== 7z

All the files related to this gist are in `libcall_Vim_builtin.7z`.

== .gif demos

Demos in Windows (`eg_dll_test.vim` in Neovim 0.10.2)
and in Debian WSL (`eg_so_test.vim` in Vim 9.1.90) are shown, below.