Skip to content

Instantly share code, notes, and snippets.

@kdrakon
Last active July 29, 2021 21:38
Show Gist options
  • Save kdrakon/0152631129329ab5eee4fee2b6a90a2d to your computer and use it in GitHub Desktop.
Save kdrakon/0152631129329ab5eee4fee2b6a90a2d to your computer and use it in GitHub Desktop.
Demonstrates how I got bindgen to generate the bindings to libfuse

Rust's Bindgen + Fuse in 2019

I will quickly show how I got bindgen (https://rust-lang.github.io/rust-bindgen) to generate the bindings to Fuse (libfuse) with the current stable release of Rust. By doing so, this should demonstrate how to bootstrap writing your own Fuse file system in Rust.

I do realise that there are some crates that already exist that aid in making Fuse drivers in Rust, but this was more or less an excuse to also try out bindgen, which I don't believe those existing libraries utilise.

I will be using:

Assuming you already have cargo, you can get bindgen with cargo install bindgen.

Steps

New project

  1. Init a project, like cargo init rust-bindgen-fuse.
  2. We need the libfuse header files. Rather than download them manually or refer to them via a package manager (e.g. Debian/Ubuntu's libfuse-dev), I'll take them directly from Github. Within the project, you can create a git submodule, e.g.git submodule [email protected]:libfuse/libfuse.git.

Generating bindings

To use bindgen and to generate the code we want, we create a "wrapper" C header file to encapsulate the headers we want, including all the transitive dependent headers. For example, this is what I have in src/fuse_wrapper.h:

#define FUSE_USE_VERSION 31

#include "../libfuse/include/fuse.h"

FUSE_USE_VERSION is a macro constant that another header file (I believe fuse_common.h) required to be defined for API compatibility. fuse.h contains all the high-level API's an implementor needs to work with, so I started there.

To test bindgen out, you can run the following:

bindgen --distrust-clang-mangling src/fuse_wrapper.h

You should see some rust code fly-by. Those are the bindings to the libfuse library we will be linking to later. The reason I used --distrust-clang-mangling was due to some linker names that were generated by bindgen; the linker couldn't correctly reference the generated names for some reason, but this flag fixed that.

For this project, we want the generated bindings to be dynamic though, so they are built for whatever Rust toolchain is currently in use. To do that, we use a cargo build script—i.e. a file named build.rs in the project root. First, add bindgen to your Cargo.toml file:

[build-dependencies]
bindgen = "0.48.1"

Note that it's a build dependency, not a code dependency. This allows its use from the build script. As for the script, here's what I used based on the documentation online.

extern crate bindgen;

use std::env;
use std::path::PathBuf;

fn main() {
    println!("cargo:rustc-link-search=/usr/local/lib");
    println!("cargo:rustc-link-lib=osxfuse");

    let bindings = bindgen::Builder::default()
        .header("src/fuse_wrapper.h")
        .trust_clang_mangling(false) // disabled due to linker name problem
        .generate()
        .expect("Unable to generate bindings");

    // Write the bindings to the $OUT_DIR/bindings.rs file.
    let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());
    bindings
        .write_to_file(out_path.join("bindings.rs"))
        .expect("Couldn't write bindings!");
}

By default, cargo will execute build.rs. Note that for now, I've hardcoded this to my current local environment, which is an OSX machine. The println!'s are sending commands to rustc to link my local dynamic library. On Ubuntu, I tested this with the dynamic library fuse after installing libfuse-dev via apt. I imagine it would be trivial to make this script smarter based on some environment variables, etc. The rest of this code calls bindgen programmatically and outputs the Rust code to a file that can be referenced in your own code.

Putting it all together

Finally, with the generated code, I can call the Fuse bindings. The general way Fuse works is you provide an implementation of a C struct called fuse_operations. This contains reference functions that are used as callbacks from the Fuse kernel module. In other words you inject the behaviour your file system should exhibit directly into a runtime, if you're so inclined to call it that.

Normally, after providing your operations, your userspace application ends and the local installation of Fuse takes over from there. Basically, it takes your injected code, mounts a file system, and your application exits. The following code—in src/main.rs—shows my first rudimentary attempt at calling Fuse.

#![allow(non_upper_case_globals)]
#![allow(non_camel_case_types)]
#![allow(non_snake_case)]
include!(concat!(env!("OUT_DIR"), "/bindings.rs"));

use std::env::args;
use std::env::args_os;
use std::ffi::CString;
use std::mem::size_of;
use std::os::unix::ffi::OsStrExt;
use std::os::raw::c_int;

fn main() {
    let ops: fuse_operations = fuse_operations {
        getattr: None,
        readlink: None,
        mknod: None,
        mkdir: None,
        unlink: None,
        rmdir: None,
        symlink: None,
        rename: None,
        link: None,
        chmod: None,
        chown: None,
        truncate: None,
        open: None,
        read: None,
        write: None,
        statfs: None,
        flush: None,
        release: None,
        fsync: None,
        setxattr: None,
        getxattr: None,
        listxattr: None,
        removexattr: None,
        opendir: None,
        readdir: None,
        releasedir: None,
        fsyncdir: None,
        init: None,
        destroy: None,
        access: None,
        create: None,
        lock: None,
        utimens: None,
        bmap: None,
        ioctl: None,
        poll: None,
        write_buf: None,
        read_buf: None,
        flock: None,
        fallocate: None,
        copy_file_range: None,
    };

    let argc: i32 = args_os().len() as i32;
    let args: Vec<CString> = args_os()
        .into_iter()
        .map(|arg| {
            arg.to_str()
                .and_then(|s| {
                    CString::new(s)
                        .map(|c_string| {
                            dbg!(&c_string);
                            c_string
                        })
                        .ok()
                })
                .expect("Expected valid arg input")
        })
        .collect();

    let mut argv: Vec<*const ::std::os::raw::c_char> =
        args.iter().map(|arg| arg.as_ptr()).collect();

    unsafe {
        fuse_main_real(
            argc,
            argv.as_mut_ptr() as *mut *mut ::std::os::raw::c_char,
            &ops,
            size_of::<fuse_operations>(),
            std::ptr::null_mut(),
        );
    }
}

As you can see, my file system does bupkis. I pretty much do four things:

  1. Create my fuse_operations that does nothing
  2. Count the number of args from the command-line (argc)
  3. Turn all the UTF-8 args into C-strings (argv)
  4. Pass all those things into fuse_main_real, which is the entry-point that calls the Fuse library to mount a path among other things.

Note all the old-school C pointers; bindgen can translate a lot of things into nice Rust (e.g. Option's in the fuse_operations struct), but you still need to do a lot of C-like stuff, like providing pointers to pointers of strings.

At the very beginning, you can see a macro, include!; that is pulling in all the generated bindings. Unfortunately, my IDE doesn't know how to handle that, so I didn't get much help trying to look through the generated code. As a trick, I did the following temporarily:

bindgen --distrust-clang-mangling src/fuse_wrapper.h > src/delete/mod.rs

Then, I modified the main.rs like so:

// include!(concat!(env!("OUT_DIR"), "/bindings.rs"));
mod delete;
use delete::*;

That way I could use my IDE to look through the generated code and reference the structs and methods. Uncommenting the include! and commenting the delete module allowed me to verify I was calling the right stuff.

Trying it out

Running:

cargo run -- --help

should give you the usage info output by Fuse:

usage: target/debug/rust-bindgen-fuse mountpoint [options]

general options:
    -o opt,[opt...]        mount options
    -h   --help            print help
    -V   --version         print version

FUSE options:
    -d   -o debug          enable debug output (implies -f)
    -f                     foreground operation
    -s                     disable multi-threaded operation

fuse: no mount point

You should be able to do this then:

cargo run -- /tmp/mount

And if you run mount, you should see an entry like this:

rust-bindgen-fuse@osxfuse0 on /private/tmp/mount (osxfuse, nodev, nosuid, synchronous, mounted by kdrakon)

Obviously, since we didn't implement anything, you'll see this after running ls /tmp/mount:

ls: /tmp/mount: Function not implemented

So how do we know what wasn't implemented? Fortunately Fuse has a debug-mode:

cargo run -- /tmp/mount -d

This will spit out all the callbacks to your fuse_operations, specifically, the ones that failed/were missing.

Next?

I'm going to decipher more of the Fuse operations and what are expected from them. Have fun!

References

Here are some helpful links I found to get over some of the bumps in the road I encountered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment