Rust modules and how to use them

So, how tf do rust modules work? and why tf are they not doing what I want right now???

Note: A bunch of things in here are defined by cargo and a few others by rust. I won't generally care about that distinction because two languages without standard build system is more than enough for my sanity.

➡️ Feel free to ask me on twitter if you have questions or remarks.

Core concepts

One common misconception is that cargo will scan eagerly everything under <root>/src/. That's not the case. Cargo will only read/discover files as needed.
Rust is one of those weird languages that recognize the existence of files. Filenames are load bearing.
mod vs use: mod is about structure and declaring what your modules are. use is about visibily and brining names into scope.

Declaring the module structure

Entry points

So, I said cargo doesn't look up files eagerly, but it does for some. Here are the various entry points, you need at least one of them but you can mix & match:

<root>/src/lib.rs : root module for a libary
<root>/src/main.rs : root module for an application
<root>/src/bin/*.rs <root>/src/bin/*/main.rs: alternative binaries for an application or lib
<root>/benches/*.rs <root>/benches/*/main.rs: benchmarks
<root>/examples/*.rs <root>/examples/*/main.rs: examples
<root>/tests/*.rs <root>/tests/*/main.rs: integration tests

Note: <root> here means the package root, aka the directory containing Cargo.toml.

Moar modules

Ok so, lets assume you have your root main.rs or lib.rs. Which one doesn't really matter, but you might be trying to write one of those exotic projects with multiples files in them, so let's do that. First of all, keep in mind one file means one module. Rust modules aren't like c++ namespaces in that they're closed, you cannot reopen them later.

Let's ignore use and visibility for now and focus on adding new modules. I'll name the current module current_mod and its root <current_dir>. if you just started, <current_dir> is <root>/src and current_mod is somewhat anonymous but you can refer to as crate. Let's also name the file where that module is defined <current_file>. At the start that'll be either <root>/src/lib.rs or <root>/src/main.rs

Let's try and create a new module current_mod::MODNAME. You have two options:

Use an inline mod declaration using mod MODNAME { /* ... */ }. This declares a module named MODNAME and it's content will be whatever is inside the braces. This doesn't solve your problem of adding new files but is useful when you want modules for namespacing purposes.

Use a dedicated files for that module. Inside <current_file>, use mod MODNAME;. It's at this point and only at this point that cargo starts looking for new source files. In particular it's gonna look into two possible places:

<current_dir>/MODNAME.rs
<current_dir>/MODNAME/mod.rs

The former is what the cool kids do, and the later is the historical one. It doesn't matter which one you prefer. Now, one nuance that's somewhat intuitive but can be a bitch when you don't realize it: Regardless of which form you use, the new module directory for both of those is <current_dir>/MODNAME. Yes, that mean they don't have to be inside each other.

Example

We have a library crate, and we want the modules crate::a, crate::a::b, crate::x and crate::x::y, and use dedicated files for all of them. here's the file hierarchy:

📁
 ├─📄 Cargo.toml
 └─📁 src
    ├─📄 lib.rs
    ├─📄 a.rs
    ├─📁 a
    │  └─📄 b.rs
    └─📁 x
       ├─📄 mod.rs
       └─📄 y.rs

the mod declarations are as follow:

// inside lib.rs
mod a;
mod b;

// inside a.rs
mod b;

// inside x/mod.rs
mod y;

Lookup and visibility

Note: I'm gonna assume at least rust 2018. If you're using rust 2015 (why ??), I'll go briefly over that, just so that people stop getting baited by all the rust 2015 example online, but you should really upgrade. If you're wondering, that's in your Cargo.toml under package.edition

Ok so that's where use and pub comes in. We'll reuse the structure from the example above for examples purposes.

Paths

Assume you're inside the module crate::a and you want to refer to an entity somewhere. there's a few possible path formats:

Absolute path inside your crate. they start with crate, so b could be crate::a::b
Relative path under your current module. they start with self, so b could be self::b. Note there's no way to go "up".
Relative path to your module's parent. they start with super so b could be super::a::b. You are allowed to use multiple super in a row to keep going up modules.
Undecorated relative path under your current module, so b would be just b.

The first 3 formats with a keyword prefix will always refer to paths in your crate unambiguously. The last one may clash with external crates.

External crates

External crates are crates you reference in your Cargo.toml. Just doing that will bring them in scope in your code.

A note about extern crate: You might have seen code such as extern crate my_crate; on the interwebs. Fuck that. This is a remnant of Rust 2015 and does absolutely nothing in editions after 2018 beside being half placebo half bait.

Let's assume a crate lol containing one function, lmao. there's two way to refer to it:

Unqualified path like lol::lmao. This isn't a relative path, and is the most common way to refer to it.
Qualifed crate path like ::lol::lmao. This is unambiguous and will always refer to an external crate.

We can see there's one ambiguity between lol::lmao looking up either inside a crate named lol or an entity named lol inside the current module. So, how do we solve ✨ c o l l i s i o n s ✨ ?

First of all, rust does mostly the right thing here. For use declaration, it yells at you and errors if anything is unambiguous. This is good. Qualified entities inside the code tho will default to being module-relative if there's an entity with that name, otherwise external crate. The fix is to use one of the various flavours of unambiguous paths.

One useful tool here is use <a-path> as <idenfitifer> to rename stuff and avoid collisions.

Wildcard and composite paths

Those are only valid as arguments to a use.

Wildcard paths allow importing all names under an entity, usually a module, but it could be a type or a trait. they look like use path::to::entity::*;

Composite paths allow merging various paths into a single use, here's a few examples:

a::b, a::c → use a::{b, c};
a a::b a::c → use a::{self, b, c}
a::b::x a::b::y a::c → use a::{b::{x, y}, c} or use a::{b::x, b::y, c}

Rust idiomatic code tend to import everything (esp types and traits) explicitly, and go as composite as possible, so you'll often see some ridiculously chonky uses at the start of files. This is normal, don't panic!

Visibility

By default, entities are visible to any code in their module, or one of their module's children. To allow code from parent modules or other crates to access them, you need to qualify their declaration with pub. In particular, note that this applies to modules too.

By default, pub makes an entity visible to everyone, but you can also control that to get some pretty fine grained options:

pub makes an entity visible to everyone.
pub(in <path>) makes an entity visible to everyone under a given path. this path must start with on of crate super self.
pub(crate) makes an entity visible to any code inside your crate
pub(super) is a shortcut for pub(in super)
pub(self) is a shortcut for pub(in self) and is the same as not using pub to begin with.

Reexports

In general, you can mostly treat use as being a declaration. that means it follows the same visibility rules as any other entity, and that also mean you can use pub on it. In particular, pub use ... is how you can reexport code from your modules, either as a shortcut, or to hide your module structure entirely.

🌈 That's all, folks 🌈

edhebi/rust_modules.md