So, how tf do rust modules work? and why tf are they not doing what I want right now???
Note: A bunch of things in here are defined by cargo and a few others by rust. I won't generally care about that distinction because two languages without standard build system is more than enough for my sanity.
β‘οΈ Feel free to ask me on twitter if you have questions or remarks.
- One common misconception is that cargo will scan eagerly everything under
<root>/src/
. That's not the case. Cargo will only read/discover files as needed. - Rust is one of those weird languages that recognize the existence of files. Filenames are load bearing.
mod
vsuse
:mod
is about structure and declaring what your modules are.use
is about visibily and brining names into scope.
So, I said cargo doesn't look up files eagerly, but it does for some. Here are the various entry points, you need at least one of them but you can mix & match:
<root>/src/lib.rs
: root module for a libary<root>/src/main.rs
: root module for an application<root>/src/bin/*.rs
<root>/src/bin/*/main.rs
: alternative binaries for an application or lib<root>/benches/*.rs
<root>/benches/*/main.rs
: benchmarks<root>/examples/*.rs
<root>/examples/*/main.rs
: examples<root>/tests/*.rs
<root>/tests/*/main.rs
: integration tests
Note: <root>
here means the package root, aka the directory containing Cargo.toml
.
Ok so, lets assume you have your root main.rs
or lib.rs
. Which one doesn't really matter, but you might be trying to write one of those exotic projects with multiples files in them, so let's do that. First of all, keep in mind one file means one module. Rust modules aren't like c++ namespaces in that they're closed, you cannot reopen them later.
Let's ignore use
and visibility for now and focus on adding new modules. I'll name the current module current_mod
and its root <current_dir>
. if you just started, <current_dir>
is <root>/src
and current_mod
is somewhat anonymous but you can refer to as crate
. Let's also name the file where that module is defined <current_file>
. At the start that'll be either <root>/src/lib.rs
or <root>/src/main.rs
Let's try and create a new module current_mod::MODNAME
. You have two options:
Use an inline mod declaration using mod MODNAME { /* ... */ }
. This declares a module named MODNAME
and it's content will be whatever is inside the braces. This doesn't solve your problem of adding new files but is useful when you want modules for namespacing purposes.
Use a dedicated files for that module. Inside <current_file>
, use mod MODNAME;
. It's at this point and only at this point that cargo starts looking for new source files. In particular it's gonna look into two possible places:
<current_dir>/MODNAME.rs
<current_dir>/MODNAME/mod.rs
The former is what the cool kids do, and the later is the historical one. It doesn't matter which one you prefer. Now, one nuance that's somewhat intuitive but can be a bitch when you don't realize it: Regardless of which form you use, the new module directory for both of those is <current_dir>/MODNAME
. Yes, that mean they don't have to be inside each other.
We have a library crate, and we want the modules crate::a
, crate::a::b
, crate::x
and crate::x::y
, and use dedicated files for all of them. here's the file hierarchy:
π
ββπ Cargo.toml
ββπ src
ββπ lib.rs
ββπ a.rs
ββπ a
β ββπ b.rs
ββπ x
ββπ mod.rs
ββπ y.rs
the mod declarations are as follow:
// inside lib.rs
mod a;
mod b;
// inside a.rs
mod b;
// inside x/mod.rs
mod y;
Note: I'm gonna assume at least rust 2018. If you're using rust 2015 (why ??), I'll go briefly over that, just so that people stop getting baited by all the rust 2015 example online, but you should really upgrade. If you're wondering, that's in your Cargo.toml
under package.edition
Ok so that's where use
and pub
comes in. We'll reuse the structure from the example above for examples purposes.
Assume you're inside the module crate::a
and you want to refer to an entity somewhere. there's a few possible path formats:
- Absolute path inside your crate. they start with
crate
, sob
could becrate::a::b
- Relative path under your current module. they start with
self
, sob
could beself::b
. Note there's no way to go "up". - Relative path to your module's parent. they start with
super
sob
could besuper::a::b
. You are allowed to use multiple super in a row to keep going up modules. - Undecorated relative path under your current module, so
b
would be justb
.
The first 3 formats with a keyword prefix will always refer to paths in your crate unambiguously. The last one may clash with external crates.
External crates are crates you reference in your Cargo.toml
. Just doing that will bring them in scope in your code.
A note about extern crate
: You might have seen code such as extern crate my_crate;
on the interwebs. Fuck that. This is a remnant of Rust 2015 and does absolutely nothing in editions after 2018 beside being half placebo half bait.
Let's assume a crate lol
containing one function, lmao
. there's two way to refer to it:
- Unqualified path like
lol::lmao
. This isn't a relative path, and is the most common way to refer to it. - Qualifed crate path like
::lol::lmao
. This is unambiguous and will always refer to an external crate.
We can see there's one ambiguity between lol::lmao
looking up either inside a crate named lol
or an entity named lol
inside the current module. So, how do we solve β¨ c o l l i s i o n s β¨ ?
First of all, rust does mostly the right thing here. For use
declaration, it yells at you and errors if anything is unambiguous. This is good. Qualified entities inside the code tho will default to being module-relative if there's an entity with that name, otherwise external crate. The fix is to use one of the various flavours of unambiguous paths.
One useful tool here is use <a-path> as <idenfitifer>
to rename stuff and avoid collisions.
Those are only valid as arguments to a use
.
Wildcard paths allow importing all names under an entity, usually a module, but it could be a type or a trait. they look like use path::to::entity::*;
Composite paths allow merging various paths into a single use, here's a few examples:
a::b
,a::c
βuse a::{b, c};
a
a::b
a::c
βuse a::{self, b, c}
a::b::x
a::b::y
a::c
βuse a::{b::{x, y}, c}
oruse a::{b::x, b::y, c}
Rust idiomatic code tend to import everything (esp types and traits) explicitly, and go as composite as possible, so you'll often see some ridiculously chonky use
s at the start of files. This is normal, don't panic!
By default, entities are visible to any code in their module, or one of their module's children. To allow code from parent modules or other crates to access them, you need to qualify their declaration with pub
. In particular, note that this applies to modules too.
By default, pub
makes an entity visible to everyone, but you can also control that to get some pretty fine grained options:
pub
makes an entity visible to everyone.pub(in <path>)
makes an entity visible to everyone under a given path. this path must start with on ofcrate
super
self
.pub(crate)
makes an entity visible to any code inside your cratepub(super)
is a shortcut forpub(in super)
pub(self)
is a shortcut forpub(in self)
and is the same as not using pub to begin with.
In general, you can mostly treat use
as being a declaration. that means it follows the same visibility rules as any other entity, and that also mean you can use pub on it. In particular, pub use ...
is how you can reexport code from your modules, either as a shortcut, or to hide your module structure entirely.
π That's all, folks π