Skip to content

Instantly share code, notes, and snippets.

@Kimundi
Last active December 21, 2015 14:48
Show Gist options
  • Select an option

  • Save Kimundi/6322035 to your computer and use it in GitHub Desktop.

Select an option

Save Kimundi/6322035 to your computer and use it in GitHub Desktop.

Crates and the module system

Rust's module system is very powerful, but because of that also somewhat complex. Nevertheless, this section will try to explain every important aspect of it.

Crates

In order to speak about the module system, we first need to define the medium it exists in:

Let's say you've written a program or a library, compiled it, and got the resulting binary. In Rust, the content of all source code that the compiler directly had to compile in order to end up with that binary is collectively called a 'crate'.

For example, for a simple hello world program your crate only consists of this code:

// main.rs

fn main() {
    println("Hello world!");
}

A crate is also the unit of independent compilation in Rust: rustc always compiles a single crate at a time, from which it produces either a library or an executable.

Note that merely using an already compiled library in your code does not make it part of your crate.

The module hierarchy

For every crate, all the code in it is arranged in a hierarchy of modules starting with a single root module. That root module is called the 'crate root'.

All modules in a crate below the crate root are declared with the mod keyword:

// This is the crate root

mod farm {
    // This is the body of module 'farm' declared in the crate root.

    fn chicken() { println("cluck cluck"); }
    fn cow() { println("mooo"); }

    mod barn {
        // Body of module 'barn'

        fn hay() { println("..."); }
    }
}

fn main() {
    println("Hello farm!");
}

As you can see, your module hierarchy is now three modules deep: There is the crate root, which contains your main() function, and the module farm. The module farm also contains two functions and a third module barn, which contains a function hay.

(In case you already stumbled over extern mod: It isn't directly related to a bare mod, we'll get to it later. )

Paths and visibility

We've now defined a nice module hierarchy. But how do we access the items in it from our main function? One way to do it is to simply fully qualifying it:

mod farm {
    fn chicken() { println("cluck cluck"); }
    // ...
}

fn main() {
    println("Hello chicken!");

    ::farm::chicken(); // Won't compile yet, see further down
}

The ::farm::chicken construct is what we call a 'path'.

Because it's starting with a ::, it's also a 'global path', which qualifies an item by its full path in the module hierarchy relative to the crate root.

If the path were to start with a regular identifier, like farm::chicken, it would be a 'local path' instead. We'll get to them later.

Now, if you actually tried to compile this code example, you'll notice that you get a unresolved name: 'farm::chicken' error. That's because per default, items (fn, struct, static, mod, ...) are only visible inside the module they are defined in.

To make them visible outside their containing modules, you need to mark them public with pub:

mod farm {
    pub fn chicken() { println("cluck cluck"); }
    pub fn cow() { println("mooo"); }
    // ...
}

fn main() {
    println("Hello chicken!");
    ::farm::chicken(); // This compiles now
}

Visibility restrictions in Rust exist only at module boundaries. This is quite different from most object-oriented languages that also enforce restrictions on objects themselves. That's not to say that Rust doesn't support encapsulation: both struct fields and methods can be private. But this encapsulation is at the module level, not the struct level.

For convenience, fields are public by default, and can be made private with the priv keyword:

mod farm {
# pub type Chicken = int;
# type Cow = int;
# struct Human(int);
# impl Human { fn rest(&self) { } }
# pub fn make_me_a_farm() -> Farm { Farm { chickens: ~[], cows: ~[], farmer: Human(0) } }
    pub struct Farm {
        priv chickens: ~[Chicken],
        priv cows: ~[Cow],
        farmer: Human
    }

    impl Farm {
        fn feed_chickens(&self) { ... }
        fn feed_cows(&self) { ... }
        pub fn add_chicken(&self, c: Chicken) { ... }
    }

    pub fn feed_animals(farm: &Farm) {
        farm.feed_chickens();
        farm.feed_cows();
    }
}

fn main() {
    let f = make_me_a_farm();
    f.add_chicken(make_me_a_chicken());
    farm::feed_animals(&f);
    f.farmer.rest();

    // This wouldn't compile:
    f.feed_cows()
    f.cows.len()
}
# fn make_me_a_farm() -> farm::Farm { farm::make_me_a_farm() }
# fn make_me_a_chicken() -> farm::Chicken { 0 }

Note: Visibility rules are currently buggy and not fully defined, you might have to add or remove pub along a path until it works.

Files and modules

One important aspect about Rusts module system is that source files are not important: You define a module hierarchy, populate it with all your definitions, define visibility, maybe put in a fn main(), and that's it: No need to think about source files.

The only file that's relevant is the one that contains the body of your crate root, and it's only relevant because you have to pass that file to rustc to compile your crate.

And in principle, that's all you need: You can write any Rust program as one giant source file that contains your crate root and everything below it in mod ... { ... } declarations.

However, in practice you usually want to split you code up into multiple source files to make it more manageable. In order to do that, Rust allows you to move the body of any module into it's own source file, which works like this:

If you declare a module without its body, like mod foo;, the compiler looks in the current directory for the files foo.rs and foo/mod.rs. If it finds either, it uses the content of that file as the body of the module. If it finds both, that's a compile error.

So, if we want to move the content of mod farm into it's own file, it would look like this:

// main.rs - contains body of the crate root

mod farm; // Compiler will look for 'farm.rs' and 'farm/mod.rs'

fn main() {
    println("Hello farm!");
    ::farm::cow();
}
// farm.rs - contains body of module 'farm' in the crate root

pub fn chicken() { println("cluck cluck"); }
pub fn cow() { println("mooo"); }

pub mod barn {
    pub fn hay() { println("..."); }
}

# fn main() { }

So, in short mod foo; is just syntactic sugar for mod foo { /* include content of foo.rs or foo/mod.rs here */ }.

This also means that having two or more identical mod foo; somewhere in your crate hierarchy is generally a bad idea, just like copy-and-paste-ing a module into two or more places is one. Both will result in duplicate and mutually incompatible definitions.

Importing names into the local scope

Always referring to definitions in other modules with their global path gets old really fast, so Rust has a way to import them into the local scope of your module: use-statements.

They work like this: At the beginning of any module body, fn body, or any other block you can write a list of use-statements, consisting of the keyword use and a global path to an item without the :: prefix. For example:

use farm::cow;
# mod farm { pub fn cow() { println("I'm a hidden ninja cow!") } }
# fn main() { cow() }

Now, if you refer to cow anywhere in the corresponding module/block, rustc will know where to find it.

More generally, for each name you use in your code, rustc will first look at all names that are defined locally, and only if that fails look at all names you brought in scope with use.

In other words, use-d items are shadowed by local definitions:

# mod farm { pub fn cow() { println("Hidden ninja cow is hidden.") } }
use farm::cow;
fn cow() { println("Mooo!") }

fn main() {
    cow() // resolves to the locally defined cow() function
}

To make this behavior more obvious, the rule has been made that use-statement always need to be written before any declaration, like in the example above.

One odd consequence of that rule is that use statements also go in front of any mod declaration, even if they refer to things inside them:

use farm::cow;
mod farm {
    pub fn cow() { println("Moooooo?") }
}

fn main() { cow() }

It's also possible to use things relatively: Adding a super:: in front of the path will start in the parent module, while adding a self:: prefix will start in the current module.

This is what our farm example looks like with use statements:

use farm::chicken;
use farm::cow;
use farm::barn;

mod farm {
    pub fn chicken() { println("cluck cluck"); }
    pub fn cow() { println("mooo"); }

    pub mod barn {
        pub fn hay() { println("..."); }
    }
}

fn main() {
    println("Hello farm!");

    // Can now refer to those names directly:
    chicken();
    cow();
    barn::hay();
}

There also exist two short forms for use-ing multiple names at once:

  1. Explicitly importing multiple names as the last element of a use path:
use farm::{chicken, cow};
# mod farm {
#     pub fn cow() { println("Did I already mention how hidden and ninja I am?") }
#     pub fn chicken() { println("I'm Bat-chicken, guardian of the hidden tutorial code.") }
# }
# fn main() { cow(); chicken() }
  1. Importing everything in a module with an wildcard:
use farm::*;
# mod farm {
#     pub fn cow() { println("Bat-chicken? What a stupid name!") }
#     pub fn chicken() { println("Says the 'hidden ninja' cow.") }
# }
# fn main() { cow(); chicken() }

However, that's not all. You can also rename an item while you're bringing it into scope:

use egg_layer = farm::chicken;
# mod farm { pub fn chicken() { println("Laying eggs is fun!")  } }
// ...

fn main() {
    egg_layer();
}

Locally renaming an item that way creates an alias: An alternate way to access the same item, which is also still accessible under it's regular path, and which is interchangeable with it.

Reexporting names

It is also possible to reexport items to be accessible under your module.

For that, you write pub use:

mod farm {
    pub use self::barn::hay;

    pub fn chicken() { println("cluck cluck"); }
    pub fn cow() { println("mooo"); }

    mod barn {
        pub fn hay() { println("..."); }
    }
}

fn main() {
    farm::chicken();
    farm::cow();
    farm::hay();
}

Just like in normal use statements, the exported names merely represent an alias to the same thing and can also be renamed.

The above example also demonstrate what you can use pub use for: The nested barn module is private, but the pub use allows users of the module farm to access a function from barn without needing to know that barn exists.

In other words, you can use them to decouple an public api from their internal implementation.

Using libraries

So far we've only talked about how to define and structure your own crate.

However, most code out there will want to use preexisting libraries, as there really is no reason to start from scratch each time you start a new project.

In Rust terminology, we need a way to refer to other crates.

For that, Rust offers you the extern mod declaration:

extern mod extra;
// extra ships with Rust, you'll find more details further down.

fn main() {
    // The rational number '1/2':
    let one_half = ::extra::rational::Ratio::new(1, 2);
}

Despite its name, extern mod is a distinct construct from regular mod declarations: A statement of the form extern mod foo; will cause rustc to search for the crate foo, and if it finds a matching binary it lets you use it from inside your crate.

The effect it has on your module hierarchy mirrors aspects of both mod and use:

  • Like mod, it causes rustc to actually emit code: The linkage information the binary needs to use the library foo.

  • But like use, all extern mod statements that refer to the same library are interchangeable, as each one really just presents an alias to an external module (the crate root of the library your linking against).

Remember how use-statements have to go before local declarations because the latter shadows the former? Well, extern mod statements also have their own rules in that regard: Both use and local declarations can shadow them, so the rule is that extern mod has to go in front of both use and local declarations.

Which can result in something like this:

extern mod extra;

use farm::dog;
use extra::rational::Ratio;

mod farm {
    fn dog() { println("woof"); }
}

fn main() {
    farm::dog();
    let a_third = Ratio::new(1, 3);
}

Yeah, it's horrible, but it's the result of shadowing rules that have been set that way because they model most closely what people expect to shadow.

Package ids

If you use extern mod, per default rustc will look for libraries in the the library search path (which you can extend with the -L switch).

However, Rust also ships with rustpkg, a package manager that is able to automatically download and build libraries if you use it for building your crate. How it works is explained here, but for this tutorial it's only important to know that you can optionally annotate an extern mod statement with an package id that rustpkg can use to identify it:

extern mod rust = "github.com/mozilla/rust"; // pretend rust is just an library

A minimal example for extern mod

Now for something that you can actually compile yourself.

We have these two files:

// world.rs
#[link(name = "world", vers = "1.0")];
pub fn explore() -> &'static str { "world" }
// main.rs
extern mod world;
fn main() { println("hello " + world::explore()); }

Now compile and run like this (adjust to your platform if necessary):

> rustc --lib world.rs  # compiles libworld-<HASH>-1.0.so
> rustc main.rs -L .    # compiles main
> ./main
"hello world"

Notice that the library produced contains the version in the file name as well as an inscrutable string of alphanumerics. These are both part of Rust's library versioning scheme. The alphanumerics are a hash representing the crate metadata.

The standard library and the prelude

While reading the examples in this tutorial, you might have asked yourself where all those magical predefined items like println() are coming from.

The truth is, there's nothing magical about them: They are all defined normally in the std library, which is a crate that ships with Rust.

The only magical thing that happens is that rustc automatically inserts this line into your crate root:

extern mod std;

As well as this line into every module body:

use std::prelude::*;

The role of the prelude module is to re-exports common definitions from std.

This allows you to use common types and functions like Option<T> or println without needing to import them. And if you need something from std that's not in the prelude, you just have to import it with an use statement.

For example, it re-exports println which is defined in std::io::println:

use puts = std::io::println;

fn main() {
    println("println is imported per default.");
    puts("Doesn't hinder you from importing it under an different name yourself.");
    ::std::io::println("Or from not using the automatic import.");
}

Both auto-insertions can be disabled with an attribute if necessary:

// In the crate root:
#[no_std];
// In any module:
#[no_prelude];

The standard library in detail

The Rust standard library provides runtime features required by the language, including the task scheduler and memory allocators, as well as library support for Rust built-in types, platform abstractions, and other commonly used features.

std includes modules corresponding to each of the integer types, each of the floating point types, the bool type, tuples, characters, strings, vectors, managed boxes, owned boxes, and unsafe and borrowed pointers. Additionally, std provides some pervasive types (option and result), task creation and communication primitives, platform abstractions (os and path), basic I/O abstractions (io), containers like hashmap, common traits (kinds, ops, cmp, num, to_str, clone), and complete bindings to the C standard library (libc).

The full documentation for std can be found here: standard library.

The extra library

Rust also ships with the extra library, an accumulation of useful things, that are however not important enough to deserve a place in the standard library. You can use them by linking to extra with an extern mod extra;.

Right now extra contains those definitions directly, but in the future it will likely just re-export a bunch of 'officially blessed' crates that get managed with rustpkg.

What next?

Now that you know the essentials, check out any of the additional tutorials on individual topics.

There is further documentation on the wiki, however those tend to be even more out of date as this document.


'concept' vs concept ? Rust vs rust ?


To have a nested directory structure for your source files, you can nest mods:

mod poultry {
    mod chicken;
    mod turkey;
}

The compiler will now look for poultry/chicken.rs and poultry/turkey.rs, and export their content in poultry::chicken and poultry::turkey. You can also provide a poultry.rs to add content to the poultry module itself.


A typical crate file declares attributes associated with the crate that may affect how the compiler processes the source. Crate attributes specify metadata used for locating and linking crates, the type of crate (library or executable), and control warning and error behavior, among other things. Crate files additionally declare the external crates they depend on as well as any modules loaded from other files.

// Crate linkage metadata
#[link(name = "farm", vers = "2.5", author = "mjh")];

// Make a library ("bin" is the default)
#[crate_type = "lib"];

// Turn on a warning
#[warn(non_camel_case_types)]

// Link to the standard library
extern mod std;

// Load some modules from other files
mod cow;
mod chicken;
mod horse;

fn main() {
    ...
}

Compiling this file will cause rustc to look for files named cow.rs, chicken.rs, and horse.rs in the same directory as the .rc file, compile them all together, and, based on the presence of the crate_type = "lib" attribute, output a shared library or an executable. (If the line #[crate_type = "lib"]; was omitted, rustc would create an executable.)

The #[link(...)] attribute provides meta information about the module, which other crates can use to load the right module. More about that later.


When a comma-separated list of name/value pairs appears after extern mod, the compiler front-end matches these pairs against the attributes provided in the link attribute of the crate file. The front-end will only select this crate for use if the actual pairs match the declared attributes. You can provide a name value to override the name used to search for the crate.

Our example crate declared this set of link attributes:

#[link(name = "farm", vers = "2.5", author = "mjh")];

Which you can then link with any (or all) of the following:

extern mod farm;
extern mod my_farm (name = "farm", vers = "2.5");
extern mod my_auxiliary_farm (name = "farm", author = "mjh");

If any of the requested metadata do not match, then the crate will not be compiled successfully.

@sfackler
Copy link

"Rusts" -> "Rust's" in the first line.

@adrientetar
Copy link

I would send a PR for this already.

@emberian
Copy link

Replace "Yeah, it's horrible, but" with "It's a bit weird, but"

@Kimundi
Copy link
Author

Kimundi commented Aug 24, 2013

irc notes:

<doomlord> good information there,thanks.  For someone lazily speedreading, and assuming eveything works like in other programs, maybe add words to the effect of "remember use paths are relative to the crate root by default", early on in the block "files and modules"
<doomlord> lazily speedreading and assuming everything works how i expect, thats me :)
<cmr> I know the #[path="foo/mod.rs"] was common
<doomlord> perhaps add to the files/modules section an example showing a.rs->b.rs   b.rs-> c.rs  showing the crate root relative thing    when b tries to use c :)
<nsf> kimundi: I see, I know how modules work in rust (at least I think so), what I meant is that it's unintuitive and confusing, hence mentioning it early is a good idea

@dobkeratops
Copy link

i learned something reading it thanks.
Suggestoin:-
the bit i was always getting wrong when starting out was not realizing things are crate relative with 'use'
perhaps you can explain that as quikcly as possible under the major heading 'files/modules'.
use directives are crate relative is the first peice of missing info they need to see.

IMO what a new user coming from a.n.other language has likely tried is this:-
a imports b
b import c
"oh no, why can't i actually use things from c in b?!" .. expecting it to work like import or c headers etc.

So maybe add that to the farm example- a 3rd file being brought in and show how the 2nd would use it... then explain further how and why that works ("One important aspect about Rusts module system is that source files are not important: .. etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment