Rust Module System Encourages Poor Practices (Comparing to Go)

Note that I'm not gonna be saying that it's confusing, hard to understand etc. Those things would be subjective, and even if I agree that it's not too intuitive, it's still totally learnable and not a big deal. In this article I'm gonna be talking about matters which are more objective.

Also note that it's not a tutorial on how Rust module system works; I'm assuming the reader is already familiar with it.

Rust: Workspaces, Crates and Modules

I like Rust-the-language a lot; I spent enormous amount of time in my life debugging memory corruption bugs in C, and so I'm fascinated by the ideas behind Rust, which gives us an incredible level of memory safety without compromising performance. That's very unique. However, there's one thing in Rust that really bothers me: the design of crates and modules.

As you probably know, Rust projects are organized as workspaces, crates and modules. A workspace is what you'd typically check into a single repository; it contains one or more crates.

The unit of compilation is a crate. A crate can depend on other crates, and that dependency graph must be acyclic, i.e. two crates can't depend on each other. That makes total sense: this is what makes it easy for crates to be compiled independently, potentially in parallel, etc.

A crate can contain one or more modules, and if at least one module has changed since the last build, the whole crate needs to be rebuilt (because, again, the unit of compilation is a crate). To be fair, let me mention that there is “incremental compilation”, which does somewhat help with cases when a single module was changed out of a hundred, but it's by no means 100 times faster than compiling the crate from scratch. At most, it's about 30-40% (so 0.3-0.4 times) faster, and so it means that as crate grows in size, compilation and re-compilation of that crate becomes significantly slower.

“Oh but you should split your project in multiple crates”, I hear someone saying now. And yeah I don't disagree: I definitely should, and I do. That's not my problem with Rust. My problem is that, in a large workspace, instead of encouraging numerous reasonably-sized crates, Rust subtly encourages big crates instead. Let me explain.

Modules within a crate can be organized as a tree: a module can contain submodules, and every submodule can in turn contain their sub-submodules, etc. So on the surface it seems to be neatly organized. And all those modules can use each other in whatever way: unlike crates, cyclic dependencies between modules within a crate are not a problem, so e.g. a module A can use a sub-submodule A/B/C, which also in turn uses A. Those two features literally promote adding more and more modules to a crate, since it's very convenient (cyclic deps are allowed, no need to think about it) and looks seemingly neat (organized as a tree).

That thing alone is enough to dislike it: it's just too easy for a crate to grow. But unfortunately, Rust doesn't stop here: other than making it too easy to inflate a crate to unreasonable sizes, it also makes it a bit too hard to manage numerous crates! First of all, as you know, creating a crate is kind of a big deal, with a separate Cargo.toml file, with all its dependencies specified explicitly (the exact versions can thankfully be inherited from the workspace, but every dependency should still exist there explicitly). Maintaining that does add noticeable burden. And second, unlike modules, crates can't be hierarchical: even if we put them in various levels in a directory tree, we'll anyway have to add them all to a flat namespace in a workspace. Those two things make it harder to manage numerous crates, so one would naturally refrain from having too many of them.

You might have noticed that when I'm talking about modules being organized as a tree, I say that it seems to be neatly organized. Because the dependencies are cyclic there, modules aren't actually organized, this tree structure merely provides an illusion, while underneath there is likely a dependency mess, and there can be hundreds of files in that mess, and Rust seemingly encourages that; otherwise why would we need modules to be organized in a tree in the first place. And when one finally resolves to factor things out to separate crate(s), good luck in untangling that mess.

For comparison, let's take a look at how Go goes about this problem.

Go: Modules, Packages and Files

In Go, instead of workspaces, crates and modules, we have modules, packages and files. In a sense, the design of those is actually similar to Rust, they are just named differently:

  • In Go, a project (what you'd typically check into a single repository), is called a module (just like a workspace in Rust)
  • In Go, unit of compilation is a package (just like a crate in Rust)
  • Go packages can depend on other Go packages, and that dependency graph must be acyclic (just like with crates in Rust)
  • Go files within a package can use things from other files within a package, and those dependencies between files can be cyclic (just like with modules in Rust)

So as you see, there are quite a lot of similarities. Go can also compile our packages in parallel due to acyclic dependency graph, and since the unit of compilation is a package, there is the same principle (albeit less prominent) that the bigger a package is, the longer it will take to compile. However the experience of using them is significantly different:

  • A Go package is literally a single directory in the filesystem, without subdirectories (a subdirectory would be just a separate package); so even though one could technically create a huge package with hundreds of files, it's just not convenient to have this many files in a single directory. So there is a certain mental barrier which would prevent one from creating a huge package, we would naturally seek to split things up;
  • Creating a new Go package is a total no-brainer: again, a package is just a directory, so all we need to do is to create a directory. Dependencies are only specified once for the whole project (“module” in Go parlance), in the single go.mod file, so there is no overhead in creating a package at all. Just create a directory;
  • Packages can form a hierarchy, e.g. a package foo can contain a subpackage foo/bar etc etc, so if one has a huge project with like a hundred packages or more, it's not a problem and scales well.

As we can tell at this point, those choices actually turn out to be pretty wise: in those subtle ways, Go encourages us to keep our packages reasonably sized, by making big packages inconvenient to work with, and by making it super easy to create new packages, and lots of them. One could technically still abuse it (by creating a package with tons of files in it), but it would feel wrong right away. That's a trait of a well-designed system: it's difficult to abuse.

And we can also notice that Rust reverses all those points! As explained above:

  • A crate can contain hundreds of modules hidden within a neat directory tree, thus it feels natural to do it;
  • Creating a new crate requires maintaining a separate Cargo.toml file, which adds burden and so it's natural to avoid it;
  • Crates can't be hierarchical, which makes it even harder to manage e.g. a hundred or more of them.

Conclusion

So while Go encourages having numerous reasonably sized packages, Rust unfortunately encourages having few large crates.

It becomes even more ironic if we consider that regardless of all that, Rust is just inherently slower to compile. Don't get me wrong here btw: it definitely makes sense that Rust compiler takes more time than Go, at least because its borrow-checker needs to do a lot of work which Go simply doesn't do. So yeah, of course Rust is inherently slower to compile, and the borrow-checker is probably the main reason of why Rust is so awesome.

But then knowing about it, it would only make sense to mitigate that inherent slowness wherever we can, and yet Rust does the opposite. Instead of mitigating the slowness, the design of crates and modules magnifies it even more. So in the end, Rust encourages devs to do things which will make them suffer, for no good reason. That's half of why I called it a “poor practice” in the title.

And the other half is: even regardless of compile times, from my experience, having to split software components up properly in an acyclic dependency graph results in better-designed systems. One could say that Rust crates and modules are more flexible than Go packages and files - yeah that's true, but here by restricting what we can do, Go enforces better code organization in practice. In fact, Rust as a language is a lot stricter than Go, C++ and many others, and Rust community itself likes to brag that by restricting what we can do in safe Rust, we end up with better software - I agree here, but then I think it'd make sense to take a similar approach with dependency management as well.

Discuss on Hacker News, Reddit, Lobsters, or right here:

Discussion

Enter your comment (please, English only). Wiki syntax is allowed:
   ____  _____   ____   ____  _   __
  / __/ / ___/  / __/  / __/ | | / /
 / _/  / /__   / _/   _\ \   | |/ / 
/___/  \___/  /___/  /___/   |___/
 
articles/rust_module_system_encourages_bad_practices.txt · Last modified: 2023/06/16 19:13 by dfrank
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0