Levans' workshop

Experiments and thoughts about Machine Learning, Rust, and other stuff...

Beware the rust cache on Travis


Today, I was working on optimizing my travis builds to speed them up, especially in the direction of caching. Travis allows you to set cache: cargo in your .travis.yml to enable caching for rust projects and caching is cool: it avoids having to rebuild all dependencies all the time, speeding up the builds.

Or does it?

The travis documentation states this about caching:

Note that this makes our cache not network-local, it is still bound to network bandwidth and DNS resolutions. That impacts what you can and should store in the cache. If you store archives larger than a few hundred megabytes in the cache, it is unlikely that you’ll see a big speed improvement.

Out of curiosity, I ran du -sh in the target/ directory of my local version of wayland-rs. The result was not quite what I expected: 4.2 GB. The kind of things that likely does not play well with travis caching. And no doubt, the travis caches of my project were enourmous: between 5GB and 7GB each.

Why are these caches so large? It's because travis unconditionnaly caches the target/ and ~/.cargo/ directories, which tend to grow very big, and that for two main reasons:

First of all, cargo and rustc, especially since incremental compilation, generate a lot of build artifacts. These artifacts are stored by the travis cache, and are mostly useless, as in your travis build a large part of them will likely be rebuilt anyway.

Secondly, outdated dependencies accumulate. Both dependencies that have released a new version, and build artifacts that were created using a previous version of the compiler. Their data is never deleted, and just sits around, making the caches bigger and bigger over time.

As a result of this observation, I have drastically changed by caching policy:

  • target/ is not worth caching: it is huge, and only build artifacts of not-outdated dependencies are useful in it. I don't know of any way to selectively keep only the relevant part from it, so I can't affort caching it.
  • ~/.cargo/bin/ is worth caching. I really don't want to recompile the cargo subcommands I use in my builds every time as they are generally very long to compile. Combined with the use of cargo-update I can make sure I recompile them only when an update is needed. But for this to work the file ~/.cargo/.crates.toml must be cached too.
  • ~/.cargo/registry/ is not worth caching either: it accumulate a lot of dead weight as dependencies are updated, and downloading its contents from crates.io rather than the travis cache will likely hardly make a difference.

With that the cache portion of my .travis.yml now contains the following, all my builds run between 25% and 50% faster than with cache: cargo, and my travis caches now only weight 68MB.

# Need to cache the whole `.cargo` directory to keep .crates.toml for
# cargo-update to work
cache:
  directories:
    - /home/travis/.cargo

# But don't cache the cargo registry
before_cache:
  - rm -rf /home/travis/.cargo/registry

For sure, what applies to my project does not necessarily applies to yours, but this is a quick analysis that might be worth doing before adding cache: cargo to your travis configuration.