Extension traits in Rust

Posted on Tue 20 June 2017 in Code • Tagged with Rust, C#, methods, extension methods, traitsLeave a comment

In a few object-oriented languages, it is possible to add methods to a class after it’s already been defined.

This feature arises quite naturally if the language has a dynamic type system that’s modifiable at runtime. In those cases, even replacing existing methods is perfectly possible1.

In addition to that, some statically typed languages — most notably in C# — offer extension methods as a dedicated feature of their type systems. The premise is that you would write standalone functions whose first argument is specially designated (usually by this keyword) as a receiver of the resulting method call:

public static int WordCount(this String str) {
    return str.Split(new char[] { ' ', '.', '?' },
                     StringSplitOptions.RemoveEmptyEntries).Length;
}

At the call site, the new method is indistinguishable from any of the existing ones:

string s = "Alice has a cat.";
int n = s.WordCount();

That’s assuming you have imported both the original class (or it’s a built-in like String), as well as the module in which the extension method is defined.

Rewrite it in Rust

The curious thing about Rust‘s type system is that it permits extension methods solely as a side effect of its core building block: traits.

In this post, I’m going to describe a certain design pattern in Rust which involves third-party types and user-defined traits. Several popular crates — like itertools or unicode-normalization — utilize it very successfully to add new, useful methods to the language standard types.

I’m not sure if this pattern has an official or widely accepted name. Personally, I’ve taken to calling it extension traits.

Let’s have a look at how they are commonly implemented.

Ingredients

We can use the extension trait pattern if we want to have additional methods in a type that we don’t otherwise control (or don’t want to modify).

Common cases include:

  • Rust standard library types, like Result, String, or anything else inside the std namespace
  • types imported from third-party libraries
  • types from the current crate if additional methods only make sense in certain scenarios (e.g. conditional compilation / testing)2

The crux of this technique is really simple. Like with most design patterns, however, it involves a certain degree of boilerplate and duplication.

So without further ado… In order to “patch” some new method(s) into an external type you will need to:

  1. Define a trait with signatures of all the methods you want to add.
  2. Implement it for the external type.
  3. There is no step three.

As an important note on the usage side, the calling code needs to import your new trait in addition to the external type. Once that’s done, it can proceed to use the new methods is if they were there to begin with.

I’m sure you are keen on seeing some examples!

Broadening your Options

We’re going to add two new methods to Rust’s standard Option type. The goal is to make it more convenient to operate on mutable Options by allowing to easily replace an existing value with another one3.

Here’s the appropriate extension trait4:

/// Additional mutation methods for `Option`.
pub trait OptionMutExt<T> {
    /// Replace the existing `Some` value with a new one.
    ///
    /// Returns the previous value if it was present, or `None` if no replacement was made.
    fn replace(&mut self, val: T) -> Option<T>;

    /// Replace the existing `Some` value with the result of given closure.
    ///
    /// Returns the previous value if it was present, or `None` if no replacement was made.
    fn replace_with<F: FnOnce() -> T>(&mut self, f: F) -> Option<T>;
}

It may feel at little bit weird to implement it.
You will basically have to pretend you are inside the Option type itself:

impl<T> OptionMutExt<T> for Option<T> {
    fn replace(&mut self, val: T) -> Option<T> {
        self.replace_with(move || val)
    }

    fn replace_with<F: FnOnce() -> T>(&mut self, f: F) -> Option<T> {
        if self.is_some() {
            let result = self.take();
            *self = Some(f());
            result
        } else {
            None
        }
    }
}

Unfortunately, this is just an illusion. Extension traits grant no special powers that’d allow you to bypass any of the regular visibility rules. All you can use inside the new methods is still just the public interface of the type you’re augmenting (here, Option).

In our case, however, this is good enough, mostly thanks to the recently introduced Option::take.

To use our shiny new methods in other places, all we have to do is import the extension trait:

use ext::rust::OptionMutExt;  // assuming you put it in ext/rust.rs

// ...somewhere...
let mut opt: Option<u32> = ...;
match opt.replace(42) {
    Some(x) => debug!("Option had a value of {} before replacement", x),
    None => assert_eq!(None, opt),
}

It doesn’t matter where it was defined either, meaning we can ship it away to crates.io and let it accrue as many happy users as Itertools has ;-)

Are you hyper::Body ready?

Our second example will demonstrate attaching more methods to a third-party type.

Last week, there was a new release of Hyper, a popular Rust framework for HTTP servers & clients. It was notable because it marked a switch from synchronous, straightforward API to a more complex, asynchronous one (which I incidentally wrote about a few weeks ago).

Predictably, there has been some confusion among its new and existing users.

We’re going to help by pinning a more convenient interface on hyper’s Body type. Body here is a struct representing the content of an HTTP request or response. After the ‘asyncatastrophe’, it doesn’t allow to access the raw incoming bytes as easily as it did before.

Thanks to extension traits, we can fix this rather quickly:

use std::error::Error;

use futures::{BoxFuture, future, Future, Stream};
use hyper::{self, Body};

pub trait BodyExt {
    /// Collect all the bytes from all the `Chunk`s from `Body`
    /// and return it as `Vec<u8>`.
    fn into_bytes(self) -> BoxFuture<Vec<u8>, hyper::Error>;

    /// Collect all the bytes from all the `Chunk`s from `Body`,
    /// decode them as UTF8, and return the resulting `String`.
    fn into_string(self) -> BoxFuture<String, Box<Error + Send>>;
}

impl BodyExt for Body {
    fn into_bytes(self) -> BoxFuture<Vec<u8>, hyper::Error> {
        self.concat()
            .and_then(|bytes| future::ok::<_, hyper::Error>(bytes.to_vec()))
            .boxed()
    }

    fn into_string(self) -> BoxFuture<String, Box<Error + Send>> {
        self.into_bytes()
            .map_err(|e| Box::new(e) as Box<Error + Send>)
            .and_then(|bytes| String::from_utf8(bytes)
                .map_err(|e| Box::new(e) as Box<Error + Send>))
            .boxed()
    }
}

With these new methods in hand, it is relatively straightforward to implement, say, a simple character-counting service:

use std::error::Error;

use futures::{BoxFuture, future, Future};
use hyper::server::{Service, Request, Response};

use ext::hyper::BodyExt;  // assuming the above is in ext/hyper.rs

pub struct Length;
impl Service for Length {
    type Request = Request;
    type Response = Response;
    type Error = Box<Error + Send>;
    type Future = BoxFuture<Self::Response, Self::Error>;

    fn call(&self, request: Request) -> Self::Future {
        let (_, _, _, _, body) = request.deconstruct();
        body.into_string().and_then(|s| future::ok(
            Response::new().with_body(s.len().to_string())
        )).boxed()
    }
}

Replacing Box<Error + Send> with an idiomatic error enum is left as an exercise for the reader :)

Extra credit bonus explanation

Reading this section is not necessary to use extension traits.

So far, we have seen what extension traits are capable of. It is only right to mention what they cannot do.

Indeed, this technique has some limitations. They are a conscious choice on the part of Rust authors, and they were decided upon in an effort to keep the type system coherent.

Coherence isn’t an everyday topic in Rust, but it becomes important when working with traits and types that cross package boundaries. Rules of trait coherence (described briefly towards the end of this section of the Rust book) state that the following combinations of “local” (this crate) and “external” (other crates5) are legal:

  • implement a local trait for a local type.
    This is common in larger programs that use polymorphic abstractions.
  • implement an external trait for a local type.
    We do this often to integrate with third-party libraries and frameworks, just like with hyper above.
  • implement a local trait for an external type.
    That’s extension traits for you!

What is not possible, however, is to:

  • implement an external trait for an external type

This case is prohibited in order to make the choice of trait implementations more predictable, both for the compiler and for the programmer. Without this rule in place, you could introduce many instances of impl Trait for Type (same Trait and same Type), each one with different functionality, leaving the compiler to “guess” the right impl for any given situation6.

The decision was thus made to disallow the impl ExternalTrait for ExternalType case altogether. If you like, you can read some more extensive backstory behind it.

Bear in mind, however, that this isn’t the unequivocally “correct” solution. Some languages choose to allow this so-called orphan case, and try to resolve the potential ambiguities in various different ways. It is a genuinely useful feature, too, as it makes easier it to glue together two unrelated libraries7.

Thankfully for extension traits, the coherence restriction doesn’t apply as long as you keep those traits and their impls in the same crate.


  1. This practice is often referred to as monkeypatching, especially in Python and Ruby. 

  2. In this case, a more common solution is to just open another impl Foo block, annotated with #[cfg(test)] or similar. An extension trait, however, makes it easier to extract Foo into a separate crate along with some handy, test-only API

  3. Note that this is not the same as the unstable (as of 1.18) Option methods guarded behind the options_entry feature gate

  4. My own convention is to call those traits FooExt if they are meant to enhance the interface of type Foo. The other practice is to mirror the name of the crate that the trait is packaged in; both Itertools and UnicodeNormalization are examples of this style. 

  5. Standard library (std or core namespaces) counts as external crate for this purpose. 

  6. Or throw an error. However, trait impls are always imported implicitly, so this could essentially prevent some combination of different modules/libraries in the ecosystem from being used together, and generally create an unfathomable mess. 

  7. The usual workaround for coherence/orphan rules in Rust involves creating a wrapper around the external type in order to make it “local”, and therefore allow external trait impls for it. This is called the newtype pattern and there are some crates to support it. 

Continue reading

Package managers’ appreciation day

Posted on Sat 26 March 2016 in Programming • Tagged with packages, package manager, node.js, npm, C++Leave a comment

By now you have probably heard about the infamous “npm-gate” that swept through the developer community over the last week. It has been brought up, discussed, covered, meta-discussed, satirized, and even featured by some mainstream media. Evidently the nerds have managed to stir up some serious trouble again, and it only took them 11 lines of that strange thing they call “code”.

No good things in small packages

When looking for a culprit, the one party that everyone pounced on immediately was of course the npm itself. With its myriad of packages that could each fit in a tweet, it invites to create the exact house of cards we’ve seen collapse.

This serves as a good wake-up call, of course. But it also compels to throw the baby out with the bathwater, and draw a conclusion that may be a little too far-fetched. Like perhaps declaring the entire idea of managing dependencies “the npm way” suspect. If packages tend to degenerate into something as ludicrous as isArray — to say nothing of left-pad, which started the whole debacle — then maybe this approach to software reusability has simply bankrupted itself?

A world without *pm

I’m right away responding to that with a resounding “No!”. Package management as a concept is not responsible for the poor decision making of one specific developer collective. And anyone who might think tools like npm do more harm than good I ask: have you recently written any C++?

See, C++ is the odd one among languages that at least pretend to be keeping up with the times. It doesn’t present a package management story at all. That’s right — the C++ “ecosystem”, as it stands now, has:

  • no package manager
  • no repository of packages
  • no unified way of managing dependencies
  • no way to isolate development environments of different projects from one another

Adding any kind of third-party dependency to a C++ project — especially a portable one, which is allegedly one of C++’s strengths — is a considerable pain, even when it doesn’t require any additional libraries by itself. And environment isolation? Some people are using Linux containers (!) for this, which is like dealing with a mosquito by shooting it with a howitzer.

Billions upon billions of lines
To build a C++ binary, you must first build the userspace.

But hey, at least they can use apt-get, right?…

So, string padding incidents aside, package managers are absolutely essential. Sure, we can and should discuss the merits of their particular flavors and implementation details —like whether it’s prudent to allow “delisting” of packages. As a whole, however, package managers deserve recognition as a crucial part of modern language tooling that we cannot really do without.

Continue reading