Karol Kuczmarski's Blog

Extension traits in Rust

Posted on Tue 20 June 2017 in Code • Tagged with Rust, C#, methods, extension methods, traits • Leave a comment

In a few object-oriented languages, it is possible to add methods to a class after it’s already been defined.

This feature arises quite naturally if the language has a dynamic type system that’s modifiable at runtime. In those cases, even replacing existing methods is perfectly possible¹.

In addition to that, some statically typed languages — most notably in C# — offer extension methods as a dedicated feature of their type systems. The premise is that you would write standalone functions whose first argument is specially designated (usually by this keyword) as a receiver of the resulting method call:

public static int WordCount(this String str) {
    return str.Split(new char[] { ' ', '.', '?' },
                     StringSplitOptions.RemoveEmptyEntries).Length;
}

At the call site, the new method is indistinguishable from any of the existing ones:

string s = "Alice has a cat.";
int n = s.WordCount();

That’s assuming you have imported both the original class (or it’s a built-in like String), as well as the module in which the extension method is defined.

Rewrite it in Rust

The curious thing about Rust‘s type system is that it permits extension methods solely as a side effect of its core building block: traits.

In this post, I’m going to describe a certain design pattern in Rust which involves third-party types and user-defined traits. Several popular crates — like itertools or unicode-normalization — utilize it very successfully to add new, useful methods to the language standard types.

I’m not sure if this pattern has an official or widely accepted name. Personally, I’ve taken to calling it extension traits.

Let’s have a look at how they are commonly implemented.

Ingredients

We can use the extension trait pattern if we want to have additional methods in a type that we don’t otherwise control (or don’t want to modify).

Common cases include:

Rust standard library types, like Result, String, or anything else inside the std namespace
types imported from third-party libraries
types from the current crate if additional methods only make sense in certain scenarios (e.g. conditional compilation / testing)²

The crux of this technique is really simple. Like with most design patterns, however, it involves a certain degree of boilerplate and duplication.

So without further ado… In order to “patch” some new method(s) into an external type you will need to:

Define a trait with signatures of all the methods you want to add.
Implement it for the external type.
There is no step three.

As an important note on the usage side, the calling code needs to import your new trait in addition to the external type. Once that’s done, it can proceed to use the new methods is if they were there to begin with.

I’m sure you are keen on seeing some examples!

Broadening your `Option`s

We’re going to add two new methods to Rust’s standard Option type. The goal is to make it more convenient to operate on mutable Options by allowing to easily replace an existing value with another one³.

Here’s the appropriate extension trait⁴:

/// Additional mutation methods for `Option`.
pub trait OptionMutExt<T> {
    /// Replace the existing `Some` value with a new one.
    ///
    /// Returns the previous value if it was present, or `None` if no replacement was made.
    fn replace(&mut self, val: T) -> Option<T>;

    /// Replace the existing `Some` value with the result of given closure.
    ///
    /// Returns the previous value if it was present, or `None` if no replacement was made.
    fn replace_with<F: FnOnce() -> T>(&mut self, f: F) -> Option<T>;
}

It may feel at little bit weird to implement it.
You will basically have to pretend you are inside the Option type itself:

impl<T> OptionMutExt<T> for Option<T> {
    fn replace(&mut self, val: T) -> Option<T> {
        self.replace_with(move || val)
    }

    fn replace_with<F: FnOnce() -> T>(&mut self, f: F) -> Option<T> {
        if self.is_some() {
            let result = self.take();
            *self = Some(f());
            result
        } else {
            None
        }
    }
}

Unfortunately, this is just an illusion. Extension traits grant no special powers that’d allow you to bypass any of the regular visibility rules. All you can use inside the new methods is still just the public interface of the type you’re augmenting (here, Option).

In our case, however, this is good enough, mostly thanks to the recently introduced Option::take.

To use our shiny new methods in other places, all we have to do is import the extension trait:

use ext::rust::OptionMutExt;  // assuming you put it in ext/rust.rs

// ...somewhere...
let mut opt: Option<u32> = ...;
match opt.replace(42) {
    Some(x) => debug!("Option had a value of {} before replacement", x),
    None => assert_eq!(None, opt),
}

It doesn’t matter where it was defined either, meaning we can ship it away to crates.io and let it accrue as many happy users as Itertools has ;-)

Are you `hyper::Body` ready?

Our second example will demonstrate attaching more methods to a third-party type.

Last week, there was a new release of Hyper, a popular Rust framework for HTTP servers & clients. It was notable because it marked a switch from synchronous, straightforward API to a more complex, asynchronous one (which I incidentally wrote about a few weeks ago).

Predictably, there has been some confusion among its new and existing users.

We’re going to help by pinning a more convenient interface on hyper’s Body type. Body here is a struct representing the content of an HTTP request or response. After the ‘asyncatastrophe’, it doesn’t allow to access the raw incoming bytes as easily as it did before.

Thanks to extension traits, we can fix this rather quickly:

use std::error::Error;

use futures::{BoxFuture, future, Future, Stream};
use hyper::{self, Body};

pub trait BodyExt {
    /// Collect all the bytes from all the `Chunk`s from `Body`
    /// and return it as `Vec<u8>`.
    fn into_bytes(self) -> BoxFuture<Vec<u8>, hyper::Error>;

    /// Collect all the bytes from all the `Chunk`s from `Body`,
    /// decode them as UTF8, and return the resulting `String`.
    fn into_string(self) -> BoxFuture<String, Box<Error + Send>>;
}

impl BodyExt for Body {
    fn into_bytes(self) -> BoxFuture<Vec<u8>, hyper::Error> {
        self.concat()
            .and_then(|bytes| future::ok::<_, hyper::Error>(bytes.to_vec()))
            .boxed()
    }

    fn into_string(self) -> BoxFuture<String, Box<Error + Send>> {
        self.into_bytes()
            .map_err(|e| Box::new(e) as Box<Error + Send>)
            .and_then(|bytes| String::from_utf8(bytes)
                .map_err(|e| Box::new(e) as Box<Error + Send>))
            .boxed()
    }
}

With these new methods in hand, it is relatively straightforward to implement, say, a simple character-counting service:

use std::error::Error;

use futures::{BoxFuture, future, Future};
use hyper::server::{Service, Request, Response};

use ext::hyper::BodyExt;  // assuming the above is in ext/hyper.rs

pub struct Length;
impl Service for Length {
    type Request = Request;
    type Response = Response;
    type Error = Box<Error + Send>;
    type Future = BoxFuture<Self::Response, Self::Error>;

    fn call(&self, request: Request) -> Self::Future {
        let (_, _, _, _, body) = request.deconstruct();
        body.into_string().and_then(|s| future::ok(
            Response::new().with_body(s.len().to_string())
        )).boxed()
    }
}

Replacing Box<Error + Send> with an idiomatic error enum is left as an exercise for the reader :)

Extra credit bonus explanation

Reading this section is not necessary to use extension traits.

So far, we have seen what extension traits are capable of. It is only right to mention what they cannot do.

Indeed, this technique has some limitations. They are a conscious choice on the part of Rust authors, and they were decided upon in an effort to keep the type system coherent.

Coherence isn’t an everyday topic in Rust, but it becomes important when working with traits and types that cross package boundaries. Rules of trait coherence (described briefly towards the end of this section of the Rust book) state that the following combinations of “local” (this crate) and “external” (other crates⁵) are legal:

implement a local trait for a local type.
This is common in larger programs that use polymorphic abstractions.
implement an external trait for a local type.
We do this often to integrate with third-party libraries and frameworks, just like with hyper above.
implement a local trait for an external type.
That’s extension traits for you!

What is not possible, however, is to:

implement an external trait for an external type

This case is prohibited in order to make the choice of trait implementations more predictable, both for the compiler and for the programmer. Without this rule in place, you could introduce many instances of impl Trait for Type (same Trait and same Type), each one with different functionality, leaving the compiler to “guess” the right impl for any given situation⁶.

The decision was thus made to disallow the impl ExternalTrait for ExternalType case altogether. If you like, you can read some more extensive backstory behind it.

Bear in mind, however, that this isn’t the unequivocally “correct” solution. Some languages choose to allow this so-called orphan case, and try to resolve the potential ambiguities in various different ways. It is a genuinely useful feature, too, as it makes easier it to glue together two unrelated libraries⁷.

Thankfully for extension traits, the coherence restriction doesn’t apply as long as you keep those traits and their impls in the same crate.

This practice is often referred to as monkeypatching, especially in Python and Ruby. ↩
In this case, a more common solution is to just open another impl Foo block, annotated with #[cfg(test)] or similar. An extension trait, however, makes it easier to extract Foo into a separate crate along with some handy, test-only API. ↩
Note that this is not the same as the unstable (as of 1.18) Option methods guarded behind the options_entry feature gate. ↩
My own convention is to call those traits FooExt if they are meant to enhance the interface of type Foo. The other practice is to mirror the name of the crate that the trait is packaged in; both Itertools and UnicodeNormalization are examples of this style. ↩
Standard library (std or core namespaces) counts as external crate for this purpose. ↩
Or throw an error. However, trait impls are always imported implicitly, so this could essentially prevent some combination of different modules/libraries in the ecosystem from being used together, and generally create an unfathomable mess. ↩
The usual workaround for coherence/orphan rules in Rust involves creating a wrapper around the external type in order to make it “local”, and therefore allow external trait impls for it. This is called the newtype pattern and there are some crates to support it. ↩

Rewrite it in Rust

Ingredients

Broadening your Options

Are you hyper::Body ready?

Extra credit bonus explanation

Broadening your `Option`s

Are you `hyper::Body` ready?