Rust as a gateway drug to Haskell

Posted on Tue 13 June 2017 in Programming • Tagged with Rust, Haskell, traits, typeclasses, monads, ADTs, FPLeave a comment

For work-related reasons, I had to recently get up to speed on programming in Haskell.

Before that, I had very little actual experience with the language, clocking probably at less than a thousand lines of working code over a couple of years. Nothing impressive either: some wrapper script here, some experimental rewrite there…

These days, I heard, there are a few resources for learning Haskell1 that don’t require having a PhD in category theory2. They may be quite helpful when your exposure to the functional programming is limited. In my case, however, the one thing that really enabled me to become (somewhat) productive was not even related to Haskell at all.

It was Rust.

In theory, this shouldn’t really make much of a sense. If you compare both languages by putting checkmarks in a feature chart, you won’t find them to have much in common.

Some of the obvious differences include:

  • predominantly functional vs. mostly imperative
  • garbage collection vs. explicit memory management
  • lazy vs. eager evaluation
  • rich runtime3 vs. almost no runtime
  • global vs. localized type inference
  • indentation vs. braces
  • two decades (!) vs. barely two years since release

Setting aside syntax, most of those differences are pretty significant.

You probably wouldn’t use Haskell for embedded programming, for instance, both for performance (GC) and memory usage reasons (laziness). Similarly, Rust’s ownership system can be too much of a hassle for high level code that isn’t subject to real time requirements.

But if you look a little deeper, beyond just the surface descriptions of both languages, you can find plenty of concepts they share.

Traits: they are typeclasses, essentially

Take Haskell’s typeclasses, for example — the cornerstone of its rich and expressive type system.

A typeclass is, simply speaking, a list of capabilities: it defines what a type can do. There exist analogs of typeclasses in most programming languages, but they are normally called interfaces or protocols, and remain closely tied to the object-oriented paradigm.

Not so in Haskell.

Or in Rust for that matter, where the equivalent concept exists under the name of traits. What typeclasses and traits have in common is that they’re used for all kinds of polymorphism in their respective languages.

Generics

For example, let’s consider parametrized types, sometimes also referred to as templates (C++) or generics (C#).

In many cases, a generic function or type requires its type arguments to exhibit certain characteristics. In some languages (like the legacy C++), this is checked only implicitly: as long as the template type-checks after its expansion, everything is okay:

template <typename T> T min(T a, T b) {
    return a > b ? b : a;
}

struct Foo {};

int main() {
    min(1, 2);  // OK
    min(Foo(), Foo());  // ERROR, no operator `>`
}

More advanced type systems, however, allow to specify the generic constraints explicitly. This is the case in Rust:

fn min<T: Ord>(a: T, b: T) -> T {
    if a > b { b } else { a }
}

as well as in Haskell:

min :: (Ord a) => a -> a -> a
min a b = if a > b then b else a

In both languages, the notion of a type supporting certain operations (like comparison/ordering) is represented as its own, first-class concept: a trait (Rust) or a typeclass (Haskell). Since the compiler is aware of those constraints, it can verify that the min function is used correctly even before it tries to generate code for a specific substitution of T.

Dynamic dispatch

On the other hand, let’s look at runtime polymorphism: the one that OO languages implement through abstract base classes and virtual methods. It’s the tool of choice if you need a container of objects of different types, which nevertheless all expose the same interface.

To offer it, Rust has trait objects, and they work pretty much exactly like base class pointers/references from Java, C++, or C#.

// Trait definition
trait Draw {
    fn draw(&self);
}

// Data type implementing the trait
struct Circle { radius: i32 }
impl Draw for Circle {
    fn draw(&self) { /* omitted */ }
}

// Usage
fn draw_all(objects: &Vec<Box<Draw>>) {
    for &obj in objects {
        obj.draw();
    }
}

The Haskell analogue is, in turn, based on typeclasses, though the specifics can be a little bit trickier:

{-# LANGUAGE ExistentialQuantification #-}

-- Typeclass definition
class Draw a where
    draw :: a -> IO ()

-- Polymorphic wrapper type
data Draw' = forall a. Draw a => Draw' a
instance Draw Draw' where
    draw (Draw' d) = draw d

-- Data types instantiating ("implementing") the typeclass
data Circle = Circle ()
instance Draw Circle where draw = undefined -- omitted
data Square = Square ()
instance Draw Square where draw = undefined -- omitted

-- Usage
drawAll :: (Draw a) => [a] -> IO ()
drawAll ds = mapM_ draw ds

main = do
    let shapes = [Draw' Circle (), Draw' Square ()]
    drawAll shapes

Here, the generic function can use typeclass constraints directly ((Draw a) => ...), but creating a container of different object types requires a polymorphic wrapper4.

Differences

All those similarities do not mean that Rust traits and Haskell typeclasses are one and the same. There are, in fact, quite a few differences, owing mostly to the fact that Haskell’s type system is more expressive:

  • Rust lacks higher kinded types, making certain abstractions impossible to encode as traits. It is possible, however, to implement a trait for infinitely many types at once if the implementation itself is generic (like here).

  • When defining a trait in Rust, you can ask implementors to provide some auxiliary, associated types in addition to just methods5. A similar mechanism in Haskell is expanded into type families, and requires enabling a GHC extension.

  • While typeclasses in Haskell can be implemented for multiple types simultaneously via a GHC extension, Rust’s take on this feature is to make traits themselves generic (e.g. trait Foo<T>). The end result is roughly similar; however, the “main implementing type” (one after for in impl ... for ...) is still a method receiver (self), just like in OO languages.

  • Rust enforces coherence rules on trait implementations. The topic is actually rather complicated, but the gist is about local (current package) vs. remote (other packages / standard library) traits and types.
    Without too much detail, coherence demands that there be a local type or trait somewhere in the impl ... for ... construct. Haskell doesn’t have this limitation, although it is recommended not to take advantage of this.

The M-word

Another area of overlap between Haskell and Rust exists in the data model utilized by those languages. Both are taking heavy advantage of algebraic data types (ADT), including the ability to define both product types (“regular” structs and records) as well as sum types (tagged unions).

Maybe you’d like Some(T)?

Even more interestingly, code in both languages makes extensive use of the two most basic ADTs:

  • Option (Rust) or Maybe (Haskell) — for denoting a presence or absence of a value
  • Result (Rust) or Either (Haskell) — for representing the alternative of “correct” and “erroneous” value

These aren’t just simple datatypes. They are deeply interwoven into the basic semantics of both languages, not to mention their standard libraries and community-provided packages.

The Option/Maybe type, for example, is the alternative to nullable references: something that’s been heavily criticized for making programs prone to unexpected NullReferenceExceptions. The idea behind both of those types is to make actual values impossible to confuse with nulls by encoding the potential nullability into the type system:

enum Option<T> { Some(T), None }
data Maybe a = Just a | Nothing

Result and Either, on the other hand, can be thought as an extension of this idea. They also represent two possibilities, but the “wrong” one isn’t just None or Nothing — it has some more information associated with it:

enum Result<T, E> { Ok(T), Err(E) }
data Either e a = Left e | Right a

This dichotomy between the Ok (or Right) value and the Error value (or the Left one) makes it a great vehicle for carrying results of functions that can fail.

In Rust, this replaces the traditional error handling mechanisms based on exceptions. In Haskell, the exceptions are present and sometimes necessary, but Either is nevertheless the preferred approach to dealing with errors.

What to do?

One thing that Haskell does better is composing those fallible functions into bigger chunks of logic.

Relatively recently, Rust has added the ? operator as a replacement for the try! macro. This is now the preferred way of error propagation, allowing for a more concise composition of functions that return Results:

/// Read an integer from given file.
fn int_from_file(path: &Path) -> io::Result<i32> {
    let mut file = fs::File::open(path)?;
    let mut s = String::new();
    file.read_to_string(&mut s)?;
    let result = s.parse().map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))?;
    Ok(result)
}

But Haskell had it for much longer, and it’s something of a hallmark of the language and functional programming in general — even though it looks thoroughly imperative:

intFromFile :: FilePath -> IO Int
intFromFile path = do
    s <- readFile path
    i <- readIO s
    return i

If you haven’t seen it before, this is of course a monad — the IO monad, to be precise. While discussing monads in detail is way outside of the scope of this article, we can definitely notice some analogies with Rust. The do notation with <- arrows is evidently similar to how in Rust you’d assign the result of a fallible operation after “unpacking” it with ?.

But of course, there’s plenty of different monads in Haskell: not just IO, but also Either, Maybe, Reader, Writer, Cont, STM, and many others. In Rust (at least as of 1.19), the ? operator only works for Result types, although there is some talk about extending it to Option as well6.

Eventually, we may see the language adopt some variant of the do notation, though the motivation for this will most likely come from asynchronous programming with Futures rather than plain Results. General monads, however, require support for higher kinded types which isn’t coming anytime soon.

A path through Rust?

Now that we’ve discussed those similarities, the obvious question arises.

Is learning Rust worthwhile if your ultimate goal is getting proficient at functional programming in general, or Haskell in particular?

My answer to that is actually pretty straightforward.

If “getting to FP” is your main goal, then Rust will not help you very much. Functional paradigm isn’t the main idea behind the language — its shtick is mostly memory safety, and zero-cost abstractions. While it succeeds somewhat at being “Haskell Lite”, it really strives to be safer C++7.

But if, on the other hand, you regard FP mostly as a curiosity that seems to be seeping into your favorite imperative language at an increasing rate, Rust can be a good way to gain familiarity with this peculiar beast.

At the very least, you will learn the functional way of modeling programs, with lots of smart enums/unions and structs but without inheritance.

And the best part is: you will be so busy fighting the borrow checker you won’t even notice when it happens ;-)


  1. Just ask in #haskell-beginners on Freenode if you’re interested. 

  2. Though ironically, I found the CT lectures by Bartosz Milewski very helpful in developing the right intuitions, even though they’re very abstract. 

  3. For example, Haskell has green threads (created with forkIO) which are somewhat similar to goroutines from Go. To get anything remotely similar in Rust, you need to use external libraries

  4. Note that such containers aren’t very idiomatic Haskell. A more typical solution would be to just curry the draw function, implicitly putting the Draw object inside its closure. 

  5. This mechanisms expands to associated constants in Rust 1.20. 

  6. Those two types also have a form of monadic bind (>>= in Haskell) exposed as the and_then method

  7. If you want another language for easing into the concept of functional programming, I’ve heard that Scala fills that niche quite well. 

Continue reading

Rust: first impressions

Posted on Thu 10 December 2015 in Code • Tagged with Rust, pointers, types, FP, OOP, traitsLeave a comment

Having recently been writing some C++ code at work, I had once again experienced the kind of exasperation that this cumbersome language evokes on regular basis. When I was working in it less sporadically, I was shrugging it off and telling myself it’s all because of the low level it operates on. Superior performance was the other side of the deal, and it was supposed to make all the trade-offs worthwhile.

Now, however, I realized that running close to the metal by no means excuses the sort of clunkiness that C++ permits. For example, there really is no reason why the archaically asinine separation of header & source files — with its inevitable redundancy of declarations and definitions, worked around with Java-esque contraptions such as pimpl — is still the bread and butter of C++ programs.
Same goes for the lack of sane dependency management, or a universal, portable build system. None of those would be at odds with native compilation to machine code, or runtime speeds that are adequate for real-time programs.

Rather than dwelling on those gripes, I thought it’d be more productive to look around and see what’s the modern offerring in the domain of lower level, really fast languages. The search wasn’t long at all, because right now it seems there is just one viable contender: Rust1.

Rusty systems

Rust introduces itself as a “systems programming language”, which is quite a bold claim. What followed the last time this phrase has been applied to an emerging language — Go — was a kind of word twisting that’s more indicative of politics, not computer science.

But Rust’s pretense to the system level is well justified. It clearly provides the requisite toolkit for working directly with the hardware, be it embedded controllers or fully featured computers. It offers compilation to native machine code; direct memory access; running time guarantees thanks to the lack of GC-incuded stops; and great interoperability through static and dynamic linkage.

In short, with Rust you can wreak havoc against the RAM and twiddle bits to your heart’s content.

Safe and sound

To be fair, though, the “havoc” part is not entirely accurate. Despite its focus on the low level, efficient computing, Rust aims to be a very safe language. Unlike C, it actively tries to prevent the programmer from shooting themselves in the foot — though it will hand you the gun if you but ask for it.

The safety guarantees provided by Rust apply to resource management, with the specific emphasis on memory and pointers to it. The way that most contemporary languages deal with memory is by introducing a garbage collector which mostly (though not wholly) relieves the programmer from thinking about allocations and deallocations. However, the kind of global, stop-the-world garbage collections (e.g. mark-and-sweep) is costly and unpredictable, ruling it out as a mechanism for real-time systems.

For this reason, Rust doesn’t mandate a GC of this kind2. And although it offers mechanisms that are similar to smart pointers from C++ (e.g. std::shared_ptr), it is actually preferable and safer to use regular, “naked” pointers: &Foo versus Cell<Foo> or RefCell<Foo> (which are some of the Rust’s “smart pointer” types).

The trick is in the clever compiler. As long as we use regular pointers, it is capable of detecting potential memory bugs at compilation time. They are referred to as “data races” in Rust’s terminology, and include perennial problems that will segfault any C code which wasn’t written with utmost care.

Part of those safety guarantees is also the default immutability of references (pointers). The simplest reference of type &Foo in Rust translates to something like const Foo * const in C3. You have to explicitly request mutability with the mut keyword, and Rust ensures there is always at most one mutable reference to any value, thus preventing problems caused by pointer aliasing.

But what if you really must sling raw pointers, and access arbitrary memory locations? Maybe you are programming a microcontroller where I/O is done through a special memory region. For those occasions, Rust has got you covered with the unsafe keyword:

// Read the state of a diode in some imaginary uC.
fn get_led_state(i: isize) -> bool {
    assert!(i >= 0 && i <= 4, "There are FOUR lights!");
    let p: *const u8 = 0x1234 as *const u8;  // known memory location
    unsafe { *p .offset(i) != 0 }
}

Its usage, like in the above example, can be very localized, limited only to those places where it’s truly necessary and guarded by the appropriate checks. As a result, the interface exposed by the above function can be considered safe. The unrestricted memory access can be contained to where it’s really inevitable.

Typing counts

Ensuring memory safety is not the only way in which Rust differentiates itself from C. What separates those two languages is also a few decades of practice and research into programming semantics. It’s only natural to expect Rust to take advantage of this progress.

And advantage it takes. Although Rust’s type system isn’t nearly as advanced and complex like — say — Scala’s, it exhibits several interesting properties that are indicative of its relatively modern origin.

First, it mixes the two most popular programming paradigms — functional and object-oriented — in roughly equal concentrations, as opposed to being biased towards the latter. Rust doesn’t have interfaces or classes: it has traits and their implementations. Even though they often fulfill similar purposes of abstraction and encapsulation, these constructs are closer to the concepts of type classes and their instances, which are found for example in Haskell.

Still, the more familiar notions of OOP aren’t too far off. Most of the key functionality of classes, for example, can be simulated by implementing “default” traits for user-defined types:

struct Person {
    first_name: String,
    last_name: String,
}

impl Person {
    fn new(first_name: &str, last_name: &str) -> Person {
        Person {
            first_name: first_name.to_string(),
            last_name: last_name.to_string(),
        }
    }

    fn greet(&self) {
        println!("Hello, {}!", self.first_name);
    }
}

// usage
let p = Person::new("John", "Doe");
p.greet();

The second aspect of Rust’s type system that we would come to expect from a new language is its expressive power. Type inference is nowadays a staple, and above we can observe the simplest form of it. But it extends further, to generic parameters, closure arguments, and closure return values.

Generics, by the way, are quite nice as well. Besides their applicability to structs, type aliases, functions, traits, trait implementations, etc., they allow for constraining their arguments with traits. This is similar to the abandoned-and-not-quite-revived-yet idea of concepts in C++, or to an analogous mechanism from C#.

The third common trend in contemporary language design is the use of type system to solve common tasks. Rust doesn’t go full Haskell and opt for monads for everything, but its Option and Result types are evidently the functional approach to error handling4. To facilitate their use, a powerful pattern matching facility is also present in Rust.

Unexpectedly pythonic

If your general go-to language is Python, you will find Rust a very nice complement and possibly a valuable instrument in your coding arsenal. Interoperability between Python and Rust is stupidly easy, thanks to both the ctypes module and the extreme simplicity of creating portable, shared libraries in Rust. Offloading some expensive, GIL-bypassing computation to a fast, native code written in Rust can thus be a relatively painless way of speeding up crucial parts of a Python program.

But somewhat more surprisingly, Rust has quite a few bits that seem to be directly inspired by Python semantics. Granted, those two languages are conceptually pretty far apart in general, but the analogies are there:

  • The concept of iterators in Rust is very similar to iterables in Python. Even the for loop is basically identical: rather than manually increment a counter, both in Rust and Python you iterate over a range of numbers.
    Oh, and both languages have an enumerate method/ function that yields pairs of (index, element).

  • Syntax for method definition in Rust uses the self keyword as first argument to distinguish between instance methods and “class”/”static” methods (or associated functions in Rust’s parlance). This is even more pythonic than in actual Python, where self is technically just a convention, albeit an extremely strong one.

  • In either language, overloading operators doesn’t use any new keywords or special syntax, like it does in C++, C#, and others. Python accomplishes it through __magic__ methods, whereas Rust has very similarly named operator traits.

  • Rust basically has doctest. If you don’t know, the doctest module is a standard Python testing utility that can run usage examples found in documentation comments and verify their correctness. Rust version (rustdoc) is even more powerful and flexible, allowing for example to mark additional boilerplate lines that should be run when testing examples, but not included in the generated documentation.

I’m sure the list doesn’t end here and will grow over time. As of this writing, for example, nightly builds of Rust already offer advanced slice pattern matching which are very similar to the extended iterable unpacking from Python 3.

Is it worth it?

Depending on your background and the programming domain you are working in, you may be wondering if Rust is a language that’s worth looking into now, or in the near future.

Firstly, let me emphasize that it’s still in its early stages. Although the stable version 1.0 has been released a good couple of months ago, the ecosystem isn’t nearly as diverse and abundant as in some of the other new languages.

If you are specifically looking to deploying Rust-written API servers, backends, and other — shall I use the word — microservices, then right now you’ll probably be better served by more established solutions, like Java with fibers, asynchronous Python on PyPy, Erlang, Go, node.js, or similar. I predict Rust catching up here in the coming months, though, because the prospect of writing native speed JSON slingers with relative ease is just too compelling to pass.

The other interesting area for Rust is game programming, because it’s one of the few languages capable of supporting even the most demanding AAA+ productions. The good news is that portable, open source game engines are already here. The bad news is that most of the existing knowledge about designing and coding high performance games is geared towards writing (stripped down) C++. The community is also rather stubborn reluctant to adopt anything that may carry even a hint of potentially unknown performance implications. Although some inroads have been made (here’s, for example, an entity component system written in Rust), and I wouldn’t be surprised to see indie games written in Rust, it probably won’t take over the industry anytime soon.

When it comes to hardware, though, Rust may already have the upper hand. It is obviously much easier language to program in than pure C. Along with its toolchain’s ability to produce minimal executables, it makes for a compelling language for programming microcontrollers and other embedded devices.

So in short, Rust is pretty nice. And if you have read that far, I think you should just go ahead and have a look for yourself :)


  1. Because as much as we’d like for D to finally get somewhere, at this point we may have better luck waiting for the Year of Linux on Desktop to dawn… 

  2. Of course, nobody has stopped the community from implementing it

  3. Strictly speaking, it’s the binding such as let x = &foo; that translates to it. Unadorned C pointer type Foo* would correspond to mutable binding to a mutable reference in Rust, i.e. let mut x = &mut foo;

  4. Their Haskell equivalents are Maybe and Either type classes, respectively. 

Continue reading