A Haskell retrospective

Posted on Sat 18 August 2018 in Programming

Approximately a year ago, I had the opportunity to work on Sigma — a large, distributed system that protects Facebook users from spam and other kinds of abuse.

One reason it was a pretty unique experience is that Sigma is almost entirely a Haskell codebase. It was the first time I got to work with the language in a professional setting, so I was eager to see how it performs in a real-world, production-grade application.

In this (rather long) post, I’ll draw on this experience and highlight Haskell’s notable features from a practical, engineering standpoint. In other words, I’ll be interested in how much does it help with solving actual problems that arise in the field of software development & maintenance.

Haskell Who?

Before we start, however, it seems necessary to clarify what “Haskell” are we actually talking about.

Granted, this may be a little surprising. From a far-away vantage point, Haskell is typically discussed as a rather uniform language, and it is often treated as synonymous with functional programming in general.

But if you look closer, that turns out to be a bit of a misrepresentation. In reality, Haskell is a complex manifold of different components, some of which can be thought as their own sublanguages. Roughly speaking, Haskell — as it’s used in the industry and in the OSS world today — should be thought of as a cake with at least the following layers:

The base Haskell language, as defined by the Haskell ‘98 and 2010 reports. At least in theory, this is the portable version of the language that any conforming compiler is supposed to accept. In practice, given the absolute monopoly of GHC, it is merely a theoretical base that needs to be significantly augmented in order to reach some level of practical usability.
A bunch of GHC extensions that are widely considered mandatory for any real-world project. Some, like TupleSections or MultiParamTypeClasses, are mostly there to fix some surprising feature gaps that would be more confusing if you had worked around them instead. Others, like GADTs or DataKinds, open up completely new avenues for type-level abstractions.
A repertoire of common third-party libraries with unique DSLs, like conduit, pipes, or lens. Unlike many “regular” packages that merely bring in some domain-specific API, these fundamental libraries shape both the deeper architecture and the surface-level look & feel of any Haskell codebase that uses them.
A selection of less common extensions which are nevertheless encountered in Haskell code with some regularity.
Template Haskell, the language for compile-time metaprogramming whose main application is probably generics.
To be clear, neither “template” nor “generics” have anything to do with the usual meanings of those terms in C++ and Java/C#/Go¹. Rather, it refers to a kind of AST-based “preprocessing” that allows Haskell code to operate on the generic structure of user-defined types: their constructors, parameters, and record fields².
Direct use of TH in application code is extremely rare, but many projects rely on libraries which utilize it behind the scenes. A great example would be Persistent, a database interface library where the ORM uses Template Haskell to construct record types from a DB schema at compile time.

There is a language in my type system

What’s striking about this ensemble of features and ideas is that most of them don’t seem to follow from the ostensible premise of the language: that it is functional, pure / referentially transparent, and non-strict / lazily evaluated. Instead, they are mostly a collection of progressively more sophisticated refinements and applications of Haskell’s type system.

This singular focus on type theory — especially in the recent years³ — is probably why many people in the wider programming world think it is necessary to grok advanced type system concepts if you even want to dabble in functional programming

That is, of course, patently untrue⁴. Some features of a strong static type system are definitely useful to have in a functional language. You can look at Elm to see how awkward things become when you deprive an FP language of its typeclasses and composition sugar.

But when the focus on type systems becomes too heavy, the concepts keep piling up and the language becomes increasingly impenetrable. Eventually, you may end up with an ecosystem where the recommended way to implement an HTTP API is to call upon half a dozen compiler extensions in order to specify it as one humongous type.

But hey, isn’t it desirable to have this kind of increased type safety?

In principle, the answer would of course be yes. However, the price we pay here is in the precious currency of complexity, and it often turns out to be way too high. When libraries, frameworks, and languages get complicated and abstract, it’s not just safety and/or productivity that can (hopefully) increase — it is also the burden on developers’ thought processes. While the exact threshold of diminishing or even negative returns is hard to pinpoint, it can definitely be reached even by the smartest and most talented teams. Add in the usual obstacles of software engineering — shifting requirements, deadlines, turnover — and you may encounter it much sooner than you think.

For some, this is a sufficient justification to basically give up on type systems altogether. And while I’d say such a knee-jerk reaction is rather excessive and unwarranted, it is at least equally harmful to letting your typing regime grow in boundless complexity. Both approaches are just too extreme to stand the test of practicality.

The legacy of bleeding edge

In other words, Haskell is hard and this does count as one of its serious problems. This conclusion isn’t exactly novel or surprising, even if some people would still argue with it.

Suppose, however, that we have somehow caused this issue to disappear completely. Let’s say that through some kind of divine intervention, it was made so that the learning curve of Haskell is no longer a problem for the majority of programmers. Maybe we found a magic lamp and — for the lack of better ideas — we wished that everyone be as proficient in applicative parsers as they are in inheritance hierarchies.

Even in this hypothetical scenario, I posit that the value proposition of Haskell would still be a tough sell.

There is this old quote from Bjarne Stroustrup (creator of C++) where he says that programming languages divide into those everyone complains about, and those that no one uses.
The first group consists of old, established technologies that managed to accrue significant complexity debt through years and decades of evolution. All the while, they’ve been adapting to the constantly shifting perspectives on what are the best industry practices. Traces of those adaptations can still be found today, sticking out like a leftover appendix or residual tail bone — or like the built-in support for XML in Java.

Languages that “no one uses”, on the other hand, haven’t yet passed the industry threshold of sufficient maturity and stability. Their ecosystems are still cutting edge, and their future is uncertain, but they sometimes champion some really compelling paradigm shifts. As long as you can bear with things that are rough around the edges, you can take advantage of their novel ideas.

Unfortunately for Haskell, it manages to combine the worst parts of both of these worlds.

On one hand, it is a surprisingly old language, clocking more than two decades of fruitful research around many innovative concepts. Yet on the other hand, it bears the signs of a fresh new technology, with relatively few production-grade libraries, scarce coverage of some domains (e.g. GUI programming), and not too many stories of commercial successes.

There are many ways to do it

Nothing shows better the problems of Haskell’s evolution over the years than the various approaches to handling strings and errors that it now has.⁵

String theory

Historically, String has been defined as a list of Characters, which is normally denoted as the [Char] type. The good thing about this representation is that many string-based algorithms can simply be written using just the list functions.

The bad thing is that Haskell lists are the so-called cons lists. They consist of the single element (called head), followed by another list of the remaining elements (called tail). This makes them roughly equivalent to what the data structures theory calls a singly-linked list — a rarely used construct that has a number of undesirable characteristics:

linear time (O(n)) for finding a specific element in the list
linear time for finding an element with a specific index in the list
linear time for insertion in the middle of the list
poor cache coherency due to scattered allocations of list nodes⁶

On top of that, keeping only a single character inside each node results in a significant waste of memory.

Given those downsides, it isn’t very surprising that virtually no serious Haskell program uses Strings for any meaningful text processing. The community-accepted replacement is the text package, whose implementation stores strings inside packed arrays, i.e. just as you would expect. As a result, Haskell has at least two main types of “strings” — or even three, since Text has both lazy and strict variants.

That’s not all, however: there is also the bytestring package. Although technically it implements generic byte buffers, its API has been pretty rich and enticing. As a result, many other packages would rather use ByteStrings directly in their interfaces than to incur the conversions to and from Text.
And just like in case of Text, separate lazy and strict variants of ByteString are also available. But unlike Text, byte strings also have Word8 and Char8 versions, where the latter is designed to handle legacy cases of ASCII-exclusive text support.

Well, I hope you kept count of all these types! I also hope you can memorize the correct way of converting between them, because it’s commonplace to see them used simultaneously. This may sometimes happen even within the same library, but it definitely occurs in application code that utilizes many different dependencies. What it usually results in are numerous occurrences of something like Text.pack . foo . Text.unpack, with conversion functions copiously sprinkled in to help win in the Type Tetris.

Errors and how to handle them

A somewhat similar issue applies to error handling. Over the years, Haskell has tried many approaches to this problem, often mixing techniques that are very rarely found in a single language, like exceptions combined with result types.

Nowadays, there is some consensus about those mistakes of the past, but the best we got is their deprecation: the current version of GHC still supports them all.

What are all those techniques? Here’s an abridged list:

the error function, terminating the program with a message (which is obviously discouraged)
the fail method of the Monad typeclass (which is now deprecated and moved to MonadFail)
the MonadError class with the associated ErrorT transformer, now deprecated in favor of…
a different MonadError class, with ExceptT as the new transformer
exceptions in the IO monad, normally raised by the standard I/O calls to signal abnormal conditions and error; however, libraries and application code are free to also throw them and use for their own error handling
the Either sum type / monad, which is essentially a type-safe version of the venerable return codes

If you really stretched the definition of error handling, I could also imagine counting Maybe/MaybeT as yet another method. But even without it, that’s half a dozen distinct approaches which you are likely to encounter in the wild in one form or another.

Implicit is better than explicit

The other kind of troublesome legacy of Haskell relates to the various design choices in the language itself. They reflect ideas straight from the time they were conceived in, which doesn’t necessarily agree with the best engineering practices as we understand them now.

Leaky modules

Take the module system, for example.

Today, it is rather uncontroversial that the purpose of splitting code into multiple submodules is to isolate it as much as possible and prevent accidental dependencies. The benefit of such isolation is better internal cohesion for each module. This can simplify testing, improve readability, foster simplicity, and reduce cognitive burden on the maintainers.

Contemporary languages help achieving this goal by making inter-module dependencies explicit. If you want to use a symbol (functions, class) from module A inside another module B, you typically have to both:

declare it public in module A
explicitly import its name in module B

The first step helps to ensure that the API of module A is limited and tractable. The second step does the same to the external dependencies of module B.

Unfortunately, Haskell requires neither of those steps. In fact, it encourages precisely the opposite of well-defined, self-contained modules, all by the virtue of its default behaviors:

the default module declaration (module Foo where ...) implicitly declares every symbol defined in the module Foo as public and importable by others
the default import statement (import Foo) brings in every public symbol from the module Foo into the global namespace of the current module

In essence, this is like putting public on each and every class or method that you’d define in a Java project, while simultaneously using nothing but wildcard (star) imports. In a very short order, you will end up with project where everything depends on everything else, and nothing can be developed in isolation.

Namespaces are apparently a bad idea

Thankfully, it is possible to avoid this pitfall by explicitly declaring both your exported and imported symbols:

-- Foo.hs --
module Foo ( foo, bar ) where

foo = ...
bar = ...
baz = ...  -- not exported

-- Bar.hs --
import Foo (foo)
-- `bar` is inaccessible here, but `foo` is available

But while this helps fighting the tangle of dependencies, it still results in cluttering the namespace of any non-trivial module with a significant number of imported symbols.

In many other languages, you can instead import the module as a whole and only refer to its members using qualified names. This is possible in Haskell as well, though it requires yet another variant of the import statement:

import qualified Data.Text as Text

duplicateWords :: Text.Text -> Text.Text
duplicateWords = Text.unwords . map (Text.unwords . replicate 2) . Text.words

What if you want both, though? In the above code, for example, the qualified name Text.Text looks a little silly, especially when it’s such a common type. It would be nice to import it directly, so that we can use it simply as Text.

Unfortunately, this is only possible when using two import statements:

import Data.Text (Text)
import qualified Data.Text as Text

duplicateWords :: Text -> Text
duplicateWords = Text.unwords . map (Text.unwords . replicate 2) . Text.words

You will find this duplication pervasive throughout Haskell codebases. Given how it affects the most important third-party packages (like text and bytestring), there have been a few proposals to improve the situation⁷, but it seems that none can go through the syntax bikeshedding phase.

Contrast this with Rust, for example, where it’s common to see imports such as this:

use std::io::{self, Read};

fn read_first_half(path: &Path) -> io::Result<String> {
    // (omitted)
}

where self conveniently stands for the module as a whole.

Wild records

Another aspect of the difficulties with keeping your namespaces in check relates to Haskell record types — its rough equivalent of structs from C and others.

When you define a record type:

data User = User { usrFirstName :: String
                 , usrLastName :: String
                 , usrEmail :: String
                 } deriving (Show)

you are declaring not one but multiple different names, and dumping them all straight into the global namespace. These names include:

the record type (here, User)
its type constructor (also User, second one above)
all of its fields (usrFirstName, usrLastName, usrEmail)

Yep, thats right. Because Haskell has no special syntax for accessing record fields, each field declaration creates an unqualified getter function. Combined with the lack of function overloading, this creates many opportunities for name collisions.

This is why in the above example, Hungarian notation is used to prevent those clashes. Despite its age and almost complete disuse in basically every other language, it is still a widely accepted practice in Haskell⁸.

Purity beats practicality

We have previously discussed the multiple ways of working with strings and handling errors in Haskell. While somewhat confusing at times, there at least appears to be an agreement in the community as to which one should generally be preferred.

This is not the case for some subtler and more abstract topics.

Haskell is, famously, a purely functional programming language. Evaluating functions, in a mathematical sense, is all a Haskell program is supposed to be doing. But the obvious problem is that such a program wouldn’t be able to do anything actually useful; there needs to be some way for it to effect the environment it runs in, if only to print the results it computed.

How to reconcile the functional purity with real-world applications is probably the most important problem that the Haskell language designers have to contend with. After a couple of decades of research and industrial use it still doesn’t have a satisfactory answer.

Yes, there is the IO monad, but it is a very blunt instrument. It offers a distinction between pure code and “effectful” code, but allows for no granularity or structure for the latter. An IO-returning function can do literally anything, while a pure function can only compute some value based on its arguments. Most code, however, is best placed somewhere between those two extremes.

How to represent different varieties of effects (filesystem, logging, network, etc.)?
How to express them as function constraints that can be verified by the compiler?
How to compose them? How to extend them?

These (and others) are still very much open questions in the Haskell community. The traditional way of dealing with them are monad transformers, but they suffer from many shortcomings⁹. More recent solutions like effects or free monads are promising, but exhibit performance issues that likely won’t be solvable without full compiler support. And even so, you can convincingly argue against those new approaches, which suggests that we may ultimately need something else entirely.

Of course, this state of affairs doesn’t really prevent anyone from writing useful applications in Haskell. “Regular” monads are still a fine choice. Indeed, even if you end up stuffing most of your code inside plain IO, it will already be a step up compared to most other languages.

Good Enough™

Incidentally, something similar could probably be said about the language as a whole.

Yes, it has numerous glaring flaws and some not-so-obvious shortcomings.
Yes, it requires disciplined coding style and attention to readability.
Yes, it will force you to courageously tackle problems that are completely unknown to programmers using other languages.
In the end, however, you will probably find it better than most alternatives.

Basically, Haskell is like pizza: even when it’s bad, it is still pretty good.

But what’s possibly the best thing about it is that you don’t even really need to adopt Haskell in order to benefit from its innovations (and avoid the legacy warts).

There is already a breed of mainstream languages that can aptly be characterized as “Haskell-lite”: heavily influenced by FP paradigms but without subscribing to them completely. The closest example in this category is of course Scala, while the newest one would be Rust.
In many aspects, they offer a great compromise that provides some important functional features while sparing you most of the teething issues that Haskell still has after almost 30 years. Functional purists may not be completely satisfied, but at least they’ll get to keep their typeclasses and monoids.

And what if you don’t want to hear about this FP nonsense at all?… Well, I’m afraid it will get harder and harder to avoid. These days, it’s evidently fine for a language to omit generics but it seems inconceivable to ditch first-class functions. Even the traditional OOP powerhouse like Java cannot do without support for anonymous (“lambda”) functions anymore. And let’s not forget all the numerous examples of monadic constructs that pervade many of the mature APIs, libraries, and languages.

So even if you, understandably, don’t really want to come to Haskell, it’s looking more and more likely that Haskell will soon come to you :)

In case of Go, I’m of course referring to a feature that’s notoriously missing from the language. ↩
For a close analogue in languages other than Haskell, you can look at the current state of procedural macros in Rust (commonly known as “custom derives”). ↩
What seems to excite the Haskell community in 2018, for example, are things like linear types and dependent types. ↩
The obvious counterexample is Clojure and its cousins in the Lisp family of languages. ↩
Although the abundance of pretty-printing libraries is high up there, too :) ↩
This can be mitigated somewhat by using a contiguous chunk of memory through a dedicated arena allocator, or implementing the list as an array. ↩
See for example this project. ↩
Some GHC extensions like DisambiguateRecordFields allow for correct type inference even in case of “overloaded” field names, though. ↩
To name a few: they don’t compose well (e.g. can only have one instance of a particular monad in the stack); they can cause some extremely tricky bugs; they don’t really cooperate with the standard library which uses IO everywhere (often requiring tricks like this). ↩