The Printer Monad in Haskell
Posted on Fri 04 August 2017 in Code • Tagged with Haskell, Writer, monads, monad transformers, WriterT • Leave a comment
Quite recently, I have encountered an interesting case of monad-based refactoring in Haskell.
Suppose you have a ComplicatedRecord
that holds the results of some lengthy and important process in your program.
You want to present that data to the user in a nicely formatted way,
so you write a function which begins somewhat like this:
{-# LANGUAGE RecordWildcards #-}
-- | Pretty-print the content of the record.
ppRecord :: ComplicatedRecord -> IO ()
ppRecord ComplicatedRecord{..} = do
-- ...
Inside, there is plenty of putStrLn
calls, likely hidden inside more specific subfunctions
that format all the numerous parts of ComplicatedRecord
.
But the IO
monad isn’t there just for printing:
because the code went through multiple iterations,
some of this logic actually takes advantage of it by making additional system & network calls.
So yeah, it’s not particularly pretty.
Now, however, we find out that the output we’re printing here shouldn’t always go directly to stdout. In some cases, unsurprisingly, we actually want it back as a single string, without having it sent to the standard output at all.
Just $ return . it
Your first instinct here may be to simply give back the final string (well, Text
)
as the function result1:
ppRecord :: ComplicatedRecord -> IO Text
However, this turns out to be rather awkward. While in most other languages we would simply accumulate output by progressively adding more data to a mutable result, this would be much more inconvenient (and somewhat weird) to do in Haskell.
This is where the stdout-based approach seems cleaner; instead of straightforward, sequential code like this:
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.Extra (whenJust)
import Data.Text.IO
import TextShow
ppOrder Order{..} = do
putStrLn $ "Order #" <> ordNumber
ppAddress ordDeliveryAddress
forM_ (zip [1..] ordItems) $ \(i, Item{..}) -> do
putStrLn $ showt (i::Int) <> ". " <> itName <> " x" <> showt itQuantity
whenJust ordBillingAddress ppAddress
ppAddress Address{..} = do
putStrLn $ addrFirstName <> " " <> addrLastName
putStrLn addrLine1
whenJust addrLine2 putStrLn
putStrLn $ addrCity <> ", " <> addrPostalCode
we have to overhaul each function and turn it into a much less pleasant “mappend
age”:
ppOrder Order{..} = unlines $ mconcat
[ "Order #" <> ordNumber
, ppAddress ordDeliveryAddress
, ppItems ordItems
] + maybe [] (\addr -> [ppAddress addr]) ordBillingAddress
where
ppItems = mconcat . map (uncurry ppItem) . zip [1..]
ppItem i Item{..} = showt i <> ". " <> itName <> " x" <> itQuantity
One may argue that this is, in fact, the more idiomatic approach,
but I’m not very fond of all those commas.
Plus, it shows rather clearly that any conditional logic (like with ordBillingAddress
here)
is going to get pretty cumbersome.
Along comes the Writer
What I’m saying here is that even in pure code,
it is sometimes very desirable to have a do
notation.
For that, however, we need a suitable Monad
2 to provide the meaning of “invisible semicolon” in a do
block.
And Text
, obviously, isn’t one.
Neither is [Text]
(lines of text),
nor any other type we’d use to represent the final output of formatting & printing.
They are unsuitable, because they cannot encode the computation that eventually produces said output —
either the top-level one (ppRecord
) or any of its building blocks (like the ppOrder
or ppAddress
),
down to a most elementary putStrLn
.
The only thing they can stand for is the result itself.
Fortunately, the pattern of executing code and occassionally producing some “additional” output
has been abstracted over in the Haskell standard library.
This is exactly the use case for the Writer
monad!
The definition of Writer
is roughly equivalent to the following:
newtype Writer w a = ... -- omitted
Of the two type parameters it takes, the w
one signifies what output it can produce “on the side”.
This is contrasted with a
which is the regular result of a monadic expression or function.
In our case, a
will basically always be ()
(unit/”empty” type),
but it is nonetheless necessary for the Writer
to behave as a monad.
To complement the above definition, Writer
comes with several useful functions.
Among those, the most interesting one is tell
:
tell :: w -> Writer w ()
write
would’ve probably been a better name for it,
as it’s definitely the main and defining operation of Writer
.
Looking at its signature, we can see it takes a bit of the Writer
‘s output (w
)
and results in a Writer
action.
Internally, it will simply add the argument to the already accumulated output of the writer3.
To make everything more concrete,
here’s a literal “Hello world” example coded very verbosly as a Writer
action:
import Control.Monad.Writer
hello :: Writer Text ()
hello = do
tell "Hello"
tell " "
tell "world"
main :: IO ()
main = do
let (_, greeting) = runWriter hello
Text.putStrLn greeting
It also contains the last element of the Writer
puzzle:
runWriter :: Writer w a -> (a, w)
Like its name suggests, this function will “run” any Writer
action that we give it,
returning both the “regular” result (a
) plus any output passed in tell
s (w
).4
My little monad: transformers are magic
The last example may be very simple,
but it contains all the building blocks for many of the printing functions we need.
If we define a convenience wrapper for tell
:
putLn :: Text -> Writer Text ()
putLn line = tell $ line <> "\n"
then both ppAddress
and ppOrder
can be translated
through a mere mechanical substitution of putStrLn
with putLn
:
ppAddress Address{..} = do
putLn $ addrFirstName <> " " <> addrLastName
putLn addrLine1
whenJust addrLine2 putLn
putLn $ addrCity <> ", " <> addrPostalCode
-- ppOrder omitted
Unfortunately, a bare Writer
like this can only work for pure code,
which isn’t a luxury we can expect in every situtation.
In my case, some of the printing logic was tied pretty strongly to IO
,
and it would be difficult and time consuming to decouple it.
Thankfully, the reliance on IO
isn’t a complete deal breaker.
While we cannot ensure that nothing calls putStrLn
anymore,
we can provide the tell
/putLn
capabilities alongside whatever other IO
calls
our code has to make (for now).
To achieve that, we need to create a monad stack with WriterT
:
newtype WriterT w m a = ... -- omitted
WriterT
is a monad transformer, one of those scary Haskell concepts
that are actually simpler than they appear on the surface.
This is because transfomers like WriterT
are mere wrappers.
The only difference between it and a regular Writer
is the additional m
parameter,
which is the inner monad we’re packaging inside a new Writer
.
Here (and in many other cases), m
will be substituted with IO
:
type Printer a = WriterT Text IO a -- w == Text, m == IO
thus creating the titular Printer
monad.
This hybrid beast can both output Text
through the Writer
API,
as well as perform any additional IO
operations
that the code may (still) require.
Below is an example;
the User
record requires an I/O call to get the size of its $HOME
directory:
import Control.Monad.IO.Class (liftIO)
import System.Directory (getFileSize)
-- To print this data type nicely, we sadly require I/O :(
data User = User { usrName :: Text
, usrHomeDir :: FilePath
}
ppUser :: User -> IO Text
ppUser User{..} = snd <$> runWriterT $ do
putLn $ "Name: " <> usrName
homeSize <- liftIO $ getFileSize usrHomeDir
putLn $ "$HOME: " <> showt usrHomeDir <> "(" <> showt homeSize <> " bytes)"
As a bit of necessary cruft,
we have to use liftIO
to “lift” (wrap) IO
actions such as getFileSize
in a full Printer
monad before executing them.
Besides everything else you can think of,
this is yet another argument for eventually getting rid of the IO
:)
Making the monads coexist
But our job isn’t done yet.
Despite looking very reasonable, this version of ppUser
doesn’t actually compile!
The actual type error may vary a little,
but it all boils down to a difference between WriterT Text IO ()
(i.e. Printer ()
)
and Writer Text ()
at each call site of putLn
.
GHC is obviously correct.
However, the problem lies not in how we’re calling putLn
,
but rather the way it’s been defined:
putLn :: Text -> Writer Text ()
This type can only produce a specific, pure Writer
action.
But to fit inside the do
block of our compound monad,
we need the Writer
+ IO
combo from WriterT Text IO
(i.e. Printer
).
We can try to address the mismatch by changing the signature to:
putLn :: Text -> WriterT Text IO () -- or just: Printer ()
but this will only result in the opposite problem.
Now, all the pure printers like ppAddress
are facing the fact
that putLn
is a (wrapped) IO
action, despite not actually doing any I/O whatsoever.
The obvious question is, can we have something that fits both?
Earlier on, I’ve said that both vanilla Writer
and the IO
-spruced Printer
support the “Writer
API”,
most notably the tell
function.
This notion of a “monadic interface” isn’t just hand-waving, though,
and Haskell (obviously!) provides a way to express it programmatically.
Meet the MonadWriter
typeclass:
class (Monad m, Monoid w) => MonadWriter w m
Any monad that can work as a Writer
will be an instance of it,
regardless of whether it wraps over IO
or anything else.
Functions like tell
are defined to be
polymorphic over it,
enabling us to leverage the same technique they use when we define putLn
:
{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.Writer.Class
putLn :: MonadWriter Text m => Text -> m ()
putLn line = tell $ line <> "\n"
If you aren’t very familiar with this syntax,
the part before =>
is a typeclass constraint, or context.
It defines the requirements to be satisfied by types
which are later used in the function signature.
Here, we request a MonadWriter
instance — one where Text
is the output
but anything can be the inner monad.
We refer to that unknown monad only as m
, a type variable.
The compiler will figure out what to substitute for it at every call site of putLn
.
As a result, both a pure Writer
and the IO
-bound Printer
can now use it.
In the second case, the relevant instance of MonadWriter
will,
naturally, have IO
fill in the m
position.
But curiously, the “pure” Writer
also has
an inner monad.
It just literally does nothing but wrap some other value:
newtype Identity a = Identity { runIdentity :: a }
In most cases, this fact is hidden behind
the real definition of Writer
,
though runIdentity
may sometimes come handy for some on-the-spot type hacks5.
The wrap
The many things we’ve talked about here could of course be a starting point
for even more advanced stuff, but obviously we have to stop somewhere!
But don’t worry: knowing about MonadWriter
and other monad typeclasses like this
is enough to write quite idiomatic code…
…at least until you learn about free monads, effects, and the like ;-)
In any case, you can check this gist for the complete code from this post.
-
IO
is still necessary due to ad-hoc network fetches and syscalls mentioned earlier. ↩ -
Or at least an
Applicative
, via theApplicativeDo
GHC extension. ↩ -
The adding is done via
mappend
, requiringw
to be aMonoid
. ↩ -
There is also the
execWriter
variant which is actually more practical here as it only returns the accumulated output. ↩ -
We could, for example, use it alongside
mapWriterT
to “fix” the calls toputLn
if we didn’t have control over its definition. ↩