Karol Kuczmarski's Blog

Recap of the gisht project

Posted on Fri 24 November 2017 in Programming • Tagged with Rust, gisht, CLI, GitHub, Python, testing • Leave a comment

In this post, I want to discuss some of the experiences I had with a project that I recently finished, gisht. By “finished” I mean that I don’t anticipate developing any new major features for it, though smaller things, bug fixes, or non-code stuff, is of course still very possible.

I’m thinking this is as much “done” as most software projects can ever hope to be. Thus, it is probably the best time for a recap / summary / postmortem / etc. — something to recount the lessons learned, and assess the choices made.

Some context

The original purpose of gisht was to facilitate download & execution of GitHub gists straight from the command line:

$ gisht Xion/git-outgoing  # run the https://gist.github.com/Xion/git-outgoing gist

I initially wrote its first version in Python because I’ve accumulated a sizable number of small & useful scripts (for Git, Unix, Python, etc.) which were all posted as gists. Sure, I could download them manually to ~/bin every time I used a new machine but that’s rather cumbersome, and I’m quite lazy.

Well, lazy and impatient :) I noticed pretty fast that the speed tax of Python is basically unacceptable for a program like gisht.

What I’m referring to here is not the speed of code execution, however, but only the startup time of Python interpreter. Irrespective of the machine, operating system, or language version, it doesn’t seem to go lower than about one hundred milliseconds; empirically, it’s often 2 or 3 times higher than that. For the common case of finding a cached gist (no downloads) and doing a simple fork+exec, this startup time was very noticeable and extremely jarring. It also precluded some more sophisticated uses for gisht, like putting its invocation into the shell’s $PROMPT¹.

Speed: delivered

And so the obvious solution emerged: let’s rewrite it in Rust!…

Because if I’m executing code straight from the internet, I should at least do it in a safe language.

But jokes aside, it is obvious that a language compiling to native code is likely a good pick if you want to optimize for startup speed. So while the choice of Rust was in large part educational (gisht was one of my first projects to be written in it), it definitely hasn’t disappointed there.

Even without any intentional optimization efforts, the app still runs instantaneously. I tried to take some measurements using the time command, but it never ticked into more than 0.001s. Perceptively, it is at least on par with git, so that’s acceptable for me :)

Can’t segfault if your code doesn’t build

Achieving the performance objective wouldn’t do us much good, however, if the road to get there involved excessive penalties on productivity. Such negative impact could manifest in many ways, including troublesome debugging due to a tricky runtime², or difficulty in getting the code to compile in the first place.

If you had even a passing contact with Rust, you’d expect the latter to be much more likely than the former.

Indeed, Rust’s very design eschews runtime flexibility to a ridiculous degree (in its “safe” mode, at least), while also forcing you to absorb subtle & complex ideas to even get your code past the compiler. The reward is increased likelihood your program will behave as intended — although it’s definitely not on the level of “if it compiles, it works” that can be offered by Haskell or Idris.

But since gisht is hardly mission critical, I didn’t actually care too much about this increased reliability. I don’t think it’s likely that Rust would buy me much over something like modern C++. And if I were to really do some kind of cost-benefit analysis of several languages — rather than going with Rust simply to learn it better — then it would be hard to justify it over something like Go.

It scales

So the real question is: has Rust not hampered my productivity too much? Having the benefit of hindsight, I’m happy to say that the trade-off was definitely acceptable :)

One thing I was particularly satisfied with was the language’s scalability. What I mean here is the ability to adapt as the project grows, but also to start quickly and remain nimble while the codebase is still pretty small.

Many languages (most, perhaps) are naturally tailored towards the large end, doing their best to make it more bearable to work with big codebases. In turn, they often forget about helping projects take off in the first place. Between complicated build systems and dependency managers (Java), or a virtual lack of either (C++), it can be really hard to get going in a “serious” language like this.

On the other hand, languages like Python make it very easy to start up and achieve relatively impressive results. Some people, however, report having encountered problems once the code evolves past certain size. While I’m actually very unsympathetic to those claims, I realize perception plays a significant role here, making those anecdotal experiences into a sort of self-fulfilling prophecy.

This perception problem should almost certainly spare Rust, as it’s a natively compiled and statically typed language, with a respectable type system to boot. There is also some evidence that the language works well in large projects already. So the only question that we might want to ask is: how easy it is to actually start a project in Rust, and carry it towards some kind of MVP?

Based on my experiences with gisht, I can say that it is, in fact, quite easy. Thanks mostly to the impressive Swiss army knife of cargo — acting as both package manager and a rudimentary build system — it was almost Python-trivial to cook a “Hello World” program that does something tangible, like talk to a JSON API. From there, it only took a few coding sessions to grow it into a functioning prototype.

Abstractions galore

As part of rewriting gisht from Python to Rust, I also wanted to fix some longstanding issues that limited its capabilities.

The most important one was the hopeless coupling to GitHub and their particular flavor of gists. Sure, this is where the project even got its name from, but people use a dozen of different services to share code snippets and it should very possible to support them all.

Here’s where it became necessary to utilize the abstraction capabilities that Rust has to offer. It was somewhat obvious to define a Host trait but of course its exact form had to be shaped over numerous iterations. Along the way, it even turned out that Result<Option<T>> and Option<Result<T>> are sometimes both necessary as return types :)

Besides cleaner architecture, another neat thing about an explicit abstraction is the ability to slice a concept into smaller pieces — and then put some of them back together. While the Host trait could support a very diverse set of gist services and pastebins, many of them turned out to be just a slight variation of one central theme. Because of this similarity, it was possible to introduce a single Basic implementation which handles multiple services through varying sets of URL patterns.

Devices like these aren’t of course specific to Rust: interfaces (traits) and classes are a staple of OO languages in general. But some other techniques were more idiomatic; the concept of iterators, for example, is flexible enough to accommodate looping over GitHub user’s gists, even as they read directly from HTTP responses.

Hacking time

Not everything was sunshine and rainbows, though.

Take clap, for example. It’s mostly a very good crate for parsing command line arguments, but it couldn’t quite cope with the unusual requirements that gisht had. To make gisht Foo/bar work alongside gisht run Foo/bar, it was necessary to analyze argv before even handing it over to clap. This turned out to be surprisingly tricky to get right. Like, really tricky, with edges cases and stuff. But as it is often the case in software, the answer turned out to be yet another layer of indirection plus a copious amount of tests.

In another instance, however, a direct library support was crucial.

It so happened that hyper, the crate I’ve been using for HTTP requests, didn’t handle the Link: response header out of the box³. This was a stumbling block that prevented the gist iterator (mentioned earlier) from correctly handling pagination in the responses from GitHub API. Thankfully, having the Header abstraction in hyper meant it was possible to add the missing support in a relatively straighforward manner. Yes, it’s not a universal implementation that’d be suitable for every HTTP client, but it does the job for gisht just fine.

Test-Reluctant Development

And so the program kept growing steadily over the months, most notably through more and more gist hosts it could now support.

Eventually, some of them would fall into a sort of twilight zone. They weren’t as complicated as GitHub to warrant writing a completely new Host instance, but they also couldn’t be handled via the Basic structure alone. A good example would be sprunge.us: mostly an ordinary pastebin, except for its optional syntax highlighting which may add some “junk” to the otherwise regular URLs.

In order to handle those odd cases, I went for a classic wrapper/decorator pattern which, in its essence, boils down to something like this:

pub struct Sprunge {
    inner: Basic,
}

impl Sprunge {
    pub fn new() -> {
        Sprunge{inner: Basic::new(ID, "sprunge.us",
                                  "http://sprunge.us/${id}", ...)}
    }
}

impl Host for Sprunge {
    // override & wrap methods that require custom logic:
    fn resolve_url(&self, url: &str) -> Option<io::Result<Gist>> {
        let mut url_obj = try_opt!(Url::parse(url).ok());
        url_obj.set_query(None);
        inner.resolve_url(url_obj.to_string().as_str())
    }

    // passthrough to the `Basic` struct for others:
    fn fetch_gist(&self, gist: &Gist, mode: FetchMode) -> io::Result<()> {
        self.inner.fetch_gist(gist, mode)
    }
    // (etc.)
}

Despite the noticeable boilerplate of a few pass-through methods, I was pretty happy with this solution, at least initially. After a few more unusual hosts, however, it became cumbersome to fix all the edge cases by looking only at the final output of the inner Basic implementation. The code was evidently asking for some tests, if only to check how the inner structure is being called.

Shouldn’t be too hard, right?… Yeah, that’s what I thought, too.

The reality, unfortunately, fell very short of those expectations. Stubs, mocks, fakes — test doubles in general — are a dark and forgotten corner of Rust that almost no one seems to pay any attention to. Absent a proper library support — much less a language one — the only way forward was to roll up my sleeves and implement a fake Host from scratch.

But that was just the beginning. How do you seamlessly inject this fake implementation into the wrapper so that it replaces the Basic struct for testing? If you are not careful and go for the “obvious” solution — a trait object:

pub struct Sprunge {
    inner: Box<Host>,
}

you’ll soon realize that you need not just a Box, but at least an Rc (or maybe even Arc). Without this kind of shared ownership, you’ll lose your chance to interrogate the test double once you hand it over to the wrapper. This, in turn, will heavily limit your ability to write effective tests.

What’s the non-obvious approach, then? The full rationale would probably warrant a separate post, but the working recipe looks more or less like this:

First, parametrize the wrapper with its inner type: pub struct Sprunge<T: Host> { inner: T }.

Put that in an internal module with the correct visibility setup:

mod internal {
    pub struct Sprunge<T: Host> {
        pub(super) inner: T,
    }
}

Make the regular (“production”) version of the wrapper into an alias, giving it the type parameter that you’ve been using directly⁴:
```
pub type Sprunge = internal::Sprunge<Basic>;
```
Change the new constructor to instantiate the internal type.
In tests, create the wrapper with a fake inner object inside.

As you can see in the real example, this convoluted technique removes the need for any pointer indirection. It also permits you to access the out-of-band interface that a fake object would normally expose.

It’s a shame, though, that so much work is required for something that should be very simple. As it appears, testing is still a neglected topic in Rust.

Packing up

It wasn’t just Rust that played a notable role in the development of gisht.

Pretty soon after getting the app to a presentable state, it became clear that a mere cargo build won’t do everything that’s necessary to carry out a complete build. It could do more, admittedly, if I had the foresight to explore Cargo build scripts a little more thoroughly. But overall, I don’t regret dropping back to my trusty ol’ pick: Python.

Like in a few previous projects, I used the Invoke task runner for both the crucial and the auxiliary automation tasks. It is a relatively powerful tool — and probably the best in its class in Python that I know of — though it can be a bit capricious if you want to really fine-tune it. But it does make it much easier to organize your automation code, to reuse it between tasks, and to (ahem) invoke those tasks in a convenient manner.

In any case, it certainly beats a collection of disconnected Bash scripts ;)

What have I automated in this way, you may ask? Well, a couple of small things; those include:

embedding of the current Git commit hash into the binary, to help identify the exact revision in the logs of any potential bug reports⁵
after a successful build, replacing the Usage section in README with the program’s --help output
generating completion scripts for popular shells by invoking the binary with a magic hidden flag (courtesy of clap)

Undoubtedly the biggest task that I relegated to Python/Invoke, was the preparation of release packages. When it comes to the various Linuxes (currently Debian and Red Hat flavors), this wasn’t particularly complicated. Major thanks are due to the amazing fpm tool here, which I recommend to anyone who needs to package their software in a distro-compatible manner.

Homebrew, however — or more precisely, OS X itself — was quite a different story. Many, many failed attempts were needed to even get it to build on Travis, and the additional dependency on Python was partially to blame. To be fair, however, most of the pain was exclusively due to OpenSSL; getting that thing to build is always loads of “fun”, especially in such an opaque and poorly debuggable environment as Travis.

The wrap

There’s probably a lot of minor things and tidbits I could’ve mentioned along the way, but the story so far has most likely covered all the important topics. Let’s wrap it up then, and highlight some interesting points in the classic Yay/Meh/Nay manner.

Yay

It was definitely a good choice to rewrite gisht specifically in Rust. Besides all the advantages I’ve mentioned already, it is also worth noting that the language went through about 10 minor version bumps while I was working on this project. Of all those new releases, I don’t recall a single one that would introduce a breaking change.
Most of the Rust ecosystem (third-party libraries) was a joy to use, and very easy to get started with. Honorable mention goes to serde_json and how easy it was to transition the code from rustc_serialize that I had used at first.
With a possible exception of sucking in node.js as a huge dependency of your project and using Grunt, there is probably no better way of writing automation & support code than Python. There may eventually be some Rust-based task runners that could try to compete, but I’m not very convinced about using a compiled language for this purpose (and especially one that takes so long to build).

Meh

While the clap crate is quite configurable and pretty straightforward to use, it does lack at least one feature that’d be very nice for gisht. Additionally, working with raw clap is often a little tedious, as it doesn’t assist you in translating parsed flags into your own configuration types, and thus requires shuffling those bits manually⁶.
Being a defacto standard for continuous integration in open-source projects, Travis CI could be a little less finicky. In almost every project I decide to use it for, I end up with about half a dozen commits that frantically try to fix silly configuration issues, all before even a simple .travis.yml works as intended. Providing a way to test CI builds locally would be an obvious way to avoid this churn.

Nay

Testing in Rust is such a weird animal. On one hand, there is a first-class, out-of-the-box support for unit tests (and even integration tests) right in the toolchain. On the other hand, the relevant parts of the ecosystem are immature or lacking, as evidenced by the dreary story of mocking and stubbing. It’s no surprise that there is a long way to catch up to languages with the strongest testing culture (Java and C#/.NET⁷), but it’s disappointing to see Rust outclassed even by C++.
Getting anything to build reliably on OSX in a CI environment is already a tall order. But if it involves things as OpenSSL, then it quickly goes from bad to terrible. I’m really not amused anymore how this “Just Works” system often turns out to hardly work at all.

Since I don’t want to end on such a negative note, I feel compelled to state the obvious fact: every technology choice is a trade-off. In case of this project, however, the drawbacks were heavily outweighed by the benefits.

For this reason, I can definitely recommend the software stack I’ve just described to anyone developing non-trivial, cross-platform command line tools.

This is not an isolated complaint, by the way, as the interpreter startup time has recently emerged as an important issue to many developers of the Python language. ↩
Which may also include a practical lack thereof. ↩
It does handle it now, fortunately. ↩
Observant readers may notice that we’re exposing a technically private type (internal::Sprunge) through a publicly visible type alias. If that type was actually private, this would trigger a compiler warning which is slated to become a hard error at some point in the future. But, amusingly, we can fool the compiler by making it a public type inside a private module, which is exactly what we’re doing here. ↩
This has since been rewritten and is now done in build.rs — but that’s only because I implemented the relevant Cargo feature myself :) ↩
For an alternative approach that doesn’t seem to have this problem, check the structopt crate. ↩
Dynamically typed languages, due to their rich runtime, are basically a class of their own when it comes to testing ease, so it wouldn’t really be fair to hold them up for comparison. ↩

Better location for unit tests in Rust

Posted on Fri 06 January 2017 in Code • Tagged with Rust, unit tests, testing, modules • Leave a comment

For a unit test to be comprehensive, it must often access some private symbols from the module it checks.

In Rust, this is permitted for submodules: they can freely refer to anything defined “upwards” in the module hierarchy. The only requirement is that they import it explicitly by name, using statements such as use super::foo.

To illustrate this, here’s an example of a ridiculously well-factored FizzBuzz along with its accompanying unit test:

use std::borrow::Cow;

pub fn fizzbuzz(n: u32) {
    for i in 1..n+1 {
        println!("{}", fizzbuzz_string(i));
    }
}

fn fizzbuzz_string(i: u32) -> Cow<'static, str> {
    let by3 = i % 3 == 0;
    let by5 = i % 5 == 0;
    if by3 && by5 { "FizzBuzz".into() }
    else if by3   { "Fizz".into() }
    else if by5   { "Buzz".into() }
    else          { format!("{}", i).into() }
}


#[cfg(test)]
mod tests {
    use super::fizzbuzz_string;

    #[test]
    fn single_numbers() {
        assert_eq!("1", fizzbuzz_string(1));
        assert_eq!("2", fizzbuzz_string(2));
        assert_eq!("Fizz", fizzbuzz_string(3));
        assert_eq!("Buzz", fizzbuzz_string(5));
        assert_eq!("7", fizzbuzz_string(7));
        assert_eq!("Fizz", fizzbuzz_string(9));
        assert_eq!("Buzz", fizzbuzz_string(10));
        assert_eq!("FizzBuzz", fizzbuzz_string(15));
        # etc.
    }
}

The internal function, as shown above, can be imported and verified independently of the public one. This is done through a #[test] procedure in an inline submodule.

Such factorization and granular testing is commonplace, especially when the public API may cause unwanted side effects, such as printing stuff to stdout here.

The issue of length

But if you are like me and prefer your modules to be short and sweet, you may feel justifiably concerned about this inline submodule business.

In the toy example above, tests have already taken at least as many lines as the actual code. Real world usually matches this ratio. A module with a couple hundred lines of regular code starts to be measured in KLOCs if we also include its tests.

While this could be taken as a strong hint to split things up, it can just as easily disincentivize testing instead.

The obvious solution is to move those tests somewhere else. What is not so evident is how to preserve this crucial module-submodule relation, enabling us to write comprehensive tests in the first place.

Looking for inspiration

I must quickly disappoint anyone who would like to round up all their unit tests and sequester them in some distant tests/ directory. Such layout is reserved for crate-level (“integration”) tests. Unit tests, on the other hand, are predestined to live among production code¹.

So let’s at least relocate them to separate files.

To make this goal more concrete, we will try to emulate the project layout described in Google’s C++ style guide. By this convention, a conceptual “module” or “unit” consists of the following files:

foo.h
foo.cc
foo_test.cc

Translating this to Rust, we get:

foo.rs
foo_test.rs

The first one is obviously our production code. The second file, foo_test.rs, contains all the tests we would previously put in the mod tests { } construct.

Seems pretty clean and straightforward, right? Unfortunately, Rust will not accept this setup without some convincing.

Family problems

To understand why, recall that the mere presence of some .rs files is not enough for the Rust compiler to care. If we want them picked up and included in the project, we also need to add some module declarations first.

In other words, there must also be a mod.rs file in this directory, containing at the very least the following content:

// (mod.rs)

mod foo;
#[cfg(test)]
mod foo_test;

Now it should be clearer that something is wrong.

We got two modules here, but they are siblings. Both foo and foo_test are on the same level, children of whatever parent module contains them both. More to the point, it’s foo_test that’s not a child module of foo, meaning it can only see the public symbols of the latter.

This is not quite enough to write a proper unit test. It definitely isn’t for our initial FizzBuzz example, because the fizzbuzz_string function cannot even be imported!

Existential crises

Okay, so how about we move the mod foo_test; declaration to foo.rs? This should be enough to establish the proper hierarchy. After all, this is how the module tree is normally reconstructed: from the appropriate placement of the mod statements.

So, here we go:

// (foo.rs)

#[cfg(test)]
mod foo_test;

error: cannot declare a new module at this location
  --> src/parent/foo.rs:4:5
   |
 4 | mod foo_test;

…Really?

Well, yes. A declaration like this simply isn’t allowed. The reason for this is actually much less arbitrary than the error message would indicate.

To put it bluntly, foo_test simply cannot exist if it’s introduced there. To deliver on its declaration promise, the submodule would have to reside within foo itself. But of course, foo.rs is just a file, so this setup is evidently impossible.

All in all, Rust seems to be looking for our module in all the wrong places.

Perhaps we can just tell it where it should be going instead?…

The right path

Enter the #[path] attribute, which fulfills this exact purpose:

// (foo.rs)

#[cfg(test)]
#[path = "./foo_test.rs"]
mod foo_test;

#[path] tells the Rust compiler where to look for the module it is attached to. Its argument is relative to the location of the outer module (like foo here), and can be either a single file, or a directory with mod.rs.

Conceptually, this is similar to a custom ClassLoader in Java, or the common sys.path hacks in Python. Unlike those two languages, however, the #[path] attribute is only relevant at compile time.

Additionally, and somewhat confusingly, #[path] can also be applied retroactively to a module that the compiler has already located. In such case, it will affect the lookup of any child modules by making rustc search for them in the new location.

With #[path] handy, it is therefore possible to implement custom layouts of regular source modules and test files.

But like with every tool that can be used to defy conventions, it should be used with the appropriate care. While a straightforward and self-documenting approach described here is unlikely to raise any eyebrows, rewriting module paths willy-nilly is most certainly a bad idea.

Okay, technically it is possible to completely isolate them, essentially by abusing the approach I describe later in this post. ↩

all and wild imports in Python

Posted on Mon 26 December 2016 in Code • Tagged with Python, modules, imports, testing • Leave a comment

An often misunderstood piece of Python import machinery is the __all__ attribute. While it is completely optional, it’s common to see modules with the __all__ list populated explicitly:

__all__ = ['Foo', 'bar']

class Foo(object):
    # ...

def bar():
    # ...

def baz():
    # ...

Before explaining what the real purpose of __all__ is (and how it relates to the titular wild imports), let’s deconstruct some common misconceptions by highlighting what it isn’t:

__all__ doesn’t prevent any of the module symbols (functions, classes, etc.) from being directly imported. In our the example, the seemingly omitted baz function (which is not included in __all__), is still perfectly importable by writing from module import baz.
Similarly, __all__ doesn’t influence what symbols are included in the results of dir(module) or vars(module). So in the case above, a dir call would result in a ['Foo', 'bar', 'baz'] list, even though 'baz' does not occur in __all__.

In other words, the content of __all__ is more of a convention rather than a strict limitation. Regardless of what you put there, every symbol defined in your module will still be accessible from the outside.

This is a clear reflection of the common policy in Python: assume everyone is a consenting adult, and that visibility controls are not necessary. Without an explicit __all__ list, Python simply puts all of the module “public” symbols there anyway¹.

The meaning of it `all`

So, what does __all__ actually effect?

This is neatly summed up in this brief StackOverflow answer. Simply speaking, its purpose is twofold:

It tells the readers of the source code — be it humans or automated tools — what’s the conventional public API exposed by the module.
It lists names to import when performing the so-called wild import: from module import *.

Because of the default content of __all__ that I mentioned earlier, the public API of a module can also be defined implicitly. Some style guides (like the Google one) are therefore relying on the public and _private naming exclusively. Nevertheless, an explicit __all__ list is still a perfectly valid option, especially considering that no approach offers any form of actual access control.

Import star

The second point, however, has some real runtime significance.

In Python, like in many other languages, it is recommended to be explicit about the exact functions and classes we’re importing. Commonly, the import statement will thus take one of the following forms:

import random
import urllib.parse
from random import randint
from logging import fatal, warning as warn
from urllib.parse import urlparse
# etc.

In each case, it’s easy to see the relevant name being imported. Regardless of the exact syntax and the possible presence of aliasing (as), it’s always the last (qualified) name in the import statement, before a newline or comma.

Contrast this with an import that ends with an asterisk:

from itertools import *

This is called a star or wild import, and it isn’t so straightforward. This is also the reason why using it is generally discouraged, except for some very specific situations.

Why? Because you cannot easily see what exact names are being imported here. For that you’d have to go to the module’s source and — you guessed it — look at the __all__ list².

Taming the wild

Barring some less important details, the mechanics of import * could therefore be expressed in the following Python (pseudo)code:

import module as __temp
for __name in module:
    globals()[name] = getattr(__temp, __name)
del __temp
del __name

One interesting case to consider is what happens when __all__ contains a wrong name.

What if one of the strings there doesn’t correspond to any name within the module?…

# foo.py
__all__ = ['Foo']

def bar():
    pass

>>> import foo
>>> from foo import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'Foo'

Quite predictably, import * blows up.
Notice, however, that regular import still works.

All in all (ahem), this hints at a cute little trick which is also very self-evident:

__all__ = ['DO_NOT_WILD_IMPORT']

Put this in a Python module, and no one will be able to import * from it!
Much more effective than any lint warning ;-)

Test `all` the things

Jokes aside, this phenomenon (__all__ with an out-of-place name in it) can also backfire. Especially when reexporting, it’s relatively easy to introduce stray 'name' into __all__: one which doesn’t correspond to any name that’s actually present in the namespace.

If we commit such a mishap, we are inadvertently lying about the public API of our package. What’s worse is that this mistake can propagate through documentation generators, and ultimately mislead our users.

While some linters may be able to catch this, a simple test like this one:

def test_all(self):
    """Test that __all__ contains only names that are actually exported."""
    import yourpackage

    missing = set(n for n in yourpackage.__all__
                  if getattr(yourpackage, n, None) is None)
    self.assertEmpty(
        missing, msg="__all__ contains unresolved names: %s" % (
            ", ".join(missing),))

is a quick & easy way to ensure this never happens.

“Public” symbols have names that don’t begin with underscore (_). Of course, “non-public” ones are still accessible but are treated as implicitly unstable & discouraged. ↩
Or check what symbols there don’t have a leading underscore. ↩

Introducing callee: matchers for unittest.mock

Posted on Sun 20 March 2016 in News • Tagged with Python, testing, mocking, argument matchers • Leave a comment

Combined with dynamic typing, the lax object model of Python makes it pretty easy to write unit tests that involve mocking some parts of the tested code. Still, there’s plenty of third party libraries — usually with “mock” somewhere in the name — which promise to make it even shorter and simpler. There’s evidently been a need here that the standard library didn’t address adequately.

It changed, however, with Python 3.3. This version introduced a new unittest.mock module which essentially obsoleted all the other, third party solutions. The module, available also as a backport for Python 2.6 and above, is flexible, easy to use, and offers most if not all of the requisite functionality.

Called it, maybe?

Well, with one notable exception. If we mock a function, or any other callable object:

# (production code)
class ProductionClass(object):
    def foo(self):
        self.florb(...)

# (test code)
something = ProductionClass()
something.florb = mock.Mock()

we have very limited capabilities when it comes to verifying how they were called. Sure, we can be extremely specific:

something.florb.assert_called_with('this particular string')

or very lenient:

something.florb.assert_called_with(mock.ANY)  # works for _any_ argument

but there is virtually nothing in between. Suppose we don’t care too much what exact string the method was called with, only that it was some string that’s long enough? Things get awkward very fast then:

for posargs, kwargs in something.florb.call_args_list:
    arg = posargs[0]
    self.assertIsInstance(arg, str)
    self.assertGreaterEqual(len(arg), 16)
    break
else:
    self.fail('required call to %r not found' % (something.florb,))

Yes, this is what’s required: an actual loop through all the mock’s calls¹; unpacking tuples of positional and keyword arguments; and manual assertions that won’t even emit useful error messages when they fail.

Eww! Surely Python can do better than that?

Meet `callee`

Why, of course it can! Today, I’m releasing callee: a library of argument matchers compatible with unittest.mock. It enables you to write much simpler, more readable, and delightfully declarative assertions:

from callee import String, LongerOrEqual
something.florb.assert_called_with(String() & LongerOrEqual(16))

The wide array of matchers offered by callee includes numeric, string, and collection ones. And if none suits your particular needs, it’s a trivial matter to implement a custom one.

Check the library’s documentation for more examples, or jump right in to the installation guide or a full matcher reference.

A loop is not necessary if we simulate assert_called_once_with rather than assert_called_with, but we still have to dig into mock.call objects to eke out the arguments. ↩