Karol Kuczmarski's Blog

Recap of the gisht project

Posted on Fri 24 November 2017 in Programming • Tagged with Rust, gisht, CLI, GitHub, Python, testing • Leave a comment

In this post, I want to discuss some of the experiences I had with a project that I recently finished, gisht. By “finished” I mean that I don’t anticipate developing any new major features for it, though smaller things, bug fixes, or non-code stuff, is of course still very possible.

I’m thinking this is as much “done” as most software projects can ever hope to be. Thus, it is probably the best time for a recap / summary / postmortem / etc. — something to recount the lessons learned, and assess the choices made.

Some context

The original purpose of gisht was to facilitate download & execution of GitHub gists straight from the command line:

$ gisht Xion/git-outgoing  # run the https://gist.github.com/Xion/git-outgoing gist

I initially wrote its first version in Python because I’ve accumulated a sizable number of small & useful scripts (for Git, Unix, Python, etc.) which were all posted as gists. Sure, I could download them manually to ~/bin every time I used a new machine but that’s rather cumbersome, and I’m quite lazy.

Well, lazy and impatient :) I noticed pretty fast that the speed tax of Python is basically unacceptable for a program like gisht.

What I’m referring to here is not the speed of code execution, however, but only the startup time of Python interpreter. Irrespective of the machine, operating system, or language version, it doesn’t seem to go lower than about one hundred milliseconds; empirically, it’s often 2 or 3 times higher than that. For the common case of finding a cached gist (no downloads) and doing a simple fork+exec, this startup time was very noticeable and extremely jarring. It also precluded some more sophisticated uses for gisht, like putting its invocation into the shell’s $PROMPT¹.

Speed: delivered

And so the obvious solution emerged: let’s rewrite it in Rust!…

Because if I’m executing code straight from the internet, I should at least do it in a safe language.

But jokes aside, it is obvious that a language compiling to native code is likely a good pick if you want to optimize for startup speed. So while the choice of Rust was in large part educational (gisht was one of my first projects to be written in it), it definitely hasn’t disappointed there.

Even without any intentional optimization efforts, the app still runs instantaneously. I tried to take some measurements using the time command, but it never ticked into more than 0.001s. Perceptively, it is at least on par with git, so that’s acceptable for me :)

Can’t segfault if your code doesn’t build

Achieving the performance objective wouldn’t do us much good, however, if the road to get there involved excessive penalties on productivity. Such negative impact could manifest in many ways, including troublesome debugging due to a tricky runtime², or difficulty in getting the code to compile in the first place.

If you had even a passing contact with Rust, you’d expect the latter to be much more likely than the former.

Indeed, Rust’s very design eschews runtime flexibility to a ridiculous degree (in its “safe” mode, at least), while also forcing you to absorb subtle & complex ideas to even get your code past the compiler. The reward is increased likelihood your program will behave as intended — although it’s definitely not on the level of “if it compiles, it works” that can be offered by Haskell or Idris.

But since gisht is hardly mission critical, I didn’t actually care too much about this increased reliability. I don’t think it’s likely that Rust would buy me much over something like modern C++. And if I were to really do some kind of cost-benefit analysis of several languages — rather than going with Rust simply to learn it better — then it would be hard to justify it over something like Go.

It scales

So the real question is: has Rust not hampered my productivity too much? Having the benefit of hindsight, I’m happy to say that the trade-off was definitely acceptable :)

One thing I was particularly satisfied with was the language’s scalability. What I mean here is the ability to adapt as the project grows, but also to start quickly and remain nimble while the codebase is still pretty small.

Many languages (most, perhaps) are naturally tailored towards the large end, doing their best to make it more bearable to work with big codebases. In turn, they often forget about helping projects take off in the first place. Between complicated build systems and dependency managers (Java), or a virtual lack of either (C++), it can be really hard to get going in a “serious” language like this.

On the other hand, languages like Python make it very easy to start up and achieve relatively impressive results. Some people, however, report having encountered problems once the code evolves past certain size. While I’m actually very unsympathetic to those claims, I realize perception plays a significant role here, making those anecdotal experiences into a sort of self-fulfilling prophecy.

This perception problem should almost certainly spare Rust, as it’s a natively compiled and statically typed language, with a respectable type system to boot. There is also some evidence that the language works well in large projects already. So the only question that we might want to ask is: how easy it is to actually start a project in Rust, and carry it towards some kind of MVP?

Based on my experiences with gisht, I can say that it is, in fact, quite easy. Thanks mostly to the impressive Swiss army knife of cargo — acting as both package manager and a rudimentary build system — it was almost Python-trivial to cook a “Hello World” program that does something tangible, like talk to a JSON API. From there, it only took a few coding sessions to grow it into a functioning prototype.

Abstractions galore

As part of rewriting gisht from Python to Rust, I also wanted to fix some longstanding issues that limited its capabilities.

The most important one was the hopeless coupling to GitHub and their particular flavor of gists. Sure, this is where the project even got its name from, but people use a dozen of different services to share code snippets and it should very possible to support them all.

Here’s where it became necessary to utilize the abstraction capabilities that Rust has to offer. It was somewhat obvious to define a Host trait but of course its exact form had to be shaped over numerous iterations. Along the way, it even turned out that Result<Option<T>> and Option<Result<T>> are sometimes both necessary as return types :)

Besides cleaner architecture, another neat thing about an explicit abstraction is the ability to slice a concept into smaller pieces — and then put some of them back together. While the Host trait could support a very diverse set of gist services and pastebins, many of them turned out to be just a slight variation of one central theme. Because of this similarity, it was possible to introduce a single Basic implementation which handles multiple services through varying sets of URL patterns.

Devices like these aren’t of course specific to Rust: interfaces (traits) and classes are a staple of OO languages in general. But some other techniques were more idiomatic; the concept of iterators, for example, is flexible enough to accommodate looping over GitHub user’s gists, even as they read directly from HTTP responses.

Hacking time

Not everything was sunshine and rainbows, though.

Take clap, for example. It’s mostly a very good crate for parsing command line arguments, but it couldn’t quite cope with the unusual requirements that gisht had. To make gisht Foo/bar work alongside gisht run Foo/bar, it was necessary to analyze argv before even handing it over to clap. This turned out to be surprisingly tricky to get right. Like, really tricky, with edges cases and stuff. But as it is often the case in software, the answer turned out to be yet another layer of indirection plus a copious amount of tests.

In another instance, however, a direct library support was crucial.

It so happened that hyper, the crate I’ve been using for HTTP requests, didn’t handle the Link: response header out of the box³. This was a stumbling block that prevented the gist iterator (mentioned earlier) from correctly handling pagination in the responses from GitHub API. Thankfully, having the Header abstraction in hyper meant it was possible to add the missing support in a relatively straighforward manner. Yes, it’s not a universal implementation that’d be suitable for every HTTP client, but it does the job for gisht just fine.

Test-Reluctant Development

And so the program kept growing steadily over the months, most notably through more and more gist hosts it could now support.

Eventually, some of them would fall into a sort of twilight zone. They weren’t as complicated as GitHub to warrant writing a completely new Host instance, but they also couldn’t be handled via the Basic structure alone. A good example would be sprunge.us: mostly an ordinary pastebin, except for its optional syntax highlighting which may add some “junk” to the otherwise regular URLs.

In order to handle those odd cases, I went for a classic wrapper/decorator pattern which, in its essence, boils down to something like this:

pub struct Sprunge {
    inner: Basic,
}

impl Sprunge {
    pub fn new() -> {
        Sprunge{inner: Basic::new(ID, "sprunge.us",
                                  "http://sprunge.us/${id}", ...)}
    }
}

impl Host for Sprunge {
    // override & wrap methods that require custom logic:
    fn resolve_url(&self, url: &str) -> Option<io::Result<Gist>> {
        let mut url_obj = try_opt!(Url::parse(url).ok());
        url_obj.set_query(None);
        inner.resolve_url(url_obj.to_string().as_str())
    }

    // passthrough to the `Basic` struct for others:
    fn fetch_gist(&self, gist: &Gist, mode: FetchMode) -> io::Result<()> {
        self.inner.fetch_gist(gist, mode)
    }
    // (etc.)
}

Despite the noticeable boilerplate of a few pass-through methods, I was pretty happy with this solution, at least initially. After a few more unusual hosts, however, it became cumbersome to fix all the edge cases by looking only at the final output of the inner Basic implementation. The code was evidently asking for some tests, if only to check how the inner structure is being called.

Shouldn’t be too hard, right?… Yeah, that’s what I thought, too.

The reality, unfortunately, fell very short of those expectations. Stubs, mocks, fakes — test doubles in general — are a dark and forgotten corner of Rust that almost no one seems to pay any attention to. Absent a proper library support — much less a language one — the only way forward was to roll up my sleeves and implement a fake Host from scratch.

But that was just the beginning. How do you seamlessly inject this fake implementation into the wrapper so that it replaces the Basic struct for testing? If you are not careful and go for the “obvious” solution — a trait object:

pub struct Sprunge {
    inner: Box<Host>,
}

you’ll soon realize that you need not just a Box, but at least an Rc (or maybe even Arc). Without this kind of shared ownership, you’ll lose your chance to interrogate the test double once you hand it over to the wrapper. This, in turn, will heavily limit your ability to write effective tests.

What’s the non-obvious approach, then? The full rationale would probably warrant a separate post, but the working recipe looks more or less like this:

First, parametrize the wrapper with its inner type: pub struct Sprunge<T: Host> { inner: T }.

Put that in an internal module with the correct visibility setup:

mod internal {
    pub struct Sprunge<T: Host> {
        pub(super) inner: T,
    }
}

Make the regular (“production”) version of the wrapper into an alias, giving it the type parameter that you’ve been using directly⁴:
```
pub type Sprunge = internal::Sprunge<Basic>;
```
Change the new constructor to instantiate the internal type.
In tests, create the wrapper with a fake inner object inside.

As you can see in the real example, this convoluted technique removes the need for any pointer indirection. It also permits you to access the out-of-band interface that a fake object would normally expose.

It’s a shame, though, that so much work is required for something that should be very simple. As it appears, testing is still a neglected topic in Rust.

Packing up

It wasn’t just Rust that played a notable role in the development of gisht.

Pretty soon after getting the app to a presentable state, it became clear that a mere cargo build won’t do everything that’s necessary to carry out a complete build. It could do more, admittedly, if I had the foresight to explore Cargo build scripts a little more thoroughly. But overall, I don’t regret dropping back to my trusty ol’ pick: Python.

Like in a few previous projects, I used the Invoke task runner for both the crucial and the auxiliary automation tasks. It is a relatively powerful tool — and probably the best in its class in Python that I know of — though it can be a bit capricious if you want to really fine-tune it. But it does make it much easier to organize your automation code, to reuse it between tasks, and to (ahem) invoke those tasks in a convenient manner.

In any case, it certainly beats a collection of disconnected Bash scripts ;)

What have I automated in this way, you may ask? Well, a couple of small things; those include:

embedding of the current Git commit hash into the binary, to help identify the exact revision in the logs of any potential bug reports⁵
after a successful build, replacing the Usage section in README with the program’s --help output
generating completion scripts for popular shells by invoking the binary with a magic hidden flag (courtesy of clap)

Undoubtedly the biggest task that I relegated to Python/Invoke, was the preparation of release packages. When it comes to the various Linuxes (currently Debian and Red Hat flavors), this wasn’t particularly complicated. Major thanks are due to the amazing fpm tool here, which I recommend to anyone who needs to package their software in a distro-compatible manner.

Homebrew, however — or more precisely, OS X itself — was quite a different story. Many, many failed attempts were needed to even get it to build on Travis, and the additional dependency on Python was partially to blame. To be fair, however, most of the pain was exclusively due to OpenSSL; getting that thing to build is always loads of “fun”, especially in such an opaque and poorly debuggable environment as Travis.

The wrap

There’s probably a lot of minor things and tidbits I could’ve mentioned along the way, but the story so far has most likely covered all the important topics. Let’s wrap it up then, and highlight some interesting points in the classic Yay/Meh/Nay manner.

Yay

It was definitely a good choice to rewrite gisht specifically in Rust. Besides all the advantages I’ve mentioned already, it is also worth noting that the language went through about 10 minor version bumps while I was working on this project. Of all those new releases, I don’t recall a single one that would introduce a breaking change.
Most of the Rust ecosystem (third-party libraries) was a joy to use, and very easy to get started with. Honorable mention goes to serde_json and how easy it was to transition the code from rustc_serialize that I had used at first.
With a possible exception of sucking in node.js as a huge dependency of your project and using Grunt, there is probably no better way of writing automation & support code than Python. There may eventually be some Rust-based task runners that could try to compete, but I’m not very convinced about using a compiled language for this purpose (and especially one that takes so long to build).

Meh

While the clap crate is quite configurable and pretty straightforward to use, it does lack at least one feature that’d be very nice for gisht. Additionally, working with raw clap is often a little tedious, as it doesn’t assist you in translating parsed flags into your own configuration types, and thus requires shuffling those bits manually⁶.
Being a defacto standard for continuous integration in open-source projects, Travis CI could be a little less finicky. In almost every project I decide to use it for, I end up with about half a dozen commits that frantically try to fix silly configuration issues, all before even a simple .travis.yml works as intended. Providing a way to test CI builds locally would be an obvious way to avoid this churn.

Nay

Testing in Rust is such a weird animal. On one hand, there is a first-class, out-of-the-box support for unit tests (and even integration tests) right in the toolchain. On the other hand, the relevant parts of the ecosystem are immature or lacking, as evidenced by the dreary story of mocking and stubbing. It’s no surprise that there is a long way to catch up to languages with the strongest testing culture (Java and C#/.NET⁷), but it’s disappointing to see Rust outclassed even by C++.
Getting anything to build reliably on OSX in a CI environment is already a tall order. But if it involves things as OpenSSL, then it quickly goes from bad to terrible. I’m really not amused anymore how this “Just Works” system often turns out to hardly work at all.

Since I don’t want to end on such a negative note, I feel compelled to state the obvious fact: every technology choice is a trade-off. In case of this project, however, the drawbacks were heavily outweighed by the benefits.

For this reason, I can definitely recommend the software stack I’ve just described to anyone developing non-trivial, cross-platform command line tools.

This is not an isolated complaint, by the way, as the interpreter startup time has recently emerged as an important issue to many developers of the Python language. ↩
Which may also include a practical lack thereof. ↩
It does handle it now, fortunately. ↩
Observant readers may notice that we’re exposing a technically private type (internal::Sprunge) through a publicly visible type alias. If that type was actually private, this would trigger a compiler warning which is slated to become a hard error at some point in the future. But, amusingly, we can fool the compiler by making it a public type inside a private module, which is exactly what we’re doing here. ↩
This has since been rewritten and is now done in build.rs — but that’s only because I implemented the relevant Cargo feature myself :) ↩
For an alternative approach that doesn’t seem to have this problem, check the structopt crate. ↩
Dynamically typed languages, due to their rich runtime, are basically a class of their own when it comes to testing ease, so it wouldn’t really be fair to hold them up for comparison. ↩

Long Live Dynamic Languages!

Posted on Wed 24 May 2017 in Programming • Tagged with Python, Rust, dynamic languages, dynamic typing, static typing • Leave a comment

If you followed the few (or a dozen) of my recent posts, you’ve probably noticed a sizable bias in the choice of topics. The vast majority were about Rust — a native, bare metal, statically typed language with powerful compile time semantics but little in the way of runtime flexibility.

Needless to say, Rust is radically different than (almost the exact opposite of) Python, the other language that I’m covering sometimes. Considering this topical shift, it would fair to assume that I, too, have subscribed to the whole Static Typing™ trend.

But that wouldn’t be very accurate.

Don’t get me wrong. As far as fashion cycles in the software industry go, the current trend towards static/compiled languages is difficult to disparage. Strong in both hype and merit, it has given us some really innovative & promising solutions (as well as some not-so-innovative ones) that are poised to shape the future of programming for years, if not decades to come. In many ways, it is also correcting mistakes of the previous generation: excessive boilerplate, byzantine abstractions, and software bloat.

What about dynamic languages, then? Are they slowly going the way of the dodo?

Trigger warning: `TypeError`

Some programmers would certainly wish so.

Indeed, it’s not hard at all to find articles and opinions about dynamic languages that are, well, less than flattering.

The common argument echoed in those accounts points to supposed unsuitability of Python et al. for any large, multi-person project. The reasoning can be summed up as “good for small scripts and not much else”. Without statically checked types, the argument goes, anything bigger than a quick hack or a prototype shall inevitably become hairy and dangerous monstrosity.

And when that happens, a single typo can go unchecked and bring down the entire system!…

At the very end of this spectrum of beliefs, some pundits may eventually make the leap from languages to people. If dynamically typed languages (or “untyped” ones, as they’re often mislabeled) are letting even trivial bugs through, then obviously anyone who wants to use them is dangerously irresponsible. It must follow that all they really want is to hack up some shoddy code, yolo it over to production, and let others worry about the consequences.

Mind the gap

It’s likely unproductive to engage with someone who’s that extreme. If the rhetoric is dialed down, however, we can definitely find the edge of reason.

In my opinion, this fine line goes right through the “good in small quantities” argument. I can certainly understand the apprehension towards large projects that utilize dynamically typed languages throughout their codebases. The prospect of such a project is scary, because it contains an additional element of uncertainty. More so than with many other technologies, you ought to know what you’re doing.

Some people (and teams) do. Others, not so much.

I would therefore refine the argument so that it better reflects the strengths and weaknesses of dynamic languages. They are perfectly suited for at least the following cases:

anyone writing small, standalone applications or scripts
any project (large or small) with a well-functioning team of talented individuals

The sad reality of the software industry is the vast, gaping chasm of calamity and despair that stretches between those two scenarios.

Within lies the bulk of commercial software projects, consistently hamstrung by the usual suspects: incompetent management, unclear and shifting requirements, under- or overstaffing, ancient development practices, lack of coding standards, rampant bureaucracy, inexperienced developers, and so on.

In such an environment, it becomes nigh impossible to capitalize on the strengths of dynamic languages. Instead, the main priority is to protect from even further productivity losses, which is what bog-standard languages like Java, C#, or Go tend to be pretty good at. Rather than to move fast, the objective is to remain moving at all.

Freedom of choice

“But that’s backwards”, the usual retort goes. “Static typing and compilation checks are what enables me to be productive!”

I have no doubt that most people saying this do indeed believe they’re better off programming in static languages. Regardless of what they think, however, there exists no conclusive evidence to back up such claims as a universal rule.

This is of course the perennial problem with software engineering in general, and the project management aspect of it in particular. There is very little proper research on optimal and effective approaches to it, which is why any of the so-called “best practices” are quite likely to stem from unsubstantiated hearsay.

We can lament this state of affairs, of course. But on the other hand, we can also find it liberating. In the absence of rigid prescriptions and judgments about productivity, we are free to explore, within technical limitations, what language works best for us, our team, and our projects.

Sometimes it’ll be Go, Java, Rust, or even Haskell.
A different situation may be best handled by Python, Ruby, or even JavaScript.

As the old adage goes, there is no silver bullet. We should not try to polish static typing into one.

Arguments to Python generator functions

Posted on Tue 14 March 2017 in Code • Tagged with Python, generators, functions, arguments, closures • Leave a comment

In Python, a generator function is one that contains a yield statement inside the function body. Although this language construct has many fascinating use cases (PDF), the most common one is creating concise and readable iterators.

A typical case

Consider, for example, this simple function:

def multiples(of):
    """Yields all multiples of given integer."""
    x = of
    while True:
        yield x
        x += of

which creates an (infinite) iterator over all multiples of given integer. A sample of its output looks like this:

>>> from itertools import islice
>>> list(islice(multiples(of=5), 10))
[5, 10, 15, 20, 25, 30, 35, 40, 45, 50]

If you were to replicate in a language such as Java or Rust — neither of which supports an equivalent of yield — you’d end up writing an iterator class. Python also has them, of course:

class Multiples(object):
    """Yields all multiples of given integer."""

    def __init__(self, of):
        self.of = of
        self.current = 0

    def __iter__(self):
        return self

    def next(self):
        self.current += self.of
        return self.current

    __next__ = next  # Python 3

but they are usually not the first choice¹.

It’s also pretty easy to see why: they require explicit bookkeeping of any auxiliary state between iterations. Perhaps it’s not too much to ask for a trivial walk over integers, but it can get quite tricky if we were to iterate over recursive data structures, like trees or graphs. In yield-based generators, this isn’t a problem, because the state is stored within local variables on the coroutine stack.

Lazy!

It’s important to remember, however, that generator functions behave differently than regular functions do, even if the surface appearance often says otherwise.

The difference I wanted to explore in this post becomes apparent when we add some argument checking to the initial example:

def multiples(of):
    """Yields all multiples of given integer."""
    if of < 0:
        raise ValueError("expected a natural number, got %r" % (of,))

    x = of
    while True:
        yield x
        x += of

With that if in place, passing a negative number shall result in an exception. Yet when we attempt to do just that, it will seem as if nothing is happening:

>>> m = multiples(-10)
>>>

And to a certain degree, this is pretty much correct. Simply calling a generator function does comparatively little, and doesn’t actually execute any of its code! Instead, we get back a generator object:

>>> m
<generator object multiples at 0x10f0ceb40>

which is essentially a built-in analogue to the Multiples iterator instance. Commonly, it is said that both generator functions and iterator classes are lazy: they only do work when we asked (i.e. iterated over).

Getting eager

Oftentimes, this is perfectly okay. The laziness of generators is in fact one of their great strengths, which is particularly evident in the immense usefulness of theitertools module.

On the other hand, however, delaying argument checks and similar operations until later may hamper debugging. The classic engineering principle of failing fast applies here very fittingly: any errors should be signaled immediately. In Python, this means raising exceptions as soon as problems are detected.

Fortunately, it is possible to reconcile the benefits of laziness with (more) defensive programming. We can make the generator functions only a little more eager, just enough to verify the correctness of their arguments.

The trick is simple. We shall extract an inner generator function and only call it after we have checked the arguments:

def multiples(of):
    """Yields all multiples of given integer."""
    if of < 0:
        raise ValueError("expected a natural number, got %r" % (of,))

    def multiples():
        x = of
        while True:
            yield x
            x += of

    return multiples()

From the caller’s point of view, nothing has changed in the typical case:

>>> multiples(10)
<generator object multiples at 0x110579190>

but if we try to make an incorrect invocation now, the problem is detected immediately:

>>> multiples(-5)

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    multiples(of=-5)
  File "<pyshell#0>", line 4, in multiples
    raise ValueError("expected a natural number, got %r" % (of,))
ValueError: expected a natural number, got -5

Pretty neat, especially for something that requires only two lines of code!

The last (micro)optimization

Indeed, we didn’t even have to pass the arguments to the inner (generator) function, because they are already captured by the closure.

Unfortunately, this also has a slight performance cost. A captured variable (also known as a cell variable) is stored on the function object itself, so Python has to emit a different bytecode instruction (LOAD_DEREF) that involves an extra pointer dereference. Normally, this is not a problem, but in a tight generator loop it can make a difference.

We can eliminate this extra work² by passing the parameters explicitly:

    # (snip)

    def multiples(of):
        x = of
        while True:
            yield x
            x += of

    return multiples(of)

This turns them into local variables of the inner function, replacing the LOAD_DEREF instructions with (aptly named) LOAD_FAST ones.

Technically, the Multiples class is here is both an iterator (because it has the next/__next__ methods) and iterable (because it has __iter__ method that returns an iterator, which happens to be the same object). This is common feature of iterators that are not associated with any collection, like the ones defined in the built-in itertools module. ↩
Note that if you engage in this kind of microoptimizations, I’d assume you have already changed your global lookup into local ones :) ↩

all and wild imports in Python

Posted on Mon 26 December 2016 in Code • Tagged with Python, modules, imports, testing • Leave a comment

An often misunderstood piece of Python import machinery is the __all__ attribute. While it is completely optional, it’s common to see modules with the __all__ list populated explicitly:

__all__ = ['Foo', 'bar']

class Foo(object):
    # ...

def bar():
    # ...

def baz():
    # ...

Before explaining what the real purpose of __all__ is (and how it relates to the titular wild imports), let’s deconstruct some common misconceptions by highlighting what it isn’t:

__all__ doesn’t prevent any of the module symbols (functions, classes, etc.) from being directly imported. In our the example, the seemingly omitted baz function (which is not included in __all__), is still perfectly importable by writing from module import baz.
Similarly, __all__ doesn’t influence what symbols are included in the results of dir(module) or vars(module). So in the case above, a dir call would result in a ['Foo', 'bar', 'baz'] list, even though 'baz' does not occur in __all__.

In other words, the content of __all__ is more of a convention rather than a strict limitation. Regardless of what you put there, every symbol defined in your module will still be accessible from the outside.

This is a clear reflection of the common policy in Python: assume everyone is a consenting adult, and that visibility controls are not necessary. Without an explicit __all__ list, Python simply puts all of the module “public” symbols there anyway¹.

The meaning of it `all`

So, what does __all__ actually effect?

This is neatly summed up in this brief StackOverflow answer. Simply speaking, its purpose is twofold:

It tells the readers of the source code — be it humans or automated tools — what’s the conventional public API exposed by the module.
It lists names to import when performing the so-called wild import: from module import *.

Because of the default content of __all__ that I mentioned earlier, the public API of a module can also be defined implicitly. Some style guides (like the Google one) are therefore relying on the public and _private naming exclusively. Nevertheless, an explicit __all__ list is still a perfectly valid option, especially considering that no approach offers any form of actual access control.

Import star

The second point, however, has some real runtime significance.

In Python, like in many other languages, it is recommended to be explicit about the exact functions and classes we’re importing. Commonly, the import statement will thus take one of the following forms:

import random
import urllib.parse
from random import randint
from logging import fatal, warning as warn
from urllib.parse import urlparse
# etc.

In each case, it’s easy to see the relevant name being imported. Regardless of the exact syntax and the possible presence of aliasing (as), it’s always the last (qualified) name in the import statement, before a newline or comma.

Contrast this with an import that ends with an asterisk:

from itertools import *

This is called a star or wild import, and it isn’t so straightforward. This is also the reason why using it is generally discouraged, except for some very specific situations.

Why? Because you cannot easily see what exact names are being imported here. For that you’d have to go to the module’s source and — you guessed it — look at the __all__ list².

Taming the wild

Barring some less important details, the mechanics of import * could therefore be expressed in the following Python (pseudo)code:

import module as __temp
for __name in module:
    globals()[name] = getattr(__temp, __name)
del __temp
del __name

One interesting case to consider is what happens when __all__ contains a wrong name.

What if one of the strings there doesn’t correspond to any name within the module?…

# foo.py
__all__ = ['Foo']

def bar():
    pass

>>> import foo
>>> from foo import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'Foo'

Quite predictably, import * blows up.
Notice, however, that regular import still works.

All in all (ahem), this hints at a cute little trick which is also very self-evident:

__all__ = ['DO_NOT_WILD_IMPORT']

Put this in a Python module, and no one will be able to import * from it!
Much more effective than any lint warning ;-)

Test `all` the things

Jokes aside, this phenomenon (__all__ with an out-of-place name in it) can also backfire. Especially when reexporting, it’s relatively easy to introduce stray 'name' into __all__: one which doesn’t correspond to any name that’s actually present in the namespace.

If we commit such a mishap, we are inadvertently lying about the public API of our package. What’s worse is that this mistake can propagate through documentation generators, and ultimately mislead our users.

While some linters may be able to catch this, a simple test like this one:

def test_all(self):
    """Test that __all__ contains only names that are actually exported."""
    import yourpackage

    missing = set(n for n in yourpackage.__all__
                  if getattr(yourpackage, n, None) is None)
    self.assertEmpty(
        missing, msg="__all__ contains unresolved names: %s" % (
            ", ".join(missing),))

is a quick & easy way to ensure this never happens.

“Public” symbols have names that don’t begin with underscore (_). Of course, “non-public” ones are still accessible but are treated as implicitly unstable & discouraged. ↩
Or check what symbols there don’t have a leading underscore. ↩

The brave “new” world of Python 3

Posted on Mon 15 August 2016 in Code • Tagged with Python, Python 3, Unicode, lazy evaluation, iterables • Leave a comment

I’ll blurt it straight up: I’m not a big fan of Python 3.

For a long time, I resisted the appeal of various incremental improvements that early 3.x releases offered. And the world agreed with me: a mere two years ago, Python 3 wasn’t even a blip on the PyPI radar.

Lately, however, things seem to be picking up some steam.

As if to compensate for years of “good enough”, Python 3 development team has given in to a steadily accelerating feature creep. Sure, some of it results in bad ideas (or even ideas you’d hope are jokes), but it nevertheless causes an increasingly wide functional gap between the 2.x and 3.x series.

Starting from around Python 3.5, this gap becomes really noticeable, even when partially bridged with many excellent backports. The ecosystem support is also mostly there, at least insofar as “not breaking horribly when a package is used in Python 3”.

And then, of course, there is the 2.7 EoL date looming ever closer.

Given all those portents, even old curmudg… ahem… seasoned developers cannot really ignore Python 3 anymore. For better or for worse, 3.x is how Python will look like in the coming years and decades. Might as well prepare for it.

In this post, I will discuss some important issues one should be aware of before trying to switch from Python 2 to 3. I won’t be talking about all the minute changes and additions, but cover the more significant, broader concepts that mark the divide between the 2.x and 3.x generations.

The two concepts I’ll be mentioning here are Unicode (obviously) and lazy vs. eager computation.

Unicode handling

You have probably heard it before. Python 3 was going to solve your Unicode problems once and for all. You haven’t believed it, of course, like you wouldn’t believe in any other silver bullet.

Still, it may be rather surprising to learn that in Python 3, you’ll actually see much more Unicode-related errors.

And strange as it may sound, it is a good thing.

In any case, either version of Python gets the most important thing about Unicode right. They both distinguish, at the type level, between strings (of Unicode codepoints) and their encodings (sequences of bytes). The type that holds the latter is called bytes in both versions, while strings are stored within the str type in Python 3 and unicode in Python 2.

It is from this crucial distinction — or rather, failing to account for it — where all the dreaded Unicode errors ultimately stem.

But where Python 2 does poorly is in the choice of defaults. You probably know all too well that bytes there is just an alias for str. That str is a fully functional string type, even though it can only contain ASCII characters. Moreover, it is also the default: quoted string literals, for example, will be of this type unless specially marked.

This poor choice of defaults is the primary source of latent Unicode bugs in Python 2 programs.

What Python 3 does here is to help expose those bugs sooner. If you already deal with Unicode correctly in your programs — maybe because you watched this excellent talk by Ned Batchelder — your main benefit will be not having to write that u"" quotes anymore. Otherwise, it’ll force you to consider the issue from the very beginning, rather than letting you write “working” programs that crash the moment they have to process some non-ASCII input.

Laziness by default

The second major change that Python 3 brings is of similar nature. It is also a change of defaults, but the impetus for it is much less evident.

What’s different in Python 3 is that many built-in functions and methods which used to return lists are now giving out bespoke objects that only mostly behave like lists. Included in these are functions like map or filter, as well as common dictionary methods such as keys or values.

This change is usually presented as removal of unnecessary cruft:

itertools.ifilter is now just filter
xrange is now just range
dict.iteritems is now just dict.items

and so on.

In some cases, this is exactly what happens. For example, there is virtually no downside to the new implementation of range, especially considering the way it is used most often.

But not every built-in managed to preserve all the functionality of lists. Indeed, many have downgraded their API guaratees to those of mere generators, i.e. the most simplistic and limited flavor of Python iterables. Working with them is trickier and more error-prone than with lists, which is due to various pitfalls that generators expose us to.

Navigating around those gotchas used to be something that Python code had to opt-in to, by explicitly importing the itertools module and using its functions in place of the built-ins. What you could gain in return was increased performance, and a lesser memory footprint. All those benefits came from making the computations lazy and refraining from storage of the intermediate results.

In Python 3, however, laziness is preordained. Even if we don’t need or care about the aforementioned perks, we have to devise some way of dealing with the pervasive generators.

One option is to embrace lazy evaluation fully, and adapt to handling unspecified iterables throughout our code bases.
The risk is an increased frequency of bugs stemming from generator misuse — including a common mistake of trying to iterate over lazy foos the second time, deeper down a long function, after it’s been already exhausted.

The alternative is to engage in a lot of “defensive listing”: wrapping of unknown (or known-but-lazy) iterables in list() calls in order to “sanitize” them for later (re)use.
Examples include immediate listification of a generator object:

primes = list(filter(is_prime, range(1000)))

or preemptive conversion of an incoming iterable argument:

def do_something(foos):
    foos = list(foos)
    # ...the rest of a long function...

Even if you choose the first path, and somehow use lazy generators everywhere, conversions are still required at the serialization boundaries:

d = {'foo': 42}
json.dumps({'keys': d.keys()})  # TypeError: dict_keys(['foo']) is not JSON serializable
json.dumps({'keys': list(d.keys())})  # works

At least in this case, the lazy iterable will vocally fail with an exception, rather than silently doing nothing (in case of repeated iteration) or always posing as truthy even when it’s empty (in if iterable: checks).

from future import doubts

So, here they are: the highlights of Python 3. If you are disappointed they all turned out to be mixed blessings, don’t worry: you are in a good company.

The truth is that Python 3 is more finnicky, less forgiving, and much less beginner-friendly than its predecessor. Its various superficial simplifications are almost squarely balanced by many new concerns that are thrust upon an unsuspecting programmer from the very beginning.

In one possible view, this is simply a sign that the language has matured. Perhaps it’s not a coincidence that almost exactly 18 years has passed between the first public version of Python (0.9) and the release of Python 3.0. By no conceivable means it is a toy language anymore, and it’s adequately equipped to tackle challenges presented by the computing world of today.

But on the other hand, it’s clear something is being gradually lost in the process.

It’s becoming harder to claim the language favors simplicity over complexity.
It is no longer so easy to pick which way is the obvious way to do it.
It is increasingly often that ugly replaces beautiful and nested replaces flat.

Little by little, Python itself is becoming less and less pythonic. The pace isn’t breakneck, but it’s definitely noticeable. But who knows? Maybe after two decades, a wholesale redefinition of the language’s core principles really is in order.

…Well, certainly that’s necessary if some of the latest ideas are about to get in!

str.startswith() with tuple argument

Posted on Tue 28 June 2016 in Code • Tagged with Python, strings, tuples • Leave a comment

Here’s a little known trick that’s applicable to Python’s startswith and endswith methods of str (and unicode).

Suppose you’re checking whether a string starts with some prefix:

if s.startswith('http://'):
    # totally an URL

You eventually add more possible prefixes (or suffixes) to your condition:

if s.startswith('http://') or s.startswith('https://'):
    # ...

Later on you notice the repetition and refactor it into something like this:

SCHEMES = ['http://', 'https://', 'ftp://', 'git://']
if any(s.startswith(p) for p in SCHEMES):
    # ...

or if you’re feeling extra functional:

if any(map(s.startswith, SCHEMES)):
    # ...

Turns out, however, that startswith (and endswith) support this use case natively. Rather than passing just a single string as the argument, you can provide a tuple of strings instead:

SCHEMES = ('http://', 'https://', 'ftp://', 'git://')
if s.startswith(SCHEMES):
    # ...

Either method will then check the original string against every element of the passed tuple. Both will only return True if at least one of the strings is recognized as prefix/suffix. As you can see, that’s exactly what we would previously do with any.

Somewhat surprisingly, however, the feature only works for actual tuples. Trying to pass a seemingly equivalent iterable — a list or set, for example — will be met with interpreter’s refusal:

>>> is_jpeg = filename.endswith(['.jpg', '.jpeg'])

TypeError: endswith first arg must be str, unicode, or tuple, not list

If you dig into it, there doesn’t seem to be a compelling reason for this behavior. The relevant feature request talks about consistency with the built-in isinstance function, but it’s quite difficult to see how those two are related.

In any case, this can be worked around without much difficulty:

PROTOCOLS = ('http', 'https', 'ftp', 'git')
if s.startswith(tuple(p + '://' for p in PROTOCOLS)):
    # ...

though ideally, you’d want to pack the prefixes in a tuple to begin with.

…or lambda?

Posted on Mon 20 June 2016 in Code • Tagged with Python, syntax, lambda, operators • Leave a comment

a.k.a. Curious Facts about Python Syntax

In Python 3.3, a new method has been added to the str type: casefold. Its purpose is to return a “sanitized” version of the string that’s suitable for case-insensitive comparison. For older versions of Python, an alternative way that’s mostly compatible is to use the str.lower method, which simply changes all letters in the string to lowercase.

Syntax is hard

Easy enough for a compatibility shim, right? That’s exactly what I thought when I came up with this:

casefold = getattr(str, 'casefold', None) or lambda s: s.lower()

Let’s ignore for a moment the fact that for a correct handling of unicode objects in Python 2, a much more sophisticated approach is necessary. What’s rather more pertinent is that this simple code doesn’t parse:

  File "foo.py", line 42
    getattr(str, 'casefold', None) or lambda s: s.lower()
                                      ^
SyntaxError: invalid syntax

It’s not very often that you would produce a SyntaxError with code that looks perfectly valid to most pythonistas. The last time I had it happen, the explanation was rather surprising and not exactly trivial to come by.

Fortunately, there is always one place where we can definitively resolve any syntactic confusion. That place is the full grammar specification of the Python language.

It may be a little intimidating at first, especially if you’re not familiar with the ENBF notation it uses. All the Python’s language constructs are there, though, so the SyntaxError from above should be traceable to a some rule of the grammar¹.

The culprit

And indeed, the offending bit is right here:

or_test: and_test ('or' and_test)*
and_test: not_test ('and' not_test)*
...

It says, essentially, that Python defines the or expression (or_test) as a sequence of and expressions (and_test). If you follow the syntax definition further, however, you will notice that and_test expands to comparisons (a < b, etc.), arithmetic expressions (x + y, etc.), list & dict constructors ([foo, bar], etc.), and finally to atoms such as literal strings and numbers.

What you won’t see along the way are lambda definitions:

lambdef: 'lambda' [varargslist] ':' test

In fact, the branch to allow them is directly above the or_test:

test: or_test ['if' or_test 'else' test] | lambdef

As you can see, the rule puts lambdas at the same syntactical level as conditional expressions (x if a else b), which is very high up. The only thing you can do with a lambda to make a larger expression is to add a yield keyword before it², or follow it with a comma to create a tuple³.

You cannot, however, pass it as an argument to a binary operator, even if it otherwise makes sense and even looks unambiguous. This is also why the nonsensical expressions such as this one:

1 + lambda: None

will fail not with TypeError, but also with SyntaxError, as they won’t even be evaluated.

More parentheses

Savvy readers may have noticed that this phenomenon is very much reminiscent of the issue of operator precedence.

Indeed, in Python and in many other languages it is the grammar that ultimately specifies the order of operations. It does so simply by defining how expressions can be constructed.

Addition, for example, will be of lower priority than multiplication simply because a sum is said to comprise of terms that are products:

arith_expr: term (('+'|'-') term)*
term: factor (('*'|'/'|'%'|'//') factor)*

This makes operator precedence a syntactic feature, and its resolution is baked into the language parser and handled implicitly⁴.

We know, however, that precedence can be worked around where necessary by enclosing the operator and its arguments in a pair of parenthesis. On the syntax level, this means creating an entirely new, top-level expression:

atom: '(' [yield_expr|testlist_comp] ')' |  # parenthesized expression
       '[' [listmaker] ']' |
       '{' [dictorsetmaker] '}' |
       '`' testlist1 '`' |
       NAME | NUMBER | STRING+)

There, it is again possible to use even the highest-level constructs, including also the silly stuff such as trying to add a number to a function:

1 + (lambda: None)

This expression will now parse correctly, and produce TypeError as expected.

In the end, the resolution of our initial dilemma is therefore rather simple:

casefold = getattr(str, 'casefold', None) or (lambda s: s.lower())

Such rules are sometimes called productions of the grammar, a term from computational linguistics. ↩
Yes, yield foo is an expression. Its result is the value sent to the generator by outer code via the send method. Since most generators are used as iterables, typically no values are passed this way so the result of a yield expression is None. ↩
There are also a legacy corner cases of lambdas in list/dict/etc. comprehensions, but those only apply under Python 2.x. ↩
This saying, there are languages where the order is resolved at later stage, after the expressions have already been parsed. They usually allow the programmer to change the precedence of their own operators, as it’s the case in Haskell. ↩

Please don’t use Click

Posted on Fri 20 May 2016 in Programming • Tagged with Python, CLI, UI, Click • Leave a comment

…not for standalone programs anyway.

Chances are, you have written some command line programs in Python. This is quite probable even if you normally code in some other language. And if you have, it is not unlikely that you needed to parse the argv of your program at one point or another.

There are plenty of options here, both in the standard library as well as among third party packages. One does stand out, however, and it’s mostly for how it is often overused. I’m talking about Click here.

If you wanted to use it in your next Python program, I hereby urge you to reconsider.

What’s the fuss?

The somewhat bizarrely named Click library is described as a “package for creating beautiful command line interfaces”. Its main trick is the ability to create subcommands by adorning Python functions with the @click.command() decorator¹. It then makes them coalesce into an argument parser, equipped with the necessary dispatching logic.

This idea isn’t new, of course. Prior art goes back at least seven years to the now-abandoned opster package. Click, however, was the first one of its kind to garner noticeable popularity, which is easily attributed to whom it’s been authored by.

So while my arguments against using this kind of CLI framework would apply to any package implementing the paradigm, it just happens that Click is currently its most prominent example. Purely for the sake of convenience, I will therefore refer to it as if it was interchangeable with the whole concept. Because why not? Whatever you may say about the library’s name, it’s hard to imagine a more concise moniker than a simple Click.

What’s wrong, then, with the way Click handles command line interfaces?

CLI: Little Interfaces

It’s how it encourages to treat them as an accidental afterthought rather than a deliberate design decision.

For applications invoked repeatedly from a terminal, their command line arguments and flags are the primary means of user interaction². It is how users communicate their intent to perform an action; provide the neccessary input data to carry it throgh; decide how they want to receive the output; and control many other aspects of the programs execution. Absent graphical components and widgets, the command line is virtually the only way to interact with a terminal program.

In other words, it is the UI.

And how important the UI is for any application? It seems to be important enough that entire fields of study are devoted to reducing friction of human-computer interaction. In many projects, the emphasis on user interface design is on par with that of actual software engineering.
Like everything, of course, it is susceptible to trends and fads (such as the recent “mobile/responsive everything!” craze). But its significance remains undiminished. Quite the opposite: in the age of ubiquitous computing, user interfaces are probably more important than ever.

Yes, this includes CLI. One of the main reasons we turn to the command line are speed and efficacy. Common tasks must utilize short and convenient syntax that is quick to integrate into user’s muscle memory. Others should not only be possible, but discoverable and accessible without going through reams of man pages.

Any terminal program intended for frequent use by humans should therefore strive to excel in those two qualities. But except for the simplest of cases, it won’t happen by itself. Designing an efficient CLI for any non-trivial application is a challenging and demanding task.

It doesn’t click

With Click, however, we’re encouraged to just wing it.

Click tells us to slap some decorators on our top-level functions and call it a day. Sure, you can dig deep enough and uncover the underlying layers of abstraction that may eventually allow you do things for which argparse has a first-class support.

By default, however, Click shoehorns your programs into predefined patterns that, incidentally, mirror those of some least intuitive command-line tools in existence.

Indeed, the whole idea of subdiving your program into several distinct is already suspect, for it appears at odds with the fundamental Unix philosophy of doing one thing well. While it is occasionally justified, it shouldn’t be the first thing that comes to your mind. But that’s completely at odds with the Click’s approach, where not ending up with multiple distinct commands is something you have to consciously avoid.

…though it sometimes might

So, what am I suggesting you use instead libraries such as Click?… Nothing outrageous, really.

If you care about your command line interface, consider just using the argparse module. Yes, it will force you to create parser objects, add arguments & flags to it, and in general pay some attention to the whole business. When it comes to UI, it’s always good to make it an explicit concern, maybe even sufficient to warrant its own module.

Alternatively, the docopt library provides another take on the UI-first approach to CLI, though it is more limited in its capabilities³.

Finally, I’m not advocating to ditch Click in all scenarios. There’s plenty of situations when we’re interested in getting any CLI up and running, and not so much in making the most efficient and intuitive interface possible. The prime example is any kind of automation scripts that are ancillary to some bigger project, like manage.py is in Django⁴. The Python ecosystem doesn’t really have dedicated task runners that are as featureful as Grunt or Gulp, and that makes Click a viable and compelling option⁵.

But for standalone programs whose CLI is the main interface? Yeah, not really.

Oddly enough, that pair of parentheses seems to be mandatory. ↩
Environment variables and config files deserve a honorary mention, of course. But those are usually derivatives of the command line arguments, containing e.g. the default values for flags. ↩
Click’s own documentation actually describes quite nicely how theirs and docopt’s philosophies differ in a way that’s consistent with this article. ↩
Incidentally, this appears to be a major motivation behind creating Click in the first place: to support web applications built upon on the Flask framework, and possibly obviate the need for extensions such as Flask-Script. ↩
This saying, there are some task runners which offer similar experience, like Invoke. ↩

Older Posts