Karol Kuczmarski's Blog

all and wild imports in Python

Posted on Mon 26 December 2016 in Code • Tagged with Python, modules, imports, testing • Leave a comment

An often misunderstood piece of Python import machinery is the __all__ attribute. While it is completely optional, it’s common to see modules with the __all__ list populated explicitly:

__all__ = ['Foo', 'bar']

class Foo(object):
    # ...

def bar():
    # ...

def baz():
    # ...

Before explaining what the real purpose of __all__ is (and how it relates to the titular wild imports), let’s deconstruct some common misconceptions by highlighting what it isn’t:

__all__ doesn’t prevent any of the module symbols (functions, classes, etc.) from being directly imported. In our the example, the seemingly omitted baz function (which is not included in __all__), is still perfectly importable by writing from module import baz.
Similarly, __all__ doesn’t influence what symbols are included in the results of dir(module) or vars(module). So in the case above, a dir call would result in a ['Foo', 'bar', 'baz'] list, even though 'baz' does not occur in __all__.

In other words, the content of __all__ is more of a convention rather than a strict limitation. Regardless of what you put there, every symbol defined in your module will still be accessible from the outside.

This is a clear reflection of the common policy in Python: assume everyone is a consenting adult, and that visibility controls are not necessary. Without an explicit __all__ list, Python simply puts all of the module “public” symbols there anyway¹.

The meaning of it `all`

So, what does __all__ actually effect?

This is neatly summed up in this brief StackOverflow answer. Simply speaking, its purpose is twofold:

It tells the readers of the source code — be it humans or automated tools — what’s the conventional public API exposed by the module.
It lists names to import when performing the so-called wild import: from module import *.

Because of the default content of __all__ that I mentioned earlier, the public API of a module can also be defined implicitly. Some style guides (like the Google one) are therefore relying on the public and _private naming exclusively. Nevertheless, an explicit __all__ list is still a perfectly valid option, especially considering that no approach offers any form of actual access control.

Import star

The second point, however, has some real runtime significance.

In Python, like in many other languages, it is recommended to be explicit about the exact functions and classes we’re importing. Commonly, the import statement will thus take one of the following forms:

import random
import urllib.parse
from random import randint
from logging import fatal, warning as warn
from urllib.parse import urlparse
# etc.

In each case, it’s easy to see the relevant name being imported. Regardless of the exact syntax and the possible presence of aliasing (as), it’s always the last (qualified) name in the import statement, before a newline or comma.

Contrast this with an import that ends with an asterisk:

from itertools import *

This is called a star or wild import, and it isn’t so straightforward. This is also the reason why using it is generally discouraged, except for some very specific situations.

Why? Because you cannot easily see what exact names are being imported here. For that you’d have to go to the module’s source and — you guessed it — look at the __all__ list².

Taming the wild

Barring some less important details, the mechanics of import * could therefore be expressed in the following Python (pseudo)code:

import module as __temp
for __name in module:
    globals()[name] = getattr(__temp, __name)
del __temp
del __name

One interesting case to consider is what happens when __all__ contains a wrong name.

What if one of the strings there doesn’t correspond to any name within the module?…

# foo.py
__all__ = ['Foo']

def bar():
    pass

>>> import foo
>>> from foo import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'Foo'

Quite predictably, import * blows up.
Notice, however, that regular import still works.

All in all (ahem), this hints at a cute little trick which is also very self-evident:

__all__ = ['DO_NOT_WILD_IMPORT']

Put this in a Python module, and no one will be able to import * from it!
Much more effective than any lint warning ;-)

Test `all` the things

Jokes aside, this phenomenon (__all__ with an out-of-place name in it) can also backfire. Especially when reexporting, it’s relatively easy to introduce stray 'name' into __all__: one which doesn’t correspond to any name that’s actually present in the namespace.

If we commit such a mishap, we are inadvertently lying about the public API of our package. What’s worse is that this mistake can propagate through documentation generators, and ultimately mislead our users.

While some linters may be able to catch this, a simple test like this one:

def test_all(self):
    """Test that __all__ contains only names that are actually exported."""
    import yourpackage

    missing = set(n for n in yourpackage.__all__
                  if getattr(yourpackage, n, None) is None)
    self.assertEmpty(
        missing, msg="__all__ contains unresolved names: %s" % (
            ", ".join(missing),))

is a quick & easy way to ensure this never happens.

“Public” symbols have names that don’t begin with underscore (_). Of course, “non-public” ones are still accessible but are treated as implicitly unstable & discouraged. ↩
Or check what symbols there don’t have a leading underscore. ↩

The meaning of it __all__

Import star

Taming the wild

Test __all__ the things

The meaning of it `all`

Test `all` the things