Actually, Python enums are pretty OK

Posted on Sun 25 October 2015 in Code • Tagged with Python, enums, serializationLeave a comment

Little over two years ago, I was writing about my skepticism towards the addition of enum types to Python. My stance has changed somewhat since then, and I now think there is some non-trivial value that enums can add to Python code. It becomes especially apparent in the circumstances involving any kind of persistence, from pickling to databases.

I will elaborate on that through the rest of this post.


First, let’s recap (or perhaps introduce) the important facts about enums in Python. An enum, or an enumeration type, is a special class that inherits from enum.Enum1 and defines one or more constants:

from enum import Enum

class Cutlery(Enum):
    knife = 'knife'
    fork = 'fork'
    spoon = 'spoon'

Their values are then converted into singleton objects with distinct identity. They available as attributes of the enumeration type:

>>> Cutlery.fork
<Cutlery.fork: 'fork'>

As a shorthand for when the values themselves are not important, the Enum class can be directly invoked with constant names as strings:

Cutlery = Enum('Cutlery', ['knife', 'fork', 'spoon'])

Resulting class offers some useful API that we’d expect from an enumeration type in any programming language:

>>> list(Cutlery)
[<Cutlery.knife: 'knife'>, <Cutlery.fork: 'fork'>, <Cutlery.spoon: 'spoon'>]
>>> [c.value for c in Cutlery]
['knife', 'fork', 'spoon']
>> Cutlery.knife in Cutlery
>>> Cutlery('spoon')
<Cutlery.spoon: 'spoon'>
>>> Cutlery('spork')
Traceback (most recent call last):
# (...)
ValueError: 'spork' is not a valid Cutlery

How it was done previously

Of course, there is nothing really groundbreaking about assigning some values to “constants”. It’s been done like that since times immemorial, and quite often those constants have been grouped within finite sets.

Here’s an example of a comment model in some imaginary ORM, complete with a status “enumeration”:

class Comment(Model):
    APPROVED = 'approved'
    REJECTED = 'rejected'
    IN_REVIEW = 'in_review'

    author = StringProperty()
    text = TextProperty()
    state = String(choices=STATES)

# usage
comment = Comment(author="Xion", text="Boo")
comment.state = Comment.IN_REVIEW

Converting it to use an Enum is relatively straightforward:

class Comment(Model):
    class State(Enum):
        approved = 'approved'
        rejected = 'rejected'
        in_review = 'in_review'

    author = StringProperty()
    text = TextProperty()
    state = StringProperty(choices=[s.value for s in State])

comment = Comment(author="Xion", text="Meh.")
comment.state = Comment.State.approved.value

It is not apparent, though, if there is any immediate benefit of doing so. True, we no longer need to define the STATES set explicitly, but saving on this bit of boilerplate is balanced out by having to access the enum’s value when assigning it to a string property.

All in all, it seems like a wash — though at least we’ve established that enums are no worse than their alternatives :)

Enums are interoperable

Obviously, this doesn’t exactly sound like high praise. Thing is, we haven’t really replaced the previous solution completely. Remnants of it still linger in the way the state property is declared. Even though it’s supposed to hold enum constants, it is defined as string, which at this point is more of an implementation detail of how those constants are serialized.

What were really need here is a kind of EnumProperty that’d allow us to work solely with enum objects. Before the introduction of a standard Enum base, however, there was little incentive for ORMs and other similiar libraries to provide such a functionality. But now, it makes much more sense to support enums as first-class citizens, at least for data exchange and serialization, because users can be expected to already prefer the standard Enum in their own code.

Thus, the previous example changes into something along these lines:

class Comment(Model):
    # ...
    state = EnumProperty(State)

comment = Comment(author="Xion", text="Yay!")
comment.state = Comment.State.approved  # no .value!

Details of EnumProperty, or some equivalent construct, are of course specific to any given data management library. In SQLAlchemy, for example, a custom column type can handle the necessary serialization and deserialization between Python code and SQL queries, allowing you to define your models like this2:

class Comment(Model):
    class State(Enum):
        # ...

    author = Column(String(255))
    text = Column(Text)
    state = Column(EnumType(State, impl=String(32)))

# usage like above

In any case, the point is to have Python code operate solely on enum objects, while the persistence layer takes care of converting between them and their serializable values.

Enums are extensible

The other advantage Enums have over loose collections of constants is that they are proper types. Like all user-defined types in Python, they can have additional methods and properties defined on their instances, or even on the type itself.

Although this capability is (by default) not as powerful as e.g. in Java — where each enum constant can override a method in its own way — it can nevertheless be quite convenient at times. Typical use cases include constant classification:

class Direction(Enum):
    left = 37
    up = 38
    right = 39
    down = 40

    def is_horizontal(self):
        return self in (Direction.left, Direction.right)

    def is_vertical(self):
        return self in (Direction.down, Direction.up)

and conversion:

    def as_vector(self):
        return {
            Direction.left: (-1, 0),
            Direction.up: (0, -1),
            Direction.right: (1, 0),
            Direction.down: (0, 1),

For the latter, it would be handy to have the Java’s ability to attach additional data to an enum constant. As it turns out, Python supports this feature natively in a very similar way. We simply have to override enum’s __new__ method to parse out any extra values from the initializer and turn them into attributes of the enum instance:

class Direction(Enum):
    left = 37, (-1, 0)
    up = 38, (0, -1)
    right = 39, (1, 0)
    down = 40, (0, 1)

    def __new__(cls, keycode, vector):
        obj = object.__new__(cls)
        obj._value_ = keycode
        obj.vector = vector
        return obj

It’s possible, in fact, to insert any arbitrary computation here that yields the final _value_ of an enum constant3. This trick can be used to, for example, construct enums that automatically number themselves.

Finally, we can add static methods, class methods, or class properties to the Enum subclass, just like we would do with any other class:

class MyEnum(Enum):
    # ...

    def __values__(cls):
        return [m.value for m in cls]

Enums just are

All these things are possible primarly because of the most notable aspect of Python enums: their existence as an explicit concept. A syntactically unorganized bunch of constants cannot offer half of the highlighted features because there is nowhere to pin them on.

For that reason alone, using enums as an explicit — rather than implicit — pattern seems worthwhile. The one benefit we’re certain to reap is better code structure through separation of important concepts.

  1. The enum module is part of the Python standard library since version 3.4 but a fully functional backport is available for Python 2.x as well. 

  2. It is even possible to instruct SQLAlchemy how to map Python enums to ENUM types in database engines that support it, but details are outside of the scope of this article. 

  3. If you’re fine with the enum’s value being the whole tuple (everything after the = sign), you can override __init__ instead of __new__ (cf. the planet example from standard docs). 

Continue reading