Actually, Python enums are pretty OK
Posted on Sun 25 October 2015 in Code • Tagged with Python, enums, serialization • Leave a comment
Little over two years ago, I was writing about my skepticism towards the addition of enum types to Python. My stance has changed somewhat since then, and I now think there is some non-trivial value that enums can add to Python code. It becomes especially apparent in the circumstances involving any kind of persistence, from pickling to databases.
I will elaborate on that through the rest of this post.
Revision
First, let’s recap (or perhaps introduce) the important facts about enums in Python. An enum, or an enumeration type,
is a special class that inherits from enum.Enum
1 and defines one or more constants:
from enum import Enum
class Cutlery(Enum):
knife = 'knife'
fork = 'fork'
spoon = 'spoon'
Their values are then converted into singleton objects with distinct identity. They available as attributes of the enumeration type:
>>> Cutlery.fork
<Cutlery.fork: 'fork'>
As a shorthand for when the values themselves are not important, the Enum
class can be directly invoked
with constant names as strings:
Cutlery = Enum('Cutlery', ['knife', 'fork', 'spoon'])
Resulting class offers some useful API that we’d expect from an enumeration type in any programming language:
>>> list(Cutlery)
[<Cutlery.knife: 'knife'>, <Cutlery.fork: 'fork'>, <Cutlery.spoon: 'spoon'>]
>>> [c.value for c in Cutlery]
['knife', 'fork', 'spoon']
>> Cutlery.knife in Cutlery
True
>>> Cutlery('spoon')
<Cutlery.spoon: 'spoon'>
>>> Cutlery('spork')
Traceback (most recent call last):
# (...)
ValueError: 'spork' is not a valid Cutlery
How it was done previously
Of course, there is nothing really groundbreaking about assigning some values to “constants”. It’s been done like that since times immemorial, and quite often those constants have been grouped within finite sets.
Here’s an example of a comment model in some imaginary ORM, complete with a status “enumeration”:
class Comment(Model):
APPROVED = 'approved'
REJECTED = 'rejected'
IN_REVIEW = 'in_review'
STATES = frozenset([APPROVED, REJECTED, IN_REVIEW])
author = StringProperty()
text = TextProperty()
state = String(choices=STATES)
# usage
comment = Comment(author="Xion", text="Boo")
comment.state = Comment.IN_REVIEW
comment.save()
Converting it to use an Enum
is relatively straightforward:
class Comment(Model):
class State(Enum):
approved = 'approved'
rejected = 'rejected'
in_review = 'in_review'
author = StringProperty()
text = TextProperty()
state = StringProperty(choices=[s.value for s in State])
comment = Comment(author="Xion", text="Meh.")
comment.state = Comment.State.approved.value
comment.save()
It is not apparent, though, if there is any immediate benefit of doing so. True, we no longer need to define the STATES
set explicitly, but saving on this bit of boilerplate is balanced out by having to access the enum’s value
when assigning it to a string property.
All in all, it seems like a wash — though at least we’ve established that enums are no worse than their alternatives :)
Enums are interoperable
Obviously, this doesn’t exactly sound like high praise. Thing is, we haven’t really replaced the previous solution
completely. Remnants of it still linger in the way the state
property is declared. Even though it’s supposed to hold
enum constants, it is defined as string, which at this point is more of an implementation detail of how those constants
are serialized.
What were really need here is a kind of EnumProperty
that’d allow us to work solely with enum objects.
Before the introduction of a standard Enum
base, however, there was little incentive for ORMs and other similiar
libraries to provide such a functionality. But now, it makes much more sense to support enums as first-class citizens,
at least for data exchange and serialization, because users can be expected to already prefer the standard Enum
in their own code.
Thus, the previous example changes into something along these lines:
class Comment(Model):
# ...
state = EnumProperty(State)
comment = Comment(author="Xion", text="Yay!")
comment.state = Comment.State.approved # no .value!
comment.save()
Details of EnumProperty
, or some equivalent construct, are of course specific to any given data management library.
In SQLAlchemy, for example, a custom column type
can handle the necessary serialization and deserialization between Python code and SQL queries,
allowing you to define your models like this2:
class Comment(Model):
class State(Enum):
# ...
author = Column(String(255))
text = Column(Text)
state = Column(EnumType(State, impl=String(32)))
# usage like above
In any case, the point is to have Python code operate solely on enum objects, while the persistence layer takes care of converting between them and their serializable values.
Enums are extensible
The other advantage Enum
s have over loose collections of constants is that they are proper types.
Like all user-defined types in Python, they can have additional methods and properties defined on their instances,
or even on the type itself.
Although this capability is (by default) not as powerful as e.g. in Java — where each enum constant can override a method in its own way — it can nevertheless be quite convenient at times. Typical use cases include constant classification:
class Direction(Enum):
left = 37
up = 38
right = 39
down = 40
@property
def is_horizontal(self):
return self in (Direction.left, Direction.right)
@property
def is_vertical(self):
return self in (Direction.down, Direction.up)
and conversion:
def as_vector(self):
return {
Direction.left: (-1, 0),
Direction.up: (0, -1),
Direction.right: (1, 0),
Direction.down: (0, 1),
}.get(self)
For the latter, it would be handy to have the Java’s ability to
attach additional data to an enum constant.
As it turns out, Python supports this feature natively in a very similar way.
We simply have to override enum’s __new__
method to parse out any extra values from the initializer
and turn them into attributes of the enum instance:
class Direction(Enum):
left = 37, (-1, 0)
up = 38, (0, -1)
right = 39, (1, 0)
down = 40, (0, 1)
def __new__(cls, keycode, vector):
obj = object.__new__(cls)
obj._value_ = keycode
obj.vector = vector
return obj
It’s possible, in fact, to insert any arbitrary computation here that yields the final _value_
of an enum constant3.
This trick can be used to, for example, construct enums that
automatically number themselves.
Finally, we can add static methods, class methods, or class properties to the
Enum
subclass, just like we would do with any other class:
class MyEnum(Enum):
# ...
@classproperty
def __values__(cls):
return [m.value for m in cls]
Enums just are
All these things are possible primarly because of the most notable aspect of Python enums: their existence as an explicit concept. A syntactically unorganized bunch of constants cannot offer half of the highlighted features because there is nowhere to pin them on.
For that reason alone, using enums as an explicit — rather than implicit — pattern seems worthwhile. The one benefit we’re certain to reap is better code structure through separation of important concepts.
-
The
enum
module is part of the Python standard library since version 3.4 but a fully functional backport is available for Python 2.x as well. ↩ -
It is even possible to instruct SQLAlchemy how to map Python enums to
ENUM
types in database engines that support it, but details are outside of the scope of this article. ↩ -
If you’re fine with the enum’s
value
being the whole tuple (everything after the=
sign), you can override__init__
instead of__new__
(cf. the planet example from standard docs). ↩