str.startswith() with tuple argument
Posted on Tue 28 June 2016 in Code
Here’s a little known trick that’s applicable to Python’s startswith
and endswith
methods of str
(and unicode
).
Suppose you’re checking whether a string starts with some prefix:
if s.startswith('http://'):
# totally an URL
You eventually add more possible prefixes (or suffixes) to your condition:
if s.startswith('http://') or s.startswith('https://'):
# ...
Later on you notice the repetition and refactor it into something like this:
SCHEMES = ['http://', 'https://', 'ftp://', 'git://']
if any(s.startswith(p) for p in SCHEMES):
# ...
or if you’re feeling extra functional:
if any(map(s.startswith, SCHEMES)):
# ...
Turns out, however, that startswith
(and endswith
) support this use case natively.
Rather than passing just a single string as the argument,
you can provide a tuple of strings instead:
SCHEMES = ('http://', 'https://', 'ftp://', 'git://')
if s.startswith(SCHEMES):
# ...
Either method will then check the original string against every element of the passed tuple.
Both will only return True
if at least one of the strings is recognized as prefix/suffix.
As you can see, that’s exactly what we would previously do with any
.
Somewhat surprisingly, however, the feature only works for actual tuples.
Trying to pass a seemingly equivalent iterable — a list
or set
, for example —
will be met with interpreter’s refusal:
>>> is_jpeg = filename.endswith(['.jpg', '.jpeg'])
TypeError: endswith first arg must be str, unicode, or tuple, not list
If you dig into it, there doesn’t seem to be a compelling reason for this behavior.
The relevant feature request talks about
consistency with the built-in isinstance
function,
but it’s quite difficult to see how those two are related.
In any case, this can be worked around without much difficulty:
PROTOCOLS = ('http', 'https', 'ftp', 'git')
if s.startswith(tuple(p + '://' for p in PROTOCOLS)):
# ...
though ideally, you’d want to pack the prefixes in a tuple to begin with.