Query string authentication in Requests

Posted on Fri 11 September 2015 in Code • Tagged with Python, Requests, HTTP, authenticationLeave a comment

Requests is a widely used Python library that provides a nicer API for writing HTTP clients than the standard urllib2 module does. It deals with authentication in an especially concise way: through a simple auth= argument, rather than a separate password manager & authentication handler or other such nonsense.

There are several possible ways to authenticate an HTTP call with Requests, and it’s pretty easy to implement our own approach if the server requires it. All the built-in ways, however, as well as the examples of custom implementations, are heavily biased towards using HTTP headers to transmit the necessary credentials (such as username/password or some kind of opaque token).

Non-standard auth

This is actually quite reasonable: the most popular authentication methods, including OAuth 1.0 & 2.0, use HTTP headers either primarily or exclusively.

Not every HTTP API follows this convention, though. Sometimes, credentials are put in other parts of the request, commonly the URL itself. It may seem like a bad idea at first but it can also be perfectly acceptable: credentials don’t have to expose secrets of any particular user of the remote system.

Steam API is a good example here. Calling any of its endpoints requires providing an API key but it grants no special rights to access data of any particular Steam user. All the information it returns is already visible on their public profile1.

For those special authentication schemes, Requests necessitate rolling out our own implementation. Thankfully, doing so is mostly pretty straightforward.

Simple example

All Requests’ authenticators are callable objects inheriting from requests.auth.AuthBase class. Writing your own is hence a matter of defining a subclass of AuthBase with at least a __call__ method:

class SillyAuth(AuthBase):
    def __call__(self, request):
        request.headers['X-ID'] = 'im valid so auth is yes'
        return request

# usage
requests.get('http://example.com', auth=SillyAuth())

The job of an authenticator is to modify the request so that it includes appropriate credentials in whatever form necessary to have them accepted by the remote server. Like I’ve mentioned before, HTTP headers are the most common option, but the request can be modified in other ways as well.

Query string parameters

One problem with modifying a query string, though, is that it’s a part of request URL. By the time it reaches authenticators, the Requests library has already merged any additional query params into it2. Including more params will thus require modifying the URL.

Though it may sound like a risky endeavour involving string manipulations that are fraught with security issues, it’s not really that bad at all. In fact, the Requests library provides an API to do exactly this:

class QueryStringAuth(AuthBase):
    """Authenticator that attaches a set of query string parameters
    (e.g. an API key) to the request.
    """
    def __init__(self, **params):
        self.params = params

    def __call__(self, request):
        if self.params:
            request.prepare_url(request.url, self.params)
        return request

Albeit scantly documented, the prepare_url method will take an existing URL and a dictionary of query string params, and outfit the request with a brand new URL that contains those params neatly encoded.

Full implementation of QueryStringAuth is a little more involved than the snippet above, because we should like to replicate all the idiosyncracies of how regular Requests API handles query string params. Some of them — like allowing both strings and lists as param values — are taken care of by prepare_url itself, but the rest should be dealt with on our own.

Usage

To finish up, let’s use this authenticator to call Steam API and return a list of games that a given user owns but hasn’t played yet:

STEAM_API_KEY = 'a1b2c3d4e5f6g7h8i9j'  # not a real one, of course


def get_steam_backlog(steam_id):
    url = 'http://api.steampowered.com/IPlayerService/GetOwnedGames/v0001/'
    params = {
        'steamid': steam_id,
        'include_appinfo': 1,
    }

    response = requests.get(
        url, params=params, auth=QueryStringAuth(key=STEAM_API_KEY))
    games = response.json().get('response', {}).get('games', ())

    for game in games:
        if game.get('playtime_forever', 0):
            continue
        yield game['name']

We could’ve put STEAM_API_KEY directly in params, of course. Singling it out explicitly as an authentication detail, however, makes the code clearer and plays nicely with more advanced features of Requests, such as sessions.


  1. It can be said that only in this case we’re dealing with exclusively authentication, whereas the others also perform authorization. I wouldn’t quibble too much about those details. The fact that both terms are often shortened to “auth” doesn’t exactly help with distinguishing them anyway. 

  2. In fact, what AuthBase.__call__ receives is a special PreparedRequest object which contains the exact bytes that’ll be sent to the server. Most of the higher level abstractions offered by the Requests library (like form data or JSON request body) has been compiled to raw octets at this point. This is done to allow some authenticators (like OAuth) to analyze the full request payload and sign it cryptographically as a part of their flow. 

Continue reading

Automatic error pages in Flask

Posted on Tue 08 September 2015 in Code • Tagged with Flask, Python, HTTP, errorsLeave a comment

In Flask, a popular web framework for Python, it’s pretty easy to implement custom handlers for specific HTTP errors. All you have to do is to write a function and annotate it with the @errorhandler decorator:

@app.errorhandler(404)
def not_found(error):
    return render_template('errors/404.html')

Albeit simple, the above example is actually very realistic. There’s rarely anything else to do in response to serious request error than to send back some HTML with an appropriate message. If you have more handlers like that, though, you’ll notice they get pretty repetitive: for each erroneous HTTP status code (404, 403, 400, etc.), pick a corresponding template and simply render it.

Which is, of course, why we’d want to deal with all in a little smarter way and with less boilerplate.

Just add a template

Ideally, we would like to avoid writing any Python code at all for each individual error handler. Since all we’re doing revolves around predefined templates, let’s just define the handlers automatically based on the HTML files themselves.

Assuming we store them in the same template directory — say, errors/ — and name their files after numeric status codes (e.g. 404.html), getting all those codes is quite simple1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from pathlib import Path, PurePath

from mywebapp import app


#: Path to the directory where HTML templates for error pages are stored.
ERRORS_DIR = PurePath('errors')


def get_supported_error_codes():
    """Returns an iterable of HTTP status codes for which we have
    a custom error page templates defined.
    """
    error_templates_dir = Path(app.root_path, app.template_folder, ERRORS_DIR)

    potential_error_templates = (
        entry for entry in error_templates_dir.glob('*.html')
        if entry.is_file())
    for template in potential_error_templates:
        try:
            code = int(template.stem)  # e.g. 404.html
        except ValueError:
            pass  # could be some base.html template, or similar
        else:
            if code < 400:
                app.logger.warning(
                    "Found error template for non-error HTTP status %s", code)
                continue
            yield code

Once we have them, we can try wiring up the handlers programmatically and making them do the right thing.

One function to handle them all

Although I’ve used a plural here, in reality we only need a single handler, as it’ll be perfectly capable of dealing with any HTTP error we bind it to. To help distinguishing between the different status codes, Flask by default invokes our error handlers with an HTTPException argument that has the code as an attribute:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from flask import render_template
from jinja2 import TemplateNotFound


def error_handler(error):
    """Universal handler for all HTTP errors.

    :param error: :class:`~werkzeug.exceptions.HTTPException`
                  representing the HTTP error.

    :return: HTTP response to be returned
    """
    code = getattr(error, 'code', None)
    if not code:
        app.logger.warning(
            "Got spurious argument to HTTP error handler: %r", error)
        return FATAL_ERROR_RESPONSE

    app.logger.debug("HTTP error %s, rendering error page", code)
    template = ERRORS_DIR / ('%s.html' % code)
    try:
        return render_template(str(template)), code
    except TemplateNotFound:
        # shouldn't happen if the handler has been wired up properly
        app.logger.fatal("Missing template for HTTP error page: %s", template)
        return FATAL_ERROR_RESPONSE

#: Response emitted when an error occurs in the error handler itself.
FATAL_ERROR_RESPONSE = ("Internal Server Error", 500)

Catching TemplateNotFound exception is admittedly a little paranoid here, as it can occur pretty much exclusively due to a programming error elsewhere. Feel free to treat it as a failed assertion about application’s internal state and e.g. convert to AssertionError if desirable.

The setup

The final step is to actually set up the handler(s):

for code in get_supported_error_codes():
    app.errorhandler(code)(error_handler)

It may look a bit wonky, but it’s just a simple desugaring of the standard Python decorator syntax from the first example. There exists a more direct approach of putting the handler inside app.error_handler_spec dictionary, but it is discouraged by Flask documentation.

Where to put the above code, though? I posit it’s perfectly fine to place in the module’s global scope, because the error handlers (and other request handlers) are traditionally defined at import time anyway.

Also notice that the default error handling we’ve defined here doesn’t preclude more specialized variants for select HTTP codes. All you have to do is ensure that your custom @app.errorhandler definition occurs after the above loop.


  1. Note that I’m using the pathlib module here — and you should, too, since it’s all around awesome. However, it is only a part of the standard library since Python 3.4, so you will most likely need to get the backport first. 

Continue reading