Turn SQLAlchemy queries into literal SQL
Posted on Thu 12 November 2015 in Code
Like I mentioned in the post about adding regular expression support, the nice thing about SQLAlchemy is the clean, layered structure of various abstractions it utilizes.
Regretably, though, we know that software abstractions tend to leak. Inn particular, any ORM seem to be a poster child of this rule, especially in the popular opinion among developer crowds. By design, they’re intended to hide at least some details of the operations on a database, but those very details can be quite critical at times. There are situations when we simply want to know what exactly is going on, and how all those model classes and mappers and relationships translate to the actual SQL code.
To make it a little more concrete, let’s focus on the SQLAlchemy Query
class. Given such a query, we’d like to get
the final SQL representation of it, the one that’s ultimately sent to the database, It could be useful
for any number of things, from logging1 to profiling, or just displaying in the web page’s footer,
or even solely for prototyping in the Python REPL.
In other words, we want to turn this:
db_session.query(User.id).filter(User.email == some_email)
into something like this:
SELECT users.id FROM users WHERE users.email = :1
regardless of the complexity of the query, the number of model classes it spans, or the number of relationship-related
JOIN
s it involves.
It’s a dialect
There is one aspect we cannot really universalize, though. It’s the specific database backend that SQLAlchemy should compile our query for. Shrugging off syntactical and semantical differences between database engines is one thing that using an ORM can potentially buy us, but if we want to get down to the SQL level, we need to be specific about it.
In SQLAlchemy’s parlance, any specific variant of the SQL language is called a dialect. Most of the time,
you’ll be interested in the particular dialect your database of choice is using. This is easily obtainable from
the database Session
:
dialect = db_session.bind.dialect
The resulting
Dialect
object
is little more than a container for small tidbits of information, used by SQLAlchemy to handle various quirks of
the database backends it supports. For our purposes, though, it can be treated as a completely opaque token.
Compile & unwrap
With the Dialect
in hand, we can invoke the query compiler to get the textual representation of our Query
.
Or, to be more precise, the compiled version of the query’s
Select
statement
>>> query = db_session.query(User.id).filter(User.email == 'foo@example.com')
>>> print(query.statement.compile(dialect=db_session.bind.dialect))
SELECT user.id
FROM users
WHERE users.email = %(email_1)s
Perhaps unsurprisingly, even after compilation the result is still just another object:
the Compiled
one.
As you can see, however, the actual SQL text is just its __str__
‘ing representation, which we can print
directly
or obtain with str()
or unicode()
.
Query.to_sql
But obviously, we don’t want to type the above incantations every time we need to take a peek at the generated SQL
for an ORM query. Probably the best solution is to extend the Query
class so that it offers this additional
functionality under a new method:
from sqlalchemy.orm.query import Query as _Query
class Query(_Query):
"""Custom, enhanced SQLALchemy Query class."""
def to_sql(self):
"""Return a literal SQL representation of the query."""
dialect = self.session.bind.dialect
return str(self.statement.compile(dialect=dialect))
This new Query
class needs then be passed as
query_cls
argument
to the constructor of Session
. Details may vary a little bit depending on how exactly your application is set up,
but in most cases, it should be easy enough to figure out.
-
If you’re only interested in indiscriminate logging of all queries, setting the
echo
parameter increate_engine
may be sufficient. Another alternative is to look directly at the logging configuration options for various parts of SQLAlchemy ↩