Skip to content

Instantly share code, notes, and snippets.

@dvarrazzo
Last active September 27, 2018 14:43
Show Gist options
  • Save dvarrazzo/b7c8f050bbd39dd2c104 to your computer and use it in GitHub Desktop.
Save dvarrazzo/b7c8f050bbd39dd2c104 to your computer and use it in GitHub Desktop.
Accessing EscapeIdentifier from Python
From an email about ukulele:
[...], but I have a trick for you. We always refused to
"bless" it into psycopg because it may break in the future (well, it
will only in psycopg3 which is not really on anybody's roadmap, only
on mine if I manage to clone myself at least 4 times). But...
In [5]: class Identifier(str): pass
In [7]: psycopg2.extensions.register_adapter(Identifier, psycopg2.extensions.AsIs)
In [9]: cnn = psycopg2.connect("")
In [10]: cur = cnn.cursor()
In [11]: print cur.mogrify("select * from blah where %s = %s", (Identifier('field'), 'value'))
select * from blah where field = 'value'
this is super-dangerous if it comes from the web, if the fields are
not checked against a whitelist of course, but it allows to build
dynamically queries with fields and tables names passed as arguments.
To be more secure the identifiers should be filtered through kinda
http://www.postgresql.org/docs/9.4/static/libpq-exec.html#LIBPQ-PQESCAPEIDENTIFIER
[...] PQescapeIdentifier can be
accessed using ctypes. Getting the address of a libpq connection
structure is tricky as it's in the middle of the psycopg connection
structure, but luckily (see connection.h) it's right after a pretty
recognizable integer, the server_version, which can be also queried
from Python. So one can write this, can't he?
def getpqconn(conn):
"""
Return the address of the libpq connection string from a psycopg connection
"""
from ctypes import string_at
from sys import getsizeof
from socket import ntohl, htonl
from binascii import hexlify
hver = "%08x" % ntohl(conn.server_version)
mem = hexlify(string_at(id(conn), getsizeof(conn))).lower()
ver_off = mem.find(hver)
assert ver_off > 0
assert mem.find(hver, ver_off + 8) == -1, "there should be only one"
pqconn = htonl(int(mem[ver_off + 8:ver_off + 16], 16))
return pqconn
and that can be used to write a "robust" escape identifier...
libpq = ctypes.cdll.LoadLibrary("libpq.so")
def escape_identifier(conn, s):
from ctypes import string_at
cstr = libpq.PQescapeIdentifier(getpqconn(conn), s, len(s))
assert cstr
rv = string_at(cstr)
libpq.PQfreemem(cstr)
return rv
In [114]: print escape_identifier(cnn, 'ciao')
"ciao"
In [115]: print escape_identifier(cnn, 'quo"tes')
"quo""tes"
In [116]: print escape_identifier(cnn, "quo'te")
"quo'te"
which can be called upon Identifier adaptation using an adapter:
class IdentifierAdapter:
def __init__(self, s):
self.s = s
self.conn = None
def prepare(self, conn):
self.conn = conn
def getquoted(self):
return escape_identifier(self.conn, self.s)
psycopg2.extensions.register_adapter(Identifier, IdentifierAdapter)
In [122]: print cur.mogrify("select * from blah where %s = %s", (Identifier('field'), 'value'))
select * from blah where "field" = 'value'
@ewenmcneill
Copy link

For the benefit of anyone trying to make this work on Ubuntu 16.04 LTS, it seems like binascii.hexlify() now returns lower case hex digits for the memory, which means the upper case hex conversion of hver (with "%08X") will fail to match in mem.find(hver) due to find() being case sensitive. A reasonable work around is to force both strings to lower case (with STRING.lower()) before trying to use find(). After doing that, this hack seems to work again.

For the record, there's a proper solution in psycopg2 2.7 in the form of quote_ident; but Ubuntu 16.04 LTS shipped with psycopg 2.6 because psycopg 2.7 did not get released until earlier this year (2017).

Ewen

@dvarrazzo
Copy link
Author

Updated to address @ewenmcneill observation and fixed a typo (was hex instead of mem, resulting in TypeError: 'builtin_function_or_method' object has no attribute '__getitem__'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment