Skip to content

Instantly share code, notes, and snippets.

@kkirsanov
Created November 23, 2017 15:26
Show Gist options
  • Select an option

  • Save kkirsanov/d64a3c2899675e8aa35870c608fe7e12 to your computer and use it in GitHub Desktop.

Select an option

Save kkirsanov/d64a3c2899675e8aa35870c608fe7e12 to your computer and use it in GitHub Desktop.
Disable logical reader in fastavro
import datetime
from io import BytesIO
from pprint import pprint
from uuid import uuid4
import fastavro
fastavro._reader.LOGICAL_READERS['long-timestamp-millis'] = lambda d, w, r: d
schema = {
"fields": [
{
"name": "date",
"type": {'type': 'int', 'logicalType': 'date'}
},
{
"name": "timestamp-millis",
"type": {'type': 'long', 'logicalType': 'timestamp-millis'}
},
{
"name": "timestamp-micros",
"type": {'type': 'long', 'logicalType': 'timestamp-micros'}
},
{
"name": "uuid",
"type": {'type': 'string', 'logicalType': 'uuid'}
},
{
"name": "time-millis",
"type": {'type': 'int', 'logicalType': 'time-millis'}
},
{
"name": "time-micros",
"type": {'type': 'long', 'logicalType': 'time-micros'}
}
],
"namespace": "namespace",
"name": "name",
"type": "record"
}
def serialize(schema, data):
bytes_writer = BytesIO()
fastavro.schemaless_writer(bytes_writer, schema, data)
return bytes_writer.getvalue()
def deserialize(schema, binary):
bytes_writer = BytesIO()
bytes_writer.write(binary)
bytes_writer.seek(0)
res = fastavro.schemaless_reader(bytes_writer, schema)
return res
data1 = {
'date': datetime.date.today(),
'timestamp-millis': datetime.datetime.now(),
'timestamp-micros': datetime.datetime.now(),
'uuid': uuid4(),
'time-millis': datetime.datetime.now().time(),
'time-micros': datetime.datetime.now().time(),
}
binary = serialize(schema, data1)
data2 = deserialize(schema, binary)
pprint(data2)
@NoAnyLove
Copy link

I think line 8 should be,

fastavro._read.LOGICAL_READERS['long-timestamp-millis'] = lambda d, w, r: d

@forsberg
Copy link

Doing this is usually a terrible idea, as it will modify how fastavro decodes datetime on a per-process level. So if you add the fastavro._read.LOGICAL_READERS line, you will modify not only how fastavro is handling timestamp-millis in the module where you add the line of code, but in the entire python program.

This is a recipe for nasty lurking hard-to-find-bugs, and makes the code unmaintainable. Speaking from experience from having found this piece of code added by a junior programmer to a project where it was now causing problems in a completely different part of the codebase.

Just don't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment