-
-
Save href/1319371 to your computer and use it in GitHub Desktop.
from collections import namedtuple | |
def convert(dictionary): | |
return namedtuple('GenericDict', dictionary.keys())(**dictionary) | |
""" | |
>>> d = dictionary(a=1, b='b', c=[3]) | |
>>> named = convert(d) | |
>>> named.a == d.a | |
True | |
>>> named.b == d.b | |
True | |
>>> named.c == d.c | |
True | |
""" |
This is nice, I have to convert a JSON object to a namedtuple
via a dict
. Could you possibly explain the syntax in this statement return namedtuple('GenericDict', dictionary.keys())(**dictionary)
, particular the bit with (** <namedtuple name> )
?
I can.
The function namedtuple()
returns a class. Passing **somedict
to a class or function as an argument expands the dict into key=value pairs when calling the function. So if I had some function, and some dict like...
def func(kwarg=None):
return kwarg
derp = { 'kwarg': 39 }
...then I can call func(**derp)
and it will return 39. If there are any other key-value pairs in derp, these will expand too, and func
will raise an exception.
If you look at namedtuple()
, it takes two arguments: a string with the name of the class (which is used by repr like in pihentagy's example), and a list of strings to name the elements. So in the block code above, namedtuple('GenericDict', dictionary.keys())
first returns a class with no particular reference. Passing (**dictionary)
right afterwards instantiates the class and returns an instance. If instead we did...
def convert(dictionary):
NT = namedtuple('GenericDict', dictionary.keys())
gen_dict = NT(**dictionary)
return gen_dict
...the same thing would happen. But this way, it might be a little clearer.
A neat idea!
Here is my variation on this theme:
from collections import namedtuple
def MakeNamedTuple(name='NamedTuple', **kwargs):
"""
Returns a namedtuple instance.
>>> nt1 = MakeNamedTuple(reclen=1000, numrec=5, samprate=1e6)
>>> nt1
NamedTuple(numrec=5, reclen=1000, samprate=1000000.0)
>>> nt1.reclen
1000
>>> sp1 = MakeNamedTuple(name='Setup', reclen=2000, numrec=55, samprate=1.6e6)
>>> sp1
Setup(numrec=55, reclen=2000, samprate=1600000.0)
>>> sp1.samprate
1600000.0
"""
dt = dict(**kwargs)
return namedtuple(name, sorted(dt))(**dt)
Thanks. Added code to convert it recursively.
from collections import namedtuple
def convert(dictionary):
for key, value in dictionary.iteritems():
if isinstance(value, dict):
dictionary[key] = convert(value)
return namedtuple('GenericDict', dictionary.keys())(**dictionary)
I have a variation to propose, in which you're 100% sure it will continue to work if new members are added to the dict (json API evolving):
args='a1 a2 a3'
d={"a1":1,"a2":2,"a3":3}
X=namedtuple("Blah",args)
X._make(d[i] for i in args.split(" "))
Blah(a1=1, a2=2, a3=3)
I'd like to point out the performance of this.
For a dictionary of 20 keys the conversion takes 500 us on my machine. Member access is about 3 times slower than of original dict. That's on cpython. On pypy surprisingly conversion takes a about 1 ms (2 times slower than cpython) and access is a little bit faster in nametuple. The process of creating a namedtuple is taking most of the time of convertion.
For a quick and dirty dict keys to object attrs conversion, I use mock
:
import mock
d = dict(a=1, b='b')
o = mock.Mock(**d)
assert d['a'] == o.a
assert d['b'] == o.b
This approach is likely to be error prone because python dictionaries aren't ordered but namedtuple is ordered. You could get different positions in the tuple for different fields from run to run.
For example, this program (when using Python 3.4.X) occasionally throws an error. Run it a few times to reproduce it.
import collections
Point = collections.namedtuple('Point', {'x':0, 'y':0})
p = Point(11, y=22) # instantiate with positional or keyword arguments
p[0] + p[1] # indexable like the plain tuple (11, 22)
The error looks like this:
Traceback (most recent call last):
File "tupletest.py", line 5, in <module>
p = Point(11, y=22) # instantiate with positional or keyword arguments
TypeError: __new__() got multiple values for argument 'y'
I added support for list values to suganthsundar's solution:
def convert(dictionary):
for key, value in dictionary.items():
if isinstance(value, dict):
dictionary[key] = convert(value)
if isinstance(value, list):
dictionary[key] = [convert(i) for i in value]
return namedtuple('GenericDict', dictionary.keys())(**dictionary)
Above code breaks if the item of the list is not a dictionary. Modifying the code as following, which should work for most of the cases:
from collections import namedtuple
def convert(obj):
if isinstance(obj, dict):
for key, value in obj.iteritems():
obj[key] = convert(value)
return namedtuple('GenericDict', obj.keys())(**obj)
elif isinstance(obj, list):
return [convert(item) for item in obj]
else:
return obj
Thanks for this!
@JinghongHuang The snippet you posted above has bad performance because it creates many GenericDict
classes recursively and in python these classes are never garbage recycled so it will memory leaks!
Anyone interested in this should also check out the dotmap
package / alternatives to that package
I wrote up this [1] implementation a little while ago. Stumbled upon this thread. From the namedtuple_fmt
module, the serialize
function accepts a dictionary (e.g. from a json.load
call) and will convert to an instance of the given namedtuple
-deriving type. Likewise, deserialize
will convert any namedtuple
-deriving type into a dictionary-l object. These functions also work on any list
-like type: List
and Tuple
are process-able.
E.g.
import json
from typing import Sequence
from namedtuple_fmt import serialize, deserialize
X = NamedTuple('X', [('msg',str)])
json_str="""{"msg": "This is the first message"}"""
first_msg = deserialize(json.loads(json_str), X)
print(first_msg.msg)
print(deserialize(serialize(first_msg)) == X("This is the first message"))
print(deserialize(json.loads(json.dumps(serialize(first_msg)))) == X("This is the first message"))
json_str="""[{"msg": "This is the first message"},{"msg": "This is the 2nd message"}]"""
messages = deserialize(json.loads(json_str), Sequence[X])
print(f"{len(messages)} messages")
print('\n'.join(map(lambda x: x.msg, messages))
Implementation note: There are explicit type checks for the Sequence
types. It's important to not mess-up when it comes to handling str
and tuple
. A first draft of this idea incorrectly did for _ in X
when trying to "test" if something was list
-like. This idea, unfortunately, will iterate over characters in a str
or elements in a tuple
. We want the tuple
-iterating part, but not when it comes to a NamedTuple
(or namedtuple
)!
[1] https://gist.github.com/malcolmgreaves/d71ae1f09075812e54d8ec54a5613616
GenericDict will have different fields: