-
-
Save hailiang-wang/567ebca0f59c612eb977065008aad867 to your computer and use it in GitHub Desktop.
| #!/usr/local/bin/python3 | |
| ''' | |
| Convert a pkl file into json file | |
| ''' | |
| import sys | |
| import os | |
| import _pickle as pickle | |
| import json | |
| def convert_dict_to_json(file_path): | |
| with open(file_path, 'rb') as fpkl, open('%s.json' % file_path, 'w') as fjson: | |
| data = pickle.load(fpkl) | |
| json.dump(data, fjson, ensure_ascii=False, sort_keys=True, indent=4) | |
| def main(): | |
| if sys.argv[1] and os.path.isfile(sys.argv[1]): | |
| file_path = sys.argv[1] | |
| print("Processing %s ..." % file_path) | |
| convert_dict_to_json(file_path) | |
| else: | |
| print("Usage: %s abs_file_path" % (__file__)) | |
| if __name__ == '__main__': | |
| main() |
This will not work if the dict has tuples.
@jcopps, if the dict or set types have the tuples, it should add some customized code snippets about traversing every dict to check the tuple position.
Then convert them to list type before using JSON dumps.
For example, I assume that the following record is one of set in pickle file:
record = {(1,2,3,3), (1,2,3,4)}
type(record) # setTrying to use json.dumps to convert them to JSON, and it will throw following error:
TypeError: {(1, 2, 3, 3), (1, 2, 3, 4)} is not JSON serializable
io = StringIO()
json_string = json.dump(record, io, ensure_ascii=False, sort_keys=True, indent=0)To fix that, it will do following code snippets firstly:
record = list(record) # [(1, 2, 3, 3), (1, 2, 3, 4)]
record_index=0
while record_index < len(record):
record[record_index] = list(record[record_index])
record_index += 1
print(record) # [[1, 2, 3, 3], [1, 2, 3, 4]]Then using json.dumps again:
io = StringIO()
json_string = json.dump(record, io, ensure_ascii=False, sort_keys=True, indent=0)
print(io.getvalue())
"""
[
[
1,
2,
3,
3
],
[
1,
2,
3,
4
]
]
"""It will be successful now :).
This will not work if the dict has tuples.
@jcopps, if the
dictorsettypes have thetuples, it should add some customized code snippets about traversing everydictto check thetupleposition.
Then convert them tolisttype before using JSON dumps.For example, I assume that the following
recordis one ofsetin pickle file:record = {(1,2,3,3), (1,2,3,4)} type(record) # setTrying to use
json.dumpsto convert them toJSON, and it will throw following error:TypeError: {(1, 2, 3, 3), (1, 2, 3, 4)} is not JSON serializableio = StringIO() json_string = json.dump(record, io, ensure_ascii=False, sort_keys=True, indent=0)To fix that, it will do following code snippets firstly:
record = list(record) # [(1, 2, 3, 3), (1, 2, 3, 4)] record_index=0 while record_index < len(record): record[record_index] = list(record[record_index]) record_index += 1 print(record) # [[1, 2, 3, 3], [1, 2, 3, 4]]Then using
json.dumpsagain:io = StringIO() json_string = json.dump(record, io, ensure_ascii=False, sort_keys=True, indent=0) print(io.getvalue()) """ [ [ 1, 2, 3, 3 ], [ 1, 2, 3, 4 ] ] """It will be successful now :).
Yes. I agree on that. But the JSON is no more reversible back to the way dictionary was.
Just wanted to say thanks for this. Helped a lot with ensuring my pickled data was as intended.