Skip to content

Instantly share code, notes, and snippets.

@wesslen
Created February 8, 2018 17:58
Show Gist options
  • Select an option

  • Save wesslen/c0b03e07553800f19c7abf9ac0bbc61b to your computer and use it in GitHub Desktop.

Select an option

Save wesslen/c0b03e07553800f19c7abf9ac0bbc61b to your computer and use it in GitHub Desktop.
Load raw json files to MongoDB
import glob
import json
from pymongo import MongoClient
# fill in hostname and port
HOST = "hostname"
PORT = 27017
client = MongoClient(HOST, PORT)
# fill in dbname and colname
db = client.dbname
col = db["colname"]
path = "/path/to/files"
for file in glob.glob(path + "*.json"):
# print file # if you want to see the file name
print(file)
with open(file, "r") as f:
tweets = []
content = f.readlines()
content = [x.strip() for x in content]
for line in content:
col.insert_one(json.loads(line))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment