Last active
December 9, 2020 15:50
-
-
Save acrosby/4601257 to your computer and use it in GitHub Desktop.
Sample code for python rtree bulk loading of 10,000,000 points and a nearest neighbor query.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from rtree import index | |
from random import random | |
from datetime import datetime | |
timer = datetime.now() | |
# Create 10,000,000 random numbers between 0 and 1 | |
rands = [random() for i in range(10000000)] | |
# Function required to bulk load the random points into the index | |
# Looping over and calling insert is orders of magnitude slower than this method | |
def generator_function(): | |
for i, coord in enumerate(rands): | |
yield (i, (coord, coord+1, coord, coord+1), coord) | |
# Add points | |
tree = index.Index(generator_function()) | |
print (datetime.now()-timer).seconds # How long did it take to add the points | |
print list(tree.nearest((rands[50], rands[50], rands[50], rands[50]), 3)) | |
print (datetime.now()-timer).seconds # How long did it take to query for the nearest 3 points |
@Tasneem-gh That would depend on if your rtree library supports a generator as an input. But if it does, then you could read a text file of coordinates line by line as a generator inside of a generator that does some processing of the lines and yield
the result. Here is some info that may be helpful: https://realpython.com/introduction-to-python-generators/
@acrosby Thanks for the hint. Will try that
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@acrosby
Thank you for replying
Actually I am using a different rtree library for building the index but I though of using the generator function to speed up the building process because it takes around an hour to build 1 million data records.
So, serializing data allows reading from a file and I can use the generator function along with that?