-
-
Save duhaime/3e781194ebaccc28351a5d53989caa70 to your computer and use it in GitHub Desktop.
import numpy as np | |
import umap | |
# generate some points in 4D | |
points = np.random.random((2000, 4)) | |
# project the points to 2D to get points proximate to one another | |
points = umap.UMAP().fit_transform(points) | |
from lloyd.lloyd import Field | |
from scipy.spatial import voronoi_plot_2d | |
import matplotlib.pyplot as plt | |
field = Field(points=points) | |
# plot the points with no relaxation relaxation | |
plt = voronoi_plot_2d(field.voronoi, show_vertices=False, line_colors='orange', line_width=2, line_alpha=0.6, point_size=2) | |
plt.show() | |
# relax the points several times, and show the result of each relaxation | |
for i in range(6): | |
field.relax_points() | |
plt = voronoi_plot_2d(field.voronoi, show_vertices=False, line_colors='orange', line_width=2, line_alpha=0.6, point_size=2) | |
plt.show() |
from scipy.spatial import Voronoi | |
from scipy.spatial import voronoi_plot_2d | |
import numpy as np | |
import sys | |
class Field(): | |
''' | |
Create a Voronoi map that can be used to run Lloyd relaxation on an array of 2D points. | |
For background, see: https://en.wikipedia.org/wiki/Lloyd%27s_algorithm | |
''' | |
def __init__(self, points=None): | |
''' | |
Store the points and bounding box of the points to which Lloyd relaxation will be applied | |
@param [arr] points: a numpy array with shape n, 2, where n is number of points | |
''' | |
if not len(points): | |
raise Exception('points should be a numpy array with shape n,2') | |
x = points[:, 0] | |
y = points[:, 0] | |
self.bounding_box = [min(x), max(x), min(y), max(y)] | |
self.points = points | |
self.build_voronoi() | |
def build_voronoi(self): | |
''' | |
Build a Voronoi map from self.points. For background on self.voronoi attrs, see: | |
https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.spatial.Voronoi.html | |
''' | |
eps = sys.float_info.epsilon | |
self.voronoi = Voronoi(self.points) | |
self.filtered_regions = [] # list of regions with vertices inside Voronoi map | |
for region in self.voronoi.regions: | |
inside_map = True # is this region inside the Voronoi map? | |
for index in region: # index = the idx of a vertex in the current region | |
# check if index is inside Voronoi map (indices == -1 are outside map) | |
if index == -1: | |
inside_map = False | |
break | |
# check if the current coordinate is in the Voronoi map's bounding box | |
else: | |
coords = self.voronoi.vertices[index] | |
if not (self.bounding_box[0] - eps <= coords[0] and | |
self.bounding_box[1] + eps >= coords[0] and | |
self.bounding_box[2] - eps <= coords[1] and | |
self.bounding_box[3] + eps >= coords[1]): | |
inside_map = False | |
break | |
# store hte region if it has vertices and is inside Voronoi map | |
if region != [] and inside_map: | |
self.filtered_regions.append(region) | |
def find_centroid(self, vertices): | |
''' | |
Find the centroid of a Voroni region described by `vertices`, and return a | |
np array with the x and y coords of that centroid. | |
The equation for the method used here to find the centroid of a 2D polygon | |
is given here: https://en.wikipedia.org/wiki/Centroid#Of_a_polygon | |
@params: np.array `vertices` a numpy array with shape n,2 | |
@returns np.array a numpy array that defines the x, y coords | |
of the centroid described by `vertices` | |
''' | |
area = 0 | |
centroid_x = 0 | |
centroid_y = 0 | |
for i in range(len(vertices)-1): | |
step = (vertices[i, 0] * vertices[i+1, 1]) - (vertices[i+1, 0] * vertices[i, 1]) | |
area += step | |
centroid_x += (vertices[i, 0] + vertices[i+1, 0]) * step | |
centroid_y += (vertices[i, 1] + vertices[i+1, 1]) * step | |
area /= 2 | |
centroid_x = (1.0/(6.0*area)) * centroid_x | |
centroid_y = (1.0/(6.0*area)) * centroid_y | |
return np.array([[centroid_x, centroid_y]]) | |
def relax_points(self): | |
''' | |
Moves each point to the centroid of its cell in the Voronoi map to "relax" | |
the points (i.e. jitter them so as to spread them out within the space). | |
''' | |
centroids = [] | |
for region in self.filtered_regions: | |
vertices = self.voronoi.vertices[region + [region[0]], :] | |
centroid = self.find_centroid(vertices) # get the centroid of these verts | |
centroids.append(list(centroid[0])) | |
self.points = centroids # store the centroids as the new point positions | |
self.build_voronoi() # rebuild the voronoi map given new point positions |
Thanks @LunarLite! I think the number of points decreasing is due to the fact that the input dataset has duplicate points.
One way around that is to add a little random noise to points to "deduplicate" the inputs.
I rolled this into a package https://github.com/duhaime/lloyd that will handle the epsilon jitter for you and ensure the number of outputs is identical to the number of inputs. Strictly if you're interested!
Hmm, I'm pretty sure it's not due to duplicate points in the original input dataset.
What I did was this:
- Generated 500 random points using numpy
- Instantiated a new Field object, entered the 500 generated points
- Applied 10 relaxation loops
- Logged amount of points on line 92, simply by using
print(len(centroids))
With every successive relaxation applied, the number of points goes down. I can see how the initial dataset might have duplicate points, but successive relaxations should not be able to create more duplicate points...
From what I could tell, through debugging, is that the section between line 34 and line 55 excludes regions based on certain conditions.
Centroids are afterwards created for each region---- but if there are less regions... there are also less centroids (ergo, less points).
This is, as far as I can tell, the only way/reason the amount of points keeps diminishing.
Please correct me if I'm wrong/missed anything though!
(Taking a look at the link you shared as well now, hope to find whatever it is I'm missing!)
Ah, duplicate points are one potential source of point reduction. IIRC, certain qhull options can also cause a reduction in the number of points. In the lloyd
package the Voronoi
constructor is called like this:
Voronoi(points, qhull_options='Qbb Qc Qx')
Perhaps the same qhull options help here? It's been a little bit since I thought about this problem space, but I hope this helps!
Yeah, went through the lloyd package, noticed it applied qhull differently there also, so that was some good insight for me, thanks!
Using your lib now as it's rather nice to use.
Few things regarding the lloyd package;
distribution version goes up to 0.0.9 (this still includes print(' * jittering inputs')
which was removed in 0.0.10)
When drawing a voronoi based on the points using scipy.spacial (as documented), some vertices appear to be off-screen (ergo, they have an index of -1)...
I'm still looking into fixing these vertices, just sharing what I'm finding.
TL,DR;
Thanks man, great help.
Ah, interesting! If you have a sample dataset where the vertices go offscreen, I'm happy to take a look.
I seem to remember there were "points at infinity" introduced into the vertex collection that were just to create either the Delaunay triangulation or the Voronoi tesselation (I believe the latter). Those points at infinity were not part of the user vertex set, but were instead meant to be removed.
I'm pretty foggy on this though to be honest! In any event, I'm happy to take a look if you see strange / inexplicable behavior with a dataset...
Hi again @duhaime!
So, I didn't find a good sample dataset, instead I ran the code from the Jupyter Notebook in your gist, and my final plot ended up being vastly different.
(Same numpy seed, using your lib, exact copy paste of the example code) My final plot is here..
Regardless, you can see that there's a rather large amount of dashed lines moving outside of the plot.
The vertices are presumable outside of plot range, from what I could find at an infinite range?
I'm guessing the scipy voronoi lib might have been updated or something, but I assume these would be points at infinity
that you mentioned.
Ah thanks very much for the follow up and sorry I'm just getting back to you now @LunarLite. I got a little distracted by some other things.
You should find that the notebook example produces a set of vertices that's of the same shape as your input vertices. Is that not the case? You're quite right--those dashed lines use points at infinity to create a polygon that surrounds a point. (I don't see any way other than points at infinity that this goal could be achieved in the general case.)
Those points at infinity will not be returned when you call field.get_points()
though--that call will only return your input points in their new positions.
Does that help clarify the behavior?
Also possibly of interest is https://github.com/YaleDHLab/pointgrid which uses a simple grid quantization approach to avoid overlapping points...
Unless I am mistaken about scipy.spatial's Voronoi representation, every point on the convex hull of the point cloud will be associated with a region that has a "-1" entry, because those Voronoi regions span to infinity.
Which means that every iteration shaves off all points on the convex hull, which isn't desired behaviour. There needs to be more qualified handling of the infinite voronoi cells than just throwing them all out.
@AlienAtSystem this setup should add 4 points outside the user-provided data distribution. Those 4 points should form the convex hull around the distribution, so only they should be culled in the end.
Are you finding that points in your data are disappearing? If so it would be great if you could share a small dataset that illustrates the problem!
Re-reading the code above, I can't see where extra points are added. It happens neither in the constructor nor in the Voronoi call.
Oh, it should be https://github.com/duhaime/lloyd/blob/master/lloyd/lloyd.py#L28. I think this gist was just really early research in this topic...
Found it very helpful and useful to look at and learn from.
Ran into an issue, which is that after several times of having executed
build_voronoi
, the amount of points goes down. Which makes sense, as the amount of regions is also culled...Might want to look into that, if you ever feel like it.