Skip to content

Instantly share code, notes, and snippets.

@isentropic
Last active January 7, 2022 13:50
Show Gist options
  • Save isentropic/a86effab2c007e86912a50f995cac52b to your computer and use it in GitHub Desktop.
Save isentropic/a86effab2c007e86912a50f995cac52b to your computer and use it in GitHub Desktop.
Tensorflow histogram2d (Simple implementation)
import tensorflow as tf
@tf.function
def get2dHistogram(x, y,
value_range,
nbins=100,
dtype=tf.dtypes.int32):
"""
Bins x, y coordinates of points onto simple square 2d histogram
Given the tensor x and y:
x: x coordinates of points
y: y coordinates of points
this operation returns a rank 2 `Tensor`
representing the indices of a histogram into which each element
of `values` would be binned. The bins are equal width and
determined by the arguments `value_range` and `nbins`.
Args:
x: Numeric `Tensor`.
y: Numeric `Tensor`.
value_range[0] lims for x
value_range[1] lims for y
nbins: Scalar `int32 Tensor`. Number of histogram bins.
dtype: dtype for returned histogram.
Example:
N = 1000
xs = tf.random.normal([N])
ys = tf.random.normal([N])
get2dHistogram(xs, ys, ([-5.0, 5.0], [-5.0, 5.0]), 50)
"""
x_range = value_range[0]
y_range = value_range[1]
histy_bins = tf.histogram_fixed_width_bins(y, y_range, nbins=nbins, dtype=dtype)
H = tf.map_fn(lambda i: tf.histogram_fixed_width(x[histy_bins == i], x_range, nbins=nbins),
tf.range(nbins))
return H # Matrix!
@CatherineTaelman
Copy link

CatherineTaelman commented May 4, 2020

Hello,
I try to use your method in the following way:

x = tf.random.uniform((28,28))
y = tf.random.uniform((28,28))
H = get2dHistogram(x, y, value_range=[[0.0,1.0], [0.0,1.0]], nbins=100, dtype=tf.dtypes.float32)

However, when I run this, I get the following error:
"
ValueError: in converted code:

C:\Users\s143239\code\nmi_tensorflow.py:93 get2dHistogram
    tf.range(nbins))
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\map_fn.py:268 map_fn
    maximum_iterations=n)
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\control_flow_ops.py:2675 while_loop
    back_prop=back_prop)
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\while_v2.py:194 while_loop
    add_control_dependencies=add_control_dependencies)
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\framework\func_graph.py:978 func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\while_v2.py:172 wrapped_body
    outputs = body(*_pack_sequence_as(orig_loop_vars, args))
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\map_fn.py:257 compute
    packed_fn_values = fn(packed_values)
C:\Users\s143239\OneDrive - TU Eindhoven\Docs\JAAR 6\stage\code\GKMI\GKMI\nmi_tensorflow.py:92 <lambda>
    H = tf.map_fn(lambda i: tf.histogram_fixed_width(x[histy_bins == i], x_range, nbins=nbins),
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\array_ops.py:821 _slice_helper
    return boolean_mask(tensor=tensor, mask=slice_spec)
C:\Users\s143239\Anaconda\lib\site-packages\tensorflow_core\python\ops\array_ops.py:1583 boolean_mask
    raise ValueError("mask cannot be scalar.")

ValueError: mask cannot be scalar. "

I'm using Tensorflow 2.1 and Python 3.7.

Could you please provide a working example of your method? Thank you very much in advance!
Kind regards, Catherine

@isentropic
Copy link
Author

isentropic commented May 4, 2020

Bins x, y coordinates of points onto simple square 2d histogram

Given the tensor x and y:
**x: x coordinates of points
y: y coordinates of points**
this operation returns a rank 2 `Tensor` 
representing the indices of a histogram into which each element
of `values` would be binned. The bins are equal width and
determined by the arguments `value_range` and `nbins`.

That is your input, x and y need to be coordinates of your points. Both x and y need to be rank 1 tensors.
If you have N points, both x and y are (1, N) shape tensors reprsenting coordinates of those points
Let me know if it works out for you

@CatherineTaelman
Copy link

CatherineTaelman commented May 5, 2020

Thank you for your answer! I now flattened my input to get both x and y in (1,N) shape (I want to use 2 MNIST images as input, that's why I used the 28x28 input first). So now I have x and y both of size (1,784), however, it's still not working. I get the same error as above: "mask cannot be scalar".

My code:
x = tf.random.uniform((28,28))
x = tf.reshape(x, [784, -1])
y = tf.random.uniform((28,28))
y = tf.reshape(y, [784, -1])
H = get2dHistogram(x, y, value_range=[[0.0,1.0], [0.0,1.0]], nbins=100, dtype=tf.dtypes.float32)

Do you know what I am doing wrong?

@isentropic
Copy link
Author

isentropic commented May 5, 2020

N = 1000
xs =  tf.random.normal([N])
ys =  tf.random.normal([N])
hist = get2dHistogram(xs, ys, ([-5.0, 5.0], [-5.0, 5.0]),  50)

returns

<tf.Tensor: shape=(50, 50), dtype=int32, numpy=
array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int32)>

We can plot it: plt.imshow(hist.numpy())

index

@isentropic
Copy link
Author

x = tf.random.uniform((28,28))
x = tf.reshape(x, [784, -1])
y = tf.random.uniform((28,28))
y = tf.reshape(y, [784, -1])
H = get2dHistogram(x, y, value_range=[[0.0,1.0], [0.0,1.0]], nbins=100, dtype=tf.dtypes.float32)

It works for me, with output:

<tf.Tensor: shape=(100, 100), dtype=int32, numpy=
array([[0, 0, 0, ..., 0, 1, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int32)>

@isentropic
Copy link
Author

isentropic commented May 5, 2020

My TF version is '2.2.0-rc4' with 3.8 python, but this is not the cause, this snippet worked with 2.0 TF

@CatherineTaelman
Copy link

Thank you for the example and extra information! I downgraded TF from 2.1 to 2.0 and now it works..

@isentropic
Copy link
Author

I think TF version is irrelevant here, it works with 2.2 as well

@LucasKirsten
Copy link

How could I change the code to add a weight argument, such as in numpy implementation?

I have tried so far the following, but it does not return the correct values for the histogram:
`def get2dHistogram(x, y, weights,
value_range,
nbins=100,
dtype=tf.dtypes.int32):

x_range = value_range[0]
y_range = value_range[1]

x = tf.histogram_fixed_width_bins(x, y_range, nbins=tf.size(x), dtype=dtype)
x = tf.math.bincount(x, weights=weights, minlength=tf.size(y))
y = tf.histogram_fixed_width_bins(y, y_range, nbins=tf.size(y), dtype=dtype)
y = tf.math.bincount(y, weights=weights, minlength=tf.size(y))

histy_bins = tf.histogram_fixed_width_bins(y, y_range, nbins=nbins, dtype=dtype)

H = tf.map_fn(lambda i: tf.histogram_fixed_width(x[histy_bins == i], x_range, nbins=nbins), tf.range(nbins))
return H # Matrix!`

@isentropic
Copy link
Author

isentropic commented Jan 7, 2022

maybe you could multiply the resulting histogram with weights
...
edit: this wont work I get it.
Perhaps if weights aren't that important perhaps duplicate (add multiplicity) to some of the points manually?

@LucasKirsten
Copy link

LucasKirsten commented Jan 7, 2022

I managed to make it work using the tensorflow-probability package for the histogram implementation. The final is below:

`def get2dHistogram(x, y, weights,
value_range,
nbins=100,
dtype=tf.dtypes.int32):

x_range = tf.linspace(value_range[0][0], value_range[0][1], num=nbins+1)
x = tf.clip_by_value(x, value_range[0][0], value_range[0][1])

histy_bins = tf.histogram_fixed_width_bins(y, value_range[1], nbins=nbins, dtype=dtype)

hists = []
for i in range(nbins):
_x = x[histy_bins == i]
_w = weights[histy_bins == i]
hist = tfp.stats.histogram(_x, edges=x_range, weights=_w)
hists.append(hist)

return tf.stack(hists, axis=0)`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment