Last active
July 26, 2020 15:28
-
-
Save DanielTakeshi/7f90c6a508678e04714933378f13c483 to your computer and use it in GitHub Desktop.
How to sample from a log-uniform distribution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
How we might sample from a log-uniform distribution | |
https://stats.stackexchange.com/questions/155552/what-does-log-uniformly-distribution-mean | |
Only run one of these three cases at a time, otherwise the plots update each | |
other. Run with these versions: | |
matplotlib 3.2.1 | |
numpy 1.18.3 | |
""" | |
import numpy as np | |
import matplotlib.pyplot as plt | |
np.set_printoptions(suppress=True, linewidth=200, edgeitems=10) | |
low = 0.01 | |
high = 0.7 | |
size = 10000 | |
nb_bins = 50 | |
if False: | |
data = np.random.uniform(low=low, high=high, size=size) | |
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid') | |
plt.title('Uniform({}, {})'.format(low, high)) | |
plt.xlabel('Epsilon') | |
plt.savefig('distr_uniform.png') | |
if False: | |
data = np.random.uniform(low=np.log(low), high=np.log(high), size=size) | |
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid') | |
plt.title('Uniform(log({}), log({})'.format(low, high)) | |
plt.xlabel('Epsilon') | |
plt.savefig('distr_uniform_log.png') | |
# Number of classes are the number of intervals. | |
nb_classes = 5 + 1 | |
if True: | |
data = np.random.uniform(low=np.log(low), high=np.log(high), size=size) | |
discretized = np.linspace(np.log(low), np.log(high), num=nb_classes) | |
data = np.exp(data) | |
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid') | |
plt.title('exp( Uniform(log({}), log({}) )'.format(low, high)) | |
plt.xlabel('Epsilon') | |
plt.savefig('distr_uniform_log_true.png') | |
# Now let's add dicretized ranges. | |
print('Discretized bounds (len {}) for epsilons:\nLog: {}\nNormal: {}'.format( | |
len(discretized), discretized, np.exp(discretized))) | |
for idx,item in enumerate(discretized): | |
plt.axvline(x=np.exp(item), color='black') | |
if idx < len(discretized) - 1: | |
start = np.exp(discretized[idx]) | |
end = np.exp(discretized[idx+1]) | |
count = np.sum( (start <= data) & (data < end) ) | |
print('{:.3f} <= x < {:.3f} count: {}'.format(start, end, count)) | |
plt.savefig('distr_uniform_log_true_bounds.png') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
(July 26) Now log(0.01) to log(0.7) with discretized bins.
Here is a plot which also has nb_classes=5+1 (because nb_classes is really the number of vertical ticks).
For classes, I get:
If it's
nb_classes=10+1)
then we get: