Skip to content

Instantly share code, notes, and snippets.

@erykml
Last active April 14, 2019 21:05
Show Gist options
  • Select an option

  • Save erykml/0470e683000dc3bde06dbf2a5c1a18b1 to your computer and use it in GitHub Desktop.

Select an option

Save erykml/0470e683000dc3bde06dbf2a5c1a18b1 to your computer and use it in GitHub Desktop.
def pp_plot(x, dist, line=True, ax=None):
'''
Function for comparing empirical data to a theoretical distribution by using a P-P plot.
Params:
x - empirical data
dist - distribution object from scipy.stats; for example scipy.stats.norm(0, 1)
line - boolean; specify if the reference line (y=x) should be drawn on the plot
ax - specified ax for subplots, None is standalone
'''
if ax is None:
ax = plt.figure().add_subplot(1, 1, 1)
n = len(x)
p = np.arange(1, n + 1) / n - 0.5 / n
pp = np.sort(dist.cdf(x))
sns.scatterplot(x=p, y=pp, color='blue', edgecolor='blue', ax=ax)
ax.set_title('PP-plot')
ax.set_xlabel('Theoretical Probabilities')
ax.set_ylabel('Sample Probabilities')
ax.margins(x=0, y=0)
if line: plt.plot(np.linspace(0, 1), np.linspace(0, 1), 'r', lw=2)
return ax
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment