Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented credibility_interval() #188

Open
wants to merge 89 commits into
base: master
Choose a base branch
from

Conversation

Stefan-Heimersheim
Copy link
Collaborator

Description

Added a credibility_interval method to MCMCSampels/NestedSamples
which computes the credibility / confidence interval.

Fixes #178 and replaces #179

Notes:

  • Discretization error: Using np.cumsum, the value of the CDF at the
    first data point is == its weight > 0, but at the last data point
    is == 1. This should be negligible for weights<<1 but for
    significant weights i.e. small sample sizes this can have an effect.
  • Should we add an alias function confidence_interval
  • I have also thrown in upper and lower limits which are not
    iso-probability intervals, but rely on the same code

Checklist:

  • I have performed a self-review of my own code
  • My code is PEP8 compliant (flake8 anesthetic tests)
  • My code contains compliant docstrings
    (pydocstyle --convention=numpy anesthetic)
  • New and existing unit tests pass locally with my changes
    (python -m pytest)
  • I have added tests that prove my fix is effective or that my
    feature works

@codecov
Copy link

codecov bot commented Jun 8, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (8726a90) to head (3e6bd4b).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #188   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           36        36           
  Lines         3058      3127   +69     
=========================================
+ Hits          3058      3127   +69     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @Stefan-Heimersheim. The changes are only cosmetic/refactoring, so this should be good to go soon. Very thought provoking

anesthetic/samples.py Outdated Show resolved Hide resolved
anesthetic/samples.py Outdated Show resolved Hide resolved
anesthetic/samples.py Outdated Show resolved Hide resolved
anesthetic/samples.py Outdated Show resolved Hide resolved
tests/test_samples.py Outdated Show resolved Hide resolved
@Stefan-Heimersheim
Copy link
Collaborator Author

Stefan-Heimersheim commented Jun 10, 2022

The sorting

        order = np.argsort(samples)
        samples = samples[order]

breaks with merged data sets, as in #189

Edit: This does not seem to be an issue anymore,

samples = anesthetic.MCMCSamples(np.random.rand(1000,2),columns=["x","y"],index=np.random.randint(0,100,1000))
anesthetic.utils.credibility_interval(samples.x)

works.

@Stefan-Heimersheim
Copy link
Collaborator Author

Re CI/lint flake8:

anesthetic/samples.py:11:1: F401 'scipy.interpolate.interp1d' imported but unused
anesthetic/samples.py:12:1: F401 'scipy.optimize.minimize_scalar' imported but unused

Not really sure what's causing flake8 to report an error here. Otherwise this should be good to go; only the quantile treatment isn't merged between credibility_interval and quantile.

anesthetic/samples.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Stefan-Heimersheim.

I've unified the CDF functions between quantile and credibility_interval. My weighted ECDF was slightly different from yours (defined by min sample has CDF 0, max sample has CDF 1, and the intermediate points have a symmetric average of the weights either side). This coincides with the unweighted form and is symmetric, but I also have some ideas as to how to extend so that the bounds are appropriately slightly larger than the min and max.

I also added some tests so that the coverage is 100%.

If you're happy with these changes please squash and merge.

Best,
Will

@Stefan-Heimersheim
Copy link
Collaborator Author

My weighted ECDF was slightly different from yours (defined by min sample has CDF 0, max sample has CDF 1, and the intermediate points have a symmetric average of the weights either side).

I am a bit unsure about the new ECDF function:
CDF_illustration
Looking at say CDF(10)-CDF(3) in this example, there are 7 out of 10 samples in the interval [3,10], but the ECDF (blue line) gives a larger value -- is that a more-correct approximation that the version I initially implemented (green/red lines)?

@williamjameshandley
Copy link
Collaborator

Excellent plot!

I've done some investigating, and these really are second order effects. What we should be most concerned about is whether cdf(b)-cdf(a) approximates the true probability mass in between b and a.

import numpy as np
from scipy.interpolate import interp1d
from scipy.stats import norm
import matplotlib.pyplot as plt

# Generate a dataset
np.random.seed(3)
data = np.random.randn(30)
weights = np.random.rand(30)
i = np.argsort(data)
data = data[i]
weights = weights[i]/weights.sum()

# Symmetric empirical CDF
w_ = np.concatenate([[0.],weights[1:]+weights[:-1]]).cumsum()
w_/=w_[-1]
sym = interp1d(data, w_)

# Left hand empirical CDF
lh = interp1d(data, np.cumsum(weights))

# Right hand empirical CDF
rh = interp1d(data, 1-np.cumsum(weights[::-1])[::-1])

plt.plot(data,sym(data), label='Symmetric')
plt.plot(data,lh(data), label='Left Hand')
plt.plot(data,rh(data),  label='Right Hand')
plt.plot(data,norm.cdf(data),  label='True CDF')
plt.xlabel('x')
plt.ylabel('CDF')
plt.tight_layout()
plt.savefig('CDF.png')
plt.close()

CDF

So we see that all perform pretty equivalently relative to the correct answer.

Increasing the number of samples improves performance, but the three empirical curves lie on top of one another, which is different at lower order to the true answer.

CDF

Examining the histograms of the performance relative to the true answer shows them to all perform in an unbiased way

W, SYM, LH, RH, T = [], [], [], [], []
for _ in range(10000):
    a, b = np.sort(np.random.uniform(data.min(),data.max(),size=(2)))
    w = weights[(data < b) & (data > a)].sum()
    W.append(w)
    T.append(norm.cdf(b)-norm.cdf(a))
    SYM.append(sym(b)-sym(a))
    LH.append(lh(b)-lh(a))
    RH.append(rh(b)-rh(a))

W, SYM, LH, RH, T = np.array([W, SYM, LH, RH, T])
        
plt.hist(W-T,bins=100)
plt.hist(SYM-T,bins=100)
plt.hist(LH-T,bins=100,alpha=0.8)
plt.hist(RH-T,bins=100,alpha=0.8)
plt.tight_layout()
plt.savefig('hist.png')

hist

Increasing to 1000 points just reduces the spread, but not the difference in the methods.

hist

Unless you can find a stress test which shows any of these to be better or worse than the others, I'd vote for the symmetric one.

@Stefan-Heimersheim
Copy link
Collaborator Author

Stefan-Heimersheim commented Aug 3, 2022

I am confused, clearly in my CDF plot there symmetric method always gives you a steeper gradient, and thus should give always give you a larger CDF(b)-CDF(a) difference.
previous
And here is your plot again without subtracting the True value:

plt.figure(constrained_layout=True)
plt.hist(SYM,bins=100, histtype="step", label="SYM")
plt.hist(LH,bins=100, histtype="step", label="LH")
plt.hist(RH,bins=100, histtype="step", label="RH")
plt.hist(T,bins=100, histtype="step", label="T")
plt.xlabel("CDF(b)-CDF(a)")
plt.legend()
plt.savefig('hist.png')
plt.show()

hist
(I did use 10 data points with weights = np.random.ones(10), I didn't get to double check if that is causing any difference in any of the codes.) Here is the same plot including T:
hist

Does the symmetric one do systematically worse? Do both methods have a systematic bias? And what would we expect in theory? Intuitively the LH/RH versions feel like implementations of the basic CDF definition, but I am not sure how to reason about the symmetric formula. I'll try to generalize this benchmark and figure out why we don't see that in your histogram.

Edit: Full code

import numpy as np
from scipy.interpolate import interp1d
from scipy.stats import norm
import matplotlib.pyplot as plt

# Generate a dataset
np.random.seed(3)
data = np.random.randn(10)
weights = np.ones(10)
i = np.argsort(data)
data = data[i]
weights = weights[i]/weights.sum()

# Symmetric empirical CDF
w_ = np.concatenate([[0.],weights[1:]+weights[:-1]]).cumsum()
w_/=w_[-1]
sym = interp1d(data, w_)

# Left hand empirical CDF
lh = interp1d(data, np.cumsum(weights))

# Right hand empirical CDF
rh = interp1d(data, 1-np.cumsum(weights[::-1])[::-1])


W, SYM, LH, RH, T = [], [], [], [], []
for _ in range(10000):
    a, b = np.sort(np.random.uniform(data.min(),data.max(),size=(2)))
    w = weights[(data < b) & (data > a)].sum()
    W.append(w)
    T.append(norm.cdf(b)-norm.cdf(a))
    SYM.append(sym(b)-sym(a))
    LH.append(lh(b)-lh(a))
    RH.append(rh(b)-rh(a))

W, SYM, LH, RH, T = np.array([W, SYM, LH, RH, T])


plt.figure(constrained_layout=True)
plt.hist(SYM,bins=100, histtype="step", label="SYM")
plt.hist(LH,bins=100, histtype="step", label="LH")
plt.hist(RH,bins=100, histtype="step", label="RH")
plt.hist(T,bins=100, histtype="step", label="T")
plt.xlabel("CDF(b)-CDF(a)")
plt.legend()
plt.savefig('hist.png')
plt.show()

@Stefan-Heimersheim
Copy link
Collaborator Author

Thanks @lukashergt for implementing the MultiIndex! That's everything done; as you suggested I'll take a look at the code tonight and we can get this merged.

@Stefan-Heimersheim
Copy link
Collaborator Author

I think this is fantastic, thanks for implementing the extension!

The only unintuitive case is calling it for a single variable, you have to do

samples[["noise"]].credibility_interval()

rather than

samples["noise"].credibility_interval()

but our other functions like mean() and std() work in either case. But not a big deal and happy to leave it this way if the extra functionality is a hassle.

@lukashergt
Copy link
Collaborator

I think this is fantastic, thanks for implementing the extension!

The only unintuitive case is calling it for a single variable, you have to do

samples[["noise"]].credibility_interval()

rather than

samples["noise"].credibility_interval()

but our other functions like mean() and std() work in either case. But not a big deal and happy to leave it this way if the extra functionality is a hassle.

samples.x0.credibility_interval(...) now returns the same as credibility_interval(samples.x0, weights=samples.get_weights(), ...).

@lukashergt
Copy link
Collaborator

@williamjameshandley, since @Stefan-Heimersheim and I have both added functionality here, would you like to review?

Comment on lines 102 to 104
# Set the last element (tested to be approx 1)
# to exactly 1 to avoid interpolation errors
cdf[-1] = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly prefer for there not to be an assert statement (which the normalisation would fix). This kind of thing would be infuriating as part of a large automated pipeline where floating point errors derail a larger workflow.

@Stefan-Heimersheim
Copy link
Collaborator Author

I would strongly prefer for there not to be an assert statement

I agree, and we do not need to check cdf[-1] because this is literally a sum over np.random.dirichlet which by definition sums to 1.

lukashergt
lukashergt previously approved these changes Sep 30, 2023
Copy link
Collaborator

@lukashergt lukashergt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything else for this PR, @williamjameshandley?

@Stefan-Heimersheim
Copy link
Collaborator Author

Were there any other changes needed to merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Function to compute iso_probability_contours for given MCMCSamples or NestedSamples object
3 participants