API

bootstraphistogram

A multi-dimensional histogram. The distribution of the histograms bin values is computed with the Possion bootstrap re-sampling method.

bootstraphistogram.BootstrapHistogram

class bootstraphistogram.BootstrapHistogram(*axes: Axis, numsamples: int = 100, rng: Optional[Union[int, Generator]] = None, **kwargs: Any)

A histogram with automatic Poission bootstrap resampling

The implementation is backed by boost Histogram (<https://github.com/scikit-hep/boost-histogram>) and thus BoostrapHistogram mimics the boost_histogram.Histogram interface.

Parameters
*axesboost_histogram.axis.Axis

Any number of boost_histogram.axis.Axis objects that define the histogram binning. See <https://boost-histogram.readthedocs.io/en/latest/usage/axes.html>.

numsamplesint

The number of bootstrap samples. Increasing this number improves the accuracy of estimators derived from the bootstrap samples, at the cost of increased memory and CPU usage.

rngUnion[int, np.random.Generator, None]

A numpy generator. If not provided, the numpy default from numpy.random.default_rng() will be used.

**kwargsAny

Passed on to the boost_histogram.Histogram constructor. numpy.ndarray

property axes: Tuple[Axis, ...]

boost_histogram.axis.Axis representing the histogram binning. The last dimension corresponds to the bootstrap sample index.

fill(*args: ArrayLike, weight: Optional[ArrayLike] = None, seed: Optional[ArrayLike] = None, **kwargs: Any) BootstrapHistogram

Fill the histogram with some values.

Parameters
*args“ArrayLike”

An 1D array containing coordinates for each dimension of the histogram.

weightOptional[“ArrayLike”]

Entry weights used to fill the histogram.

seed: Optional[“ArrayLike”]

Per-element seed. Overrides the Generator given in the constructor and uses a pseudo-random number generator seeded by the given value. This may be useful when filling multiple histograms with data that is not statistically independent (where it may be desirable to seed the generator with a record ID).

**kwargsAny

Passed on to boost_histogram.Histogram.fill.

Returns
selfBootstrapHistogram

Reference to this object. This is done to maintain consistency with boost_histogram.Histogram.

mean(flow: bool = False, ignore_nan: bool = True) NDArray[Any]

Binned sample mean.

Parameters
flow: bool

If True under and overflow bins are included.

ignore_nanbool

If True numpy.nanmean() is used.

Returns
numpy.ndarray

an array containing the mean value of all bootstrap samples for each bin in the histogram.

property nominal: Histogram

A histogram of the filled values, with no bootstrap samples.

property numsamples: int

Number of bootstrap re-samplings.

percentile(q: float, flow: bool = False, interpolation: str = 'linear', ignore_nan: bool = True) NDArray[Any]

Binned q-th percentile.

Parameters
qfloat

The percentile, a number between 0 and 100 (inclusive).

flow: bool

If True under and overflow bins are included.

interpolationstr

As numpy.percentile().

ignore_nanbool

If True numpy.nanpercentile() is used.

Returns
numpy.ndarray

an array containing the q-th percentile of all bootstrap samples for each bin in the histogram, .

project(*args: int) BootstrapHistogram

Reduce histogram dimensionality by summing over some dimensions. The bootstrap sample axis is always kept by this operation.

Parameters
*args: int

The dimensions to be kept.

Returns
hist: BootstrapHistogram

a copy of the histogram with only axes in args and the bootstrap sample axes.

property samples: Histogram

A histogram of the bootstrap samples. The last dimension corresponds to the bootstrap sample index and is of size BootstrapHistogram.numsamples.

std(flow: bool = False, ignore_nan: bool = True) NDArray[Any]

Binned sample standard deviation.

Parameters
flow: bool

If True under and overflow bins are included.

ignore_nanbool

If True numpy.nanstd() is used.

Returns
numpy.ndarray

an array containing the standard deviation of all bootstrap sample values for each bin in the histogram, .

view(flow: bool = False) Any

Return a view of the underlying histogram bootstrap sample values.

bootstraphistogram.plot

Functions to plot BootstrapHistogram objects with matplotlib.

exception bootstraphistogram.plot.HistogramRankError

Error raised when trying to plot a histogram with the wrong number of dimensions.

bootstraphistogram.plot.errorbar(hist: BootstrapHistogram, percentiles: Optional[Tuple[float, float, float]] = None, ax: Optional[Axes] = None, **kwargs: Any) Any

Plot the bootstrap sample mean and standard deviation.

Parameters
hist: bootstraphistogram.BootstrapHistogram

the bootstraphistogram.BootstrapHistogram to plot.

percentiles: Optional[Tuple[float, float]]

lower, central, and upper percentiles to use for error bar. If None, the mean +-1 standard deviation is plotted.

ax: Optional[matplotlib.axes.Axes]

matplotlib.axes.Axes to plot on.

**kwargsAny

passed on to matplotlib.axes.Axes.errorbar()

Returns
mplerrorbarresultAny

returns the result of call to matplotlib.axes.Axes.errorbar().

bootstraphistogram.plot.fill_between(hist: BootstrapHistogram, percentiles: Optional[Tuple[float, float]] = (15.865000000000002, 84.13499999999999), ax: Optional[Axes] = None, **kwargs: Any) Any

Fill the area between two percentiles.

Parameters
hist: bootstraphistogram.BootstrapHistogram

the bootstraphistogram.BootstrapHistogram to plot.

percentiles: Optional[Tuple[float, float]]

upper and lower percentile bounds to fill. A pair of numbers between 0 and 100. Defaults to fill an equal-tailed 68.27% interval. If None, the mean +-1 standard deviation is plotted.

ax: Optional[matplotlib.axes.Axes]

matplotlib.axes.Axes to plot on.

**kwargsAny

passed on to matplotlib.axes.Axes.fill_between()

Returns
mplfillbetweenresultAny

returns the result of call to matplotlib.axes.Axes.fill_between().

bootstraphistogram.plot.scatter(hist: BootstrapHistogram, ax: Optional[Axes] = None, **kwargs: Any) Any

Scatter plot of the bootstrap samples.

The scatter-point x-coordinate within a bin drawn is from a uniform random distribution.

Parameters
hist: bootstraphistogram.BootstrapHistogram

the bootstraphistogram.BootstrapHistogram to plot.

ax: Optional[matplotlib.axes.Axes]

matplotlib.axes.Axes to plot on.

**kwargsAny

passed on to matplotlib.axes.Axes.fill_between()

Returns
mplfillbetweenresultAny

returns the result of call to matplotlib.axes.Axes.scatter().

bootstraphistogram.plot.step(hist: BootstrapHistogram, percentile: Optional[float] = None, ax: Optional[Axes] = None, **kwargs: Any) Any

Plot a curve corresponding to the histogram bootstrap sample mean (or the given percentile).

Parameters
hist: bootstraphistogram.BootstrapHistogram

the bootstraphistogram.BootstrapHistogram to plot.

percentile: Optional[float]

the sample percentile to plot. A number between 0 and 100. 50 corresponds to the median. If None, the mean is plotted.

ax: Optional[matplotlib.axes.Axes]

matplotlib.axes.Axes to plot on.

**kwargsAny

passed on to matplotlib.axes.Axes.step()

Returns
mplstepresultAny

returns the result of call to matplotlib.axes.Axes.step().

bootstraphistogram.BootstrapMoment

class bootstraphistogram.BootstrapMoment(numsamples: int = 100, rng: Optional[Union[int, Generator]] = None, **kwargs: Any)

Computes the mean, variance and skewness of a (optionally weighted) dataset with bootstrap resampling.

Parameters
numsamplesint

The number of bootstrap samples. Increasing this number improves the accuracy of estimators derived from the bootstrap samples, at the cost of increased memory and CPU usage.

rngUnion[int, np.random.Generator, None]

A numpy generator. If not provided, the numpy default from numpy.random.default_rng() will be used.

**kwargs: Any

Passed on to the underlying bootstraphistogram.BootstrapHistogram.

Examples

To create and fill an instance:

>>> from bootstraphistogram import BootstrapMoment
>>> import numpy as np
>>> moments = BootstrapMoment(3, rng=1234)
>>> moments.fill(np.arange(101))

Compute the mean:

>>> moments.mean().nominal
50.0
>>> moments.mean().samples
array([51.4, 50.3, 46.1 ])

Compute the standard deviation:

>>> moments.std().nominal
29.15
>>> moments.std().samples
array([29.5, 29.5, 28.6])

Compute the skewness:

>>> moments.skewness().nominal
0.0
>>> moments.skewness().samples
array([-0.00,  0.10,  0.06])
fill(values: ArrayLike, weight: Optional[ArrayLike] = None, seed: Optional[ArrayLike] = None, **kwargs: Any) None

Fill the object with some values.

Parameters
values“ArrayLike”

A 1D array containing the values from which moments will be calculated.

weightOptional[“ArrayLike”]

weights associated with the values.

seed: Optional[“ArrayLike”]

Per-element seed. Overrides the Generator given in the constructor and uses a pseudo-random number generator seeded by the given value. In some cases it is desirable to seed the generator with a record ID to allow bootstrap samples to be statistically correlated between objects.

**kwargsAny

Passed on to the underlying boost_histogram.Histogram.fill.

mean() ValueWithSamples[float]

Compute the mean.

Returns
ValueWithSamples

the (weighted) mean of the (weighted) fill values and bootstrap resamples.

property numsamples: int

Number of bootstrap re-samplings.

skewness() ValueWithSamples[float]

Compute the skewness.

Returns
ValueWithSamples

the (weighted) skewness of the (weighted) fill values and bootstrap resamples.

std() ValueWithSamples[float]

Compute the standard deviation.

Returns
ValueWithSamples

the (weighted) standard deviation of the (weighted) fill values and bootstrap resamples.

variance() ValueWithSamples[float]

Compute the variance.

Returns
ValueWithSamples

the (weighted) variance of the (weighted) fill values and bootstrap resamples.

bootstraphistogram.ValueWithSamples

class bootstraphistogram.ValueWithSamples(nominal: T, samples: NDArray[Any])

Container class storing a calculated value along with its bootstrap resamples.

Parameters
nominal: T

the value without any resampling.

samples: “NDArray[Any]”

the same value with Poisson bootstrap resampling applied.

property nominal: T

the value without any resampling.

property samples: NDArray[Any]

the value with Poisson bootstrap resampling applied.

bootstraphistogram.BootstrapEfficiency

class bootstraphistogram.BootstrapEfficiency(*axes: Axis, numsamples: int = 100, rng: Optional[Union[int, Generator]] = None, nanto: Optional[float] = None, **kwargs: Any)

Calculates binned efficiencies with uncertainties calculated with Poission bootstrap resampling.

Parameters
*axesboost_histogram.axis.Axis

Any number of boost_histogram.axis.Axis objects that define the efficiency binning. See <https://boost-histogram.readthedocs.io/en/latest/usage/axes.html>.

numsamplesint

The number of bootstrap samples. Increasing this number improves the accuracy of estimators derived from the bootstrap samples, at the cost of increased memory and CPU usage.

rngUnion[int, np.random.Generator, None]

A numpy generator. If not provided, the numpy default from numpy.random.default_rng() will be used.

nanto: Optional[float]

When calculating efficiencies empty bins will result in NaN. If not None these values will be set to nanto.

**kwargsAny

Passed on to the bootstraphistogram.BootstrapHistogram constructor. numpy.ndarray

class Array(numerator: NDArray[Any], denominator: NDArray[Any], efficiency: NDArray[Any])

A result type to store arrays returned by bootstraphistogram.BootstrapEfficiency

denominator: NDArray[Any]

Alias for field number 1

efficiency: NDArray[Any]

Alias for field number 2

numerator: NDArray[Any]

Alias for field number 0

class Histogram(numerator: Histogram, denominator: Histogram, efficiency: Histogram)

A result type to store histograms returned by bootstraphistogram.BootstrapEfficiency

denominator: Histogram

Alias for field number 1

efficiency: Histogram

Alias for field number 2

numerator: Histogram

Alias for field number 0

property axes: Tuple[Axis, ...]

boost_histogram.axis.Axis representing the histogram binning. The first dimension corresponds to whether an entry is included in the numerator or not (index 0 = not in numerator, index 1 = included in numerator). The last dimension corresponds to the bootstrap sample index.

property denominator: BootstrapHistogram

The denominator as a BootstrapHistogram.

property efficiency: BootstrapHistogram

The efficiency as a BootstrapHistogram.

fill(selected: ArrayLike, *args: ArrayLike, weight: Optional[ArrayLike] = None, seed: Optional[ArrayLike] = None, **kwargs: Any) BootstrapEfficiency

Fill the histogram with some values.

Parameters
selected: “ArrayLike”

A 1D boolean array determining whether an event enters the numerator or denominator.

*args“ArrayLike”

An 1D array containing coordinates for each dimension of the histogram.

weightOptional[“ArrayLike”]

Entry weights used to fill the histogram.

seed: Optional[“ArrayLike”]

Per-element seed. Overrides the Generator given in the constructor and uses a pseudo-random number generator seeded by the given value. This may be useful when filling multiple histograms with data that is not statistically independent (where it may be desirable to seed the generator with a record ID).

**kwargsAny

Passed on to boostraphistogram.BootstrapHistogram.fill.

Returns
selfBootstrapEfficiency

Reference to this object. This is done to maintain consistency with boost_histogram.Histogram.

mean(flow: bool = False, ignore_nan: bool = True) Array

Binned mean of the bootstrap samples.

Parameters
flow: bool

If True under and overflow bins are included.

ignore_nanbool

If True numpy.nanmean() is used.

property nominal: Histogram

A histogram of the filled values, with no bootstrap re-sampling applied.

property numerator: BootstrapHistogram

The numerator as a BootstrapHistogram.

property numsamples: int

Number of bootstrap re-samplings.

percentile(q: float, flow: bool = False, interpolation: str = 'linear', ignore_nan: bool = True) Array

Binned q-th percentile of the bootstrap samples.

Parameters
qfloat

The percentile, a number between 0 and 100 (inclusive).

flow: bool

If True under and overflow bins are included.

interpolationstr

As numpy.percentile().

ignore_nanbool

If True numpy.nanpercentile() is used.

Returns
numpy.ndarray

an array containing the q-th percentile of all bootstrap samples for each bin in the histogram, .

project(*args: int) BootstrapEfficiency

Reduce histogram dimensionality by summing over some dimensions. The efficiency “selected” axis (first axis) and the bootstrap sample axis (final axis) are always kept by this operation.

Parameters
*args: int

The dimensions to be kept.

Returns
hist: BootstrapEfficiency

a copy of the histogram with only axes in args and the bootstrap sample axes.

property samples: Histogram

A histogram of the bootstrap samples. The last dimension corresponds to the bootstrap sample index and is of size BootstrapEfficiency.numsamples.

std(flow: bool = False, ignore_nan: bool = True) Array

Binned standard deviation of the boostrap samples.

Parameters
flow: bool

If True under and overflow bins are included.

ignore_nanbool

If True numpy.nanstd() is used.

view(flow: bool = False) Any

Return a view of the underlying histogram bootstrap sample values.