API
bootstraphistogram
A multi-dimensional histogram. The distribution of the histograms bin values is computed with the Possion bootstrap re-sampling method.
bootstraphistogram.BootstrapHistogram
is the main histogram class.Some basic plotting functions are provided in
bootstraphistogram.plot
.bootstraphistogram.BootstrapMoment
calculates the first three moments of a dataset.
bootstraphistogram.BootstrapHistogram
- class bootstraphistogram.BootstrapHistogram(*axes: Axis, numsamples: int = 100, rng: Optional[Union[int, Generator]] = None, **kwargs: Any)
A histogram with automatic Poission bootstrap resampling
The implementation is backed by boost Histogram (<https://github.com/scikit-hep/boost-histogram>) and thus
BoostrapHistogram
mimics theboost_histogram.Histogram
interface.- Parameters
- *axesboost_histogram.axis.Axis
Any number of
boost_histogram.axis.Axis
objects that define the histogram binning. See <https://boost-histogram.readthedocs.io/en/latest/usage/axes.html>.- numsamplesint
The number of bootstrap samples. Increasing this number improves the accuracy of estimators derived from the bootstrap samples, at the cost of increased memory and CPU usage.
- rngUnion[int, np.random.Generator, None]
A numpy generator. If not provided, the numpy default from
numpy.random.default_rng()
will be used.- **kwargsAny
Passed on to the
boost_histogram.Histogram
constructor.numpy.ndarray
- property axes: Tuple[Axis, ...]
boost_histogram.axis.Axis
representing the histogram binning. The last dimension corresponds to the bootstrap sample index.
- fill(*args: ArrayLike, weight: Optional[ArrayLike] = None, seed: Optional[ArrayLike] = None, **kwargs: Any) BootstrapHistogram
Fill the histogram with some values.
- Parameters
- *args“ArrayLike”
An 1D array containing coordinates for each dimension of the histogram.
- weightOptional[“ArrayLike”]
Entry weights used to fill the histogram.
- seed: Optional[“ArrayLike”]
Per-element seed. Overrides the Generator given in the constructor and uses a pseudo-random number generator seeded by the given value. This may be useful when filling multiple histograms with data that is not statistically independent (where it may be desirable to seed the generator with a record ID).
- **kwargsAny
Passed on to
boost_histogram.Histogram.fill
.
- Returns
- selfBootstrapHistogram
Reference to this object. This is done to maintain consistency with boost_histogram.Histogram.
- mean(flow: bool = False, ignore_nan: bool = True) NDArray[Any]
Binned sample mean.
- Parameters
- flow: bool
If True under and overflow bins are included.
- ignore_nanbool
If True
numpy.nanmean()
is used.
- Returns
- numpy.ndarray
an array containing the mean value of all bootstrap samples for each bin in the histogram.
- property nominal: Histogram
A histogram of the filled values, with no bootstrap samples.
- property numsamples: int
Number of bootstrap re-samplings.
- percentile(q: float, flow: bool = False, interpolation: str = 'linear', ignore_nan: bool = True) NDArray[Any]
Binned q-th percentile.
- Parameters
- qfloat
The percentile, a number between 0 and 100 (inclusive).
- flow: bool
If True under and overflow bins are included.
- interpolationstr
- ignore_nanbool
If True
numpy.nanpercentile()
is used.
- Returns
- numpy.ndarray
an array containing the q-th percentile of all bootstrap samples for each bin in the histogram, .
- project(*args: int) BootstrapHistogram
Reduce histogram dimensionality by summing over some dimensions. The bootstrap sample axis is always kept by this operation.
- Parameters
- *args: int
The dimensions to be kept.
- Returns
- hist: BootstrapHistogram
a copy of the histogram with only axes in args and the bootstrap sample axes.
- property samples: Histogram
A histogram of the bootstrap samples. The last dimension corresponds to the bootstrap sample index and is of size
BootstrapHistogram.numsamples
.
- std(flow: bool = False, ignore_nan: bool = True) NDArray[Any]
Binned sample standard deviation.
- Parameters
- flow: bool
If True under and overflow bins are included.
- ignore_nanbool
If True
numpy.nanstd()
is used.
- Returns
- numpy.ndarray
an array containing the standard deviation of all bootstrap sample values for each bin in the histogram, .
- view(flow: bool = False) Any
Return a view of the underlying histogram bootstrap sample values.
bootstraphistogram.plot
Functions to plot BootstrapHistogram objects with matplotlib.
- exception bootstraphistogram.plot.HistogramRankError
Error raised when trying to plot a histogram with the wrong number of dimensions.
- bootstraphistogram.plot.errorbar(hist: BootstrapHistogram, percentiles: Optional[Tuple[float, float, float]] = None, ax: Optional[Axes] = None, **kwargs: Any) Any
Plot the bootstrap sample mean and standard deviation.
- Parameters
- hist: bootstraphistogram.BootstrapHistogram
the
bootstraphistogram.BootstrapHistogram
to plot.- percentiles: Optional[Tuple[float, float]]
lower, central, and upper percentiles to use for error bar. If None, the mean +-1 standard deviation is plotted.
- ax: Optional[matplotlib.axes.Axes]
matplotlib.axes.Axes
to plot on.- **kwargsAny
passed on to
matplotlib.axes.Axes.errorbar()
- Returns
- mplerrorbarresultAny
returns the result of call to
matplotlib.axes.Axes.errorbar()
.
- bootstraphistogram.plot.fill_between(hist: BootstrapHistogram, percentiles: Optional[Tuple[float, float]] = (15.865000000000002, 84.13499999999999), ax: Optional[Axes] = None, **kwargs: Any) Any
Fill the area between two percentiles.
- Parameters
- hist: bootstraphistogram.BootstrapHistogram
the
bootstraphistogram.BootstrapHistogram
to plot.- percentiles: Optional[Tuple[float, float]]
upper and lower percentile bounds to fill. A pair of numbers between 0 and 100. Defaults to fill an equal-tailed 68.27% interval. If None, the mean +-1 standard deviation is plotted.
- ax: Optional[matplotlib.axes.Axes]
matplotlib.axes.Axes
to plot on.- **kwargsAny
passed on to
matplotlib.axes.Axes.fill_between()
- Returns
- mplfillbetweenresultAny
returns the result of call to
matplotlib.axes.Axes.fill_between()
.
- bootstraphistogram.plot.scatter(hist: BootstrapHistogram, ax: Optional[Axes] = None, **kwargs: Any) Any
Scatter plot of the bootstrap samples.
The scatter-point x-coordinate within a bin drawn is from a uniform random distribution.
- Parameters
- hist: bootstraphistogram.BootstrapHistogram
the
bootstraphistogram.BootstrapHistogram
to plot.- ax: Optional[matplotlib.axes.Axes]
matplotlib.axes.Axes
to plot on.- **kwargsAny
passed on to
matplotlib.axes.Axes.fill_between()
- Returns
- mplfillbetweenresultAny
returns the result of call to
matplotlib.axes.Axes.scatter()
.
- bootstraphistogram.plot.step(hist: BootstrapHistogram, percentile: Optional[float] = None, ax: Optional[Axes] = None, **kwargs: Any) Any
Plot a curve corresponding to the histogram bootstrap sample mean (or the given percentile).
- Parameters
- hist: bootstraphistogram.BootstrapHistogram
the
bootstraphistogram.BootstrapHistogram
to plot.- percentile: Optional[float]
the sample percentile to plot. A number between 0 and 100. 50 corresponds to the median. If
None
, the mean is plotted.- ax: Optional[matplotlib.axes.Axes]
matplotlib.axes.Axes
to plot on.- **kwargsAny
passed on to
matplotlib.axes.Axes.step()
- Returns
- mplstepresultAny
returns the result of call to
matplotlib.axes.Axes.step()
.
bootstraphistogram.BootstrapMoment
- class bootstraphistogram.BootstrapMoment(numsamples: int = 100, rng: Optional[Union[int, Generator]] = None, **kwargs: Any)
Computes the mean, variance and skewness of a (optionally weighted) dataset with bootstrap resampling.
- Parameters
- numsamplesint
The number of bootstrap samples. Increasing this number improves the accuracy of estimators derived from the bootstrap samples, at the cost of increased memory and CPU usage.
- rngUnion[int, np.random.Generator, None]
A numpy generator. If not provided, the numpy default from
numpy.random.default_rng()
will be used.- **kwargs: Any
Passed on to the underlying
bootstraphistogram.BootstrapHistogram
.
Examples
To create and fill an instance:
>>> from bootstraphistogram import BootstrapMoment >>> import numpy as np >>> moments = BootstrapMoment(3, rng=1234) >>> moments.fill(np.arange(101))
Compute the mean:
>>> moments.mean().nominal 50.0 >>> moments.mean().samples array([51.4, 50.3, 46.1 ])
Compute the standard deviation:
>>> moments.std().nominal 29.15 >>> moments.std().samples array([29.5, 29.5, 28.6])
Compute the skewness:
>>> moments.skewness().nominal 0.0 >>> moments.skewness().samples array([-0.00, 0.10, 0.06])
- fill(values: ArrayLike, weight: Optional[ArrayLike] = None, seed: Optional[ArrayLike] = None, **kwargs: Any) None
Fill the object with some values.
- Parameters
- values“ArrayLike”
A 1D array containing the values from which moments will be calculated.
- weightOptional[“ArrayLike”]
weights associated with the values.
- seed: Optional[“ArrayLike”]
Per-element seed. Overrides the Generator given in the constructor and uses a pseudo-random number generator seeded by the given value. In some cases it is desirable to seed the generator with a record ID to allow bootstrap samples to be statistically correlated between objects.
- **kwargsAny
Passed on to the underlying
boost_histogram.Histogram.fill
.
- mean() ValueWithSamples[float]
Compute the mean.
- Returns
- ValueWithSamples
the (weighted) mean of the (weighted) fill values and bootstrap resamples.
- property numsamples: int
Number of bootstrap re-samplings.
- skewness() ValueWithSamples[float]
Compute the skewness.
- Returns
- ValueWithSamples
the (weighted) skewness of the (weighted) fill values and bootstrap resamples.
- std() ValueWithSamples[float]
Compute the standard deviation.
- Returns
- ValueWithSamples
the (weighted) standard deviation of the (weighted) fill values and bootstrap resamples.
- variance() ValueWithSamples[float]
Compute the variance.
- Returns
- ValueWithSamples
the (weighted) variance of the (weighted) fill values and bootstrap resamples.
bootstraphistogram.ValueWithSamples
- class bootstraphistogram.ValueWithSamples(nominal: T, samples: NDArray[Any])
Container class storing a calculated value along with its bootstrap resamples.
- Parameters
- nominal: T
the value without any resampling.
- samples: “NDArray[Any]”
the same value with Poisson bootstrap resampling applied.
- property nominal: T
the value without any resampling.
- property samples: NDArray[Any]
the value with Poisson bootstrap resampling applied.
bootstraphistogram.BootstrapEfficiency
- class bootstraphistogram.BootstrapEfficiency(*axes: Axis, numsamples: int = 100, rng: Optional[Union[int, Generator]] = None, nanto: Optional[float] = None, **kwargs: Any)
Calculates binned efficiencies with uncertainties calculated with Poission bootstrap resampling.
- Parameters
- *axesboost_histogram.axis.Axis
Any number of
boost_histogram.axis.Axis
objects that define the efficiency binning. See <https://boost-histogram.readthedocs.io/en/latest/usage/axes.html>.- numsamplesint
The number of bootstrap samples. Increasing this number improves the accuracy of estimators derived from the bootstrap samples, at the cost of increased memory and CPU usage.
- rngUnion[int, np.random.Generator, None]
A numpy generator. If not provided, the numpy default from
numpy.random.default_rng()
will be used.- nanto: Optional[float]
When calculating efficiencies empty bins will result in NaN. If not None these values will be set to nanto.
- **kwargsAny
Passed on to the
bootstraphistogram.BootstrapHistogram
constructor.numpy.ndarray
- class Array(numerator: NDArray[Any], denominator: NDArray[Any], efficiency: NDArray[Any])
A result type to store arrays returned by
bootstraphistogram.BootstrapEfficiency
- denominator: NDArray[Any]
Alias for field number 1
- efficiency: NDArray[Any]
Alias for field number 2
- numerator: NDArray[Any]
Alias for field number 0
- class Histogram(numerator: Histogram, denominator: Histogram, efficiency: Histogram)
A result type to store histograms returned by
bootstraphistogram.BootstrapEfficiency
- denominator: Histogram
Alias for field number 1
- efficiency: Histogram
Alias for field number 2
- numerator: Histogram
Alias for field number 0
- property axes: Tuple[Axis, ...]
boost_histogram.axis.Axis
representing the histogram binning. The first dimension corresponds to whether an entry is included in the numerator or not (index 0 = not in numerator, index 1 = included in numerator). The last dimension corresponds to the bootstrap sample index.
- property denominator: BootstrapHistogram
The denominator as a BootstrapHistogram.
- property efficiency: BootstrapHistogram
The efficiency as a BootstrapHistogram.
- fill(selected: ArrayLike, *args: ArrayLike, weight: Optional[ArrayLike] = None, seed: Optional[ArrayLike] = None, **kwargs: Any) BootstrapEfficiency
Fill the histogram with some values.
- Parameters
- selected: “ArrayLike”
A 1D boolean array determining whether an event enters the numerator or denominator.
- *args“ArrayLike”
An 1D array containing coordinates for each dimension of the histogram.
- weightOptional[“ArrayLike”]
Entry weights used to fill the histogram.
- seed: Optional[“ArrayLike”]
Per-element seed. Overrides the Generator given in the constructor and uses a pseudo-random number generator seeded by the given value. This may be useful when filling multiple histograms with data that is not statistically independent (where it may be desirable to seed the generator with a record ID).
- **kwargsAny
Passed on to
boostraphistogram.BootstrapHistogram.fill
.
- Returns
- selfBootstrapEfficiency
Reference to this object. This is done to maintain consistency with boost_histogram.Histogram.
- mean(flow: bool = False, ignore_nan: bool = True) Array
Binned mean of the bootstrap samples.
- Parameters
- flow: bool
If True under and overflow bins are included.
- ignore_nanbool
If True
numpy.nanmean()
is used.
- property nominal: Histogram
A histogram of the filled values, with no bootstrap re-sampling applied.
- property numerator: BootstrapHistogram
The numerator as a BootstrapHistogram.
- property numsamples: int
Number of bootstrap re-samplings.
- percentile(q: float, flow: bool = False, interpolation: str = 'linear', ignore_nan: bool = True) Array
Binned q-th percentile of the bootstrap samples.
- Parameters
- qfloat
The percentile, a number between 0 and 100 (inclusive).
- flow: bool
If True under and overflow bins are included.
- interpolationstr
- ignore_nanbool
If True
numpy.nanpercentile()
is used.
- Returns
- numpy.ndarray
an array containing the q-th percentile of all bootstrap samples for each bin in the histogram, .
- project(*args: int) BootstrapEfficiency
Reduce histogram dimensionality by summing over some dimensions. The efficiency “selected” axis (first axis) and the bootstrap sample axis (final axis) are always kept by this operation.
- Parameters
- *args: int
The dimensions to be kept.
- Returns
- hist: BootstrapEfficiency
a copy of the histogram with only axes in args and the bootstrap sample axes.
- property samples: Histogram
A histogram of the bootstrap samples. The last dimension corresponds to the bootstrap sample index and is of size
BootstrapEfficiency.numsamples
.
- std(flow: bool = False, ignore_nan: bool = True) Array
Binned standard deviation of the boostrap samples.
- Parameters
- flow: bool
If True under and overflow bins are included.
- ignore_nanbool
If True
numpy.nanstd()
is used.
- view(flow: bool = False) Any
Return a view of the underlying histogram bootstrap sample values.