pyppin.math.histogram¶
A flexible class for managing, computing with, and plotting histograms.
This is an example ASCII plot, from the histogram used in the pyppin.iterators.sample
unittest:
|
|
|
|
0.0 +
| #
| ##
| ##
| ##
0.0 + ###
| # ###
| # ###
| # ### #
| # #####
0.0 + ## ######
| ## ######
| ## ######
| ## ######
| ## ######
0.0 + ###########
| ############
| #############
| ###############
| ###############
0.0 + ###############
| ################
| ################
| ################
| ################
0.0 + #################
| ###################
| ####################
| # ####################
| # ####################
0.0 + # # ##################### #
| # ####################### #
| # # #########################
++------+------+------+------+------+------+------+------+------+------+----
0.0 12.1 24.2 36.3 48.4 60.5 72.6 84.8 96.9 109.0 121.1
Classes
|
Define the shapes of the buckets that we will use for the histogram. |
|
- class pyppin.math.histogram.Histogram(bucketing: Optional[Bucketing] = None)[source]¶
Bases:
object
- add(value: float, count: int = 1) None [source]¶
Add a value to the histogram.
- Parameters
value – The value to add.
count – The number of times to add this value.
- percentile(n: float) float [source]¶
Return the value at the Nth percentile of this histogram, with n in [0, 100].
- property standard_deviation: float[source]¶
The standard deviation of the data.
Reminder: If your data doesn’t follow a Gaussian distribution, this is not going to give you a very meaningful number.
- plot_ascii(width: int = 100, height: int = 0, min_percentile: float = 0, max_percentile: float = 100, raw_counts: bool = False) str [source]¶
Generate an ASCII plot of the histogram.
- Parameters
width – The dimensions of the plot, in characters. The height defaults to half the width.
height – The dimensions of the plot, in characters. The height defaults to half the width.
min_percentile – The subrange of the histogram to include.
max_percentile – The subrange of the histogram to include.
raw_counts – If True, plot the raw bucket counts. If False (the default), plot the probability distribution function.
- histogram_values() Callable[[float], float] [source]¶
Return a function that looks like the histogram itself: For each value X, it returns the count in the bucket containing X.
Note that this is not the same as the PDF, because buckets do not all have the same width!
- class pyppin.math.histogram.Bucketing(max_linear_value: Optional[float] = None, linear_steps: float = 1, exponential_multiplier: float = 2)[source]¶
Bases:
object
Define the shapes of the buckets that we will use for the histogram.
We use linear/exponential bucketing: linear buckets (i.e. [0->n) [n->2n) [2n->3n)) up to some initial limit, and beyond that exponential buckets ([m->kn), [kn->k²n), [k²n, k³n)).
- Parameters
max_linear_value – The value at which we switch from linear to exponential buckets, or None to use linear values for everything.
linear_steps – The interval size for linear values.
exponential_multiplier – The multiplication factor for exponential values.
- bucket(value: float) int [source]¶
Given a value to be added to the histogram, figure out which bucket it goes in.