Problem 3¶
Analyze the height data of a population.
The data is available at: http://jse.amstat.org/v11n2/datasets.heinz.html.
Given a series of measurements in centimeters, compute: * The minimum, the maximum and average height of the population, * A histogram of the heights given the number of bins
Classes:
|
Represent a single measurement of a human height. |
|
Represent a range of measurements. |
|
Represent the ranges of the histogram bins. |
|
Represent a mutable histogram. |
Functions:
|
Compute the statistics of the given |
|
Find the index of the bin range among |
|
Compute the histogram over |
- class Measurement(value: float)[source]¶
Represent a single measurement of a human height.
Methods:
__new__(cls, value)Enforce the valid range on the measurement.
- static __new__(cls, value: float) Measurement[source]¶
Enforce the valid range on the measurement.
- Requires
0 < value < 400(Only valid value; the tallest man on earth ever measured was 251cm tall.)
- compute_stats(measurements: List[Measurement]) Tuple[float, float, float][source]¶
Compute the statistics of the given
measurements.- Returns
Minimum, mean, maximum
- Requires
len(measurements) > 0
- Ensures
not (len(set(measurements)) != 1) or result[0] < result[1] < result[2]
not (len(set(measurements)) == 1) or result[0] == result[1] == result[2]
(Identical measurements all give the same min, average, max)
- class Range(start: float, end: float)[source]¶
Represent a range of measurements.
Methods:
__init__(start, end)Initialize with the given values.
__repr__()Represent as mathematical range for easier debugging.
- class BinRanges(bin_count: int, lower_bound: float, upper_bound: float, include_minus_inf: bool, include_inf: bool)[source]¶
Represent the ranges of the histogram bins.
Methods:
__new__(cls, bin_count, lower_bound, ...)Construct
bin_countnumber of histogram bins betweenlower_boundandupper_bound.Get the bin range at the given index.
__len__()Return the number of the bin ranges.
__iter__()Iterate over the bin ranges.
- static __new__(cls, bin_count: int, lower_bound: float, upper_bound: float, include_minus_inf: bool, include_inf: bool) BinRanges[source]¶
Construct
bin_countnumber of histogram bins betweenlower_boundandupper_bound.If
include_inf, include -∞ and +∞ in the spanned total range of histogram.- Requires
( bin_width := (upper_bound - lower_bound) / bin_count, bin_width != 0 )[1]
(Bin width not numerically zero)
not math.isnan(lower_bound) and not math.isinf(lower_bound)not math.isnan(upper_bound) and not math.isinf(upper_bound)lower_bound < upper_bound
- Ensures
include_inf and include_minus_inf⇒len(result) == bin_count + 2(bin_count does not refer to +/- inf bins)
not include_inf and include_minus_inf⇒len(result) == bin_count + 1(bin_count does not refer to +/- inf bins)
include_inf and not include_minus_inf⇒len(result) == bin_count + 1(bin_count does not refer to +/- inf bins)
not include_inf and not include_minus_inf⇒len(result) == bin_count(bin_count does not refer to +/- inf bins)
not (include_inf ^ math.isinf(result[-1].end))(include_inf <=> upper bound of the last bin is +inf)
not (include_minus_inf ^ math.isinf(result[0].start))(include_min_inf <=> lower bound of the first bin is -inf)
all( previous.end == current.start for previous, current in common.pairwise(result) )
(Bin ranges without a hole)
- bin_index(ranges: BinRanges, value: float) int[source]¶
Find the index of the bin range among
rangescorresponding tovalue.- Requires
not math.isnan(value)
- Ensures
value < ranges[0].start⇒result == -1(Value not covered in ranges => bin not found)
value > ranges[-1].end⇒result == -1(Value not covered in ranges => bin not found)
ranges[0].start <= value <= ranges[-1].end⇒0 <= result < len(ranges)(Value in the ranges => bin found)
result != -1⇒ranges[result].start <= value < ranges[result].end(Index not found or it corresponds to the correct bin range)
- class Histogram(ranges: BinRanges)[source]¶
Represent a mutable histogram.
- Establishes
all(count >= 0 for count in self.counts)
Methods:
__init__(ranges)Initialize the histogram with zero counts for
ranges.add(value)Count the
valuein the corresponding bin.items()Iterate over (bin range, count of observations).
Attributes:
Bin ranges
Count of observations for the given bin
- __init__(ranges: BinRanges) None[source]¶
Initialize the histogram with zero counts for
ranges.- Requires
len(ranges) > 0
- ranges¶
Bin ranges
- counts¶
Count of observations for the given bin
- compute_histogram(measurements: Sequence[Measurement]) List[Tuple[Range, int]][source]¶
Compute the histogram over
measurements.- Returns
List of (bin range, count of observations for that bin)
- Requires
len(measurements) > 0
- Ensures
len(measurements) == sum(item[1] for item in result)