Problem 3¶
Analyze the height data of a population.
The data is available at: http://jse.amstat.org/v11n2/datasets.heinz.html.
Given a series of measurements in centimeters, compute: * The minimum, the maximum and average height of the population, * A histogram of the heights given the number of bins
Classes:
|
Represent a single measurement of a human height. |
|
Represent a range of measurements. |
|
Represent the ranges of the histogram bins. |
|
Represent a mutable histogram. |
Functions:
|
Compute the statistics of the given |
|
Find the index of the bin range among |
|
Compute the histogram over |
- class Measurement(value: float)[source]¶
Represent a single measurement of a human height.
Methods:
__new__
(cls, value)Enforce the valid range on the measurement.
- static __new__(cls, value: float) Measurement [source]¶
Enforce the valid range on the measurement.
- Requires
0 < value < 400
(Only valid value; the tallest man on earth ever measured was 251cm tall.)
- compute_stats(measurements: List[Measurement]) Tuple[float, float, float] [source]¶
Compute the statistics of the given
measurements
.- Returns
Minimum, mean, maximum
- Requires
len(measurements) > 0
- Ensures
not (len(set(measurements)) != 1) or result[0] < result[1] < result[2]
not (len(set(measurements)) == 1) or result[0] == result[1] == result[2]
(Identical measurements all give the same min, average, max)
- class Range(start: float, end: float)[source]¶
Represent a range of measurements.
Methods:
__init__
(start, end)Initialize with the given values.
__repr__
()Represent as mathematical range for easier debugging.
- class BinRanges(bin_count: int, lower_bound: float, upper_bound: float, include_minus_inf: bool, include_inf: bool)[source]¶
Represent the ranges of the histogram bins.
Methods:
__new__
(cls, bin_count, lower_bound, ...)Construct
bin_count
number of histogram bins betweenlower_bound
andupper_bound
.Get the bin range at the given index.
__len__
()Return the number of the bin ranges.
__iter__
()Iterate over the bin ranges.
- static __new__(cls, bin_count: int, lower_bound: float, upper_bound: float, include_minus_inf: bool, include_inf: bool) BinRanges [source]¶
Construct
bin_count
number of histogram bins betweenlower_bound
andupper_bound
.If
include_inf
, include -∞ and +∞ in the spanned total range of histogram.- Requires
( bin_width := (upper_bound - lower_bound) / bin_count, bin_width != 0 )[1]
(Bin width not numerically zero)
not math.isnan(lower_bound) and not math.isinf(lower_bound)
not math.isnan(upper_bound) and not math.isinf(upper_bound)
lower_bound < upper_bound
- Ensures
include_inf and include_minus_inf
⇒len(result) == bin_count + 2
(bin_count does not refer to +/- inf bins)
not include_inf and include_minus_inf
⇒len(result) == bin_count + 1
(bin_count does not refer to +/- inf bins)
include_inf and not include_minus_inf
⇒len(result) == bin_count + 1
(bin_count does not refer to +/- inf bins)
not include_inf and not include_minus_inf
⇒len(result) == bin_count
(bin_count does not refer to +/- inf bins)
not (include_inf ^ math.isinf(result[-1].end))
(include_inf <=> upper bound of the last bin is +inf)
not (include_minus_inf ^ math.isinf(result[0].start))
(include_min_inf <=> lower bound of the first bin is -inf)
all( previous.end == current.start for previous, current in common.pairwise(result) )
(Bin ranges without a hole)
- bin_index(ranges: BinRanges, value: float) int [source]¶
Find the index of the bin range among
ranges
corresponding tovalue
.- Requires
not math.isnan(value)
- Ensures
value < ranges[0].start
⇒result == -1
(Value not covered in ranges => bin not found)
value > ranges[-1].end
⇒result == -1
(Value not covered in ranges => bin not found)
ranges[0].start <= value <= ranges[-1].end
⇒0 <= result < len(ranges)
(Value in the ranges => bin found)
result != -1
⇒ranges[result].start <= value < ranges[result].end
(Index not found or it corresponds to the correct bin range)
- class Histogram(ranges: BinRanges)[source]¶
Represent a mutable histogram.
- Establishes
all(count >= 0 for count in self.counts)
Methods:
__init__
(ranges)Initialize the histogram with zero counts for
ranges
.add
(value)Count the
value
in the corresponding bin.items
()Iterate over (bin range, count of observations).
Attributes:
Bin ranges
Count of observations for the given bin
- __init__(ranges: BinRanges) None [source]¶
Initialize the histogram with zero counts for
ranges
.- Requires
len(ranges) > 0
- ranges¶
Bin ranges
- counts¶
Count of observations for the given bin
- compute_histogram(measurements: Sequence[Measurement]) List[Tuple[Range, int]] [source]¶
Compute the histogram over
measurements
.- Returns
List of (bin range, count of observations for that bin)
- Requires
len(measurements) > 0
- Ensures
len(measurements) == sum(item[1] for item in result)