dupin.data.reduce#
Overview
Wrap a custom reducing callable. |
|
Reduce a distribution to the Nth greatest values. |
|
Reduce a distribution into percentile values. |
|
Enable mutliple reducers to act on the same generator like object. |
Details
Classes for transforming array quantities into scalar features.
Reduction in dupin
takes an array and _reduces_ it to a set number of scalar
values. A computer science reduction goes from an array to a single value. Our
usage of the term is similar; we just allow for multiple reductions to happen
within the same reducer. Examples of common reducers in the dupin
sense are
the max, min, mean, mode, and standard deviation functions.
- class dupin.data.reduce.CustomReducer(custom_function)[source]#
Wrap a custom reducing callable.
- Parameters:
generator (
GeneratorLike
) – A generator like object to reduce.custom_function (
callable
[numpy.ndarray
,dict
[str
,float
]) – A custom callable that takes in a NumPy array and returns a dictionary with keys indicating the reduction and values the reduced distribution value.
- function#
The provided callable.
- Type:
callable
[[numpy.ndarray
],dict
[str
,numpy.ndarray
]]
- class dupin.data.reduce.NthGreatest(indices)[source]#
Reduce a distribution to the Nth greatest values.
This reducer returns the greatest and least values from a distribution as specified by the provided indices. Greatest values are specified by positive integers and least by negative, e.g. -1 is the minimum value in the array. The features keys are modified with the index ordinal number and whether it is greatest or least. -1 becomes “1st_least” and 10 becomes “10th_greatest”.
- class dupin.data.reduce.Percentile(percentiles=None)[source]#
Reduce a distribution into percentile values.
The reducers sorts the input array to get the provided percentiles. The reducers then uses the key format f”{percentile}%” to identify it reductions.
- class dupin.data.reduce.Tee(reducers)[source]#
Enable mutliple reducers to act on the same generator like object.
Each reducer is run on the original distribution and their reductions are concatenated. This reducer does not create its own reductions or corresponding keys.
- Parameters:
reducers (
list
[dupin.data.base.DataReducer
]) – A sequence of a data reducers.
- attach_logger(logger)[source]#
Add a logger to this step in the data pipeline.
- Parameters:
logger (dupin.data.logging.Logger) – A logger object to store data from the data pipeline for individual elements of the composed maps.