dupin.data.aggregate#

Overview

SignalAggregator

Using generators computes signals across a trajectory.

XarrayGenerator

Generator that converts a frame from xarray to a dupin compatible form.

Details

Helper module for generating/storing features accross an entire trajectory.

This class provides the SignalAggregator class which takes a pipeline and provides methods for storing the output across a trajectory.

class dupin.data.aggregate.SignalAggregator(generator, logger=None)[source]#

Using generators computes signals across a trajectory.

This class can be used to create appropriate data structures for use in analyzing a whole trajectory with offline methods or iteratively analyzing for online use. See the compute and accumulate methods for usage.

Parameters:
  • generator (dupin.data.base.GeneratorLike) – A sequence of signal generators to use for generating the multivariate signal of a trajectory.

  • logger (dupin.data.logging.Logger) – A logger object to store information about the data processing of the given pipeline. Defaults to None.

generator#

The generator which generates data given a trajectory frame.

Type:

dupin.data.base.GeneratorLike

signals#

The current list of analyzed frames.

Type:

list[dict]

__init__(generator, logger=None)[source]#
__weakref__#

list of weak references to the object

accumulate(*args, **kwargs)[source]#

Add features from simulation snapshot to object.

Allows the addition of individual snapshots to aggregator. This can be useful for online detection or any case where computing the entire trajectory is not possible or not desired or for use in a for loop.

Parameters:
  • *args – Positional arguments to feed to the generator like object.

  • **kwargs – Keyword arguments to feed to the generator like object.

compute(iterator)[source]#

Compute signals from generator across the iterator.

These signals are stored internally in signals until asked for by to_dataframe or to_xarray. This can be called multiple times, and the stored signals values will be appended.

Parameters:

iterator (Iterator[Tuple[Tuple[Any,...], Dict[str, Any]]]) – An object when iterated over, yields args and kwargs compatible with the generator_like object’s call signature.

Note

Use the from_base_iterator staticmethod to convert a standard iterator into one compatible with this method.

static from_base_iterator(iterator, is_args=False, is_kwargs=False)[source]#

Convert a base iterator into one that works with compute.

The default behavior is to treat the items of the iterator as a single positional argument. Read the argument list for alternative options.

Parameters:
  • iterator (Iterator[Any]) – The iterator to convert.

  • is_args (bool, optional) – Whether to treat the iterator objects as positional arguments (i.e. yields tuples). Defaults to False.

  • is_kwargs (bool, optional) – Whether to treat the iterator objects as keyword arguments (i.e. yields dicts). Defaults to False.

Returns:

new_iterator – The modified iterator.

Return type:

Iterator[Tuple[Tuple[Any,…], Dict[str, Any]]]

property logger#

Logger for the aggregator.

Type:

dupin.data.logging.Logger

to_dataframe()[source]#

Return the aggregated signals as a pandas DataFrame.

Note

This method requires pandas to be available.

Returns:

signals – The aggregated signals. The columns are features, and the indices correspond to system frames in the order passed to accumulate or compute.

Return type:

pandas.DataFrame

to_xarray(third_dim_name='third_dim')[source]#

Return the aggregated signal as a xarray.DataArray.

This method is designed to be used primarily with non-reduced data (e.g. per-particle features). This enables with XarrayGenerator to do the mapping/reduction later, attempt multiple reductions, or use the data for purposes outside detection such as visualization or plotting.

Note

This method requires xarray to be available.

Warning

This method only works when all arrays have the same first dimension size.

Returns:

signal – The aggregated signal. The first dimension is frames, the second is features, and the third the first dimension of the aggregated features (eg. number of particles).

Return type:

xarray.DataArray

class dupin.data.aggregate.XarrayGenerator(feature_dim='feature')[source]#

Generator that converts a frame from xarray to a dupin compatible form.

This class is useful to use with SignalAggregator.to_xarray to separate the data generation and optionally the mapping step from the reduction step.

Parameters:

feature_dim (str, optional) – The name of the feature dimension in the xarray frames, defaults to “feature” (the default of SignalAggregator.to_xarray).

__call__(xarray_frame)[source]#

Convert the xarray object to a dupin pipeline representation.

Parameters:

xarray_frame (xarray.DataArray) – The data array for the current frame of the signal.

Returns:

frame – The data represented as a dictionary of arrays.

Return type:

dict [str, numpy.ndarray]

__init__(feature_dim='feature')[source]#