dupin.data.logging#

Overview

Logger

Class for logging extra information from data pipeline.

Details

Functions and classes for allowing logging extra data from pipelines.

Logging serves as the only directly supported means of introspecting into the state of a pipeline. This allows pipeline components to report to a logger information that otherwise would be discarded as it is not part of the feature generation. An example would be the index for the specified least and greatest values in dupin.data.reduce.NthGreatest. This particular example allows a user after feature generation to see which particles in a trajectory are chosen each step for a feature.

Similar to the pipeline proper, the logging infrastructure expects pipeline components to return dictionaries. The data itself is stored in a list where each entry is data from one frame of the trajectory parsed. The list elements are nested dictionaries where the top level is feature names (at the various stages of modification) and the second level is the class/modifier specific identifier, and the third and final level is the logging data for that object for that frame.

Note

Logging data is not accessible to other parts of the pipeline.

class dupin.data.logging.Logger[source]#

Class for logging extra information from data pipeline.

Stores available metadata from pipeline components. Not all components offer metadata, and those that do document them.

__init__()[source]#

Construct a Logger instance.

__setitem__(key, value)[source]#

Internally store information from data pipeline.

This is used to store pipeline component feature specific metadata.

__weakref__#

list of weak references to the object

end_frame()[source]#

End the current frame of data. Allows separate by time of data.

property frames#

Assess a particular frame of data.

The data is a list of dict where keys are features and values are dicts with the metadata gathered from the pipeline components.

Type:

list [dict]

to_dataframe()[source]#

Return a pandas.DataFrame object consisting of stored data.

This uses pandas.MultiIndex to map the nested dictionaries to a dataframe.

Warning

This assumes the pipeline produces homogenous data along a trajectory.

Warning

This only works for floating point logged values.