Package reflectometry :: Package reduction :: Module selection :: Class Datasets

Class Datasets

source code

A dataset is a set of files with an embedded sequence number. The Dataset class gathers filenames one at a time and composes them into datasets.

The sequence number is assumed to be at most three digits, and to be the last three digits before the extension. The dataset includes both the sequence number and the extension.

For each dataset there is a count of the number of files in the dataset (max 1000) and the starting and ending sequence numbers. The filenames associated with the smallest and largest sequence numbers are retained so the starting and ending time can be read from the directory or the file (using method latest()). The entire list of files in the dataset is also available.

Only files in the set of valid extensions are considered datasets. All others are classed as 'other'.

Nested Classes
  Dataset
Instance Methods
 
__init__(self, extensions=None) source code
 
walk(self, pattern='', recurse=False, revisit=False)
Walk a file pattern adding all new files into the list of available datasets.
source code
 
add(self, filename)
Add a single file to a dataset.
source code
 
__add__(self, other)
Merge two datasets (e.g., from different subtrees)
source code
Class Variables
  pattern = re.compile(r'^(?P<name>[^\.]*?)(?P<seq>[0-9]{1,3})?(...
Method Details

walk(self, pattern='', recurse=False, revisit=False)

source code 

Walk a file pattern adding all new files into the list of available datasets. The pattern has an implicit '*' at the end, thus matching all leading elements. If recurse is true, then enter subdirectories. If revisit is true, revisit subdirectories that have already been visited.

Note: the goal of revisit is to support refresh on the list of available files, however revisiting a whole subtree can be expensive. Ideally this would run in a separate thread with yields to the GUI which could update the list of available datasets and/or a waitbar while this is happening. Yet again good interface trumps clean separation?

__add__(self, other)
(Addition operator)

source code 

Merge two datasets (e.g., from different subtrees)

Not sure yet if this is necessary --- maybe we want to enter the subtrees and see what is available?


Class Variable Details

pattern

Value:
re.compile(r'^(?P<name>[^\.]*?)(?P<seq>[0-9]{1,3})?(?P<junk>[^\.0-9]*?\
)(?P<ext>\..*)?$')