nexusLIMS.schemas package¶
Schema tools for NexusLIMS.
Submodules¶
nexusLIMS.schemas.activity module¶
The “Acquisition Activity” module.
Provides a class to represent and operate on an Acquisition Activity (as defined by the NexusLIMS Experiment schema), as well as a helper method to cluster a list of filenames by the files’ modification times.
- class nexusLIMS.schemas.activity.AcquisitionActivity(start=None, end=None, mode='', unique_params=None, setup_params=None, unique_meta=None, files=None, previews=None, meta=None, warnings=None)[source]¶
Bases:
object
A collection of files/metadata attributed to a physical acquisition activity.
Instances of this class correspond to AcquisitionActivity nodes in the NexusLIMS schema
- Parameters
start (datetime) – The start point of this AcquisitionActivity
end (datetime) – The end point of this AcquisitionActivity
mode (str) – The microscope mode for this AcquisitionActivity (i.e. ‘IMAGING’, ‘DIFFRACTION’, ‘SCANNING’, etc.)
unique_params (set) – A set of dictionary keys that comprises all unique metadata keys contained within the files of this AcquisitionActivity
setup_params (dict) – A dictionary containing metadata about the data that is shared amongst all data files in this AcquisitionActivity
unique_meta (list) – A list of dictionaries (one for each file in this AcquisitionActivity) containing metadata key-value pairs that are unique to each file in
files
(i.e. those that could not be moved intosetup_params
)files (list) – A list of filenames belonging to this AcquisitionActivity
previews (list) – A list of filenames pointing to the previews for each file in
files
meta (list) – A list of dictionaries containing the “important” metadata for each file in
files
warnings (list) – A list of metadata values that may be untrustworthy because of the software
- add_file(fname: Path, *, generate_preview=True)[source]¶
Add file to AcquisitionActivity.
Add a file to this activity’s file list, parse its metadata (storing a flattened copy of it to this activity), generate a preview thumbnail, get the file’s type, and a lazy HyperSpy signal.
- as_xml(seqno, sample_id)[source]¶
Translate AcquisitionActivity to an XML representation.
Build an XML (
lxml
) representation of this AcquisitionActivity (for use in instances of the NexusLIMS schema).- Parameters
- Returns
activity_xml – A string representing this AcquisitionActivity (note: is not a properly-formed complete XML document since it does not have a header or namespace definitions)
- Return type
- store_setup_params(values_to_search=None)[source]¶
Store common metadata keys as “setup parameters”.
Search the metadata of files in this AcquisitionActivity for those containing identical values over all files, which will then be defined as parameters attributed to experimental setup, rather than individual datasets.
Stores a dictionary containing the metadata keys and values that are consistent across all files in this AcquisitionActivity as an attribute (
self.setup_params
).- Parameters
values_to_search (list) – A list (or tuple, set, or other iterable type) containing values to search for in the metadata dictionary list. If None (default), all values contained in any file will be searched.
- nexusLIMS.schemas.activity.cluster_filelist_mtimes(filelist: List[str]) List[float] [source]¶
Cluster a list of files by modification time.
Perform a statistical clustering of the timestamps (mtime values) of a list of files to find “relatively” large gaps in acquisition time. The definition of relatively depends on the context of the entire list of files. For example, if many files are simultaneously acquired, the “inter-file” time spacing between these will be very small (near zero), meaning even fairly short gaps between files may be important. Conversely, if files are saved every 30 seconds or so, the tolerance for a “large gap” will need to be correspondingly larger.
The approach this method uses is to detect minima in the Kernel Density Estimation (KDE) of the file modification times. To determine the optimal bandwidth parameter to use in KDE, a grid search over possible appropriate bandwidths is performed, using Leave One Out cross-validation. This approach allows the method to determine the important gaps in file acquisition times with sensitivity controlled by the distribution of the data itself, rather than a pre-supposed optimum. The KDE minima approach was suggested here.
- Parameters
filelist (List[str]) – The files (as a list) whose timestamps will be interrogated to find “relatively” large gaps in acquisition time (as a means to find the breaks between discrete Acquisition Activities)
- Returns
aa_boundaries – A list of the mtime values that represent boundaries between discrete Acquisition Activities
- Return type
List[float]