nexusLIMS package

The NexusLIMS back-end software.

This module contains the software required to monitor a database for sessions logged by users on instruments that are part of the NIST Electron Microscopy Nexus Facility. Based off this information, records representing individual experiments are automatically generated and uploaded to the front-end NexusLIMS CDCS instance for users to browse, query, and edit.

Example

In most cases, the only code that needs to be run directly is initiating the record builder to look for new sessions, which can be done by running the record_builder module directly:

$ python -m nexusLIMS.builder.record_builder

Refer to Record building workflow for more details.

Configuration variables

The following variables should be defined as environment variables in your session, or in the .env file in the root of this package’s repository. See the .env.example file for more documentation and examples.

NexusLIMS_file_strategy

Defines the strategy used to find files associated with experimental records. A value of exclusive will only add files for which NexusLIMS knows how to generate preview images and extract metadata. A value of inclusive will include all files found, even if preview generation/detailed metadata extraction is not possible.

NexusLIMS_ignore_patterns

The patterns defined in this variable (which should be provided as a JSON-formatted string) will be ignored when finding files. A default value is provided in the .env.example file that should work for most users, but this setting allows for further customization of the file-finding routine.

nexusLIMS_user

The username used to authenticate to sharepoint calendar resources and CDCS

nexusLIMS_pass

The password used to authenticate to sharepoint calendar resources and CDCS

mmfnexus_path

The path (should be already mounted) to the root folder containing data from the Electron Microscopy Nexus. This folder is accessible read-only, and it is where data is written to by instruments in the Electron Microscopy Nexus. The file paths for specific instruments (specified in the NexusLIMS database) are relative to this root.

nexusLIMS_path

The root path used by NexusLIMS for various needs. This folder is used to store the NexusLIMS database, generated records, individual file metadata dumps and preview images, and anything else that is needed by the back-end system.

nexusLIMS_db_path

The direct path to the NexusLIMS SQLite database file that contains information about the instruments in the Nexus Facility, as well as logs for the sessions created by users using the Session Logger Application.

NEMO_address_X

The path to a NEMO instance’s API endpoint. Should be something like https://www.nemo.com/api/ (make sure to include the trailing slash). The value _X can be replaced with any value (such as NEMO_address_1). NexusLIMS supports having multiple NEMO reservation systems enabled at once (useful if your instruments are split over a few different management systems). To enable this behavior, create multiple pairs of environment variables for each instance, where the suffix _X changes for each pair (e.g. you could have NEMO_address_1 paired with NEMO_token_1, NEMO_address_2 paired with NEMO_token_2, etc.).

NEMO_token_X

An API authentication token from the corresponding NEMO installation (specified in NEMO_address_X) that will be used to authorize requests to the NEMO API. This token can be obtained by visiting the “Detailed Administration” page in the NEMO instance, and then creating a new token under the “Tokens” menu. Note that this token will authenticate as a particular user, so you may wish to set up a “dummy” or “functional” user account in the NEMO instance for these operations.

NEMO_strftime_fmt_X and NEMO_strptime_fmt_X

These options are optional, and control how dates/times are sent to (strftime) and interpreted from (strptime) the API. If “strftime_fmt” and/or “strptime_fmt” are not provided, the standard ISO 8601 format for datetime representation will be used (which should work with the default NEMO settings). These options are configurable to allow for support of non-default date format settings on a NEMO server. The formats should be provided using the standard datetime library syntax for encoding date and time information (see strftime() and strptime() Behavior for details).

NEMO_tz_1

Also optional; If the “tz” option is provided, the datetime strings received from the NEMO API will be coerced into the given timezone. The timezone should be specified using the IANA “tz database” name (see https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). This option should not be supplied for NEMO servers that return time zone information in their API response, since it will override the timezone of the returned data. It is mostly useful for servers that return reservation/usage event times without any timezone information. Providing it helps properly map file creation times to usage event times.

Subpackages

Submodules

nexusLIMS.cdcs module

A module to handle the uploading of previously-built XML records to a CDCS instance.

See https://github.com/usnistgov/NexusLIMS-CDCS for more details on the NexusLIMS customizations to the CDCS application.

This module can also be run directly to upload records to a CDCS instance without invoking the record builder.

nexusLIMS.cdcs.cdcs_url() str[source]

Return the url to the NexusLIMS CDCS instance by fetching it from the environment.

Returns

url – The URL of the NexusLIMS CDCS instance to use

Return type

str

Raises

ValueError – If the cdcs_url environment variable is not defined, raise a ValueError

nexusLIMS.cdcs.delete_record(record_id)[source]

Delete a Data record from the NexusLIMS CDCS instance via REST API.

Parameters

record_id (str) – The id value (on the CDCS server) of the record to be deleted

Returns

response – The REST response returned from the CDCS instance after attempting the delete operation

Return type

Response

nexusLIMS.cdcs.get_template_id()[source]

Get the template ID for the schema (so the record can be associated with it).

Returns

template_id – The template ID

Return type

str

nexusLIMS.cdcs.get_workspace_id()[source]

Get the workspace ID that the user has access to.

This should be the Global Public Workspace in the current NexusLIMS CDCS implementation.

Returns

workspace_id – The workspace ID

Return type

str

nexusLIMS.cdcs.upload_record_content(xml_content, title)[source]

Upload a single XML record to the NexusLIMS CDCS instance.

Parameters
  • xml_content (str) – The actual content of an XML record (rather than a file)

  • title (str) – The title to give to the record in CDCS

Returns

  • post_r (Response) – The REST response returned from the CDCS instance after attempting the upload

  • record_id (str) – The id (on the server) of the record that was uploaded

nexusLIMS.cdcs.upload_record_files(files_to_upload: Optional[List[Path]], *, progress: bool = False) List[Path][source]

Upload record files to CDCS.

Upload a list of .xml files (or all .xml files in the current directory) to the NexusLIMS CDCS instance using upload_record_content().

Parameters
  • files_to_upload (Optional[List[Path]]) – The list of .xml files to upload. If None, all .xml files in the current directory will be used instead.

  • progress (bool) – Whether to show a progress bar for uploading

Returns

  • files_uploaded (list of pathlib.Path) – A list of the files that were successfully uploaded

  • record_ids (list of str) – A list of the record id values (on the server) that were uploaded

nexusLIMS.instruments module

Methods and representations for instruments in a NexusLIMS system.

nexusLIMS.instruments.instrument_db

A dictionary of Instrument objects.

Each object in this dictionary represents an instrument detected in the NexusLIMS remote database.

Type

dict

class nexusLIMS.instruments.Instrument(api_url=None, calendar_name=None, calendar_url=None, location=None, name=None, schema_name=None, property_tag=None, filestore_path=None, computer_ip=None, computer_name=None, computer_mount=None, harvester=None, timezone=None)[source]

Bases: object

Representation of a NexusLIMS instrument.

A simple object to hold information about an instrument in the Microscopy Nexus facility, fetched from the external NexusLIMS database.

Parameters
  • api_url (str or None) – The calendar API endpoint url for this instrument’s scheduler

  • calendar_name (str or None) – The “user-friendly” name of the calendar for this instrument as displayed on the reservation system resource (e.g. “FEI Titan TEM”)

  • calendar_url (str or None) – The URL to this instrument’s web-accessible calendar on the SharePoint resource (if using)

  • location (str or None) – The physical location of this instrument (building and room number)

  • name (str or None) – The unique identifier for an instrument in the facility, currently (but not required to be) built from the make, model, and type of instrument, plus a unique numeric code (e.g. FEI-Titan-TEM-635816)

  • schema_name (str or None) – The human-readable name of instrument as defined in the Nexus Microscopy schema and displayed in the records

  • property_tag (str or None) – A unique numeric identifier for this instrument (not used by NexusLIMS, but for reference and potential future use)

  • filestore_path (str or None) – The path (relative to central storage location specified in mmfnexus_path) where this instrument stores its data (e.g. ./Titan)

  • computer_name (str or None) – The hostname of the support PC connected to this instrument that runs the Session Logger App. If this is incorrect (or not included), the logger application will fail when attempting to start a session from the microscope (only relevant if using the Session Logger App)

  • computer_ip (str or None) – The IP address of the support PC connected to this instrument (not currently utilized)

  • computer_mount (str or None) – The full path where the central file storage is mounted and files are saved on the ‘support PC’ for the instrument (e.g. ‘M:/’; only relevant if using the Session Logger App)

  • harvester (str or None) – The specific submodule within nexusLIMS.harvesters that should be used to harvest reservation information for this instrument. At the time of writing, the only possible values are nemo or sharepoint_calendar.

  • timezone (timezone, str, or None) – The timezone in which this instrument is located, in the format of the IANA timezone database (e.g. America/New_York). This is used to properly localize dates and times when communicating with the harvester APIs.

localize_datetime(_dt: datetime) datetime[source]

Localize a datetime to an Instrument’s timezone.

Convert a date and time to the timezone of this instrument. If the supplied datetime is naive (i.e. does not have a timezone), it will be assumed to already be in the timezone of the instrument, and the displayed time will not change. If the timezone of the supplied datetime is different than the instrument’s, the time will be adjusted to compensate for the timezone offset.

Parameters

_dt – The datetime object to localize

Returns

A datetime object with the same timezone as the instrument

Return type

datetime

localize_datetime_str(_dt: datetime, fmt: str = '%Y-%m-%d %H:%M:%S %Z') str[source]

Localize a datetime to an Instrument’s timezone and return as string.

Convert a date and time to the timezone of this instrument, returning a textual representation of the object, rather than the datetime itself. Uses localize_datetime() for the actual conversion.

Parameters
  • _dt – The datetime object ot localize

  • fmt – The strftime format string to use to format the output

Returns

The formatted textual representation of the localized datetime

Return type

str

nexusLIMS.instruments.get_instr_from_api_url(api_url: str) Optional[Instrument][source]

Get an instrument object from the NexusLIMS database by its api_url.

Parameters

api_url – An api_url (e.g. “FEITitanTEMEvents”) that will be used to search for a matching instrument in the api_url values

Returns

An Instrument instance matching the api_url, or None if no match was found

Return type

Instrument

Examples

>>> inst = get_instr_from_api_url('https://nemo.url.com/api/tools/?id=1')
>>> str(inst)
'FEI-Titan-STEM-630901_n in xxx/xxxx'
nexusLIMS.instruments.get_instr_from_calendar_name(cal_name)[source]

Get an instrument object from the NexusLIMS database by its calendar name.

Parameters

cal_name (str) – A calendar name (e.g. “FEITitanTEMEvents”) that will be used to search for a matching instrument in the api_url values

Returns

instrument – An Instrument instance matching the path, or None if no match was found

Return type

Instrument or None

Examples

>>> inst = get_instr_from_calendar_name('FEITitanTEMEvents')
>>> str(inst)
'FEI-Titan-TEM-635816 in ***REMOVED***'
nexusLIMS.instruments.get_instr_from_filepath(path: Path)[source]

Get an instrument object by a given path Using the NexusLIMS database.

Parameters

path – A path (relative or absolute) to a file saved in the central filestore that will be used to search for a matching instrument

Returns

instrument – An Instrument instance matching the path, or None if no match was found

Return type

Instrument or None

Examples

>>> inst = get_instr_from_filepath('/path/to/file.dm3')
>>> str(inst)
'FEI-Titan-TEM-635816 in xxx/xxxx'

nexusLIMS.utils module

Utility functions used in potentially multiple places by NexusLIMS.

exception nexusLIMS.utils.AuthenticationError(message)[source]

Bases: Exception

Class for showing an exception having to do with authentication.

nexusLIMS.utils.current_system_tz()[source]

Get the current system timezone information.

nexusLIMS.utils.find_dirs_by_mtime(path: str, dt_from: datetime, dt_to: datetime, *, followlinks: bool = True) List[str][source]

Find directories modified between two times.

Given two timestamps, find the directories under a path that were last modified between the two.

Deprecated since version 0.0.9: find_dirs_by_mtime is not recommended for use to find files for record inclusion, because subsequent modifications to a directory (e.g. the user wrote a text file or did some analysis afterwards) means no files will be returned from that directory (because it is not searched)

Parameters
  • path – The root path from which to start the search

  • dt_from – The “starting” point of the search timeframe

  • dt_to – The “ending” point of the search timeframe

  • followlinks – Argument passed on to py:func:os.walk to control whether symbolic links are followed

Returns

dirs – A list of the directories that have modification times within the time range provided

Return type

list

nexusLIMS.utils.find_files_by_mtime(path: Path, dt_from, dt_to) List[Path][source]

Find files motified between two times.

Given two timestamps, find files under a path that were last modified between the two.

Parameters
  • path – The root path from which to start the search

  • dt_from (datetime) – The “starting” point of the search timeframe

  • dt_to (datetime) – The “ending” point of the search timeframe

Returns

files – A list of the files that have modification times within the time range provided (sorted by modification time)

Return type

list

nexusLIMS.utils.get_auth(filename: Optional[Path] = None, *, basic: bool = False)[source]

Get an authentication scheme for NexusLIMS requests.

Set up NTLM authentication for the Microscopy Nexus using an account as specified from a file that lives in the package root named .credentials (or some other value provided as a parameter). Alternatively, the stored credentials can be overridden by supplying two environment variables: nexusLIMS_user and nexusLIMS_pass. These variables will be queried first, and if not found, the method will attempt to use the credential file.

Parameters
  • filename (str) – Name relative to this file (or absolute path) of file from which to read the parameters

  • basic (bool) – If True, return only username and password rather than NTLM authentication

Returns

auth – NTLM authentication handler for requests

Return type

requests_ntlm.HttpNtlmAuth or tuple

Notes

The credentials file is expected to have a section named [nexus_credentials] and two values: username and password. See the credentials.ini.example file included in the repository as an example.

nexusLIMS.utils.get_nested_dict_key(nested_dict, key_to_find, prepath=())[source]

Get a key from nested dictionaries.

Use a recursive method to find a key in a dictionary of dictionaries (such as the metadata dictionaries we receive from the file parsers). Cribbed from: https://stackoverflow.com/a/22171182/1435788.

Parameters
  • nested_dict (dict) – Dictionary to search

  • key_to_find (object) – Value to search for

  • prepath (tuple) – “path” to prepend to the search to limit the search to only part of the dictionary

Returns

path – The “path” through the dictionary (expressed as a tuple of keys) where value was found. If None, the value was not found in the dictionary.

Return type

tuple or None

nexusLIMS.utils.get_nested_dict_value(nested_dict, value, prepath=())[source]

Get a value from nested dictionaries.

Use a recursive method to find a value in a dictionary of dictionaries (such as the metadata dictionaries we receive from the file parsers). Cribbed from: https://stackoverflow.com/a/22171182/1435788.

Parameters
  • nested_dict (dict) – Dictionary to search

  • value (object) – Value to search for

  • prepath (tuple) – “path” to prepend to the search to limit the search to only part of the dictionary

Returns

path – The “path” through the dictionary (expressed as a tuple of keys) where value was found. If None, the value was not found in the dictionary.

Return type

tuple or None

nexusLIMS.utils.get_nested_dict_value_by_path(nest_dict, path)[source]

Get a nested dictionary value by path.

Get the value from within a nested dictionary structure by traversing into the dictionary as deep as that path found and returning that value.

Parameters
  • nest_dict (dict) – A dictionary of dictionaries that is to be queried

  • path (tuple) – A tuple (or other iterable type) that specifies the subsequent keys needed to get to a a value within nest_dict

Returns

value – The value at the path within the nested dictionary; if there’s no value there, return the string “not found”

Return type

object or str

nexusLIMS.utils.get_timespan_overlap(range_1: Tuple[datetime, datetime], range_2: Tuple[datetime, datetime]) timedelta[source]

Find the amount of overlap between two time spans.

Adapted from https://stackoverflow.com/a/9044111.

Parameters
  • range_1 – Tuple of length 2 of datetime objects: first is the start of the time range and the second is the end of the time range

  • range_2 – Tuple of length 2 of datetime objects: first is the start of the time range and the second is the end of the time range

Returns

The amount of overlap between the time ranges

Return type

timedelta

nexusLIMS.utils.gnu_find_files_by_mtime(path: Path, dt_from: datetime, dt_to: datetime, extensions: Optional[List[str]] = None, *, followlinks: bool = True) List[Path][source]

Find files modified between two times.

Given two timestamps, find files under a path that were last modified between the two. Uses the system-provided GNU find command. In basic testing, this method was found to be approximately 3 times faster than using find_files_by_mtime() (which is implemented in pure Python).

Parameters
  • path – The root path from which to start the search, relative to the mmfnexus_path environment setting.

  • dt_from – The “starting” point of the search timeframe

  • dt_to – The “ending” point of the search timeframe

  • extensions – A list of strings representing the extensions to find. If None, all files between are found between the two times.

  • followlinks – Whether to follow symlinks using the find command via the -H command line flag. This is useful when the mmfnexus_path is actually a directory of symlinks. If this is the case and followlinks is False, no files will ever be found because the find command will not “dereference” the symbolic links it finds. See comments in the code for more comments on implementation of this feature.

Returns

A list of the files that have modification times within the time range provided (sorted by modification time)

Return type

List[str]

Raises

RuntimeError – If the find command cannot be found, or running it results in output to stderr

nexusLIMS.utils.has_delay_passed(date: datetime) bool[source]

Check if the current time is greater than the configured delay.

Check if the current time is greater than the configured (or default) record building delay configured in the nexusLIMS_file_delay_days environment variable. If the date given is timezone-aware, the current time in that timezone will be compared.

Parameters

date – The datetime to check; can be either timezone aware or naive

Returns

Whether the current time is greater than the given date plus the configurable delay.

Return type

bool

nexusLIMS.utils.is_subpath(path: Path, of_paths: Union[Path, List[Path]])[source]

Return if this path is a subpath of other paths.

Helper function to determine if a given path is a “subpath” of a set of paths. Useful to help determine which instrument a given file comes from, given the instruments filestore_path and the path of the file to test.

Parameters
  • path – The path of the file (or directory) to test. This will usually be the absolute path to a file on the local filesystem (to be compared using the host-specific mmf_nexus_root_path.

  • of_paths – The “higher-level” path to test against (or list thereof). In typical use, this will be a path joined of an instruments filestore_path with the root-level mmf_nexus_root_path

Returns

result – Whether or not path is a subpath of one of the directories in of_paths

Return type

bool

Examples

>>> is_subpath(Path('/path/to/file.dm3'),
...            Path(os.environ['mmfnexus_path'] /
...                 titan.filestore_path))
True
nexusLIMS.utils.nexus_req(url: str, function: str, *, basic_auth: bool = False, token_auth: Optional[str] = None, **kwargs: Optional[dict])[source]

Make a request from NexusLIMS.

A helper method that wraps a function from requests, but adds a local certificate authority chain to validate any custom certificates and allow authenticatation using NTLM. Will automatically retry on 500 errors using a strategy suggested here: https://stackoverflow.com/a/35636367.

Parameters
  • url – The URL to fetch

  • function – The function from the requests library to use (e.g. 'GET', 'POST', 'PATCH', etc.)

  • basic_auth – If True, use only username and password for authentication rather than NTLM

  • token_auth – If a value is provided, it will be used as a token for authentication (only one of token_auth or basic_auth should be provided. The method will error if both are provided

  • **kwargs – Other keyword arguments are passed along to the fn

Returns

r – A requests response object

Return type

requests.Response

Raises

ValueError – If multiple methods of authentication are provided to the function

nexusLIMS.utils.remove_dict_nones(dictionary: Dict[Any, Any]) Dict[Any, Any][source]

Delete keys with a value of None in a dictionary, recursively.

Taken from https://stackoverflow.com/a/4256027.

Parameters

dictionary – The dictionary, with keys that have None values removed

Returns

The same dictionary, but with “Nones” removed

Return type

dict

nexusLIMS.utils.remove_dtb_element(tree, path)[source]

Remove an element from a DictionaryTreeBrowser by setting it to None.

Helper method that sets a specific leaf of a DictionaryTreeBrowser to None. Use with remove_dict_nones() to fully remove the desired DTB element after setting it to None (after converting DTB to dictionary).

Parameters
  • tree (DictionaryTreeBrowser) – the DictionaryTreeBrowser object to remove the object from

  • path (str) – period-delimited path to a DTB element

Returns

tree

Return type

DictionaryTreeBrowser

nexusLIMS.utils.replace_mmf_path(path: Path, suffix: str) Path[source]

Given an input “mmfnexus_path” path, generate equivalent “nexusLIMS_path” path.

If the given path is not a subpath of “mmfnexus_path”, a warning will be logged and the suffix will just be added at the end.

Parameters
  • path – The input path, which is expected to be a subpath of the mmfnexus_path directory

  • suffix – Any added suffix to add to the path (useful for appending with a new extension, such as .json)

Returns

A resolved pathlib.Path object pointing to the new path

Return type

Path

nexusLIMS.utils.set_nested_dict_value(nest_dict, path, value)[source]

Set a nested dictionary value by path.

Set a value within a nested dictionary structure by traversing into the dictionary as deep as that path found and changing it to value. Cribbed from https://stackoverflow.com/a/13688108/1435788.

Parameters
  • nest_dict (dict) – A dictionary of dictionaries that is to be queried

  • path (tuple) – A tuple (or other iterable type) that specifies the subsequent keys needed to get to a a value within nest_dict

  • value (object) – The value which will be given to the path in the nested dictionary

Returns

value – The value at the path within the nested dictionary

Return type

object

nexusLIMS.utils.setup_loggers(log_level)[source]

Set logging level of all NexusLIMS loggers.

Parameters

log_level (int) – The level of logging, such as logging.DEBUG

nexusLIMS.utils.sort_dict(item)[source]

Recursively sort a dictionary by keys.

nexusLIMS.utils.try_getting_dict_value(dict_, key)[source]

Try to get a nested dictionary value.

This method will try to get a value from a dictionary (potentially nested) and fail silently if the value is not found, returning None.

Parameters
  • dict (dict) – The dictionary from which to get a value

  • key (str or tuple) – The key to query, or if an iterable container type (tuple, list, etc.) is given, the path into a nested dictionary to follow

Returns

val – The value of the dictionary specified by key. If the dictionary does not have a key, returns the string “not found” without raising an error

Return type

object or str

nexusLIMS.version module

Keeps track of the current software version.