rmellipse.arrschema

Submodules

Exceptions

ValidationError

Common base class for all non-exit exceptions.

Classes

ArrSchemaRegistry

Dict subclass that stores schema information.

Functions

arrschema(name, shape, dims, dtype[, units, coords, ...])

Generate a dictionary that describes an array structure.

load(→ object)

Save a dataset using an array schema registry.

validate(arr, *, schema[, attach_schema])

Check if a DataArray conforms to a particular schema.

save(→ object)

Load a dataset using arrschema registry.

convert(→ Any)

Convert an input data to a new schema.

annotate(arr)

Generate an array annotation and attatch it as metadata.

zeros(schema[, like, drop_mismatched_dims, attrs])

as_schema(array[, registry, schema, schema_uid, ...])

Cast an array into a schema.

Package Contents

class rmellipse.arrschema.ArrSchemaRegistry

Bases: dict

Dict subclass that stores schema information.

Initialize self. See help(type(self)) for accurate signature.

property schema
property loaders
property savers
show_schema()
show_loaders()
find_schema(schema_name: str = None, schema_uid: str = None) dict

Find a schema.

Name of uid must be provided. Will throw an error if only the name is provided and multiple schema in the registry share a name.

Parameters:
namestr

Name of the schema to look for.

uid: str

uid of the schema to find.

Returns:
dict

Requested schema

Raises:
ValueError

If multiple schemas in the regsitry have the same name.

import_saver(schema_name=None, schema_uid: str = None, extension: str = None, saver_type: str = None, verbose: bool = False)

Get a loader functiona associated with a schema.

Parameters:
schema_namestr, optional

Name of the schema being saved, by default None

schema_uidstr, optional

UID of the schema being saved, by default None

extensionstr, optional

File extension, by default None

saver_typestr, optional

Type of saver to use, by default None

verbosebool, optional

Print info, by default False

Returns:
callable

Saver function imported from the registry

import_loader(schema_name=None, schema_uid: str = None, extension: str = None, loader_type: str = None, verbose: bool = False)

Get a loader functiona associated with a schema.

Parameters:
schema_namestr, optional

_description_, by default None

schema_uidstr, optional

_description_, by default None

extensionstr, optional

_description_, by default None

loader_typestr, optional

_description_, by default None

verbosebool, optional

_description_, by default False

Returns:
_type_

_description_

add_converter(funspec: str, input_schema: dict, output_schema: dict)

Add a converting functiom between two schema.

Converting functions take in exactly one argument and output exactly 1

Parameters:
funspecstr

Path spec of function in dot-notation (module.submodule:function)

input_schemadict, optional

name of schema (or provide the uid) for converter

output_schemadict, optional

output_schema for converter

import_converter(input_schema_name: str = None, input_schema_uid: str = None, output_schema_name: str = None, output_schema_uid: str = None) callable

Add a converting functiom between two schema.

Converting functions take in exactly one argument and output exactly 1

Parameters:
funspecstr

Path spec of function in dot-notation (module.submodule:function)

input_schema_namestr, optional

name of schema (or provide the uid), by default None

input_schema_uidstr, optional

uid of input schema (or provide the name), by default None

output_schema_namestr, optional

_description_, by default None

output_schema_uidstr, optional

_description_, by default None

add_saver(funspec: str, extension: str, saver_type: str, schema: dict)

Add a saving function to the registry.

Parameters:
funspecstr

_description_

extensionlist[str], optional

_description_, by default None

saver_typestr, optional

Specify the type of saver (e.g. csv like, HDF5, group_saveable). If notprovided, ‘’ is used.

schemadict, optional

Schema, by default None

Raises:
ValueError

_description_

add_loader(funspec: str, extension: str, loader_type: str, schema: dict | Mapping)
Parameters:
funspecstr

_description_

extensionlist[str], optional

_description_, by default None

loader_typestr, optional

Specify the type of loader (e.g. csv like, HDF5, group_saveable). If notprovided, ‘’ is used.

schemadict | Mapping

Schema to use

Raises:
ValueError

_description_

add_schema(schema: dict | Mapping | pathlib.Path)
exception rmellipse.arrschema.ValidationError(*args, **kwargs)

Bases: Exception

Common base class for all non-exit exceptions.

Initialize self. See help(type(self)) for accurate signature.

rmellipse.arrschema.arrschema(name: str, shape: tuple[str | int], dims: tuple[str], dtype: str, units: Mapping | str = None, coords: Mapping = None, uid: str = None, attrs_schema: Mapping = None)

Generate a dictionary that describes an array structure.

Parameters:
namestr

Name of the array structure.

shapetuple[str | int]

Shape of structure. Ellipses indicate arbitrary dimensions, letters indicate a required dimension of unknown length, and integers indicate a required dimension of a required length.

dimstuple[str]

Names assigned to dimensions specified by shape. Any required dimension must be names, and arbitrary dimensions must also be ellipses.

dtypestr

Type string, corresponds to numpy’s dtype (e.g. f8, c8, u8, etc)

unitsMapping, optional

Mapping of units to the array structure. If the whole structure has a single unit, then a string can be passed. Optionally, a single required dimension can be mapped to a 1-d array of units. For example, if a dimension called “col” corresponds to columns in spread-sheet like data and each column has its own unit, you could specify that as {“col”:[“unit 1”, “unit 2”]}.

coordsMapping, optional

Mapping of required dimensions to a coordinate space. Must provide at least a dtype and a single unit as a string. Optionally, if the coordinates are fixed (i.e. the row and column indices of stacks of 2-d matrices) then you may specify those coordinates here.

uidstr, optional

The uid of a schema can be provided here, it is created using uuid4 if it is not provided, by default None.

attrs_schemamapping, optional

JSON Schema for validating metadata attributes.

Returns:
dict

Dictionary conforming to an arrschema specification.

Raises:
Exception

If some logical inconsistency or is found, or the provided schema doesn’t follow the specification for an array schema.

rmellipse.arrschema.load(path: pathlib.Path | str, *load_args, registry: ArrSchemaRegistry, schema: dict, loader_type: str = None, validate_schema: bool = True, verbose=False, **load_kwargs) object

Save a dataset using an array schema registry.

Parameters:
pathPath | str

_description_

registryArrSchemaRegistry

_description_

schemastr, optional

_description_, by default None

loader_typestr, optional

_description_, by default None

validate_schemabool, optional

_description_, by default True

verbosebool, optional

_description_, by default False

Returns:
object

_description_

rmellipse.arrschema.validate(arr: AnnotatedArrayLike, *, schema: Mapping, attach_schema: bool = True)

Check if a DataArray conforms to a particular schema.

Parameters:
arrxarray.DataArray

DataArray object to validate

schemaMapping, optional

A schema dictionary to validate against, by default None

attach_schemabool, optional

If True, the schema is dumped into a string and attatched to the attrs of the input data array. The default is True.

rmellipse.arrschema.save(path: str | pathlib.Path, arr: AnnotatedArrayLike, *saver_args, registry: ArrSchemaRegistry, schema: dict = None, saver_type: str = None, validate_schema: bool = True, verbose=False, **saver_kwargs) object

Load a dataset using arrschema registry.

Parameters:
pathstr | Path

_description_

arrobject

_description_

registryArrSchemaRegistry

_description_

schema_namestr, optional

_description_, by default None

schema_uidstr, optional

_description_, by default None

saver_typestr, optional

_description_, by default None

validate_schemabool, optional

_description_, by default True

verbosebool, optional

_description_, by default False

Returns:
object

_description_

rmellipse.arrschema.convert(input: AnnotatedArrayLike, registry: ArrSchemaRegistry, output_schema: dict, input_schema: dict = None) Any

Convert an input data to a new schema.

Convert functions take in a single input and have a single output (SISO).

Parameters:
inputAny

The input array with a known schema.

registryArrSchemaRegistry

The registry containing the converters.

output_schema_namestr, optional

Schema name of the output (or the uid). by default None

output_schema_uidstr, optional

Schema uid of the output (or the name), by default None

input_schema_namestr, optional

Name of the input schema (if it can’t be inferred), by default None

input_schema_uidstr, optional

Schema uid of the input (if it can’t be inferred), by default None

Returns:
Any

Converted input to the output schema.

Raises:
AttributeError

If the input schema can’t be inferred.

rmellipse.arrschema.annotate(arr: AnnotatedArrayLike)

Generate an array annotation and attatch it as metadata.

rmellipse.arrschema.zeros(schema: dict | Mapping, like: xarray.DataArray = None, drop_mismatched_dims: bool = True, attrs: dict = None, **with_coords)
rmellipse.arrschema.as_schema(array: xarray.DataArray, registry: ArrSchemaRegistry = None, schema: dict | Mapping = None, schema_uid: str = None, schema_name: str = None)

Cast an array into a schema.

Casts the correct data_type of the data as well as the coordinates. If coordinates are fixed, applies those as the values.

Parameters:
arrayxr.DataArray

_description_

registryArrSchemaRegistry, optional

_description_, by default None

schemadict | Mapping, optional

_description_, by default None

schema_uidstr, optional

_description_, by default None

schema_namestr, optional

_description_, by default None

Returns:
_type_

_description_

Raises:
ValueError

_description_