rmellipse.arrschema¶
Submodules¶
Exceptions¶
Common base class for all non-exit exceptions. |
Classes¶
Dict subclass that stores schema information. |
Functions¶
|
Generate a dictionary that describes an array structure. |
|
Save a dataset using an array schema registry. |
|
Check if a DataArray conforms to a particular schema. |
|
Load a dataset using arrschema registry. |
|
Convert an input data to a new schema. |
|
Generate an array annotation and attatch it as metadata. |
|
|
|
Cast an array into a schema. |
Package Contents¶
- class rmellipse.arrschema.ArrSchemaRegistry¶
Bases:
dictDict subclass that stores schema information.
Initialize self. See help(type(self)) for accurate signature.
- property schema¶
- property loaders¶
- property savers¶
- show_schema()¶
- show_loaders()¶
- find_schema(schema_name: str = None, schema_uid: str = None) dict¶
Find a schema.
Name of uid must be provided. Will throw an error if only the name is provided and multiple schema in the registry share a name.
- Parameters:
- namestr
Name of the schema to look for.
- uid: str
uid of the schema to find.
- Returns:
- dict
Requested schema
- Raises:
- ValueError
If multiple schemas in the regsitry have the same name.
- import_saver(schema_name=None, schema_uid: str = None, extension: str = None, saver_type: str = None, verbose: bool = False)¶
Get a loader functiona associated with a schema.
- Parameters:
- schema_namestr, optional
Name of the schema being saved, by default None
- schema_uidstr, optional
UID of the schema being saved, by default None
- extensionstr, optional
File extension, by default None
- saver_typestr, optional
Type of saver to use, by default None
- verbosebool, optional
Print info, by default False
- Returns:
- callable
Saver function imported from the registry
- import_loader(schema_name=None, schema_uid: str = None, extension: str = None, loader_type: str = None, verbose: bool = False)¶
Get a loader functiona associated with a schema.
- Parameters:
- schema_namestr, optional
_description_, by default None
- schema_uidstr, optional
_description_, by default None
- extensionstr, optional
_description_, by default None
- loader_typestr, optional
_description_, by default None
- verbosebool, optional
_description_, by default False
- Returns:
- _type_
_description_
- add_converter(funspec: str, input_schema: dict, output_schema: dict)¶
Add a converting functiom between two schema.
Converting functions take in exactly one argument and output exactly 1
- Parameters:
- funspecstr
Path spec of function in dot-notation (module.submodule:function)
- input_schemadict, optional
name of schema (or provide the uid) for converter
- output_schemadict, optional
output_schema for converter
- import_converter(input_schema_name: str = None, input_schema_uid: str = None, output_schema_name: str = None, output_schema_uid: str = None) callable¶
Add a converting functiom between two schema.
Converting functions take in exactly one argument and output exactly 1
- Parameters:
- funspecstr
Path spec of function in dot-notation (module.submodule:function)
- input_schema_namestr, optional
name of schema (or provide the uid), by default None
- input_schema_uidstr, optional
uid of input schema (or provide the name), by default None
- output_schema_namestr, optional
_description_, by default None
- output_schema_uidstr, optional
_description_, by default None
- add_saver(funspec: str, extension: str, saver_type: str, schema: dict)¶
Add a saving function to the registry.
- Parameters:
- funspecstr
_description_
- extensionlist[str], optional
_description_, by default None
- saver_typestr, optional
Specify the type of saver (e.g. csv like, HDF5, group_saveable). If notprovided, ‘’ is used.
- schemadict, optional
Schema, by default None
- Raises:
- ValueError
_description_
- add_loader(funspec: str, extension: str, loader_type: str, schema: dict | Mapping)¶
- Parameters:
- funspecstr
_description_
- extensionlist[str], optional
_description_, by default None
- loader_typestr, optional
Specify the type of loader (e.g. csv like, HDF5, group_saveable). If notprovided, ‘’ is used.
- schemadict | Mapping
Schema to use
- Raises:
- ValueError
_description_
- add_schema(schema: dict | Mapping | pathlib.Path)¶
- exception rmellipse.arrschema.ValidationError(*args, **kwargs)¶
Bases:
ExceptionCommon base class for all non-exit exceptions.
Initialize self. See help(type(self)) for accurate signature.
- rmellipse.arrschema.arrschema(name: str, shape: tuple[str | int], dims: tuple[str], dtype: str, units: Mapping | str = None, coords: Mapping = None, uid: str = None, attrs_schema: Mapping = None)¶
Generate a dictionary that describes an array structure.
- Parameters:
- namestr
Name of the array structure.
- shapetuple[str | int]
Shape of structure. Ellipses indicate arbitrary dimensions, letters indicate a required dimension of unknown length, and integers indicate a required dimension of a required length.
- dimstuple[str]
Names assigned to dimensions specified by shape. Any required dimension must be names, and arbitrary dimensions must also be ellipses.
- dtypestr
Type string, corresponds to numpy’s dtype (e.g. f8, c8, u8, etc)
- unitsMapping, optional
Mapping of units to the array structure. If the whole structure has a single unit, then a string can be passed. Optionally, a single required dimension can be mapped to a 1-d array of units. For example, if a dimension called “col” corresponds to columns in spread-sheet like data and each column has its own unit, you could specify that as {“col”:[“unit 1”, “unit 2”]}.
- coordsMapping, optional
Mapping of required dimensions to a coordinate space. Must provide at least a dtype and a single unit as a string. Optionally, if the coordinates are fixed (i.e. the row and column indices of stacks of 2-d matrices) then you may specify those coordinates here.
- uidstr, optional
The uid of a schema can be provided here, it is created using uuid4 if it is not provided, by default None.
- attrs_schemamapping, optional
JSON Schema for validating metadata attributes.
- Returns:
- dict
Dictionary conforming to an arrschema specification.
- Raises:
- Exception
If some logical inconsistency or is found, or the provided schema doesn’t follow the specification for an array schema.
- rmellipse.arrschema.load(path: pathlib.Path | str, *load_args, registry: ArrSchemaRegistry, schema: dict, loader_type: str = None, validate_schema: bool = True, verbose=False, **load_kwargs) object¶
Save a dataset using an array schema registry.
- Parameters:
- pathPath | str
_description_
- registryArrSchemaRegistry
_description_
- schemastr, optional
_description_, by default None
- loader_typestr, optional
_description_, by default None
- validate_schemabool, optional
_description_, by default True
- verbosebool, optional
_description_, by default False
- Returns:
- object
_description_
- rmellipse.arrschema.validate(arr: AnnotatedArrayLike, *, schema: Mapping, attach_schema: bool = True)¶
Check if a DataArray conforms to a particular schema.
- Parameters:
- arrxarray.DataArray
DataArray object to validate
- schemaMapping, optional
A schema dictionary to validate against, by default None
- attach_schemabool, optional
If True, the schema is dumped into a string and attatched to the attrs of the input data array. The default is True.
- rmellipse.arrschema.save(path: str | pathlib.Path, arr: AnnotatedArrayLike, *saver_args, registry: ArrSchemaRegistry, schema: dict = None, saver_type: str = None, validate_schema: bool = True, verbose=False, **saver_kwargs) object¶
Load a dataset using arrschema registry.
- Parameters:
- pathstr | Path
_description_
- arrobject
_description_
- registryArrSchemaRegistry
_description_
- schema_namestr, optional
_description_, by default None
- schema_uidstr, optional
_description_, by default None
- saver_typestr, optional
_description_, by default None
- validate_schemabool, optional
_description_, by default True
- verbosebool, optional
_description_, by default False
- Returns:
- object
_description_
- rmellipse.arrschema.convert(input: AnnotatedArrayLike, registry: ArrSchemaRegistry, output_schema: dict, input_schema: dict = None) Any¶
Convert an input data to a new schema.
Convert functions take in a single input and have a single output (SISO).
- Parameters:
- inputAny
The input array with a known schema.
- registryArrSchemaRegistry
The registry containing the converters.
- output_schema_namestr, optional
Schema name of the output (or the uid). by default None
- output_schema_uidstr, optional
Schema uid of the output (or the name), by default None
- input_schema_namestr, optional
Name of the input schema (if it can’t be inferred), by default None
- input_schema_uidstr, optional
Schema uid of the input (if it can’t be inferred), by default None
- Returns:
- Any
Converted input to the output schema.
- Raises:
- AttributeError
If the input schema can’t be inferred.
- rmellipse.arrschema.annotate(arr: AnnotatedArrayLike)¶
Generate an array annotation and attatch it as metadata.
- rmellipse.arrschema.zeros(schema: dict | Mapping, like: xarray.DataArray = None, drop_mismatched_dims: bool = True, attrs: dict = None, **with_coords)¶
- rmellipse.arrschema.as_schema(array: xarray.DataArray, registry: ArrSchemaRegistry = None, schema: dict | Mapping = None, schema_uid: str = None, schema_name: str = None)¶
Cast an array into a schema.
Casts the correct data_type of the data as well as the coordinates. If coordinates are fixed, applies those as the values.
- Parameters:
- arrayxr.DataArray
_description_
- registryArrSchemaRegistry, optional
_description_, by default None
- schemadict | Mapping, optional
_description_, by default None
- schema_uidstr, optional
_description_, by default None
- schema_namestr, optional
_description_, by default None
- Returns:
- _type_
_description_
- Raises:
- ValueError
_description_