.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples\grp2_ArrSchema\plot_e00_making_schema.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_grp2_ArrSchema_plot_e00_making_schema.py: Defining Array Schema ===================== Array schema are json documents that act as metadata describing the structure of array-like data. The submodule provides some python bindings for generating the schema in python. See .. GENERATED FROM PYTHON SOURCE LINES 14-21 Make a Registry --------------- First make a registry to store a collection of schema you want to use. We will also import some example functions including in the rmellipse package, as well as the json package to format the json documents. .. GENERATED FROM PYTHON SOURCE LINES 21-30 .. code-block:: Python import rmellipse.arrschema as arrschema import rmellipse.arrschema.examples as examples import xarray as xr import numpy as np import json registry = arrschema.ArrSchemaRegistry() .. GENERATED FROM PYTHON SOURCE LINES 31-35 We will define a basic schema of an arry with arbitrary shape made of floats called "float_zeros". When the schema is made with the arrschema function a uid is automatically added. Then we add it to the registry. .. GENERATED FROM PYTHON SOURCE LINES 35-44 .. code-block:: Python float_zeros = arrschema.arrschema( name='float_zeros', shape=(...,), dims=(...,), dtype=float ) print(json.dumps(float_zeros, indent=True)) registry.add_schema(float_zeros) .. rst-class:: sphx-glr-script-out .. code-block:: none { "name": "float_zeros", "shape": [ "..." ], "dims": [ "..." ], "dtype": "float64", "coords": {}, "attrs_schema": {}, "uid": "5999d617ca48f2f56c06aade60c386ec28ccc345a6b87695ffd9ce2a7c32053c" } .. GENERATED FROM PYTHON SOURCE LINES 45-50 Validation ---------- Currently validation is only implemented for xarray.DataArray objects and RMEmeas objects. .. GENERATED FROM PYTHON SOURCE LINES 50-75 .. code-block:: Python # the schema can be provided directly my_data = xr.DataArray(np.zeros((4, 4), dtype=float)) arrschema.validate(my_data, schema=float_zeros) # reffered to by a registry and uid my_data = xr.DataArray(np.zeros((4, 4), dtype=float)) arrschema.validate(my_data, schema=float_zeros) # reffered to by a registry and name # Names are not unique, this may fail if multiple # schemas share a name and is not recommended. my_data = xr.DataArray(np.zeros((4, 4), dtype=float)) arrschema.validate(my_data, schema=float_zeros) # this will fail because the dtype isn't correct my_data_fails = xr.DataArray(np.zeros((4, 4), dtype=complex)) try: arrschema.validate(my_data_fails, schema=float_zeros) except arrschema.ValidationError as e: print(e) # validated datasets store the associated schema in the metadata print(my_data.attrs) .. rst-class:: sphx-glr-script-out .. code-block:: none dtype complex128 doesnt match expected float64 for : Size: 256B array([[0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j]]) Dimensions without coordinates: dim_0, dim_1 {'ARRSCHEMA': '{"name": "float_zeros", "shape": ["..."], "dims": ["..."], "dtype": "float64", "coords": {}, "attrs_schema": {}, "uid": "5999d617ca48f2f56c06aade60c386ec28ccc345a6b87695ffd9ce2a7c32053c"}'} .. GENERATED FROM PYTHON SOURCE LINES 76-86 Loaders and Savers ------------------ Array schema provide a system for organizing and envoking different encoding and decoding functions for various schema. Often times we work with a single in memory representation of a particular object, but may need to be able to read/write to and from multiple different methods of storing that data on disc. We do this by associating a loader/saver with a function using a module spec, and related file extensions. .. GENERATED FROM PYTHON SOURCE LINES 86-101 .. code-block:: Python registry.add_loader( 'rmellipse.arrschema.examples:load_group_saveable', ['.h5', '.hdf5'], loader_type='group_saveable', schema=float_zeros, ) registry.add_saver( 'rmellipse.arrschema.examples:save_group_saveable', ['.h5', '.hdf5'], saver_type='group_saveable', schema=float_zeros, ) .. GENERATED FROM PYTHON SOURCE LINES 102-112 .. code-block:: Python # Encoding and decoding functions are expected have function signatures # that look like ``fun(path, data, *args, **kwargs)`` arrschema.save('example.h5', my_data, 'my-name', registry=registry) my_data_read = arrschema.load( 'example.h5', group='my-name', registry=registry, schema=float_zeros ) print(my_data_read) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 128B array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]]) Dimensions without coordinates: dim_0, dim_1 Attributes: ARRSCHEMA: {"uid": "5999d617ca48f2f56c06aade60c386ec28ccc345a... __class__.__module__: xarray.core.dataarray __class__.__name__: DataArray is_big_object: True save_type: DATAARRAY_SAVEABLE .. GENERATED FROM PYTHON SOURCE LINES 113-115 Otherwise, you will have to explicitly declare what schema your data corresponds to. .. GENERATED FROM PYTHON SOURCE LINES 116-119 .. code-block:: Python arrschema.save('example.h5', my_data, 'my-name', registry=registry, schema=float_zeros) .. GENERATED FROM PYTHON SOURCE LINES 120-126 Conversion ---------- Converters can be assigned as well. Converters are functions that take in a single data set with an associated schema, and returns a new data set with an associated schema. .. GENERATED FROM PYTHON SOURCE LINES 126-142 .. code-block:: Python # define a new format we care about int_zeros = arrschema.arrschema(name='int_zeros', shape=(...,), dims=(...,), dtype=int) registry.add_schema(int_zeros) registry.add_converter( 'rmellipse.arrschema.examples:convert_float_to_int', input_schema=float_zeros, output_schema=int_zeros, ) converted = arrschema.convert(my_data, registry=registry, output_schema=int_zeros) arrschema.validate(converted, schema=int_zeros) print(converted) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 128B array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]) Dimensions without coordinates: dim_0, dim_1 Attributes: ARRSCHEMA: {"name": "int_zeros", "shape": ["..."], "dims": ["..."], "dty... .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.320 seconds) .. _sphx_glr_download_auto_examples_grp2_ArrSchema_plot_e00_making_schema.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_e00_making_schema.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_e00_making_schema.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_e00_making_schema.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_