Groups (zarr.hierarchy
)#
- zarr.hierarchy.group(store=None, overwrite=False, chunk_store=None, cache_attrs=True, synchronizer=None, path=None, *, zarr_version=None)[source]#
Create a group.
- Parameters
- storeMutableMapping or string, optional
Store or path to directory in file system.
- overwritebool, optional
If True, delete any pre-existing data in store at path before creating the group.
- chunk_storeMutableMapping, optional
Separate storage for chunks. If not provided, store will be used for storage of both chunks and metadata.
- cache_attrsbool, optional
If True (default), user attributes will be cached for attribute read operations. If False, user attributes are reloaded from the store prior to all attribute read operations.
- synchronizerobject, optional
Array synchronizer.
- pathstring, optional
Group path within store.
- Returns
- gzarr.hierarchy.Group
Examples
Create a group in memory:
>>> import zarr >>> g = zarr.group() >>> g <zarr.hierarchy.Group '/'>
Create a group with a different store:
>>> store = zarr.DirectoryStore('data/example.zarr') >>> g = zarr.group(store=store, overwrite=True) >>> g <zarr.hierarchy.Group '/'>
- zarr.hierarchy.open_group(store=None, mode='a', cache_attrs=True, synchronizer=None, path=None, chunk_store=None, storage_options=None, *, zarr_version=None, meta_array=None)[source]#
Open a group using file-mode-like semantics.
- Parameters
- storeMutableMapping or string, optional
Store or path to directory in file system or name of zip file.
- mode{‘r’, ‘r+’, ‘a’, ‘w’, ‘w-‘}, optional
Persistence mode: ‘r’ means read only (must exist); ‘r+’ means read/write (must exist); ‘a’ means read/write (create if doesn’t exist); ‘w’ means create (overwrite if exists); ‘w-’ means create (fail if exists).
- cache_attrsbool, optional
If True (default), user attributes will be cached for attribute read operations. If False, user attributes are reloaded from the store prior to all attribute read operations.
- synchronizerobject, optional
Array synchronizer.
- pathstring, optional
Group path within store.
- chunk_storeMutableMapping or string, optional
Store or path to directory in file system or name of zip file.
- storage_optionsdict
If using an fsspec URL to create the store, these will be passed to the backend implementation. Ignored otherwise.
- meta_arrayarray-like, optional
An array instance to use for determining arrays to create and return to users. Use numpy.empty(()) by default.
New in version 2.13.
- Returns
- gzarr.hierarchy.Group
Examples
>>> import zarr >>> root = zarr.open_group('data/example.zarr', mode='w') >>> foo = root.create_group('foo') >>> bar = root.create_group('bar') >>> root <zarr.hierarchy.Group '/'> >>> root2 = zarr.open_group('data/example.zarr', mode='a') >>> root2 <zarr.hierarchy.Group '/'> >>> root == root2 True
- class zarr.hierarchy.Group(store, path=None, read_only=False, chunk_store=None, cache_attrs=True, synchronizer=None, zarr_version=None, *, meta_array=None)[source]#
Instantiate a group from an initialized store.
- Parameters
- storeMutableMapping
Group store, already initialized. If the Group is used in a context manager, and the store has a
close
method, it will be called on exit.- pathstring, optional
Group path.
- read_onlybool, optional
True if group should be protected against modification.
- chunk_storeMutableMapping, optional
Separate storage for chunks. If not provided, store will be used for storage of both chunks and metadata.
- cache_attrsbool, optional
If True (default), user attributes will be cached for attribute read operations. If False, user attributes are reloaded from the store prior to all attribute read operations.
- synchronizerobject, optional
Array synchronizer.
- meta_arrayarray-like, optional
An array instance to use for determining arrays to create and return to users. Use numpy.empty(()) by default.
New in version 2.13.
- Attributes
store
A MutableMapping providing the underlying storage for the group.
path
Storage path.
name
Group name following h5py convention.
read_only
A boolean, True if modification operations are not permitted.
chunk_store
A MutableMapping providing the underlying storage for array chunks.
synchronizer
Object used to synchronize write access to groups and arrays.
attrs
A MutableMapping containing user-defined attributes.
info
Return diagnostic information about the group.
meta_array
An array-like instance to use for determining arrays to create and return to users.
Methods
__len__
()Number of members.
__iter__
()Return an iterator over group member names.
__contains__
(item)Test for group membership.
__getitem__
(item)Obtain a group member.
Return the Group for use as a context manager.
__exit__
(exc_type, exc_val, exc_tb)Call the close method of the underlying Store.
Return an iterator over member names for groups only.
groups
()Return an iterator over (name, value) pairs for groups only.
array_keys
([recurse])Return an iterator over member names for arrays only.
arrays
([recurse])Return an iterator over (name, value) pairs for arrays only.
visit
(func)Run
func
on each object's path.visitkeys
(func)An alias for
visit()
.visitvalues
(func)Run
func
on each object.visititems
(func)Run
func
on each object's path and the object itself.tree
([expand, level])Provide a
print
-able display of the hierarchy.create_group
(name[, overwrite])Create a sub-group.
require_group
(name[, overwrite])Obtain a sub-group, creating one if it doesn't exist.
create_groups
(*names, **kwargs)Convenience method to create multiple groups in a single call.
require_groups
(*names)Convenience method to require multiple groups in a single call.
create_dataset
(name, **kwargs)Create an array.
require_dataset
(name, shape[, dtype, exact])Obtain an array, creating if it doesn't exist.
create
(name, **kwargs)Create an array.
empty
(name, **kwargs)Create an array.
zeros
(name, **kwargs)Create an array.
ones
(name, **kwargs)Create an array.
full
(name, fill_value, **kwargs)Create an array.
array
(name, data, **kwargs)Create an array.
empty_like
(name, data, **kwargs)Create an array.
zeros_like
(name, data, **kwargs)Create an array.
ones_like
(name, data, **kwargs)Create an array.
full_like
(name, data, **kwargs)Create an array.
info
Return diagnostic information about the group.
move
(source, dest)Move contents from one path to another relative to the Group.
- __iter__()[source]#
Return an iterator over group member names.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> d1 = g1.create_dataset('baz', shape=100, chunks=10) >>> d2 = g1.create_dataset('quux', shape=200, chunks=20) >>> for name in g1: ... print(name) bar baz foo quux
- __contains__(item)[source]#
Test for group membership.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> d1 = g1.create_dataset('bar', shape=100, chunks=10) >>> 'foo' in g1 True >>> 'bar' in g1 True >>> 'baz' in g1 False
- __getitem__(item)[source]#
Obtain a group member.
- Parameters
- itemstring
Member name or path.
Examples
>>> import zarr >>> g1 = zarr.group() >>> d1 = g1.create_dataset('foo/bar/baz', shape=100, chunks=10) >>> g1['foo'] <zarr.hierarchy.Group '/foo'> >>> g1['foo/bar'] <zarr.hierarchy.Group '/foo/bar'> >>> g1['foo/bar/baz'] <zarr.core.Array '/foo/bar/baz' (100,) float64>
- group_keys()[source]#
Return an iterator over member names for groups only.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> d1 = g1.create_dataset('baz', shape=100, chunks=10) >>> d2 = g1.create_dataset('quux', shape=200, chunks=20) >>> sorted(g1.group_keys()) ['bar', 'foo']
- groups()[source]#
Return an iterator over (name, value) pairs for groups only.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> d1 = g1.create_dataset('baz', shape=100, chunks=10) >>> d2 = g1.create_dataset('quux', shape=200, chunks=20) >>> for n, v in g1.groups(): ... print(n, type(v)) bar <class 'zarr.hierarchy.Group'> foo <class 'zarr.hierarchy.Group'>
- array_keys(recurse=False)[source]#
Return an iterator over member names for arrays only.
- Parameters
- recurserecurse, optional
Option to return member names for all arrays, even from groups below the current one. If False, only member names for arrays in the current group will be returned. Default value is False.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> d1 = g1.create_dataset('baz', shape=100, chunks=10) >>> d2 = g1.create_dataset('quux', shape=200, chunks=20) >>> sorted(g1.array_keys()) ['baz', 'quux']
- arrays(recurse=False)[source]#
Return an iterator over (name, value) pairs for arrays only.
- Parameters
- recurserecurse, optional
Option to return (name, value) pairs for all arrays, even from groups below the current one. If False, only (name, value) pairs for arrays in the current group will be returned. Default value is False.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> d1 = g1.create_dataset('baz', shape=100, chunks=10) >>> d2 = g1.create_dataset('quux', shape=200, chunks=20) >>> for n, v in g1.arrays(): ... print(n, type(v)) baz <class 'zarr.core.Array'> quux <class 'zarr.core.Array'>
- visit(func)[source]#
Run
func
on each object’s path.- Note: If
func
returnsNone
(or doesn’t return), iteration continues. However, if
func
returns anything else, it ceases and returns that value.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> g4 = g3.create_group('baz') >>> g5 = g3.create_group('quux') >>> def print_visitor(name): ... print(name) >>> g1.visit(print_visitor) bar bar/baz bar/quux foo >>> g3.visit(print_visitor) baz quux
Search for members matching some name query can be implemented using
visit
that is,find
andfindall
. Consider the following tree:/ ├── aaa │ └── bbb │ └── ccc │ └── aaa ├── bar └── foo
It is created as follows:
>>> root = zarr.group() >>> foo = root.create_group("foo") >>> bar = root.create_group("bar") >>> root.create_group("aaa").create_group("bbb").create_group("ccc").create_group("aaa") <zarr.hierarchy.Group '/aaa/bbb/ccc/aaa'>
For
find
, the first path that matches a given pattern (for example “aaa”) is returned. Note that a non-None value is returned in the visit function to stop further iteration.>>> import re >>> pattern = re.compile("aaa") >>> found = None >>> def find(path): ... global found ... if pattern.search(path) is not None: ... found = path ... return True ... >>> root.visit(find) True >>> print(found) aaa
For
findall
, all the results are gathered into a list>>> pattern = re.compile("aaa") >>> found = [] >>> def findall(path): ... if pattern.search(path) is not None: ... found.append(path) ... >>> root.visit(findall) >>> print(found) ['aaa', 'aaa/bbb', 'aaa/bbb/ccc', 'aaa/bbb/ccc/aaa']
To match only on the last part of the path, use a greedy regex to filter out the prefix:
>>> prefix_pattern = re.compile(r".*/") >>> pattern = re.compile("aaa") >>> found = [] >>> def findall(path): ... match = prefix_pattern.match(path) ... if match is None: ... name = path ... else: ... _, end = match.span() ... name = path[end:] ... if pattern.search(name) is not None: ... found.append(path) ... return None ... >>> root.visit(findall) >>> print(found) ['aaa', 'aaa/bbb/ccc/aaa']
- Note: If
- visitvalues(func)[source]#
Run
func
on each object.- Note: If
func
returnsNone
(or doesn’t return), iteration continues. However, if
func
returns anything else, it ceases and returns that value.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> g4 = g3.create_group('baz') >>> g5 = g3.create_group('quux') >>> def print_visitor(obj): ... print(obj) >>> g1.visitvalues(print_visitor) <zarr.hierarchy.Group '/bar'> <zarr.hierarchy.Group '/bar/baz'> <zarr.hierarchy.Group '/bar/quux'> <zarr.hierarchy.Group '/foo'> >>> g3.visitvalues(print_visitor) <zarr.hierarchy.Group '/bar/baz'> <zarr.hierarchy.Group '/bar/quux'>
- Note: If
- visititems(func)[source]#
Run
func
on each object’s path and the object itself.- Note: If
func
returnsNone
(or doesn’t return), iteration continues. However, if
func
returns anything else, it ceases and returns that value.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> g4 = g3.create_group('baz') >>> g5 = g3.create_group('quux') >>> def print_visitor(name, obj): ... print((name, obj)) >>> g1.visititems(print_visitor) ('bar', <zarr.hierarchy.Group '/bar'>) ('bar/baz', <zarr.hierarchy.Group '/bar/baz'>) ('bar/quux', <zarr.hierarchy.Group '/bar/quux'>) ('foo', <zarr.hierarchy.Group '/foo'>) >>> g3.visititems(print_visitor) ('baz', <zarr.hierarchy.Group '/bar/baz'>) ('quux', <zarr.hierarchy.Group '/bar/quux'>)
- Note: If
- tree(expand=False, level=None)[source]#
Provide a
print
-able display of the hierarchy.- Parameters
- expandbool, optional
Only relevant for HTML representation. If True, tree will be fully expanded.
- levelint, optional
Maximum depth to descend into hierarchy.
Notes
Please note that this is an experimental feature. The behaviour of this function is still evolving and the default output and/or parameters may change in future versions.
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> g4 = g3.create_group('baz') >>> g5 = g3.create_group('quux') >>> d1 = g5.create_dataset('baz', shape=100, chunks=10) >>> g1.tree() / ├── bar │ ├── baz │ └── quux │ └── baz (100,) float64 └── foo >>> g1.tree(level=2) / ├── bar │ ├── baz │ └── quux └── foo >>> g3.tree() bar ├── baz └── quux └── baz (100,) float64
- create_group(name, overwrite=False)[source]#
Create a sub-group.
- Parameters
- namestring
Group name.
- overwritebool, optional
If True, overwrite any existing array with the given name.
- Returns
- gzarr.hierarchy.Group
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.create_group('foo') >>> g3 = g1.create_group('bar') >>> g4 = g1.create_group('baz/quux')
- require_group(name, overwrite=False)[source]#
Obtain a sub-group, creating one if it doesn’t exist.
- Parameters
- namestring
Group name.
- overwritebool, optional
Overwrite any existing array with given name if present.
- Returns
- gzarr.hierarchy.Group
Examples
>>> import zarr >>> g1 = zarr.group() >>> g2 = g1.require_group('foo') >>> g3 = g1.require_group('foo') >>> g2 == g3 True
- create_groups(*names, **kwargs)[source]#
Convenience method to create multiple groups in a single call.
- create_dataset(name, **kwargs)[source]#
Create an array.
Arrays are known as “datasets” in HDF5 terminology. For compatibility with h5py, Zarr groups also implement the require_dataset() method.
- Parameters
- namestring
Array name.
- dataarray-like, optional
Initial data.
- shapeint or tuple of ints
Array shape.
- chunksint or tuple of ints, optional
Chunk shape. If not provided, will be guessed from shape and dtype.
- dtypestring or dtype, optional
NumPy dtype.
- compressorCodec, optional
Primary compressor.
- fill_valueobject
Default value to use for uninitialized portions of the array.
- order{‘C’, ‘F’}, optional
Memory layout to be used within each chunk.
- synchronizerzarr.sync.ArraySynchronizer, optional
Array synchronizer.
- filterssequence of Codecs, optional
Sequence of filters to use to encode chunk data prior to compression.
- overwritebool, optional
If True, replace any existing array or group with the given name.
- cache_metadatabool, optional
If True, array configuration metadata will be cached for the lifetime of the object. If False, array metadata will be reloaded prior to all data access and modification operations (may incur overhead depending on storage and data access pattern).
- dimension_separator{‘.’, ‘/’}, optional
Separator placed between the dimensions of a chunk.
- Returns
- azarr.core.Array
Examples
>>> import zarr >>> g1 = zarr.group() >>> d1 = g1.create_dataset('foo', shape=(10000, 10000), ... chunks=(1000, 1000)) >>> d1 <zarr.core.Array '/foo' (10000, 10000) float64> >>> d2 = g1.create_dataset('bar/baz/qux', shape=(100, 100, 100), ... chunks=(100, 10, 10)) >>> d2 <zarr.core.Array '/bar/baz/qux' (100, 100, 100) float64>
- require_dataset(name, shape, dtype=None, exact=False, **kwargs)[source]#
Obtain an array, creating if it doesn’t exist.
Arrays are known as “datasets” in HDF5 terminology. For compatibility with h5py, Zarr groups also implement the create_dataset() method.
Other kwargs are as per
zarr.hierarchy.Group.create_dataset()
.- Parameters
- namestring
Array name.
- shapeint or tuple of ints
Array shape.
- dtypestring or dtype, optional
NumPy dtype.
- exactbool, optional
If True, require dtype to match exactly. If false, require dtype can be cast from array dtype.
- create(name, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.create()
.
- empty(name, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.empty()
.
- zeros(name, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.zeros()
.
- ones(name, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.ones()
.
- full(name, fill_value, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.full()
.
- array(name, data, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.array()
.
- empty_like(name, data, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.empty_like()
.
- zeros_like(name, data, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.zeros_like()
.
- ones_like(name, data, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.ones_like()
.
- full_like(name, data, **kwargs)[source]#
Create an array. Keyword arguments as per
zarr.creation.full_like()
.