Storage (zarr.storage)

This module contains storage classes for use with Zarr arrays and groups. However, note that any object implementing the MutableMapping interface can be used as a Zarr array store.

zarr.storage.init_array(store, shape, chunks=None, dtype=None, compressor='default', fill_value=None, order='C', overwrite=False, path=None, chunk_store=None, filters=None)

initialize an array store with the given configuration.

Parameters:

store : MutableMapping

A mapping that supports string keys and bytes-like values.

shape : int or tuple of ints

Array shape.

chunks : int or tuple of ints, optional

Chunk shape. If not provided, will be guessed from shape and dtype.

dtype : string or dtype, optional

NumPy dtype.

compressor : Codec, optional

Primary compressor.

fill_value : object

Default value to use for uninitialized portions of the array.

order : {‘C’, ‘F’}, optional

Memory layout to be used within each chunk.

overwrite : bool, optional

If True, erase all data in store prior to initialisation.

path : string, optional

Path under which array is stored.

chunk_store : MutableMapping, optional

Separate storage for chunks. If not provided, store will be used for storage of both chunks and metadata.

filters : sequence, optional

Sequence of filters to use to encode chunk data prior to compression.

Notes

The initialisation process involves normalising all array metadata, encoding as JSON and storing under the ‘.zarray’ key. User attributes are also initialized and stored as JSON under the ‘.zattrs’ key.

Examples

Initialize an array store:

>>> from zarr.storage import init_array
>>> store = dict()
>>> init_array(store, shape=(10000, 10000), chunks=(1000, 1000))
>>> sorted(store.keys())
['.zarray', '.zattrs']

Array metadata is stored as JSON:

>>> print(str(store['.zarray'], 'ascii'))
{
    "chunks": [
        1000,
        1000
    ],
    "compressor": {
        "clevel": 5,
        "cname": "lz4",
        "id": "blosc",
        "shuffle": 1
    },
    "dtype": "<f8",
    "fill_value": null,
    "filters": null,
    "order": "C",
    "shape": [
        10000,
        10000
    ],
    "zarr_format": 2
}

User-defined attributes are also stored as JSON, initially empty:

>>> print(str(store['.zattrs'], 'ascii'))
{}

Initialize an array using a storage path:

>>> store = dict()
>>> init_array(store, shape=100000000, chunks=1000000, dtype='i1',
...            path='foo')
>>> sorted(store.keys())
['.zattrs', '.zgroup', 'foo/.zarray', 'foo/.zattrs']
>>> print(str(store['foo/.zarray'], 'ascii'))
{
    "chunks": [
        1000000
    ],
    "compressor": {
        "clevel": 5,
        "cname": "lz4",
        "id": "blosc",
        "shuffle": 1
    },
    "dtype": "|i1",
    "fill_value": null,
    "filters": null,
    "order": "C",
    "shape": [
        100000000
    ],
    "zarr_format": 2
}
zarr.storage.init_group(store, overwrite=False, path=None, chunk_store=None)

initialize a group store.

Parameters:

store : MutableMapping

A mapping that supports string keys and byte sequence values.

overwrite : bool, optional

If True, erase all data in store prior to initialisation.

path : string, optional

Path under which array is stored.

chunk_store : MutableMapping, optional

Separate storage for chunks. If not provided, store will be used for storage of both chunks and metadata.

class zarr.storage.DictStore(cls=<type 'dict'>)

Extended mutable mapping interface to a hierarchy of dicts.

Examples

>>> import zarr
>>> store = zarr.DictStore()
>>> store['foo'] = b'bar'
>>> store['foo']
b'bar'
>>> store['a/b/c'] = b'xxx'
>>> store['a/b/c']
b'xxx'
>>> sorted(store.keys())
['a/b/c', 'foo']
>>> store.listdir()
['a', 'foo']
>>> store.listdir('a/b')
['c']
>>> store.rmdir('a')
>>> sorted(store.keys())
['foo']
class zarr.storage.DirectoryStore(path)

Mutable Mapping interface to a directory. Keys must be strings, values must be bytes-like objects.

Parameters:

path : string

Location of directory.

Examples

>>> import zarr
>>> store = zarr.DirectoryStore('example_store')
>>> store['foo'] = b'bar'
>>> store['foo']
b'bar'
>>> open('example_store/foo', 'rb').read()
b'bar'
>>> store['a/b/c'] = b'xxx'
>>> store['a/b/c']
b'xxx'
>>> open('example_store/a/b/c', 'rb').read()
b'xxx'
>>> sorted(store.keys())
['a/b/c', 'foo']
>>> store.listdir()
['a', 'foo']
>>> store.listdir('a/b')
['c']
>>> store.rmdir('a')
>>> sorted(store.keys())
['foo']
>>> import os
>>> os.path.exists('example_store/a')
False
class zarr.storage.TempStore(suffix='', prefix='zarr', dir=None)

Directory store using a temporary directory for storage.

class zarr.storage.ZipStore(path, compression=0, allowZip64=True, mode='a')

Mutable Mapping interface to a Zip file. Keys must be strings, values must be bytes-like objects.

Parameters:

path : string

Location of file.

compression : integer, optional

Compression method to use when writing to the archive.

allowZip64 : bool, optional

If True (the default) will create ZIP files that use the ZIP64 extensions when the zipfile is larger than 2 GiB. If False will raise an exception when the ZIP file would require ZIP64 extensions.

mode : string, optional

One of ‘r’ to read an existing file, ‘w’ to truncate and write a new file, ‘a’ to append to an existing file, or ‘x’ to exclusively create and write a new file.

Notes

When modifying a ZipStore the close() method must be called otherwise essential data will not be written to the underlying zip file. The ZipStore class also supports the context manager protocol, which ensures the close() method is called on leaving the with statement.

Examples

>>> import zarr
>>> store = zarr.ZipStore('example.zip', mode='w')
>>> store['foo'] = b'bar'
>>> store['foo']
b'bar'
>>> store['a/b/c'] = b'xxx'
>>> store['a/b/c']
b'xxx'
>>> sorted(store.keys())
['a/b/c', 'foo']
>>> store.close()
>>> import zipfile
>>> zf = zipfile.ZipFile('example.zip', mode='r')
>>> sorted(zf.namelist())
['a/b/c', 'foo']
close()

Closes the underlying zip file, ensuring all records are written.

flush()

Closes the underlying zip file, ensuring all records are written, then re-opens the file for further modifications.

zarr.storage.migrate_1to2(store)

Migrate array metadata in store from Zarr format version 1 to version 2.

Parameters:

store : MutableMapping

Store to be migrated.

Notes

Version 1 did not support hierarchies, so this migration function will look for a single array in store and migrate the array metadata to version 2.