The Array class (zarr.core)

class zarr.core.Array(store, readonly=False)

Instantiate an array from an initialised store.

Parameters:

store : MutableMapping

Array store, already initialised.

readonly : bool, optional

True if array should be protected against modification.

Examples

>>> import zarr
>>> store = dict()
>>> zarr.init_store(store, shape=(10000, 10000), chunks=(1000, 1000))
>>> z = zarr.Array(store)
>>> z
zarr.core.Array((10000, 10000), float64, chunks=(1000, 1000), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 762.9M; nbytes_stored: 316; ratio: 2531645.6; initialized: 0/100
  store: builtins.dict

Attributes

store A MutableMapping providing the underlying storage for the array.
readonly A boolean, True if write operations are not permitted.
shape A tuple of integers describing the length of each dimension of the array.
chunks A tuple of integers describing the length of each dimension of a chunk of the array.
dtype The NumPy data type.
compression A string naming the primary compression algorithm used to compress chunks of the array.
compression_opts Parameters controlling the behaviour of the primary compression algorithm.
fill_value A value used for uninitialized portions of the array.
order A string indicating the order in which bytes are arranged within chunks of the array.
attrs A MutableMapping containing user-defined attributes.
size The total number of elements in the array.
itemsize The size in bytes of each item in the array.
nbytes The total number of bytes that would be required to store the array without compression.
nbytes_stored The total number of stored bytes of data for the array.
initialized The number of chunks that have been initialized with some data.
cdata_shape A tuple of integers describing the number of chunks along each dimension of the array.

Methods

__getitem__(item) Retrieve data for some portion of the array.
__setitem__(key, value) Modify data for some portion of the array.
resize(*args) Change the shape of the array by growing or shrinking one or more dimensions.
append(data[, axis]) Append data to axis.
__getitem__(item)

Retrieve data for some portion of the array. Most NumPy-style slicing operations are supported.

Returns:

out : ndarray

A NumPy array containing the data for the requested region.

Examples

Setup a 1-dimensional array:

>>> import zarr
>>> import numpy as np
>>> z = zarr.array(np.arange(100000000), chunks=1000000, dtype='i4')
>>> z
zarr.core.Array((100000000,), int32, chunks=(1000000,), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 381.5M; nbytes_stored: 6.7M; ratio: 56.8; initialized: 100/100
  store: builtins.dict

Take some slices:

>>> z[5]
5
>>> z[:5]
array([0, 1, 2, 3, 4], dtype=int32)
>>> z[-5:]
array([99999995, 99999996, 99999997, 99999998, 99999999], dtype=int32)
>>> z[5:10]
array([5, 6, 7, 8, 9], dtype=int32)
>>> z[:]
array([       0,        1,        2, ..., 99999997, 99999998, 99999999], dtype=int32)

Setup a 2-dimensional array:

>>> import zarr
>>> import numpy as np
>>> z = zarr.array(np.arange(100000000).reshape(10000, 10000),
...                chunks=(1000, 1000), dtype='i4')
>>> z
zarr.core.Array((10000, 10000), int32, chunks=(1000, 1000), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 381.5M; nbytes_stored: 9.5M; ratio: 40.1; initialized: 100/100
  store: builtins.dict

Take some slices:

>>> z[2, 2]
20002
>>> z[:2, :2]
array([[    0,     1],
       [10000, 10001]], dtype=int32)
>>> z[:2]
array([[    0,     1,     2, ...,  9997,  9998,  9999],
       [10000, 10001, 10002, ..., 19997, 19998, 19999]], dtype=int32)
>>> z[:, :2]
array([[       0,        1],
       [   10000,    10001],
       [   20000,    20001],
       ...,
       [99970000, 99970001],
       [99980000, 99980001],
       [99990000, 99990001]], dtype=int32)
>>> z[:]
array([[       0,        1,        2, ...,     9997,     9998,     9999],
       [   10000,    10001,    10002, ...,    19997,    19998,    19999],
       [   20000,    20001,    20002, ...,    29997,    29998,    29999],
       ...,
       [99970000, 99970001, 99970002, ..., 99979997, 99979998, 99979999],
       [99980000, 99980001, 99980002, ..., 99989997, 99989998, 99989999],
       [99990000, 99990001, 99990002, ..., 99999997, 99999998, 99999999]], dtype=int32)
__setitem__(key, value)

Modify data for some portion of the array.

Examples

Setup a 1-dimensional array:

>>> import zarr
>>> z = zarr.zeros(100000000, chunks=1000000, dtype='i4')
>>> z
zarr.core.Array((100000000,), int32, chunks=(1000000,), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 381.5M; nbytes_stored: 291; ratio: 1374570.4; initialized: 0/100
  store: builtins.dict

Set all array elements to the same scalar value:

>>> z[:] = 42
>>> z[:]
array([42, 42, 42, ..., 42, 42, 42], dtype=int32)

Set a portion of the array:

>>> z[:100] = np.arange(100)
>>> z[-100:] = np.arange(100)[::-1]
>>> z[:]
array([0, 1, 2, ..., 2, 1, 0], dtype=int32)

Setup a 2-dimensional array:

>>> z = zarr.zeros((10000, 10000), chunks=(1000, 1000), dtype='i4')
>>> z
zarr.core.Array((10000, 10000), int32, chunks=(1000, 1000), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 381.5M; nbytes_stored: 313; ratio: 1277955.3; initialized: 0/100
  store: builtins.dict

Set all array elements to the same scalar value:

>>> z[:] = 42
>>> z[:]
array([[42, 42, 42, ..., 42, 42, 42],
       [42, 42, 42, ..., 42, 42, 42],
       [42, 42, 42, ..., 42, 42, 42],
       ...,
       [42, 42, 42, ..., 42, 42, 42],
       [42, 42, 42, ..., 42, 42, 42],
       [42, 42, 42, ..., 42, 42, 42]], dtype=int32)

Set a portion of the array:

>>> z[0, :] = np.arange(z.shape[1])
>>> z[:, 0] = np.arange(z.shape[0])
>>> z[:]
array([[   0,    1,    2, ..., 9997, 9998, 9999],
       [   1,   42,   42, ...,   42,   42,   42],
       [   2,   42,   42, ...,   42,   42,   42],
       ...,
       [9997,   42,   42, ...,   42,   42,   42],
       [9998,   42,   42, ...,   42,   42,   42],
       [9999,   42,   42, ...,   42,   42,   42]], dtype=int32)
resize(*args)

Change the shape of the array by growing or shrinking one or more dimensions.

Notes

When resizing an array, the data are not rearranged in any way.

If one or more dimensions are shrunk, any chunks falling outside the new array shape will be deleted from the underlying store.

Examples

>>> import zarr
>>> z = zarr.zeros(shape=(10000, 10000), chunks=(1000, 1000))
>>> z
zarr.core.Array((10000, 10000), float64, chunks=(1000, 1000), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 762.9M; nbytes_stored: 313; ratio: 2555910.5; initialized: 0/100
  store: builtins.dict
>>> z.resize(20000, 10000)
>>> z
zarr.core.Array((20000, 10000), float64, chunks=(1000, 1000), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 1.5G; nbytes_stored: 313; ratio: 5111821.1; initialized: 0/200
  store: builtins.dict
>>> z.resize(30000, 1000)
>>> z
zarr.core.Array((30000, 1000), float64, chunks=(1000, 1000), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 228.9M; nbytes_stored: 312; ratio: 769230.8; initialized: 0/30
  store: builtins.dict
append(data, axis=0)

Append data to axis.

Parameters:

data : array_like

Data to be appended.

axis : int

Axis along which to append.

Notes

The size of all dimensions other than axis must match between this array and data.

Examples

>>> import numpy as np
>>> import zarr
>>> a = np.arange(10000000, dtype='i4').reshape(10000, 1000)
>>> z = zarr.array(a, chunks=(1000, 100))
>>> z
zarr.core.Array((10000, 1000), int32, chunks=(1000, 100), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 38.1M; nbytes_stored: 1.9M; ratio: 20.0; initialized: 100/100
  store: builtins.dict
>>> z.append(a)
>>> z
zarr.core.Array((20000, 1000), int32, chunks=(1000, 100), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 76.3M; nbytes_stored: 3.8M; ratio: 20.0; initialized: 200/200
  store: builtins.dict
>>> z.append(np.vstack([a, a]), axis=1)
>>> z
zarr.core.Array((20000, 2000), int32, chunks=(1000, 100), order=C)
  compression: blosc; compression_opts: {'clevel': 5, 'cname': 'lz4', 'shuffle': 1}
  nbytes: 152.6M; nbytes_stored: 7.6M; ratio: 20.0; initialized: 400/400
  store: builtins.dict