Release notes#

Note

Zarr-Python 2.18.* is expected be the final release in the 2.* series. Work on Zarr-Python 3.0 is underway. See GH1777 for more details on the upcoming 3.0 release.

3.0.0-alpha#

Warning

Zarr-Python 3.0.0-alpha is a pre-release of the upcoming 3.0 release. This release is not feature complete or expected to be ready for production applications.

Note

The complete release notes for 3.0 have not been added to this document yet. See the 3.0.0-alpha release on GitHub for a record of changes included in this release.

Enhancements#

Typing#

Maintenance#

Documentation#

2.18.3#

Enhancements#

Maintenance#

Deprecations#

  • Deprecate zarr.n5.N5Store and zarr.n5.N5FSStore. These stores are slated to be removed in Zarr Python 3.0. By Joe Hamman #2085.

2.18.2#

Enhancements#

2.18.1#

Maintenance#

  • Fix a regression when getting or setting a single value from arrays with size-1 chunks. By Deepak Cherian #1874

2.18.0#

Enhancements#

  • Performance improvement for reading and writing chunks if any of the dimensions is size 1. By Deepak Cherian #1730.

Maintenance#

Deprecations#

  • Deprecate experimental v3 support by issuing a FutureWarning. Also updated docs to warn about using the experimental v3 version. By Joe Hamman #1802 and #1807.

  • Deprecate the following stores: zarr.storage.DBMStore, zarr.storage.LMDBStore, zarr.storage.SQLiteStore, zarr.storage.MongoDBStore, zarr.storage.RedisStore, and zarr.storage.ABSStore. These stores are slated to be removed from Zarr-Python in version 3.0. By Joe Hamman #1801.

2.17.2#

Enhancements#

Docs#

Maintenance#

2.17.1#

Enhancements#

Docs#

Maintenance#

2.17.0#

Enhancements#

Docs#

Maintenance#

2.16.1#

Maintenance#

2.16.0#

Enhancements#

Maintenance#

2.15.0#

Enhancements#

Maintenance#

Documentation#

Bug fixes#

2.14.2#

Bug fixes#

2.14.1#

Documentation#

2.14.0#

Major changes#

  • Improve Zarr V3 support, adding partial store read/write and storage transformers. Add new features from the v3 spec:

    • storage transformers

    • get_partial_values and set_partial_values

    • efficient get_partial_values implementation for FSStoreV3

    • sharding storage transformer

    By Jonathan Striebel; #1096, #1111.

  • N5 nows supports Blosc. Remove warnings emitted when using N5Store or N5FSStore with a blosc-compressed array. By Davis Bennett; #1331.

Bug fixes#

2.13.6#

Maintenance#

2.13.5#

Bug fixes#

2.13.4#

Appreciation#

Special thanks to Outreachy participants for contributing to most of the maintenance PRs. Please read the blog post summarising the contribution phase and welcoming new Outreachy interns: https://zarr.dev/blog/welcoming-outreachy-2022-interns/

Enhancements#

Bug fixes#

  • Fix bug that caused double counting of groups in groups() and group_keys() methods with V3 stores. By Ryan Abernathey #1228.

  • Remove unnecessary calling of contains_array for key that ended in .array.json. By Joe Hamman #1149.

  • Fix bug that caused double counting of groups in groups() and group_keys() methods with V3 stores. By Ryan Abernathey #1228.

Documentation#

Maintenance#

2.13.3#

  • Improve performance of slice selections with steps by omitting chunks with no relevant data. By Richard Shaw #843.

2.13.2#

2.13.1#

2.13.0#

Major changes#

  • Support of alternative array classes by introducing a new argument, meta_array, that specifies the type/class of the underlying array. The meta_array argument can be any class instance that can be used as the like argument in NumPy (see NEP 35). enabling support for CuPy through, for example, the creation of a CuPy CPU compressor. By Mads R. B. Kristensen #934.

  • Remove support for Python 3.7 in concert with NumPy dependency. By Davis Bennett #1067.

  • Zarr v3: add support for the default root path rather than requiring that all API users pass an explicit path. By Gregory R. Lee #1085, #1142.

Bug fixes#

  • Remove/relax erroneous “meta” path check (regression). By Gregory R. Lee #1123.

  • Cast all attribute keys to strings (and issue deprecation warning). By Mattia Almansi #1066.

  • Fix bug in N5 storage that prevented arrays located in the root of the hierarchy from bearing the n5 keyword. Along with fixing this bug, new tests were added for N5 routines that had previously been excluded from testing, and type annotations were added to the N5 codebase. By Davis Bennett #1092.

  • Fix bug in LRUEStoreCache in which the current size wasn’t reset on invalidation. By BGCMHou and Josh Moore #1076, #1077.

  • Remove erroneous check that disallowed array keys starting with “meta”. By Gregory R. Lee #1105.

Documentation#

Maintenance#

2.12.0#

Enhancements#

  • Add support for reading and writing Zarr V3. The new zarr._store.v3 package has the necessary classes and functions for evaluating Zarr V3. Since the format is not yet finalized, the classes and functions are not automatically imported into the regular zarr name space. Setting the ZARR_V3_EXPERIMENTAL_API environment variable will activate them. By Gregory Lee; #898, #1006, and #1007 as well as by Josh Moore #1032.

  • Create FSStore from an existing fsspec filesystem. If you have created an fsspec filesystem outside of Zarr, you can now pass it as a keyword argument to FSStore. By Ryan Abernathey; #911.

  • Add numpy encoder class for json.dumps By Eric Prestat; #933.

  • Appending performance improvement to Zarr arrays, e.g., when writing to S3. By hailiangzhang; #1014.

  • Add number encoder for json.dumps to support numpy integers in chunks arguments. By Eric Prestat #697.

Bug fixes#

  • Fix bug that made it impossible to create an FSStore on unlistable filesystems (e.g. some HTTP servers). By Ryan Abernathey; #993.

Documentation#

Maintenance#

2.11.3#

Bug fixes#

  • Fix missing case to fully revert change to default write_empty_chunks. By Tom White; #1005.

2.11.2#

Bug fixes#

  • Changes the default value of write_empty_chunks to True to prevent unanticipated data losses when the data types do not have a proper default value when empty chunks are read back in. By Vyas Ramasubramani; #965, #1001.

2.11.1#

Bug fixes#

  • Fix bug where indexing with a scalar numpy value returned a single-value array. By Ben Jeffery #967.

  • Removed clobber argument from normalize_store_arg. This enables to change data within an opened consolidated group using mode “r+” (i.e region write). By Tobias Kölling #975.

2.11.0#

Enhancements#

  • Sparse changes with performance impact! One of the advantages of the Zarr format is that it is sparse, which means that chunks with no data (more precisely, with data equal to the fill value, which is usually 0) don’t need to be written to disk at all. They will simply be assumed to be empty at read time. However, until this release, the Zarr library would write these empty chunks to disk anyway. This changes in this version: a small performance penalty at write time leads to significant speedups at read time and in filesystem operations in the case of sparse arrays. To revert to the old behavior, pass the argument write_empty_chunks=True to the array creation function. By Juan Nunez-Iglesias; #853 and Davis Bennett; #738.

  • Fancy indexing. Zarr arrays now support NumPy-style fancy indexing with arrays of integer coordinates. This is equivalent to using zarr.Array.vindex. Mixing slices and integer arrays is not supported. By Juan Nunez-Iglesias; #725.

  • New base class. This release of Zarr Python introduces a new BaseStore class that all provided store classes implemented in Zarr Python now inherit from. This is done as part of refactoring to enable future support of the Zarr version 3 spec. Existing third-party stores that are a MutableMapping (e.g. dict) can be converted to a new-style key/value store inheriting from BaseStore by passing them as the argument to the new zarr.storage.KVStore class. For backwards compatibility, various higher-level array creation and convenience functions still accept plain Python dicts or other mutable mappings for the store argument, but will internally convert these to a KVStore. By Gregory Lee; #839, #789, and #950.

  • Allow to assign array fill_values and update metadata accordingly. By Ryan Abernathey, #662.

  • Allow to update array fill_values By Matthias Bussonnier #665.

Bug fixes#

Documentation#

Maintenance#

  • Upgrade MongoDB in test env. By Joe Hamman #939.

  • Pass dimension_separator on fixture generation. By Josh Moore #858.

  • Activate Python 3.9 in GitHub Actions. By Josh Moore #859.

  • Drop shortcut fsspec[s3] for dependency. By Josh Moore #920.

  • and a swath of code-linting improvements by Dimitri Papadopoulos Orfanos:

    • Unnecessary comprehension (#899)

    • Unnecessary None provided as default (#900)

    • use an if expression instead of and/or (#888)

    • Remove unnecessary literal (#891)

    • Decorate a few method with @staticmethod (#885)

    • Drop unneeded return (#884)

    • Drop explicit object inheritance from class-es (#886)

    • Unnecessary comprehension (#883)

    • Codespell configuration (#882)

    • Fix typos found by codespell (#880)

    • Proper C-style formatting for integer (#913)

    • Add LGTM.com / DeepSource.io configuration files (#909)

2.10.3#

Bug fixes#

  • N5 keywords now emit UserWarning instead of raising a ValueError. By Boaz Mohar; #860.

  • blocks_to_decompress not used in read_part function. By Boaz Mohar; #861.

  • defines blocksize for array, updates hexdigest values. By Andrew Fulton; #867.

  • Fix test failure on Debian and conda-forge builds. By Josh Moore; #871.

2.10.2#

Bug fixes#

  • Fix NestedDirectoryStore datasets without dimension_separator metadata. By Josh Moore; #850.

2.10.1#

Bug fixes#

  • Fix regression by setting normalize_keys=False in fsstore constructor. By Davis Bennett; #842.

2.10.0#

Enhancements#

Bug fixes#

2.9.5#

Bug fixes#

  • Fix FSStore.listdir behavior for nested directories. By Gregory Lee; #802.

2.9.4#

Bug fixes#

  • Fix structured arrays that contain objects By :user: Attila Bergou <abergou>; :issue: 806

2.9.3#

Maintenance#

  • Mark the fact that some tests that require fsspec, without compromising the code coverage score. By Ben Williams; #823.

  • Only inspect alternate node type if desired isn’t present. By Trevor Manz; #696.

2.9.2#

Maintenance#

  • Correct conda-forge deployment of Zarr by fixing some Zarr tests. By Ben Williams; #821.

2.9.1#

Maintenance#

2.9.0#

This release of Zarr Python is the first release of Zarr to not support Python 3.6.

Enhancements#

Documentation#

  • Clarify that arbitrary key/value pairs are OK for attributes. By Stephan Hoyer; #751.

  • Clarify how to manually convert a DirectoryStore to a ZipStore. By pmav99; #763.

Bug fixes#

Maintenance#

2.8.3#

Bug fixes#

2.8.2#

Documentation#

Bug fixes#

Maintenance#

  • Updated ipytree warning for jlab3 By Ian Hunt-Isaak; #721.

  • b170a48a - (issue-728, copy-nested) Updated ipytree warning for jlab3 (#721) (3 weeks ago) <Ian Hunt-Isaak>

  • Activate dependabot By Josh Moore; #734.

  • Update Python classifiers (Zarr is stable!) By Josh Moore; #731.

2.8.1#

Bug fixes#

  • raise an error if create_dataset’s dimension_separator is inconsistent By Gregory R. Lee; #724.

2.8.0#

V2 Specification Update#

  • Introduce optional dimension_separator .zarray key for nested chunks. By Josh Moore; #715, #716.

2.7.1#

Bug fixes#

2.7.0#

Enhancements#

Bug fixes#

2.6.1#

2.6.0#

This release of Zarr Python is the first release of Zarr to not support Python 3.5.

  • End Python 3.5 support. By Chris Barnes; #602.

  • Fix open_group/open_array to allow opening of read-only store with mode='r' #269

  • Add Array tests for FSStore. By Andrew Fulton; :issue: 644.

  • fix a bug in which attrs would not be copied on the root when using copy_all; #613

  • Fix FileNotFoundError with dask/s3fs #649

  • Fix flaky fixture in test_storage.py #652

  • Fix FSStore getitems fails with arrays that have a 0 length shape dimension #644

  • Use async to fetch/write result concurrently when possible. #536, See this comment for some performance analysis showing order of magnitude faster response in some benchmark.

See this link for the full list of closed and merged PR tagged with the 2.6 milestone.

  • Add ability to partially read and decompress arrays, see #667. It is only available to chunks stored using fsspec and using Blosc as a compressor.

    For certain analysis case when only a small portion of chunks is needed it can be advantageous to only access and decompress part of the chunks. Doing partial read and decompression add high latency to many of the operation so should be used only when the subset of the data is small compared to the full chunks and is stored contiguously (that is to say either last dimensions for C layout, firsts for F). Pass partial_decompress=True as argument when creating an Array, or when using open_array. No option exists yet to apply partial read and decompress on a per-operation basis.

2.5.0#

This release will be the last to support Python 3.5, next version of Zarr will be Python 3.6+.

  • DirectoryStore now uses os.scandir, which should make listing large store faster, #563

  • Remove a few remaining Python 2-isms. By Poruri Sai Rahul; #393.

  • Fix minor bug in N5Store. By @gsakkis, #550.

  • Improve error message in Jupyter when trying to use the ipytree widget without ipytree installed. By Zain Patel; #537

  • Add typing information to many of the core functions #589

  • Explicitly close stores during testing. By Elliott Sales de Andrade; #442

  • Many of the convenience functions to emit errors (err_* from zarr.errors have been replaced by ValueError subclasses. The corresponding err_* function have been removed. #590, #614)

  • Improve consistency of terminology regarding arrays and datasets in the documentation. By Josh Moore; #571.

  • Added support for generic URL opening by fsspec, where the URLs have the form “protocol://[server]/path” or can be chained URls with “::” separators. The additional argument storage_options is passed to the backend, see the fsspec docs. By Martin Durant; #546

  • Added support for fetching multiple items via getitems method of a store, if it exists. This allows for concurrent fetching of data blocks from stores that implement this; presently HTTP, S3, GCS. Currently only applies to reading. By Martin Durant; #606

  • Efficient iteration expanded with option to pass start and stop index via array.islice. By Sebastian Grill, #615.

2.4.0#

Enhancements#

  • Add key normalization option for DirectoryStore, NestedDirectoryStore, TempStore, and N5Store. By James Bourbeau; #459.

  • Add recurse keyword to Group.array_keys and Group.arrays methods. By James Bourbeau; #458.

  • Use uniform chunking for all dimensions when specifying chunks as an integer. Also adds support for specifying -1 to chunk across an entire dimension. By James Bourbeau; #456.

  • Rename DictStore to MemoryStore. By James Bourbeau; #455.

  • Rewrite .tree() pretty representation to use ipytree. Allows it to work in both the Jupyter Notebook and JupyterLab. By John Kirkham; #450.

  • Do not rename Blosc parameters in n5 backend and add blocksize parameter, compatible with n5-blosc. By @axtimwalde, #485.

  • Update DirectoryStore to create files with more permissive permissions. By Eduardo Gonzalez and James Bourbeau; #493

  • Use math.ceil for scalars. By John Kirkham; #500.

  • Ensure contiguous data using astype. By John Kirkham; #513.

  • Refactor out _tofile/_fromfile from DirectoryStore. By John Kirkham; #503.

  • Add __enter__/__exit__ methods to Group for h5py.File compatibility. By Chris Barnes; #509.

Bug fixes#

  • Fix Sqlite Store Wrong Modification. By Tommy Tran; #440.

  • Add intermediate step (using zipfile.ZipInfo object) to write inside ZipStore to solve too restrictive permission issue. By Raphael Dussin; #505.

  • Fix ‘/’ prepend bug in ABSStore. By Shikhar Goenka; #525.

Documentation#

Maintenance#

2.3.2#

Enhancements#

Bug fixes#

2.3.1#

Bug fixes#

2.3.0#

Enhancements#

  • New storage backend, backed by Azure Blob Storage, class zarr.storage.ABSStore. All data is stored as block blobs. By Shikhar Goenka, Tim Crone and Zain Patel; #345.

  • Add “consolidated” metadata as an experimental feature: use zarr.convenience.consolidate_metadata() to copy all metadata from the various metadata keys within a dataset hierarchy under a single key, and zarr.convenience.open_consolidated() to use this single key. This can greatly cut down the number of calls to the storage backend, and so remove a lot of overhead for reading remote data. By Martin Durant, Alistair Miles, Ryan Abernathey, #268, #332, #338.

  • Support has been added for structured arrays with sub-array shape and/or nested fields. By Tarik Onalan, #111, #296.

  • Adds the SQLite-backed zarr.storage.SQLiteStore class enabling an SQLite database to be used as the backing store for an array or group. By John Kirkham, #368, #365.

  • Efficient iteration over arrays by decompressing chunkwise. By Jerome Kelleher, #398, #399.

  • Adds the Redis-backed zarr.storage.RedisStore class enabling a Redis database to be used as the backing store for an array or group. By Joe Hamman, #299, #372.

  • Adds the MongoDB-backed zarr.storage.MongoDBStore class enabling a MongoDB database to be used as the backing store for an array or group. By Noah D Brenowitz, Joe Hamman, #299, #372, #401.

  • New storage class for N5 containers. The zarr.n5.N5Store has been added, which uses zarr.storage.NestedDirectoryStore to support reading and writing from and to N5 containers. By Jan Funke and John Kirkham.

Bug fixes#

  • The implementation of the zarr.storage.DirectoryStore class has been modified to ensure that writes are atomic and there are no race conditions where a chunk might appear transiently missing during a write operation. By sbalmer, #327, #263.

  • Avoid raising in zarr.storage.DirectoryStore’s __setitem__ when file already exists. By Justin Swaney, #272, #318.

  • The required version of the Numcodecs package has been upgraded to 0.6.2, which has enabled some code simplification and fixes a failing test involving msgpack encoding. By John Kirkham, #361, #360, #352, #355, #324.

  • Failing tests related to pickling/unpickling have been fixed. By Ryan Williams, #273, #308.

  • Corrects handling of NaT in datetime64 and timedelta64 in various compressors (by John Kirkham; #344).

  • Ensure DictStore contains only bytes to facilitate comparisons and protect against writes. By John Kirkham, #350.

  • Test and fix an issue (w.r.t. fill values) when storing complex data to Array. By John Kirkham, #363.

  • Always use a tuple when indexing a NumPy ndarray. By John Kirkham, #376.

  • Ensure when Array uses a dict-based chunk store that it only contains bytes to facilitate comparisons and protect against writes. Drop the copy for the no filter/compressor case as this handles that case. By John Kirkham, #359.

Maintenance#

2.2.0#

Enhancements#

  • Advanced indexing. The Array class has several new methods and properties that enable a selection of items in an array to be retrieved or updated. See the Advanced indexing tutorial section for more information. There is also a notebook with extended examples and performance benchmarks. #78, #89, #112, #172.

  • New package for compressor and filter codecs. The classes previously defined in the zarr.codecs module have been factored out into a separate package called Numcodecs. The Numcodecs package also includes several new codec classes not previously available in Zarr, including compressor codecs for Zstd and LZ4. This change is backwards-compatible with existing code, as all codec classes defined by Numcodecs are imported into the zarr.codecs namespace. However, it is recommended to import codecs from the new package, see the tutorial sections on Compressors and Filters for examples. With contributions by John Kirkham; #74, #102, #120, #123, #139.

  • New storage class for DBM-style databases. The zarr.storage.DBMStore class enables any DBM-style database such as gdbm, ndbm or Berkeley DB, to be used as the backing store for an array or group. See the tutorial section on Storage alternatives for some examples. #133, #186.

  • New storage class for LMDB databases. The zarr.storage.LMDBStore class enables an LMDB “Lightning” database to be used as the backing store for an array or group. #192.

  • New storage class using a nested directory structure for chunk files. The zarr.storage.NestedDirectoryStore has been added, which is similar to the existing zarr.storage.DirectoryStore class but nests chunk files for multidimensional arrays into sub-directories. #155, #177.

  • New tree() method for printing hierarchies. The Group class has a new zarr.hierarchy.Group.tree() method which enables a tree representation of a group hierarchy to be printed. Also provides an interactive tree representation when used within a Jupyter notebook. See the Array and group diagnostics tutorial section for examples. By John Kirkham; #82, #140, #184.

  • Visitor API. The Group class now implements the h5py visitor API, see docs for the zarr.hierarchy.Group.visit(), zarr.hierarchy.Group.visititems() and zarr.hierarchy.Group.visitvalues() methods. By John Kirkham, #92, #122.

  • Viewing an array as a different dtype. The Array class has a new zarr.Array.astype() method, which is a convenience that enables an array to be viewed as a different dtype. By John Kirkham, #94, #96.

  • New open(), save(), load() convenience functions. The function zarr.convenience.open() provides a convenient way to open a persistent array or group, using either a DirectoryStore or ZipStore as the backing store. The functions zarr.convenience.save() and zarr.convenience.load() are also available and provide a convenient way to save an entire NumPy array to disk and load back into memory later. See the tutorial section Persistent arrays for examples. #104, #105, #141, #181.

  • IPython completions. The Group class now implements __dir__() and _ipython_key_completions_() which enables tab-completion for group members to be used in any IPython interactive environment. #170.

  • New info property; changes to __repr__. The Group and Array classes have a new info property which can be used to print diagnostic information, including compression ratio where available. See the tutorial section on Array and group diagnostics for examples. The string representation (__repr__) of these classes has been simplified to ensure it is cheap and quick to compute in all circumstances. #83, #115, #132, #148.

  • Chunk options. When creating an array, chunks=False can be specified, which will result in an array with a single chunk only. Alternatively, chunks=True will trigger an automatic chunk shape guess. See Chunk optimizations for more on the chunks parameter. #106, #107, #183.

  • Zero-dimensional arrays and are now supported; by Prakhar Goel, #154, #161.

  • Arrays with one or more zero-length dimensions are now fully supported; by Prakhar Goel, #150, #154, #160.

  • The .zattrs key is now optional and will now only be created when the first custom attribute is set; #121, #200.

  • New Group.move() method supports moving a sub-group or array to a different location within the same hierarchy. By John Kirkham, #191, #193, #196.

  • ZipStore is now thread-safe; #194, #192.

  • New Array.hexdigest() method computes an Array’s hash with hashlib. By John Kirkham, #98, #203.

  • Improved support for object arrays. In previous versions of Zarr, creating an array with dtype=object was possible but could under certain circumstances lead to unexpected errors and/or segmentation faults. To make it easier to properly configure an object array, a new object_codec parameter has been added to array creation functions. See the tutorial section on Object arrays for more information and examples. Also, runtime checks have been added in both Zarr and Numcodecs so that segmentation faults are no longer possible, even with a badly configured array. This API change is backwards compatible and previous code that created an object array and provided an object codec via the filters parameter will continue to work, however a warning will be raised to encourage use of the object_codec parameter. #208, #212.

  • Added support for datetime64 and timedelta64 data types; #85, #215.

  • Array and group attributes are now cached by default to improve performance with slow stores, e.g., stores accessing data via the network; #220, #218, #204.

  • New LRUStoreCache class. The class zarr.storage.LRUStoreCache has been added and provides a means to locally cache data in memory from a store that may be slow, e.g., a store that retrieves data from a remote server via the network; #223.

  • New copy functions. The new functions zarr.convenience.copy() and zarr.convenience.copy_all() provide a way to copy groups and/or arrays between HDF5 and Zarr, or between two Zarr groups. The zarr.convenience.copy_store() provides a more efficient way to copy data directly between two Zarr stores. #87, #113, #137, #217.

Bug fixes#

  • Fixed bug where read_only keyword argument was ignored when creating an array; #151, #179.

  • Fixed bugs when using a ZipStore opened in ‘w’ mode; #158, #182.

  • Fill values can now be provided for fixed-length string arrays; #165, #176.

  • Fixed a bug where the number of chunks initialized could be counted incorrectly; #97, #174.

  • Fixed a bug related to the use of an ellipsis (…) in indexing statements; #93, #168, #172.

  • Fixed a bug preventing use of other integer types for indexing; #143, #147.

Documentation#

Maintenance#

  • A data fixture has been included in the test suite to ensure data format compatibility is maintained; #83, #146.

  • The test suite has been migrated from nosetests to pytest; #189, #225.

  • Various continuous integration updates and improvements; #118, #124, #125, #126, #109, #114, #171.

  • Bump numcodecs dependency to 0.5.3, completely remove nose dependency, #237.

  • Fix compatibility issues with NumPy 1.14 regarding fill values for structured arrays, #222, #238, #239.

Acknowledgments#

Code was contributed to this release by Alistair Miles, John Kirkham and Prakhar Goel.

Documentation was contributed to this release by Mamy Ratsimbazafy and Charles Noyes.

Thank you to John Kirkham, Stephan Hoyer, Francesc Alted, and Matthew Rocklin for code reviews and/or comments on pull requests.

2.1.4#

  • Resolved an issue where calling hasattr on a Group object erroneously returned a KeyError. By Vincent Schut; #88, #95.

2.1.3#

2.1.2#

  • Resolved an issue when no compression is used and chunks are stored in memory (#79).

2.1.1#

Various minor improvements, including: Group objects support member access via dot notation (__getattr__); fixed metadata caching for Array.shape property and derivatives; added Array.ndim property; fixed Array.__array__ method arguments; fixed bug in pickling Array state; fixed bug in pickling ThreadSynchronizer.

2.1.0#

  • Group objects now support member deletion via del statement (#65).

  • Added zarr.storage.TempStore class for convenience to provide storage via a temporary directory (#59).

  • Fixed performance issues with zarr.storage.ZipStore class (#66).

  • The Blosc extension has been modified to return bytes instead of array objects from compress and decompress function calls. This should improve compatibility and also provides a small performance increase for compressing high compression ratio data (#55).

  • Added overwrite keyword argument to array and group creation methods on the zarr.hierarchy.Group class (#71).

  • Added cache_metadata keyword argument to array creation methods.

  • The functions zarr.creation.open_array() and zarr.hierarchy.open_group() now accept any store as first argument (#56).

2.0.1#

The bundled Blosc library has been upgraded to version 1.11.1.

2.0.0#

Hierarchies#

Support has been added for organizing arrays into hierarchies via groups. See the tutorial section on Groups and the zarr.hierarchy API docs for more information.

Filters#

Support has been added for configuring filters to preprocess chunk data prior to compression. See the tutorial section on Filters and the zarr.codecs API docs for more information.

Other changes#

To accommodate support for hierarchies and filters, the Zarr metadata format has been modified. See the Zarr Storage Specification Version 2 for more information. To migrate an array stored using Zarr version 1.x, use the zarr.storage.migrate_1to2() function.

The bundled Blosc library has been upgraded to version 1.11.0.

Acknowledgments#

Thanks to Matthew Rocklin, Stephan Hoyer and Francesc Alted for contributions and comments.

1.1.0#

  • The bundled Blosc library has been upgraded to version 1.10.0. The ‘zstd’ internal compression library is now available within Blosc. See the tutorial section on Compressors for an example.

  • When using the Blosc compressor, the default internal compression library is now ‘lz4’.

  • The default number of internal threads for the Blosc compressor has been increased to a maximum of 8 (previously 4).

  • Added convenience functions zarr.blosc.list_compressors() and zarr.blosc.get_nthreads().

1.0.0#

This release includes a complete re-organization of the code base. The major version number has been bumped to indicate that there have been backwards-incompatible changes to the API and the on-disk storage format. However, Zarr is still in an early stage of development, so please do not take the version number as an indicator of maturity.

Storage#

The main motivation for re-organizing the code was to create an abstraction layer between the core array logic and data storage (#21). In this release, any object that implements the MutableMapping interface can be used as an array store. See the tutorial sections on Persistent arrays and Storage alternatives, the Zarr Storage Specification Version 1, and the zarr.storage module documentation for more information.

Please note also that the file organization and file name conventions used when storing a Zarr array in a directory on the file system have changed. Persistent Zarr arrays created using previous versions of the software will not be compatible with this version. See the zarr.storage API docs and the Zarr Storage Specification Version 1 for more information.

Compression#

An abstraction layer has also been created between the core array logic and the code for compressing and decompressing array chunks. This release still bundles the c-blosc library and uses Blosc as the default compressor, however other compressors including zlib, BZ2 and LZMA are also now supported via the Python standard library. New compressors can also be dynamically registered for use with Zarr. See the tutorial sections on Compressors and Configuring Blosc, the Zarr Storage Specification Version 1, and the zarr.compressors module documentation for more information.

Synchronization#

The synchronization code has also been refactored to create a layer of abstraction, enabling Zarr arrays to be used in parallel computations with a number of alternative synchronization methods. For more information see the tutorial section on Parallel computing and synchronization and the zarr.sync module documentation.

Changes to the Blosc extension#

NumPy is no longer a build dependency for the zarr.blosc Cython extension, so setup.py will run even if NumPy is not already installed, and should automatically install NumPy as a runtime dependency. Manual installation of NumPy prior to installing Zarr is still recommended, however, as the automatic installation of NumPy may fail or be sub-optimal on some platforms.

Some optimizations have been made within the zarr.blosc extension to avoid unnecessary memory copies, giving a ~10-20% performance improvement for multi-threaded compression operations.

The zarr.blosc extension now automatically detects whether it is running within a single-threaded or multi-threaded program and adapts its internal behaviour accordingly (#27). There is no need for the user to make any API calls to switch Blosc between contextual and non-contextual (global lock) mode. See also the tutorial section on Configuring Blosc.

Other changes#

The internal code for managing chunks has been rewritten to be more efficient. Now no state is maintained for chunks outside of the array store, meaning that chunks do not carry any extra memory overhead not accounted for by the store. This negates the need for the “lazy” option present in the previous release, and this has been removed.

The memory layout within chunks can now be set as either “C” (row-major) or “F” (column-major), which can help to provide better compression for some data (#7). See the tutorial section on Chunk memory layout for more information.

A bug has been fixed within the __getitem__ and __setitem__ machinery for slicing arrays, to properly handle getting and setting partial slices.

Acknowledgments#

Thanks to Matthew Rocklin, Stephan Hoyer, Francesc Alted, Anthony Scopatz and Martin Durant for contributions and comments.

0.4.0#

See v0.4.0 release notes on GitHub.

0.3.0#

See v0.3.0 release notes on GitHub.