Skip to content

Conversation

@jrbourbeau
Copy link
Contributor

This PR updates how we construct the HighLevelGraph in subset_dataset_to_block. Currently we create a HighLevelGraph and then manually add a few new layers, which are dicts, directly to hlg.layers.

hlg = HighLevelGraph.from_collections(
gname,
graph,
dependencies=[arg for arg in npargs if dask.is_dask_collection(arg)],
)
for gname_l, layer in new_layers.items():
# This adds in the getitems for each variable in the dataset.
hlg.dependencies[gname_l] = {gname}
hlg.layers[gname_l] = layer

However since (in more recent versions of Dask) HighLevelGraph layers are expected to be Layer class instances this can result in unexpected errors (xref #5077 (comment)). Instead of manually adding new layers to hlg.layers, this PR proposes we create a new HighLevelGraph altogether to ensure that hlg.layers won't contain any raw dict layers.

Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing this issue, @jrbourbeau!

It appears that the failing tests are unrelated to your changes

@dcherian
Copy link
Contributor

dcherian commented May 6, 2021

Great thanks @jrbourbeau

@dcherian dcherian merged commit 020345d into pydata:master May 6, 2021
@jrbourbeau jrbourbeau deleted the hlg-fixup branch May 6, 2021 20:37
@dcherian dcherian mentioned this pull request May 13, 2021
9 tasks
dcherian added a commit to TomNicholas/xarray that referenced this pull request May 13, 2021
* upstream/master:
  combine keep_attrs and combine_attrs in apply_ufunc (pydata#5041)
  Explained what a deprecation cycle is (pydata#5289)
  Code cleanup (pydata#5234)
  FacetGrid docstrings (pydata#5293)
  Add whats new for dataset interpolation with non-numerics (pydata#5297)
  Allow dataset interpolation with different datatypes (pydata#5008)
  Flexible indexes: add Index base class and xindexes properties (pydata#5102)
  pre-commit: autoupdate hook versions (pydata#5280)
  convert the examples for apply_ufunc to doctest (pydata#5279)
  fix the new whatsnew section
  Ensure `HighLevelGraph` layers are `Layer` instances (pydata#5271)
dcherian added a commit to matzegoebel/xarray that referenced this pull request May 13, 2021
* upstream/master: (23 commits)
  combine keep_attrs and combine_attrs in apply_ufunc (pydata#5041)
  Explained what a deprecation cycle is (pydata#5289)
  Code cleanup (pydata#5234)
  FacetGrid docstrings (pydata#5293)
  Add whats new for dataset interpolation with non-numerics (pydata#5297)
  Allow dataset interpolation with different datatypes (pydata#5008)
  Flexible indexes: add Index base class and xindexes properties (pydata#5102)
  pre-commit: autoupdate hook versions (pydata#5280)
  convert the examples for apply_ufunc to doctest (pydata#5279)
  fix the new whatsnew section
  Ensure `HighLevelGraph` layers are `Layer` instances (pydata#5271)
  New whatsnew section
  Release-workflow: Bug fix (pydata#5273)
  more maintenance on whats-new.rst (pydata#5272)
  v0.18.0 release highlights (pydata#5266)
  Fix exception when display_expand_data=False for file-backed array. (pydata#5235)
  Warn ignored keep attrs (pydata#5265)
  Disable workflows on forks (pydata#5267)
  fix the built wheel test (pydata#5270)
  pypi upload workflow maintenance (pydata#5269)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants