Skip to content

Conversation

@max-sixty
Copy link
Collaborator

  • Tests added
  • Passes isort . && black . && mypy . && flake8
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

I took an initial stab at allowing for consistently ordered coords, which came up recently, by using a list rather than a set. It's fairly involved, and the currently proposed code fails quite a few tests.

Another option would be to use OrderedSet. Or to leave it as a set.

@max-sixty max-sixty changed the title Initial attempt at using list rather than set for coord_names Allow for consistenty ordered coords Jan 3, 2021
@dcherian
Copy link
Contributor

dcherian commented Jan 4, 2021

Another option would be to use OrderedSet.

Would this be a smaller change overall (aside from typing)?

@max-sixty
Copy link
Collaborator Author

Would this be a smaller change overall (aside from typing)?

I think slightly smaller, yes.

@max-sixty
Copy link
Collaborator Author

As discussed in dev meeting:

  • one case to look at is that of pulling a variable out of a dataset
  • consider using an OrderedSet, which will perform better in worst case

@dcherian
Copy link
Contributor

dcherian commented Jan 6, 2021

one case to look at is that of pulling a variable out of a dataset

@max-sixty see #4744

Copy link
Collaborator

@keewis keewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to submit these suggestions before the meeting, so they might be out-of-date (but possibly still helpful).

def _names(self) -> Set[Hashable]:
return set(self._data._coords)
def _names(self) -> List[Hashable]:
return list(sorted(self._data._coords))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorted always returns a list:

Suggested change
return list(sorted(self._data._coords))
return sorted(self._data._coords)

Comment on lines +293 to +294
new_coord_names = coord_names + [x for x in vars_to_replace if x not in coord_names]
new_coord_names = [x for x in new_coord_names if x not in vars_to_remove]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we can come up with something more efficient (or if being efficient is really necessary), but this seems a bit wasteful (we iterate over set(vars_to_replace) - set(coord_names) twice). Maybe something like this:

Suggested change
new_coord_names = coord_names + [x for x in vars_to_replace if x not in coord_names]
new_coord_names = [x for x in new_coord_names if x not in vars_to_remove]
new_coord_names = [x for x in coord_names if x not in vars_to_remove]
new_coord_names += [
x
for x in vars_to_replace
if x not in coord_names and x not in vars_to_remove
]

?

Comment on lines +351 to +352
new_coord_names = coord_names + [x for x in vars_to_create if x not in coord_names]
new_coord_names = [x for x in new_coord_names if x not in vars_to_remove]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here:

Suggested change
new_coord_names = coord_names + [x for x in vars_to_create if x not in coord_names]
new_coord_names = [x for x in new_coord_names if x not in vars_to_remove]
new_coord_names = [x for x in coord_names if x not in vars_to_remove]
new_coord_names += [
x
for x in vars_to_create
if x not in coord_names and x not in vars_to_remove
]

self._variables = variables
self._coord_names = coord_names
# TODO: can we remove `sorted` and let it be user-defined?
self._coord_names = sorted(coord_names)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if user-defined is a good idea for Dataset, see #4649. sorted would have the advantage of always being predictable.

@max-sixty
Copy link
Collaborator Author

max-sixty commented Jun 2, 2021

Going to close this in favor of using an OrderedSet — while I won't do this now, I can in the future, or someone is welcome to take this on. I should have realized that would be a better design in the first place!

@max-sixty max-sixty closed this Jun 2, 2021
@max-sixty max-sixty deleted the ordered-coords branch June 28, 2021 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants