Skip to content

Conversation

neutrinoceros
Copy link
Member

@neutrinoceros neutrinoceros commented Jul 7, 2024

Close #4749

This is a mostly automated style change, but I'm using the test suite to detect places where strict=False is either the intended behavior or a bug.

I'll also carefully review this myself to improve coverage.

kwargs["deposition"] = deposition
if override_bins is not None:
for o_bin, ax in zip(o_bins, ["x", "y", "z"]):
for o_bin, ax in zip(o_bins, ["x", "y", "z"], strict=False):
Copy link
Member Author

@neutrinoceros neutrinoceros Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one very rare case where it's apparent that strict=False is the intended behavior (found thanks to a test).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. Yes, I think this is assuming that it'll run out of axes based on the dimensionality.

@neutrinoceros neutrinoceros added the code style Related to linting tools label Jul 7, 2024
if len(args) + len(kwargs) != len(self._params):
raise RuntimeError
self.__dict__.update(zip(self._params, args))
self.__dict__.update(zip(self._params, args, strict=False))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the context, it seems that strict=False is clearly intended here

@neutrinoceros neutrinoceros added the enhancement Making something better label Jul 7, 2024
return [
self.data_source.ds.arr([mis.min(), mas.max()])
for mis, mas in zip(values[::2], values[1::2])
for mis, mas in zip(values[::2], values[1::2], strict=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only safe if len(values)%2==0 is guaranteed. If not, strict=False matches the existing behavior but is possibly buggy (one value is silently dropped)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using strict here is the right move, as a check on argument inputs.

@neutrinoceros neutrinoceros changed the title STY: enforce B905 and ignore UP038 RFC: enforce B905 and ignore UP038 Jul 7, 2024
@neutrinoceros neutrinoceros added the refactor improve readability, maintainability, modularity label Jul 7, 2024
@neutrinoceros
Copy link
Member Author

On second though this isn't a pure stylistic choice, as it may (however unlikely) have undesirable effects. Re-triaging as a refactor.

)
cgs_units = ("cm", "g", "s", "cm/s", "gauss")
for unit, attr, cgs_unit in zip(base_units, attrs, cgs_units):
for unit, attr, cgs_unit in zip(base_units, attrs, cgs_units, strict=True):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good example of why enforcing strict=True helps finding bugs

@neutrinoceros neutrinoceros force-pushed the sty/strict_zip branch 2 times, most recently from 4798699 to e9ad856 Compare July 7, 2024 12:53
@neutrinoceros neutrinoceros changed the title RFC: enforce B905 and ignore UP038 RFC: explicitly set strict flag in zip calls (enforce B905), and ignore UP038 Jul 7, 2024
Comment on lines +1323 to +1322
if wf is not None:
is_local.append(wf.sampling_type == "local")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highlighting this bit as a bugfix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I have an inkling somewhere in my head that there's an extant issue this could be related to but searching does not prove me correct.

keys = fhandle["field_types"].keys()
val = fhandle["field_types"].keys()
return dict(zip(keys, val))
val = fhandle["field_types"].values()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a case where the original code just looks wrong, but it's been there for 12 years so either it's not really a bug (but it should have a comment), or it's actually not used by anyone...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any references to it -- this may be vestigial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. And it's private anyway so let me just remove it now.

Copy link
Member Author

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finishing my self-review with a couple notes for reviewers:
First of, if you're wondering why I write

if sys.version_info >= (3, 10):
    pass
else:
    from yt._maintenance.backports import zip

instead of the simpler

if sys.version_info < (3, 10):
    from yt._maintenance.backports import zip

It is because ruff (rule UP036) can automatically cleanup the former but not the latter when we switch python-requires to >=3.10. So it's more verbose now but easier to cleanup in the future.
edit July 16th 2024: this is not true anymore, ruff 0.5.2 (and supposedly some older versions too) know how to cleanup both versions (but require --unsafe-fixes in both cases), so I'll simplify this PR.

  1. in a couple places, I actively changed the code around a zip to fix apparent bugs. These should be highlighted by my own comments.

  2. Please pay special attention in cases where I set strict=False: in most cases, I saw no actual test failing with strict=True but the context around zip made me think strict=False was the intended behavior. I may be wrong.

  3. Some zip calls were zipping some small literal lists, and I refactored them to as lists of tuples to avoid zip entirely, which I found was more readable. Maybe I made a mistake in a case or two, so please pay extra attention there too.

@neutrinoceros neutrinoceros marked this pull request as ready for review July 8, 2024 08:01
@neutrinoceros
Copy link
Member Author

I know this isn't exactly an exciting read, but can I ask @matthewturk, @chrishavlin or @cphyc for a review ?

@chrishavlin
Copy link
Contributor

Yes! I actually started it a few days ago but didn't finish up. Can definitely get to it in the next few days.

matthewturk
matthewturk previously approved these changes Jul 16, 2024
Copy link
Member

@matthewturk matthewturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments where I think there might be issues, but I want to note that I think there are no problems with any of this, and any bugs it finds would be an improvement. Thanks for taking the time and leaving such helpful comments on it.

unambiguous_fields_to_include = []
unambiguous_fields_units = []
for field, field_unit in zip(fields_to_include, fields_units):
for field, field_unit in zip(fields_to_include, fields_units, strict=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an error were to occur, I think this is a candidate location.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~30 lines prior to this, an error is raised `if len(fields_units) != len(fields_to_include):, so I think this should be OK


## explicitly go after the fields we want
for field, units in zip(fields_to_include, fields_units):
for field, units in zip(fields_to_include, fields_units, strict=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an error were to occur, I think this is a candidate location.

return [
self.data_source.ds.arr([mis.min(), mas.max()])
for mis, mas in zip(values[::2], values[1::2])
for mis, mas in zip(values[::2], values[1::2], strict=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using strict here is the right move, as a check on argument inputs.

Comment on lines +1323 to +1322
if wf is not None:
is_local.append(wf.sampling_type == "local")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I have an inkling somewhere in my head that there's an extant issue this could be related to but searching does not prove me correct.

kwargs["deposition"] = deposition
if override_bins is not None:
for o_bin, ax in zip(o_bins, ["x", "y", "z"]):
for o_bin, ax in zip(o_bins, ["x", "y", "z"], strict=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. Yes, I think this is assuming that it'll run out of axes based on the dimensionality.

else:
p = self.parameters
for d, u in zip(("length", "time"), ("cm", "s")):
for d, u in [("length", "cm"), ("time", "s")]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't help but feel like I probably initiated this idiom that you're undoing in several places ... and I think undoing it is the right thing!

def get_child_index(anc_id, desc_id):
cid = ""
for aind, dind in zip(anc_id.split("_"), desc_id.split("_")):
for aind, dind in zip(anc_id.split("_"), desc_id.split("_"), strict=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this errors it will be a very valuable error check

a: c
for (a, b), c in zip(
self._header_spec, struct.unpack(hh, f.read(struct.calcsize(hh)))
self._header_spec,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I'm less certain of, in that it may be the case that we don't want to force the specification of all fields and just let some go. But, that's not necessarily what we should build in to doing this check, so 👍 .


def compare(self, new_result, old_result):
for newp, oldp in zip(new_result["parents"], old_result["parents"]):
for newp, oldp in zip(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good additional check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although, it may result in slightly less informative errors if the lengths are different

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you mean here: if lengths are different, there might be no error at all (without strict=True)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just mean, that the old way wouldn't pick that up as-is but the new way will show a weird error. It's NBD.

)

for k, val in zip(a, v):
for k, val in zip(a, v, strict=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the other version was somewhat better, as otherwise we're left deducing what caused the iterators to be of different lengths.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR I'm waiting on @chrishavlin to weigh in but happy to change it if he agrees

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i prefer keeping the OSError, but don't feel too strongly about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted !

chrishavlin
chrishavlin previously approved these changes Jul 22, 2024
Copy link
Contributor

@chrishavlin chrishavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good! some minor questions/remarks so i'll hit approve now (assume approval if it gets dismissed!)

I'll also say I had a hard time evaluating the changes in the code for some of the frontends (particularly where the existing zips related to file structures....), but I think we have to trust (hope?) that the tests would catch errors there.

unambiguous_fields_to_include = []
unambiguous_fields_units = []
for field, field_unit in zip(fields_to_include, fields_units):
for field, field_unit in zip(fields_to_include, fields_units, strict=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~30 lines prior to this, an error is raised `if len(fields_units) != len(fields_to_include):, so I think this should be OK

@neutrinoceros
Copy link
Member Author

Can I get another approval ? Seems that I dismissed the ones I got when I pushed another clean up commit

@neutrinoceros
Copy link
Member Author

Let's get this in before it starts rotting

@neutrinoceros neutrinoceros merged commit bc63757 into yt-project:main Jul 27, 2024
@neutrinoceros neutrinoceros deleted the sty/strict_zip branch July 27, 2024 14:20
@neutrinoceros neutrinoceros added this to the 4.4.0 milestone Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code style Related to linting tools enhancement Making something better refactor improve readability, maintainability, modularity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MNT: how should we handle Python 3.10 becoming our oldest supported version ?
3 participants