Update conformance tests #2095

rchen152 · 2025-10-09T22:36:16Z

Apologies for the monster PR; I started out just wanting to update the pyrefly results and kept finding more related things to tweak. The individual commits are pretty self-contained. This PR:

Updates unexpected_fails.py to allow the "conformant" field to be omitted for passing tests - this is a recent change to the tests that hadn't propagated to this script.
Adds an empty pyrefly.toml file in the tests subdirectory. Previously, if you happened to have a pyrefly.toml in a parent directory while running the conformance script, pyrefly would try to type-check the parent directory, to often unfortunate results. This file stops that from happening.
Makes fixes and updates to the tests:
- In cases in which a subset of type checkers reported a type error that was unrelated to the thing being tested, I fixed the type error so it wouldn't muddy the results.
- In cases in which type checkers had differing, reasonable assert_type results, I added tagged errors to allow multiple results.
- In generics_self_advanced.py, I added E? for errors reported only by zuban, which are labeled as a "true positive" in the notes in the corresponding results file.
- In overloads_definitions.py, I added another "E[override-final]" comment for another line on which the "override-final" error may appear.
- In conformance/tests/protocols_explicit.py, I removed an incorrect error. I believe what happened is that Concrete3 was changed from abstract to concrete at some point without updating the corresponding # E comment.
Removes some conformant = "Partial" entries that contradicted automated "Pass" results.
Regenerates the results for all type checkers. I also updated to the latest version for all checkers except zuban. The latest version of zuban fails a lot of tests it was previously passing due to a new "incompatible return value type" error; I'm not sure if this is a bug in zuban or if we need a custom config or something, so I left it alone for now (cc @davidhalter).

With this PR, unexpected_fails.py no longer reports any discrepancies between the automated and manual conformance results, which is nice :)

…or passing tests.

JelleZijlstra · 2025-10-09T23:42:31Z

conformance/results/mypy/generics_type_erasure.toml

 """
 output = """
 generics_type_erasure.py:19: error: Expression is of type "Node[Never]", not "Node[Any]"  [assert-type]
-generics_type_erasure.py:22: error: Expression is of type "Never", not "Any"  [assert-type]


This should be marked as conformant now, right? (Can't comment on line above)

On reflection, we can't allow Never here because it directly contradicts what the spec says (https://typing.python.org/en/latest/spec/generics.html#instantiating-generic-classes-and-type-erasure) ^^; I reverted this test change.

JelleZijlstra · 2025-10-09T23:48:35Z

conformance/tests/generics_self_advanced.py


    @classmethod
    def method3(cls) -> None:
+        a = cls.a  # E?: zuban errors on accessing `Self` in a classmethod


That seems like a false positive, why should we allow that error?

Ah, I was going by this comment:

typing/conformance/results/zuban/generics_self_advanced.toml

Line 3 in b3f46cf

True positive: Writing to a variable that contains Self (with a class) should error

. I've changed the test back to what it was and updated the zuban result instead.

JelleZijlstra · 2025-10-09T23:49:21Z

conformance/tests/generics_syntax_infer_variance.py

+    def __getitem__(self, index: int) -> T:
+        ...
+    @overload
+    def __getitem__(self, index: slice) -> Sequence[T]:


Do we really need the overloads? In general I'd prefer to keep things as simple as possible.

Without the overloads, pyright was complaining about overriding Sequence.getitem inconsistently.

Oh right, the thing inherits from Sequence.

JelleZijlstra

Thank you!

davidhalter · 2025-10-10T14:55:00Z

@rchen152 Thanks for the heads up! I'm going to look into that, it's definitely a bug.

davidhalter

Thanks for your work here and sorry again for the new Zuban errors, I must have changed something that makes stub-like functions require actual returns and not just a ... body, which is probably not an issue for anybody using Zuban (since in stubs it's fine), but really annoying for the conformance suite.

I have some comments about Any. I think I would like to understand your reasoning and if you really think that Any is valid in these cases.

davidhalter · 2025-10-10T15:00:22Z

conformance/tests/generics_defaults.py


-assert_type(func1(0), int)
+assert_type(func1(0), int)  # E[optional-default-use]
+assert_type(func1(0), Any)  # E[optional-default-use]


Why would this ever be Any? I think it should either be int, because that's the default, or Never because it's not reachable.

I think something like the function below feels very natural:

def func1(x: int | set[T4]) -> T4: if isinstance(x, int): return 1 # Or raise Exception() return next(iter(x), 2)

I think all type checkers do currently not recognize that the return is theoretically ok.

Using the default here is optional: https://typing.python.org/en/latest/spec/generics.html#function-defaults. It's not clear what should happen if a type checker doesn't use the default. I think there are a number of reasonable choices (falling back to Any, reporting an error, etc.); I updated the test to accept the Any choice because that's what pyright would fall back to if the default weren't there.

@rchen152, I think you're conflating argument defaults (i.e. default values for parameters in a function) with type argument defaults (i.e. default type argument values for type parameters). This test is for the latter, but the section you linked to above refers to the former. The behavior covered by this test is not optional.

@erictraut The section I linked is quoted directly above this test. The test is:

T4 = TypeVar("T4", default=int) def func1(x: int | set[T4]) -> T4: ... assert_type(func1(0), int)

I don't see where argument defaults come in? T4 goes unsolved, so a type checker may use the default.

Sorry, my bad. Your comment is correct.

davidhalter · 2025-10-10T18:56:09Z

conformance/results/zuban/generics_type_erasure.toml

+conformant = "Partial"
 notes = """
-True positive: Undefined type vars should be inferred as Never not Any (avoiding to introduce Any)
+Infers Node[Never] instead of Node[Any] when argument is not provided.


You wrote:

In generics_self_advanced.py, I added E? for errors reported only by zuban, which are labeled as a "true positive" in the notes in the corresponding results file.

I don't think the E? was added. I think it's fine to change the conformance for now, but I probably need to open an issue to discuss this. I actually think Mypy is right in this case and Pyright is wrong. Pyright assumes Any when this variable should really be Never, like pretty much all static type systems would assume and they would force the user to specify a type.

I personally think 99% of the conformance tests are correct, but this is one of the cases where I'm sure that Any sneaking in is not what we want.

Ah, sorry, I changed the PR during the review process but didn't update the description. Instead of adding # E?, I changed the zuban results file to mark conformance as "partial."

The specification says pretty explicitly that the default should be Any: https://typing.python.org/en/latest/spec/generics.html#instantiating-generic-classes-and-type-erasure. I'm sympathetic to the argument that Never should be allowed as well, but a change like that would have to go through the process described here for proposing a change to the spec. The conformance tests should follow what the spec says as closely as possible.

I'm not convinced that Node[Never] is the right answer here. There's not enough information for a constraint solver to solve the type in this case, so Never doesn't seem right to me. I think it's reasonable for a type checker to generate an error in this case if it wants to be strict. I also think it's reasonable for a type checker to solve the type as Any (or Unknown, to use pyright's term for an implied Any).

@rchen152 Yeah, I guess that was always my fear, hence the "true positive" label :) I wanted to discuss this in the future and didn't have time back then. I'm going to bring this back up soon.

@erictraut

I think it's reasonable for a type checker to generate an error in this case if it wants to be strict.

But isn't that what we should specify? Isn't strictness what we want to achieve as much as possible? We could also allow Mypy's --no-strict-optional and allow for a lot of uncertainty there, but I think it's good that we don't and we should probably strive more towards something like Mypy's --strict and try to get rid of any Any that is introduced anywhere. Any is the reason why these checkers exist in the first place and removing all of them should be our goal.

Summary: Pulls in updates from python/typing#2095: some test changes that improve our conformance a bit, plus a pyrefly.toml file so that pyrefly can't pick up a stray config file and do interesting things. Reviewed By: yangdanny97 Differential Revision: D84370023 fbshipit-source-id: 0e8d70a00db73469b6e13812f8cf5920090b0e73

rchen152 added 6 commits October 9, 2025 15:25

Update unexpected_fails.py to allow omitting the conformant field f…

f3991f9

…or passing tests.

Add pyrefly.toml.

48ea301

Update results for pyrefly 0.36.2.

4932db2

Update manual conformance.

f124c33

Fixes to tests for type errors and incorrect expectations.

4c1bd69

Regenerate results.

a01b0a4

JelleZijlstra reviewed Oct 9, 2025

View reviewed changes

Address review comments.

f6b4eae

JelleZijlstra approved these changes Oct 10, 2025

View reviewed changes

Remove no-longer-needed import.

baf8e33

srittau added the topic: conformance tests Issues with the conformance test suite label Oct 10, 2025

rchen152 merged commit ecc2129 into python:main Oct 10, 2025
4 checks passed

rchen152 deleted the update branch October 10, 2025 17:48

davidhalter suggested changes Oct 10, 2025

View reviewed changes

Update conformance tests #2095

Update conformance tests #2095

Uh oh!

Conversation

rchen152 commented Oct 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JelleZijlstra left a comment

Choose a reason for hiding this comment

Uh oh!

davidhalter commented Oct 10, 2025

Uh oh!

Uh oh!

davidhalter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidhalter Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

davidhalter Oct 11, 2025 •

edited

Loading