-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Introduce TypeRelation::Redundancy
#20602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
|
Lint rule | Added | Removed | Changed |
---|---|---|---|
invalid-argument-type |
12 | 0 | 32 |
unsupported-operator |
10 | 0 | 7 |
possibly-missing-attribute |
0 | 0 | 14 |
invalid-assignment |
0 | 0 | 12 |
invalid-return-type |
0 | 1 | 5 |
no-matching-overload |
0 | 3 | 0 |
possibly-missing-implicit-call |
0 | 0 | 1 |
Total | 22 | 4 | 71 |
CodSpeed WallTime Performance ReportMerging #20602 will improve performances by 18.36%Comparing Summary
Benchmarks breakdown
|
CodSpeed Instrumentation Performance ReportMerging #20602 will improve performances by 25.82%Comparing Summary
Benchmarks breakdown
Footnotes
|
f7b458a
to
32c55bc
Compare
Ecosystem analysisThis all looks good to me. Here's a slightly more minimal example of what's going on with the new schemathesis diagnostics. On from typing import Any
def f(x: dict[Any, Any] | dict[Any | str, Any | dict[Any, Any]]):
reveal_type(x) # dict[Any, Any] | dict[Any | str, Any | dict[Any, Any]]
reveal_type(x.items()) # dict_items[Any, Any] | dict_items[Any | str, Any | dict[Any, Any]]
for a, b in x.items():
reveal_type(a) # Any
reveal_type(b) # Any And on this PR we have this behaviour: from typing import Any
def f(x: dict[Any, Any] | dict[Any | str, Any | dict[Any, Any]]):
reveal_type(x) # dict[Any, Any] | dict[Any | str, Any | dict[Any, Any]]
reveal_type(x.items()) # dict_items[Any | str, Any | dict[Any, Any]]
for a, b in x.items():
reveal_type(a) # Any | str
reveal_type(b) # Any | dict[Any, Any]
The new diagnostics on |
63ecee5
to
fd4d504
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code simplification and performance wins are huge. My main holdup is from a type theoretic viewpoint. Right now we can define the two existing relations succinctly, both in English, and in terms of materializations:
-
subtyping: Every materialization of
S
is a subtype of every materialization ofT
. To prove that, we only have to show thatS+ ≤ T-
, because of transitivity. -
assignability: There is some materialization of
T
that every materialization ofS
is a subtype of. To prove that, we show thatS+ ≤ T+
.
Is there a similar definition that we can make for union simplification?
// Union simplification sits in between subtyping and assignability. `Any <: T` only | ||
// holds true if `T` is also a dynamic type or a union that contains a dynamic type. | ||
// Similarly, `T <: Any` only holds true if `T` is a dynamic type or an intersection | ||
// that contains a dynamic type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is the key bit? Can this be expressed in terms of the top/bottom materializations of the two types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yes: although Any & T
is not a subtype of Any
, for any type T
it is redundant to add Any & T
to a union that already contains Any
as the bottom materialization of (Any & T) | Any
will always be equivalent to the bottom materialization of Any
, and the top materialization of (Any & T) | Any
will always be equivalent to the top materialization of Any
.
I wrote this up in full as the doc-comment to TypeRelation::UnionSimplification
.
I suspect that the relation implemented here isn't theoretically consistent, but it's nonetheless practically useful, and so we should probably document the theoretical inconsistency and do it anyway. I think this goes back to the discussion on #18799 (and specifically this comment). There is a more generalized definition of subtyping for gradual types (proposed by @JelleZijlstra in that PR discussion) where the rule is that Unfortunately we can't fully use that definition, because it breaks down (and introduces non-transitivity) in the interaction between nominal and non-fully-static structural types. It introduces cases where a nominal type X is a (nominal) subtype of a nominal type Y, but overrides a gradual attribute of Y with a more precise type, in such a way that now Y is a subtype of some structural type P, but X is not. Which gives that If we could use that definition of subtyping, it would remove the need to check for both equivalence and subtyping in union simplification, because under that definition of gradual subtyping, any two equivalent types (even gradual ones) are subtypes of each other (that is, it generalizes reflexivity of subtyping.) I believe that the addition of the |
59fe101
to
57981a9
Compare
Okay -- I've had a go at thinking through and writing up some theoretical foundations for this! LMK what you think. |
/// The union simplification relation dictates whether the union `A | B` can be | ||
/// pragmatically simplified to the type `A` without downstream consequences on ty's | ||
/// inference of types elsewhere. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name is a bit unfortunate because it's hard to formulate other terms from it. I'll say that D
is union-simplifiable with regards to C
if the relation holds between D and C.
Then, I think the formal formulation can be that D union-simplifiable C
if C
is equivalent to C | D
. Equivalence, as you probably discuss somewhere, for fully static types means that the sets of values described by both types are the same; for dynamic types it means that the top and bottom materializations of both types are equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code a bit more and maybe "redundant with" is a good name. C
is redundant with D
if the union C | D
can be safely simplified to D
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name is a bit unfortunate because it's hard to formulate other terms from it
Agreed... but I'm struggling to think of anything better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code a bit more and maybe "redundant with" is a good name.
C
is redundant withD
if the unionC | D
can be safely simplified toD
.
Hmm, maybe TypeRelation::UnionRedundancy
? And we can say C
is union-redundant with D
if the union C | D
can be safely simplified to D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it also apply to intersections? That's why I wanted to avoid using the term "union".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does, yes, fair point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I think the relation is reversed for intersection, right? If we use ≤R
for the new redundancy relation, and =G
for gradual equivalence, then its
C ≤U D <=> (C | D) =G D
C ≤U D <=> (C & D) =G C
Mnemonics: the smaller thing is on the left of the relation; union makes things bigger; intersection makes things smaller
(That doesn't change Jelle's point that we don't need to include "union" in the name of the relation since it's relevant to intersections too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcreager, are you able to think of any test cases I could add for intersections (that don't already pass with the logic we have on main
)? I'm struggling to come up with any, and I can't tell if this is because my head is going fuzzy with all the set theory or because our DNF representation of unions and intersections actually makes the question redundant (🥁) when it comes to intersections.
If it's the latter, then UnionRedundancy
might not be such a bad name after all, and it might make the concept easier to reason about if we stated that it only applied to union simplification (not intersection simplification)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but... just because we can't think of any new test cases doesn't mean that it's wrong to replace the existing if x.is_subtype_of(y) || x.is_equivalent_to(y)
branches in our intersection simplification logic with the new x.is_redundant_with(y)
method. So it does seem like it's useful (purely from a performance perspective, if nothing else) to apply it to both unions and intersections, even if (unlike with unions) this doesn't lead to any semantic improvement over what we have on main
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re you able to think of any test cases I could add for intersections (that don't already pass with the logic we have on
main
)?
I can't think of any either. We can add those as a follow-on if we encounter any.
So it does seem like it's useful (purely from a performance perspective, if nothing else) to apply it to both unions and intersections, even if (unlike with unions) this doesn't lead to any semantic improvement over what we have on
main
.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct and useful to me, besides my suggestions about the name and phrasing.
/// The union simplification relation dictates whether the union `A | B` can be | ||
/// pragmatically simplified to the type `A` without downstream consequences on ty's | ||
/// inference of types elsewhere. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I think the relation is reversed for intersection, right? If we use ≤R
for the new redundancy relation, and =G
for gradual equivalence, then its
C ≤U D <=> (C | D) =G D
C ≤U D <=> (C & D) =G C
Mnemonics: the smaller thing is on the left of the relation; union makes things bigger; intersection makes things smaller
(That doesn't change Jelle's point that we don't need to include "union" in the name of the relation since it's relevant to intersections too)
0d0444c
to
a40bd9a
Compare
## Summary Currently we do not emit an error on this code: ```py from ty_extensions import Not def f[T](x: T, y: Not[T]) -> T: x = y return x ``` But we should do! `~T` should never be assignable to `T`. This fixes a small regression introduced in 14fe122#diff-8049ab5af787dba29daa389bbe2b691560c15461ef536f122b1beab112a4b48aR1443-R1446, where a branch that previously returned `false` was replaced with a branch that returns `C::always_satisfiable` -- the opposite of what it used to be! The regression occurred because we didn't have any tests for this -- so I added some tests in this PR that fail on `main`. I only spotted the problem because I was going through the code of `has_relation_to_impl` with a fine toothcomb for #20602 😄
e8ec64e
to
3ef117e
Compare
1d7d9d1
to
e6bd09f
Compare
3a769ea
to
3f54ca7
Compare
Seems like #20651 wasn't so uncontroversial 😆 and my head is going fuzzy with all the type theory there. On further reflection, I think it should be okay to land this without #20651, so I've rebased this on top of |
TypeRelation::UnionSimplification
TypeRelation::Redundancy
I think I addressed all review comments. This should be ready for another round of review. |
g: Invariant[Any] | Invariant[Any | str], | ||
h: Invariant[Any | str] | Invariant[Any], | ||
): | ||
reveal_type(a) # revealed: Bivariant[Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidentally, it would be nice to have a version of reveal_type
that takes in a type-form parameter, so that we don't have to jump through these hoops to get something we can pass in to typing_extensions.reveal_type
/// The union simplification relation dictates whether the union `A | B` can be | ||
/// pragmatically simplified to the type `A` without downstream consequences on ty's | ||
/// inference of types elsewhere. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re you able to think of any test cases I could add for intersections (that don't already pass with the logic we have on
main
)?
I can't think of any either. We can add those as a follow-on if we encounter any.
So it does seem like it's useful (purely from a performance perspective, if nothing else) to apply it to both unions and intersections, even if (unlike with unions) this doesn't lead to any semantic improvement over what we have on
main
.
Agreed
/// `D` can be said to be redundant in a union with `C` if the top materialization of the type | ||
/// `C | D` is equivalent to the top materialization of `C`, *and* the bottom materialization | ||
/// of `C | D` is equivalent to the bottom materialization of `C`. | ||
/// More concisely: `D <: C` iff `Top[C | D] == Top[C]` AND `Bottom[C | D] == Bottom[C]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way of saying this would be: D
can be said to be redundant in a union with C
if C | D
is gradually equivalent to C
.
The top and bottom materializations of two gradual types are the same if and only if they are gradually equivalent.
And the two conditions in the end could also be reformulated as: Top[D] <: Top[C] AND Bottom[D] <: Bottom[C]
, which would show that "redundancy" is equivalent to the "other subtyping" relation that @carljm experimented with in #18799 as well:
Relation | Conditions |
---|---|
Subtyping | Top[D] <: Bottom[C] |
Redundancy | Top[D] <: Top[C] AND Bottom[D] <: Bottom[C] |
Assignability | Bottom[D] <: Top[C] |
This also shows that subtyping always implies redundancy, and redundancy implies assignability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way of saying this would be:
D
can be said to be redundant in a union withC
ifC | D
is gradually equivalent toC
.
Correct (though gradual equivalence could, I think, be defined in terms of "mutual redundancy", and if we defined redundancy using gradual equivalence, that would be a little cyclic 😆).
This also shows that subtyping always implies redundancy, and redundancy implies assignability.
Yes, I was plannning on adding property tests to this effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reversing this logic also implies that gradual equivalence between C
and D
is equivalent to mutual redundancy between C
and D
. Which could hint at another strategy to re-implement gradual equivalence in ty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I just now saw your comment. Glad we both came up with the term "mutual redundancy" 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick summary of my post-merge review and discussion with Alex: Given that the redundancy operation described in this PR seems to be equivalent to the "S+ <: T+ && S- <: T-" subtyping from #18799, we should probably try to document why the same problem mentioned in that PR with transitivity of subtyping is not relevant here (probably because what has been implemented here does not fully match the defined redundancy relation).
Summary
The union
T | U
can be validly simplified toU
iff:T
is a subtype ofU
ORT
is equivalent toU
ORU
is a union and contains a type that is equivalent toT
ORT
is an intersection and contains a type that is equivalent toU
(In practice, the only situation in which 2, 3 or 4 would be true when (1) was not true would be if
T
orU
is a dynamic type.)Currently we achieve these simplifications in the union builder by doing something along the lines of
t.is_subtype_of(db, u) || t.is_equivalent_to_(db, u) || t.into_intersection().is_some_and(|intersection| intersection.positive(db).contains(&u)) || u.into_union().is_some_and(|union| union.elements(db).contains(&t))
. But this is both slow and misses some cases (it doesn't simplify the unionAny | (Unknown & ~None)
toAny
, for example). We can improve the consistency and performance of our union simplifications by adding a third type relation that sits in betweenTypeRelation::Subtyping
andTypeRelation::Assignability
:TypeRelation::UnionSimplification
.This change leads to simpler, more user-friendly types due to the more consistent simplification. It also lead to a pretty huge performance improvement!
Test Plan
Existing tests, plus some new ones.