-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] eliminate is_fully_static #18799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
c767070
to
533344f
Compare
CodSpeed Instrumentation Performance ReportMerging #18799 will improve performances by 61.94%Comparing Summary
Benchmarks breakdown
|
ad90749
to
5c4e2bd
Compare
This seems to imply that Any is not a subtype of itself, because there are materializations of Any that are not subtypes of other materializations of Any. I think the right formulation is the following: given two gradual types |
Yes, it didn't seem correct to me to consider We do (in this PR, and before it) simplify
Yes, I can see the rationale here. If every materialization of And I think dual arguments apply for simplifying I will try out this definition tomorrow and see if there are any difficulties implementing it, and how it impacts this PR. It definitely has some nice properties, since it restores both reflexivity of subtyping, and also that equivalence implies subtyping. And it should make union/intersection simplification simpler, since it can be based only on subtyping, instead of requiring that we check both subtyping and equivalence. |
I think this is a key point. I was thinking of subtyping in terms of "can we say for sure that whatever type B materializes to is a subtype of whatever type A materializes to?" But if subtyping is really only used for simplifying unions and intersections, then we can instead define it in terms of "does this type contribute any information to the union/intersection that this other type doesn't?" or "can we simplify this union/intersection and still have an equivalent type?" -- and I think that leads us to your definition. |
I feel a version of that argument would imply that we can also simplify Some other thoughts that I need to develop further but that may be of interest to you:
And a general note: As you may have noticed, I recently nerdsniped myself into thinking a lot about the theory of the type system. It sounds like my ruminations have been useful for ty, but feel free to ignore my ideas if they distract you from building a practically useful type checker. :) |
I'm excited about the performance improvement to |
Yes, I think my argument on the lower bound was incomplete; we need that additional condition to ensure the lower bound of the union type does not grow by the removal of
Yes, with your definition of gradual subtyping, I think this is true.
I am curious to hear more about how you use it. Is it just for step 5 (where we use top materialization)?
Definitely! |
My approach predates the spec and doesn't follow it exactly, but it seems mostly compliant, though I need to test more. I haven't had time yet to verify whether it's actually equivalent to the spec. When I initially implemented this I wrote up an explanation at https://pyanalyze.readthedocs.io/en/latest/reference/signature.html#pyanalyze.signature.OverloadedSignature.check_call . Since then, I've changed the concept of a "match due to Any" to "a match that matches under assignability but not subtyping". This behaves similarly to step 5 in the spec but not sure if it's equivalent. I can write more about it when I get around to looking into this further. |
Recording a few observations here that arose in discussion with @sharkdp: Let's write "top materialization of S" as The definition of subtyping I implemented here so far corresponds to saying that It's interesting to note that neither definition is able to simplify (But this is all setting aside the fact that it's not clear how we can even define a bottom materialization for some Python gradual types, particularly invariant generics such as |
2780c68
to
d8f7a50
Compare
I've implemented the expanded form of "gradual subtyping" discussed above. It mostly seems to work out well -- all tests pass, union/intersection builder can now be simplified to use only subtyping and not a separate equivalence check, reflexivity of subtyping is restored for all types. But the property tests do reveal one problem: transitivity of subtyping is now broken by We could maybe work around this problem specifically for I'm not sure what can be done about this. The stricter form of gradual subtyping that I originally implemented here (requiring that Losing transitivity of subtyping is a serious problem for consistent simplification of unions and intersections, because it means the order in which types are added to a union/intersection can change the result of simplification. |
My plan here is to revert this PR back to the |
Yes I think that unfortunately makes sense. |
e0adf94
to
4f1fcf2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Thank you for the detailed write-up and the doc comments. And for trying out the S- <: T- && S+ <: T+
approach, even if it didn't work out. Very nice to see how this leads to a lot of simplification, too.
crates/ty_python_semantic/resources/mdtest/type_properties/is_equivalent_to.md
Outdated
Show resolved
Hide resolved
crates/ty_python_semantic/resources/mdtest/type_properties/is_gradual_equivalent_to.md
Outdated
Show resolved
Hide resolved
crates/ty_python_semantic/resources/mdtest/type_properties/is_subtype_of.md
Show resolved
Hide resolved
crates/ty_python_semantic/resources/mdtest/type_properties/is_subtype_of.md
Outdated
Show resolved
Hide resolved
crates/ty_python_semantic/resources/mdtest/type_properties/is_subtype_of.md
Outdated
Show resolved
Hide resolved
I won't look in depth since @sharkdp already has, but lmk if there's anything specific you'd like my feedback on! |
4f1fcf2
to
42f18ed
Compare
CodSpeed WallTime Performance ReportMerging #18799 will not alter performanceComparing Summary
|
68daa8a
to
b069742
Compare
b069742
to
bcd273c
Compare
Had to make some significant additional changes to UnionBuilder to fix issues found by property tests, but this now passes all tests and property tests. I'm going to go ahead and merge, but open to post-merge review comments! |
Summary
Having a recursive type method to check whether a type is fully static is inefficient, unnecessary, and makes us overly strict about subtyping relations.
It's inefficient because we end up re-walking the same types many times to check for fully-static-ness.
It's unnecessary because we can check relations involving the dynamic type appropriately, depending whether the relation is subtyping or assignability.
We use the subtyping relation to simplify unions and intersections. We can usefully consider that
S <: T
for gradual types also, as long as it remains true thatS | T
is equivalent toT
andS & T
is equivalent toS
.One conservative definition (implemented here) that satisfies this requirement is that we consider
S <: T
if, for every possible pair of materializationsS'
andT'
,S' <: T'
. Or put differently the top materialization ofS
(S+
-- the union of all possible materializations ofS
) is a subtype of the bottom materialization ofT
(T-
-- the intersection of all possible materializations ofT
). In the most basic cases we can usefully say thatAny <: object
and thatNever <: Any
, and we can handle more complex cases inductively from there.This definition of subtyping for gradual subtypes is not reflexive (
Any
is not a subtype ofAny
).As a corollary, we also remove
is_gradual_equivalent_to
--is_equivalent_to
now has the meaning thatis_gradual_equivalent_to
used to have. If necessary, we could restore anis_fully_static_equivalent_to
or similar (which would not do anis_fully_static
pre-check of the types, but would instead pass a relation-kind enum down through a recursive equivalence check, similar tohas_relation_to
), but so far this doesn't appear to be necessary.Credit to @JelleZijlstra for the observation that
is_fully_static
is unnecessary and overly restrictive on subtyping.There is another possible definition of gradual subtyping: instead of requiring that
S+ <: T-
, we could instead require thatS+ <: T+
andS- <: T-
. In other words, instead of requiring all materializations ofS
to be a subtype of every materialization ofT
, we just require that every materialization ofS
be a subtype of some materialization ofT
, and that every materialization ofT
be a supertype of some materialization ofS
. This definition also preserves the core invariant thatS <: T
implies thatS | T = T
andS & T = S
, and it restores reflexivity: under this definition,Any
is a subtype ofAny
, and for any equivalent typesS
andT
,S <: T
andT <: S
. But unfortunately, this definition breaks transitivity of subtyping, because nominal subclasses in Python use assignability ("consistent subtyping") to define acceptable overrides. This means that we may have a classA
withdef method(self) -> Any
and a subtypeB(A)
withdef method(self) -> int
, sinceint
is assignable toAny
. This means that if we have a protocolP
withdef method(self) -> Any
, we would haveB <: A
(from nominal subtyping) andA <: P
(Any
is a subtype ofAny
), but notB <: P
(int
is not a subtype ofAny
). Breaking transitivity of subtyping is not tenable, so we don't use this definition of subtyping.Test Plan
Existing tests (modified in some cases to account for updated semantics.)
Stable property tests pass at a million iterations:
QUICKCHECK_TESTS=1000000 cargo test -p ty_python_semantic -- --ignored types::property_tests::stable
Changes to property test type generation
Since we no longer have a method of categorizing built types as fully-static or not-fully-static, I had to add a previously-discussed feature to the property tests so that some tests can build types that are known by construction to be fully static, because there are still properties that only apply to fully-static types (for example, reflexiveness of subtyping.)
Changes to handling of
*args, **kwargs
signaturesThis PR "discovered" that, once we allow non-fully-static types to participate in subtyping under the above definitions,
(*args: Any, **kwargs: Any) -> Any
is now a subtype of() -> object
. This is true, if we take a literal interpretation of the former signature: all materializations of the parameters*args: Any, **kwargs: Any
can accept zero arguments, making the former signature a subtype of the latter. But the spec actually says that*args: Any, **kwargs: Any
should be interpreted as equivalent to...
, and that makes a difference here:(...) -> Any
is not a subtype of() -> object
, because (unlike a literal reading of(*args: Any, **kwargs: Any)
),...
can materialize to any signature, including a signature with required positional arguments.This matters for this PR because it makes the "any two types are both assignable to their union" property test fail if we don't implement the equivalence to
...
. BecauseFunctionType.__call__
has the signature(*args: Any, **kwargs: Any) -> Any
, and if we take that at face value it's a subtype of() -> object
, makingFunctionType
a subtype of() -> object)
-- but then a function with a required argument is also a subtype ofFunctionType
, but not a subtype of() -> object
. So I went ahead and implemented the equivalence to...
in this PR.Ecosystem analysis
bool | (bool & Unknown) | Unknown
to simplybool | Unknown
, because we can now observe that every possible materialization ofbool & Unknown
is still a subtype ofbool
(whereas before we would set asidebool & Unknown
as a not-fully-static type.) This is clearly an improvement.possibly-unresolved-reference
errors in sockeye, pymongo, ignite, scrapy and others are true positives for conditional imports that were formerly silenced by bogus conflicting-declarations (which we currently don't issue a diagnostic for), because we considered two different declarations ofUnknown
to be conflicting (we usedis_equivalent_to
notis_gradual_equivalent_to
). In this PR that distinction disappears and all equivalence is gradual, so a declaration ofUnknown
no longer conflicts with a declaration ofUnknown
, which then results in us surfacing the possibly-unbound error.enum.StrEnum
, which doesn't exist on 3.10. That makes theclass SimpleDigest(StrEnum)
a class that inherits fromUnknown
(and bypasses our current TODO handling for accessing attributes on enum classes, since we don't recognize it as an enum class at all). This PR improves our understanding of assignability to classes that inherit fromAny
/Unknown
, and we now recognize that a string literal is not assignable to a class inheritingAny
orUnknown
.