[ty] Add new `hover` mdtest assertion #20786

dcreager · 2025-10-09T12:24:30Z

This PR adds a new hover mdtest assertion. This exercises the same logic as the hover LSP action. It's useful for when reveal_type would display a different infered type. (reveal_type is not transparent to bidirectional typing, so inserting a reveal_type call can sometimes change the type that we infer for an expression!)

I used Claude to help write large parts of this, but I have reviewed each step and the resulting diff in detail.

- Add Hover variant to UnparsedAssertion and ParsedAssertion enums - Create HoverAssertion struct with column and expected_type fields - Add HoverAssertionParseError enum for validation errors - Update from_comment() to recognize '# hover:' and '# ↓ hover:' patterns - Simplified design: down arrow must appear immediately before 'hover' keyword - Add placeholder matching logic in matcher.rs (to be completed) - Add PLAN.md to track implementation progress The hover assertion syntax uses a down arrow to indicate column position: # ↓ hover: expected_type expression_to_hover This will enable testing hover functionality in mdtest files, similar to how ty_ide tests work with <CURSOR> markers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Create CheckOutput enum with Diagnostic and Hover variants - Replace SortedDiagnostics with SortedCheckOutputs in matcher - Update match_file to accept &[CheckOutput] instead of &[Diagnostic] - Update matching logic to handle both diagnostics and hover results - Implement Unmatched trait for CheckOutput - Convert diagnostics to CheckOutput in lib.rs before matching This infrastructure allows hover assertions to be matched against hover results without polluting the DiagnosticId enum with test-specific IDs. The hover matching logic will be implemented in the next commit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Add find_covering_node() to locate AST nodes at specific positions - Add infer_type_at_position() to get type information using ty_python_semantic - Add generate_hover_outputs() to scan for hover assertions and generate results - Integrate hover outputs into check flow in run_test() - Implement hover matching logic in matcher.rs to compare types - Avoid adding ty_ide dependency; instead use ty_python_semantic directly The implementation extracts hover assertions from comments, computes the target position from the down arrow location, infers the type at that position using the AST and SemanticModel, and matches it against the expected type in the assertion. Hover assertions now work end-to-end in mdtest files! 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

All core hover assertion functionality is now implemented and compiling! 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Move hover-related functions from lib.rs into a new hover.rs module to improve code organization: - find_covering_node() - locate AST nodes at specific offsets - infer_type_at_position() - get inferred types using SemanticModel - generate_hover_outputs() - scan for hover assertions and generate results This keeps lib.rs focused on the main test execution flow. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Replace the large match statement over all AnyNodeRef expression variants with a simple call to as_expr_ref(). This helper method handles all expression-related variants automatically, reducing the function from ~60 lines to just 6 lines. Also removes statement-related variants (StmtFunctionDef, StmtClassDef, StmtExpr) to focus only on expression nodes, which is the primary use case for hover assertions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The previous implementation had a bug: it would overwrite `found` for every matching node in source order, which could incorrectly select a sibling node instead of the minimal covering node. Now use the same approach as ty_ide's covering_node: - Use leave_node() to detect when we've finished traversing a subtree - Set a `found` flag when leaving the minimal node to prevent further updates - This ensures we return the deepest (most specific) node that covers the offset, not just the last one visited in source order 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Replace the leave_node() approach with a simpler strategy: just compare the range lengths and keep the smallest node that covers the offset. This is more direct and easier to understand than tracking when we leave nodes. The minimal covering node is simply the one with the shortest range that contains the offset. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The previous implementation incorrectly assumed the target line was always line_number + 1, which breaks when multiple assertion comments are stacked on consecutive lines. Now generate_hover_outputs accepts the parsed InlineFileAssertions, which already correctly associates each assertion with its target line number (accounting for stacked comments). For the column position, we extract it directly from the down arrow position in the UnparsedAssertion::Hover text, avoiding the need to parse the assertion ourselves (parsing happens later in the matcher). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Replace nested if-let blocks with let-else + continue to reduce indentation and improve readability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

SortedDiagnostics is now replaced by SortedCheckOutputs, which handles both diagnostics and hover results. This refactoring: - Renames diagnostic.rs to check_output.rs to better reflect its purpose - Moves CheckOutput and SortedCheckOutputs definitions from matcher.rs to check_output.rs where they belong - Removes the now-unused SortedDiagnostics infrastructure - Ports the test to use SortedCheckOutputs instead of SortedDiagnostics - Updates all imports throughout the codebase The check_output module now serves as the central location for sorting and grouping all types of check outputs (diagnostics and hover results) by line number. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Create a dedicated HoverOutput struct to hold hover result data, replacing the inline fields in CheckOutput::Hover variant. This allows implementing Unmatched and UnmatchedWithColumn traits directly on HoverOutput, simplifying the CheckOutput implementations to simple delegation. Benefits: - Better separation of concerns - Cleaner trait implementations - More consistent with Diagnostic handling - Easier to extend HoverOutput in the future 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Move the HoverOutput type from check_output.rs to hover.rs where it logically belongs. The check_output module should only contain the CheckOutput enum and sorting infrastructure, while hover-specific types belong in the hover module. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Implement UnmatchedWithColumn for CheckOutput and update the column() method to work with CheckOutput instead of just Diagnostic. This allows eliminating the match statement when formatting unmatched outputs - we can now just call unmatched_with_column() on all outputs uniformly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Move the column calculation from generate_hover_outputs to the HoverAssertion parsing logic. This makes better use of the existing column field in HoverAssertion. Changes: - UnparsedAssertion::Hover now stores both the expected type and the full comment text - HoverAssertion::from_str() now takes both parameters and calculates the column from the down arrow position in the full comment - generate_hover_outputs() now reads the column from the parsed assertion instead of recalculating it This eliminates redundant calculations and makes the column field actually useful. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Import InlineFileAssertions, ParsedAssertion, and UnparsedAssertion at the module level instead of using fully qualified crate::assertion names throughout the code. This makes the code cleaner and more idiomatic. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Change the column field in HoverAssertion from OneIndexed to usize, storing it as a zero-based index. This matches how it's used and eliminates unnecessary conversions between zero-based and one-based indexing. The arrow position from find() is already zero-based, and TextSize uses zero-based offsets, so this is more natural. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The previous implementation calculated the down arrow position within the comment text, but then incorrectly used that as an offset into the target line. This only worked if the comment started at column 0. Now we properly calculate the column position by: 1. Finding the arrow offset within the comment text 2. Getting the comment's column position in its line 3. Adding these together to get the arrow's column in the line 4. Using that column to index into the target line This fixes hover assertions when comments are indented or don't start at the beginning of the line. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Move the column calculation from generate_hover_outputs into HoverAssertion::from_str() by passing LineIndex and SourceText to the parse() method. This simplifies the code and centralizes the column calculation logic in one place during parsing, rather than spreading it across multiple locations. The HoverAssertion now directly stores the final column position in the line, ready to use. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Convert the Hover variant from tuple-style to named fields for better clarity and self-documentation. Before: Hover(&'a str, &'a str, TextRange) After: Hover { expected_type, full_comment, range } This makes the code more readable and easier to maintain. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The column field was being treated inconsistently - calculated as a character offset but used as a byte offset. This would break on any line with multi-byte UTF-8 characters before the hover position. Fixes: 1. Use chars().position() instead of find() to get character offset 2. Use LineIndex::offset() with PositionEncoding::Utf32 to properly convert character offset to byte offset (TextSize) 3. Document that column is a UTF-32 character offset This ensures hover assertions work correctly with Unicode text. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Instead of manually calculating character offsets, use line_column() which does all the work for us: 1. Find the byte offset of the arrow in the comment text 2. Add that to the comment's TextRange to get the arrow's absolute position 3. Call line_column() on that position to get the character column This is much simpler and lets line_column handle all the UTF-8/UTF-32 conversion complexity. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Store the hover assertion column as OneIndexed since that's the type returned by line_column() and required by SourceLocation. This eliminates unnecessary conversions between zero-based and one-based indexing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Update match_line signature to accept &[CheckOutput] instead of &LineCheckOutputs for consistency with the previous API that took &[Diagnostic]. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Elide unnecessary explicit lifetimes in find_covering_node - Add backticks around CheckOutputs in doc comment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Update SortedCheckOutputs to store references to CheckOutput instead of owned values, matching the design of the previous SortedDiagnostics implementation. This avoids unnecessary cloning and makes the API more consistent. Changes: - SortedCheckOutputs now stores Vec<&CheckOutput> - new() takes IntoIterator<Item = &CheckOutput> - LineCheckOutputs.outputs is now &[&CheckOutput] - Implement Unmatched and UnmatchedWithColumn for &CheckOutput - Update match_line to take &[&CheckOutput] 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

When hovering, if we find a statement node (like StmtExpr), extract the expression from within it. This allows hover assertions to work on standalone expressions like variable references. Also add a simple working mdtest to demonstrate hover assertions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add hover.md with examples of hover assertions across different expression types. Some tests still need arrow alignment fixes, but many sections are passing including basic literals, function definitions, comprehensions, and the simple hover test file. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

When hovering on an attribute name like 'value' in 'instance.value', the minimal covering node is the Identifier, but we need the Attribute expression to get the type. Update find_covering_node to track both the minimal node and minimal expression, preferring the expression. This fixes hover on attribute accesses. All hover tests now pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

github-actions · 2025-10-09T12:26:32Z

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

github-actions · 2025-10-09T12:27:28Z

`mypy_primer` results

No ecosystem changes detected ✅
No memory usage changes detected ✅

github-actions · 2025-10-09T12:38:35Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

AlexWaygood

Interesting! I haven't reviewed the new ty_test code thoroughly yet, but would you be able to give some more examples of where the current behaviour of reveal_type is problematic?

AlexWaygood · 2025-10-09T12:39:16Z

crates/ty_python_semantic/resources/mdtest/hover.md

+## Different results for `reveal_type` and `hover`
+
+```py
+from typing import overload
+
+def f(x: dict[str, int]) -> None: ...
+
+# revealed: dict[Unknown, Unknown]
+f(reveal_type({}))
+
+# ↓ hover: dict[str, int]
+f({})
+```


This example confuses me somewhat, because the introduction to this mdtest suite states that

You can use the hover assertion to test the infered type of an expression

But dict[str, int] is not the inferred type of {} here. The inferred type of {} here is dict[Unknown, Unknown] (the answer reveal_type gave), but we allow {} to be passed into the x parameter of f because dict[Unknown, Unknown] is assignable to dict[str, int], the annotation for the x parameter.

As of #20635, we're doing bidirectional type checking for call arguments, and using the parameter annotation as the type context. And as of #20523, we're using the type context to infer a more precise type for dictionary literals. So dict[str, int] really is the infered type of {} here, when it's passed directly in as the argument. But when passing it to reveal_type, the parameter annotation is just _T, and so we don't have the more precise type context.

That might suggest that we want to propagate the type context through these kinds of inner calls (not just for reveal_type), but even then, with bidi checking it's now the case that we can have different infered types for the same expression depending on how its used, and it's useful to have an assertion that cannot influence that type context in any way.

That might suggest that we want to propagate the type context through these kinds of inner calls (not just for reveal_type)

That sounds like it might make sense to me... It seems quite counterintuitive to me that hovering over the first [""] in the playground for this example gives a different type than hovering over the second [""]:

from typing import reveal_type def needs_list_of_strings(x: list[str | None]): ... def identity[T](x: T) -> T: return x needs_list_of_strings([""]) needs_list_of_strings(identity([""]))

https://play.ty.dev/53299738-b406-4e46-ba40-c0240225f0ae

with bidi checking it's now the case that we can have different infered types for the same expression depending on how its used, and it's useful to have an assertion that cannot influence that type context in any way

Fair enough! The additional code in ty_test to make it work also isn't free, though -- there's a maintenance burden there, and a small cognitive burden (I now have to think about whether a reveal_type assertion or a hover assertion will be more appropriate when I'm writing my tests!). I still feel like I'd be interested in a few more examples of where the current reveal_type behaviour is causing issues for your generics work.

As of #20635, we're doing bidirectional type checking for call arguments, and using the parameter annotation as the type context. And as of #20523, we're using the type context to infer a more precise type for dictionary literals. So dict[str, int] really is the infered type of {} here, when it's passed directly in as the argument. But when passing it to reveal_type, the parameter annotation is just _T, and so we don't have the more precise type context.

Thanks BTW -- I hadn't fully internalised that this was the consequence of those changes! Makes sense.

crates/ty_python_semantic/resources/mdtest/hover.md

sharkdp · 2025-10-09T14:44:52Z

That might suggest that we want to propagate the type context through these kinds of inner calls (not just for reveal_type)

Yes, this sounds like something that we're going to want/need anyway

but even then, with bidi checking it's now the case that we can have different infered types for the same expression depending on how its used, and it's useful to have an assertion that cannot influence that type context in any way.

… but assuming that we would already propagate the type context through a reveal_type call (which acts like an identity function), we could also solve this use case with reveal_type, right? Then there would be no need for hover-assertions anymore? You could then do:

x: list[Literal[1]] = reveal_type([1, 1])  # revealed: list[Literal[1]]
y: list[int] = reveal_type([1, 1])  # revealed: list[int]

dcreager · 2025-10-09T17:07:46Z

assuming that we would already propagate the type context through a reveal_type call (which acts like an identity function), we could also solve this use case with reveal_type, right? Then there would be no need for hover-assertions anymore? You could then do:

Yes, but... 😂

The particular example I want to write a (hover) test for is this:

def f[T](x: dict[str, T]) -> None: ...

# TODO: should be dict[str, Unknown]
# ↓ hover: dict[str, T@f]
f({})

In a perfect world, I should be able to do this:

# TODO: should be dict[str, Unknown]
# revealed: dict[str, T@f]
f(reveal_type({}))

But this example is more complex than the annotated assignments you mention. We don't just need to propagate the type context through from the outer call to the inner call; we also need to solve the specialization inference for both calls simultaneously (T@f and _T@reveal_type). (Or maybe allow that the inner one can refer to typevars of the outer one?)

I agree that we need to make that work — but it also seemed worth having an escape hatch that lets us test these expressions without relying on all of that machinery working correctly.

Maybe the escape hatch is to write that test over in ty_ide, where we already have hover tests that use <CURSOR> to indicate where the hover is triggered? But that also seemed less than ideal, since the test wouldn't really be exercising the LSP action; it's a test of our type inference behavior, which seems to properly belong in a ty_python_semantic mdtest.

dcreager · 2025-10-09T17:09:05Z

Maybe the escape hatch is to write that test over in ty_ide

Or alternatively, maybe the escape hatch is just to hover in the playground and write in the PR description that I did that

sharkdp · 2025-10-09T17:21:58Z

I am certainly not opposed to this approach here, but it does feel like a rather large addition (judging from the line diff alone) for a seemingly small use case.

over in ty_ide, where we already have hover tests that use <CURSOR> to indicate where the hover is triggered

Maybe there is actually interest in porting those tests to your framework here? That would certainly be a strong argument in favor.

AlexWaygood · 2025-10-09T17:32:19Z

I am certainly not opposed to this approach here, but it does feel like a rather large addition (judging from the line diff alone) for a seemingly small use case.

(I'm also a little surprised that mypy, for example, has never needed this feature in its test suite. Mypy has a test framework that's pretty darn similar to mdtest, and a test suite that's much bigger than ours, but it still just uses reveal_type nearly everywhere)

MichaReiser

Having the ability to write hover tests in mdtests is nice. However, it adds a bit of confusion: Am I supposed to write a mdtest-hover test or a ty_ide hover test when making changes to hover and why? Do we need both?

Maybe there is actually interest in porting those tests to your framework here? That would certainly be a strong argument in favor.

Having the ability to write them in mdtests would certainly be nice. However, they often test for more than just the type (e.g. doc comments). This makes it difficult to replace them entirely unless we support snapshotting the entire hover but that doesn't seem that urgent right now.

MichaReiser · 2025-10-10T06:31:05Z

crates/ty_python_semantic/resources/mdtest/hover.md

Should these tests be in ty_ide? I'd find it confusing that a change in ty_ide can break a test in ty_python_semantic. It would probably take me a while to realize that a test in ty_python_semantic isn't failing because of a change in ty_python_semantic (which would be the first place where I go look), but because of a change in ty_ide.

MichaReiser · 2025-10-10T06:36:45Z

crates/ty_python_semantic/resources/mdtest/hover.md

+    # ↓ hover: int
+    parameter
+
+    #        ↓ hover: Literal[10]


I don't have a good suggestion as an alternative, but a downside of the arrow is that I'd always have to copy paste it from somewhere because I don't know how to type it on a keyboard.

see #20786 (comment) :-)

dcreager and others added 30 commits October 8, 2025 12:19

Update PLAN.md - mark steps 4 & 5 complete

6e73b85

All core hover assertion functionality is now implemented and compiling! 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

[ty_test] Use let-else pattern in generate_hover_outputs

ab26136

Replace nested if-let blocks with let-else + continue to reduce indentation and improve readability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

[ty_test] Fix clippy warnings in hover module

180d9de

- Elide unnecessary explicit lifetimes in find_covering_node - Add backticks around CheckOutputs in doc comment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

[ty] Update PLAN.md with testing progress

b18d213

clean up mdtest

6ddf729

dcreager and others added 7 commits October 8, 2025 15:54

use ty_ide

8426cc6

update tests

7b88440

remove finished plan

39353da

precommit

1810b7c

update tests

fbc211f

clean up the diff

b438631

dcreager requested review from AlexWaygood and carljm as code owners October 9, 2025 12:24

dcreager added the internal An internal refactor or improvement label Oct 9, 2025

dcreager requested review from MichaReiser and sharkdp as code owners October 9, 2025 12:24

dcreager added the ty Multi-file analysis & type inference label Oct 9, 2025

AlexWaygood reviewed Oct 9, 2025

View reviewed changes

dcreager added 3 commits October 9, 2025 09:55

reword some docs

34b463d

spelling

0817a36

clean up some claude-isms

eb95e49

MichaReiser reviewed Oct 10, 2025

View reviewed changes

[ty] Add new hover mdtest assertion #20786

Are you sure you want to change the base?

[ty] Add new hover mdtest assertion #20786

Uh oh!

Conversation

dcreager commented Oct 9, 2025

Uh oh!

github-actions bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diagnostic diff on typing conformance tests

Uh oh!

github-actions bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

github-actions bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

AlexWaygood left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexWaygood Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

dcreager Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

AlexWaygood Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

AlexWaygood Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sharkdp commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcreager commented Oct 9, 2025

Uh oh!

dcreager commented Oct 9, 2025

Uh oh!

sharkdp commented Oct 9, 2025

Uh oh!

AlexWaygood commented Oct 9, 2025

Uh oh!

MichaReiser left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

MichaReiser Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

sharkdp Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ty] Add new `hover` mdtest assertion #20786

[ty] Add new `hover` mdtest assertion #20786

github-actions bot commented Oct 9, 2025 •

edited

Loading

github-actions bot commented Oct 9, 2025 •

edited

Loading

`mypy_primer` results

github-actions bot commented Oct 9, 2025 •

edited

Loading

`ruff-ecosystem` results

AlexWaygood left a comment •

edited

Loading

sharkdp commented Oct 9, 2025 •

edited

Loading

MichaReiser left a comment •

edited

Loading