Skip to content

Conversation

kripken
Copy link
Member

@kripken kripken commented Aug 29, 2025

This adds descriptor support to struct-utils.h as an extra field. It is
reported as index "-1" when an index is used in the API, allowing
passes to skip it where irrelevant.

That new support is then used in a single pass, GTO, where we just
see if a descriptor can be removed, and remove it if so.

Diff without whitespace is smaller.

@kripken
Copy link
Member Author

kripken commented Aug 29, 2025

Oh, the fuzzer found a subtyping issue here. Looking into it...

@@ -320,6 +362,10 @@ template<typename T> class TypeHierarchyPropagator {
work.push(*superType);
}
}
// Propagate the descriptor.
if (superInfos.desc.combine(infos.desc)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to check whether the supertype actually has a descriptor first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is safe either way: if there is no descriptor, there is nothing to propagate.

Given the propagation we do is typically copying a few booleans, I'm not sure if adding a check would make things faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so worried about speed, but rather modeling things accurately. It would be confusing to find during debugging that a type with no descriptor has descriptor information. One option would be to put the desc info in a std::optional, but I suppose that users of this utility will have their own default "bottom" information that works just as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Yeah, maybe a std::optional would make debugging clearer. But that would add overhead and code clutter. I added a comment, maybe that's good enough?

@@ -333,6 +379,10 @@ template<typename T> class TypeHierarchyPropagator {
work.push(subType);
}
}
// Propagate the descriptor.
if (subInfos.desc.combine(infos.desc)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for both supertypes and subtypes we can probably skip combining descriptor values if the original type does not have a descriptor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as above)

Comment on lines 403 to 404
// To remove a descriptor, it must not be used in either subtypes or
// supertypes, to not break validation. It must also have no write (see
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still be able to remove a descriptor even if subtypes need to keep their descriptors, as long as we can remove the descriptors from all the supertypes.

It would also be great if we could take reference exactness into account and avoid propagating info to supertypes and subtypes when we don't need to.

Copy link
Member Author

@kripken kripken Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misreading you, but are you suggesting this should work?

(module
  (rec
    (type $A (sub (descriptor $A.desc (struct))))
    (type $A.desc (sub (describes $A (struct ))))

    (type $B (sub $A (descriptor $B.desc (struct))))
    (type $B.desc (sub (describes $B (struct))))
  )

  (func $test
    (drop
      (struct.new $A
        (struct.new $A.desc)
      )
    )
    (drop
      (ref.get_desc $B ;; force $B's descriptor to remain
        (struct.new $B
          (struct.new $B.desc)
        )
      )
    )
  )
)

Here there is a subtype of $A that needs a descriptor, but $A itself does not. But this does not validate: $B.desc must be a subtype of $A.desc. And, once that is added, we cannot optimize, as the descriptor subtyping blocks us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$B.desc only needs to be a subtype of $A's descriptor if $A has a descriptor. This is perfectly valid:

  (rec
    (type $A (sub (struct)))

    (type $B (sub $A (descriptor $B.desc (struct))))
    (type $B.desc (sub (describes $B (struct))))
  )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you removed $A.desc and $B.desc's subtyping of it. Yeah, that should work, but only if we unsubtype here at the same time? I'm not sure if we want this pass to do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. But then Unsubtyping won't be able to do this either because it doesn't modify descriptor/describes relationships.

One option would be to handle unneeded descriptors that need to remain descriptors for validation purposes by giving them a trivial new type to describe. That would let the original described type be optimized and allow future optimization passes to clean up the descriptor type and the trivial new described type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the program requires $B <: $A and requires that $B.desc remain the descriptor of $B, then unsubtyping won't be able to optimize anything because $B <: $A implies $B.desc <: $A.desc. Then GTO won't be able to do anything either because keeping $B.desc as $B's descriptor forces keeping $A.desc as $A's descriptor with the current code. If we added the trivial described types, then --gto --unsubtyping would be able to fully optimize this case as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks. Makes sense.

This would need the new trivial described type to still be a supertype of the sub-descriptor, correct? It seems tricky to get that right, especially if there are other uses (say, casts) on the descriptor types...

Let's leave all this for a TODO perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the new trivial described types shouldn't need to have any supertypes or subtypes. Leaving this as a TODO sounds good, although some of the new propagation infrastructure wouldn't be necessary if we did this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, please see the commit before last entitled tlively's idea? - does that do what you were thinking?

It doesn't quite work - some bug with temp types deep in the TypeBuilder - but I posted it because I don't see how

some of the new propagation infrastructure wouldn't be necessary

What code should have been deleted? This seems to just add code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you still need to propagate needed up and down the descriptor hierarchy to determine which types need trivial described types, so I was wrong about being able to delete code :(

See separate comments on the implementation of the idea.

)
)

;; As above, but use $B's. This also stops everything.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to optimize $A here!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same issue as above with my wat example, I think)

@kripken
Copy link
Member Author

kripken commented Aug 29, 2025

(bug fixed where descriptors subtyped but not the main types)

}

// Propagate.
descPropagator.propagateToSuperAndSubTypes(map);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, propagating descriptor info up to types without descriptors actually produces worse results here. If two different types have descriptors and they share a supertype without a descriptor, we should not propagate constraints on one descriptor to the other.

(type $super (sub (struct)))

(type $A (sub $super (descriptor $A.desc (struct))))
(type $A.desc (describes $A (struct)))

(type $B (sub $super (descriptor $B.desc (struct))))
(type $B.desc (describes $B (struct)))

Here propagateToSuperAndSubTypes will currently propagate descriptor information from $A to $B and vice versa. But that's overly conservative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Actually it doesn't matter in this particular case because the property being propagated is not tied to the descriptor field, but I guess this reminded me of the issue.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is descPropagator: it propagates along descriptors. In your example, the describees are where there is subtyping.

And, for the descriptors, I think this problem can't happen? Doesn't a supertype of a descriptor need to be a descriptor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like our last comments here raced. Are you saying this might happen elsewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I realized after I wrote the opening comment that there's no problem with this particular code. But e.g. using $A.desc should not force $B.desc to be kept around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yes, now I see. Fixed in the last commit: now that we have proper descriptor subtype propagation (in the code after it in the source), we can propagate describees only to subtypes, allowing that optimization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still fix the propagator not to propagate descriptor information through types that don't have descriptors, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d0ec85a

I don't quite see how to test that, though, but it is at least good for efficiency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might become more important in CFP?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, other passes might notice, good point.

}

// Propagate.
descPropagator.propagateToSuperAndSubTypes(map);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still fix the propagator not to propagate descriptor information through types that don't have descriptors, though.

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments on tlively's idea, let's see how GitHub renders them...

// that, we add a new type A2 for A.desc to describe, which keeps
// the property that A.desc and B.desc are a parent/child pair of
// descriptors, which is necessary for validation.
haveUnneededDescriptors[*described] = HeapType(described->getStruct());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to copy $A's contents. We can always just use trivial empty structs (which means that haveUnneededDescriptors can remain a set).

Comment on lines 575 to 578
if (value) {
value = getTempHeapType(*value);
}
typeBuilder.setDescribed(i, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right. You need to grow the type builder to make space for the new described type, put an empty struct in that new last entry of the type builder, set it to be described by the current type, and set the current type to describe the new type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - this is definitely not as simple as I'd hoped. We need to add some kind of new hook in TypeRewriter that lets the user grow the TypeBuilder, and also to somehow set up the right order (we can't append the described type after the descriptor).

I added a TODO for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, that ordering makes it extra tricky...

Comment on lines +53 to +55
// combine() a non-existent descriptor, we are doing unneeded work, but the
// data here is typically just a few bools, so it is simpler and likely
// faster to just copy those rather than check if the type has a descriptor.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer combine when the descriptor does not exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeHierarchyPropagator doesn't do so below, that's correct, but StructValuesMap still does.

Co-authored-by: Thomas Lively <[email protected]>
@kripken kripken merged commit 7ac88c4 into WebAssembly:main Sep 2, 2025
16 checks passed
@kripken kripken deleted the gto.desc branch September 2, 2025 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants