Skip to content

Conversation

geetu040
Copy link
Contributor

@geetu040 geetu040 commented Feb 26, 2025

What does this PR do?

This PR

  • fixes SamVisionSdpaAttention when output_attentions=True, fall back to eager implementation
  • refactors position embeddings in both attention layers to a single function
  • fixes tensorflow related tests: runs/13544413658
  • reflects these changes in TFSamVisionSdpaAttention and GotOcr2VisionAttention

Previously discussed here: #36248 (comment)

🚨 Breaking changes

add_decomposed_rel_pos public method of *Attention module was replaced with get_decomposed_rel_pos method, which will return pos embedding instead of adding it to the attention weights

Who can review?

@amyeroberts, @qubvel, @zucchini-nlp

@qubvel
Copy link
Contributor

qubvel commented Feb 26, 2025

Hi @geetu040, thanks for opening the PR!

@qubvel
Copy link
Contributor

qubvel commented Feb 26, 2025

run-slow: sam

@qubvel qubvel added the Vision label Feb 26, 2025
Copy link
Contributor

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/sam']
quantizations: [] ...

@geetu040
Copy link
Contributor Author

geetu040 commented Mar 3, 2025

run-slow: sam

@geetu040
Copy link
Contributor Author

geetu040 commented Mar 3, 2025

Hi @qubvel , I have fixed the failing tests at runs/13544413658. This PR is ready for review now and can you please run slow tests again?

Copy link
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

run-slow: sam

A question:

Comment on lines +1415 to +1417
counts += (
[cur_idxs[0].numpy().item()] + btw_idxs.numpy().tolist() + [height * width - cur_idxs[-1].numpy().item()]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need ot add .numpy()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because tensorflow tensors don't have a direct item() method and to access the scalar value, we need to do numpy().item()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok! just wondering if it was broken before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a bunch of other tensorflow bugs that I don't why didnot appear in the workflows before, except in #36493, and I have checked for different version of tensorflow they seem to be geniune bugs over all versions.

Copy link
Contributor

@qubvel qubvel Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, not sure if there any usage of TF Sam actually, so there might be some bugs indeed!

@qubvel
Copy link
Contributor

qubvel commented Mar 3, 2025

run-slow: sam

Copy link
Contributor

github-actions bot commented Mar 3, 2025

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/sam']
quantizations: [] ...

@geetu040
Copy link
Contributor Author

geetu040 commented Mar 3, 2025

@qubvel the custom tests have passed now, the failing ones seem unrelated

@qubvel
Copy link
Contributor

qubvel commented Mar 3, 2025

Nice! Thanks for fixing

@qubvel
Copy link
Contributor

qubvel commented Mar 3, 2025

Lets wait for those tests being fixed on main, and then merge it

@qubvel
Copy link
Contributor

qubvel commented Mar 4, 2025

cc @ArthurZucker for approval: changing add_decomposed_rel_pos to get_decomposed_rel_pos, which is a public method of a module

@geetu040
Copy link
Contributor Author

geetu040 commented Mar 6, 2025

@ArthurZucker a soft ping, since this blocks the failing tests in other PRs: #36248, #36493

@geetu040
Copy link
Contributor Author

Hi @qubvel, would it be possible to ping someone else for this? It seems that @ArthurZucker is quite busy and might not have had a chance to review it.

Since this is blocking other PRs (#36248 and #36493), it would be great if we could get it merged. Otherwise, if this will take some time, I can merge this branch into the other PRs and continue working from there.

@qubvel
Copy link
Contributor

qubvel commented Mar 13, 2025

cc @molbap or @zucchini-nlp if you have bandwidth

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking, no? We had add_decomposed_rel_pos as public method and now we're removing it completely

I am oke to break, don't think anyone actually was calling it separately. But we can add a 🔴 in PR title

else:
kwarg_value = "__empty__"
if kwarg_value != "__empty__":
if not isinstance(kwarg_value, str) or kwarg_value != "__empty__":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder, why this was needed? Even if kwargs_value is float, checking kwarg_value != "__empty__" is enough isn't it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was necessary because kwarg_value cannot be directly compared to "__empty__" if it is a TensorFlow tensor. Attempting such a comparison results in a TypeError:

tf.Variable([1, 2, 3, 4]) == "hello"
# TypeError: Cannot convert '__empty__' to EagerTensor of dtype int32

This issue occurred in the test suite test_modeling_tf_sam.py::TFSamModelIntegrationTest (slow tests for sam) within this workflow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, thanks for explanation

@geetu040 geetu040 changed the title Fix sdpa in sam and refactor relative position embeddings 🔴 Fix sdpa in sam and refactor relative position embeddings Mar 14, 2025
@geetu040 geetu040 requested a review from zucchini-nlp March 17, 2025 00:33
Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, sorry, meant to hit approve! LGTM as long as qubvel is okey with the changes

@qubvel qubvel changed the title 🔴 Fix sdpa in sam and refactor relative position embeddings 🚨🚨🚨 Fix sdpa in sam and refactor relative position embeddings Mar 17, 2025
@qubvel
Copy link
Contributor

qubvel commented Mar 17, 2025

Added a section in PR initial message with breaking change description, merging

@qubvel qubvel merged commit c53d53d into huggingface:main Mar 17, 2025
21 checks passed
kmehant pushed a commit to kmehant/transformers that referenced this pull request Mar 17, 2025
…gface#36422)

* fall back to eager if output_attentions

* improve relative position embeddings

* run modular on got_ocr2

* run-slow: sam

* fix run-length encoding

* fix tf processor errors

* update tf_sam

* fix compile error

* re-run tests

Signed-off-by: Mehant Kammakomati <[email protected]>
kmehant pushed a commit to kmehant/transformers that referenced this pull request Mar 17, 2025
…gface#36422)

* fall back to eager if output_attentions

* improve relative position embeddings

* run modular on got_ocr2

* run-slow: sam

* fix run-length encoding

* fix tf processor errors

* update tf_sam

* fix compile error

* re-run tests

Signed-off-by: Mehant Kammakomati <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants