-
Notifications
You must be signed in to change notification settings - Fork 296
Simple modification #2385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple modification #2385
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @pass-lin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request enhances the model loading capabilities by documenting ModelScope as a valid preset source and improves the computational efficiency of the SAM model by enabling bfloat16 inference. Additionally, it refines the core tensor padding utility and related preprocessing layers, ensuring more consistent and predictable data handling across the library.
Highlights
- ModelScope Integration: Added ModelScope as a supported download source for presets in the
Backbone.from_preset
method documentation, expanding the range of accessible models. - BF16 Inference for SAM Model: Enabled bfloat16 (bf16) inference support for the SAM (Segment Anything Model) by passing the
dtype
to theRandomFrequencyPositionalEmbeddings
layer in the prompt encoder. - Improved Tensor Padding Logic: Refactored the
pad
utility function to provide more robust and explicit handling of left-side padding, ensuring correct tensor shapes and values. This includes changes to how padding is applied and the introduction of anaxis
parameter. - Preprocessing Layer Refinements: Made minor adjustments to
MultiSegmentPacker
by simplifying the shape definition and toStartEndPacker
by ensuringpad_value
defaults to 0 if not provided. - Test Case Updates: Updated various test cases for
MultiSegmentPacker
andStartEndPacker
to align with the refined padding logic, ensuring the expected outputs match the new behavior.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces two main changes: adding ModelScope as a download source and enabling bfloat16 inference for the SAM model. The changes for these features are well-implemented. However, there are significant issues with the modifications to the padding logic in tensor_utils.py
. The new implementation for left-padding is incorrect and effectively performs right-padding, which is a critical bug. Several tests have been updated to match this incorrect behavior, and they should be reverted along with the fix. Additionally, a related change in start_end_packer.py
introduces a bug for string data types.
def pad(x, shape, padding_side, pad_value, axis=-1): | ||
if padding_side == "left" and pad_value is not None: | ||
x = x[..., ::-1] | ||
|
||
outputs = x.to_tensor( | ||
default_value=pad_value, | ||
shape=shape, | ||
) | ||
|
||
if padding_side == "left": | ||
outputs = x.to_tensor( | ||
default_value=pad_value, | ||
) | ||
outputs = outputs[..., ::-1] | ||
padding_shape = [tf.shape(outputs)[0]] + [1] * (len(outputs.shape) - 1) | ||
padding_shape[axis] = shape[axis] - tf.shape(outputs)[axis] | ||
padding_shape = tf.cast(padding_shape, "int64") | ||
padding = tf.fill(padding_shape, pad_value) | ||
padding = tf.cast(padding, outputs.dtype) | ||
outputs = tf.concat([outputs, padding], axis=axis) | ||
else: | ||
outputs = x.to_tensor( | ||
default_value=pad_value, | ||
shape=tf.cast(shape, "int64"), | ||
) | ||
return outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated implementation of pad
for padding_side="left"
is incorrect. It appears to perform right-padding instead of left-padding. The logic first left-pads the tensor to the maximum sequence length within the batch, and then concatenates additional padding to the right to match the target shape
. This results in mixed padding, not pure left-padding.
This bug causes tests in multi_segment_packer_test.py
and start_end_packer_test.py
to fail, and they have been modified to expect this incorrect right-padding behavior. The tests should be reverted to their original state once this padding logic is corrected.
To fix this, the previous implementation strategy of reversing the tensor, padding, and then reversing back should be restored. This is a standard and correct way to achieve left-padding.
Here is a suggested implementation that restores the correct behavior:
def pad(x, shape, padding_side, pad_value, axis=-1):
shape = tf.cast(shape, "int64")
if padding_side == "left":
# Reverse, pad, and reverse back is a standard way to do left padding.
x = x[..., ::-1]
outputs = x.to_tensor(
default_value=pad_value,
shape=shape,
)
outputs = outputs[..., ::-1]
else: # padding_side == "right"
outputs = x.to_tensor(
default_value=pad_value,
shape=shape,
)
return outputs
@@ -141,7 +141,7 @@ def check_special_value_type(value, value_name): | |||
self.start_value = start_value | |||
self.end_value = end_value | |||
|
|||
self.pad_value = pad_value | |||
self.pad_value = pad_value or 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change introduces a bug for string-based tensors. When pad_value
is None
, it will be set to 0
. If the layer is then used with string inputs, this will cause a TypeError
during padding, as you cannot pad a string tensor with an integer 0
.
The previous implementation was correct. The default value for padding should be handled where the tensor's dtype
is known, or be left as None
for the underlying padding function to handle (which tf.ragged.to_tensor
does correctly, using 0
for numeric types and ""
for strings).
self.pad_value = pad_value or 0 | |
self.pad_value = pad_value |
Mainly two simple modifications