- 
                Notifications
    
You must be signed in to change notification settings  - Fork 5.9k
 
[XPU] Add layernorm_relu pass and kernel #68451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
           Sorry to inform you that 7695d5e's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.  | 
    
956209b    to
    6eb5f41      
    Compare
  
    | 
           lgtm  | 
    
a1fff89    to
    42c442f      
    Compare
  
    | 
               | 
          ||
| if __name__ == "__main__": | ||
| np.random.seed(200) | ||
| unittest.main() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from paddle.framework import core
if core.get_xpu_device_version(0) == core.XPUVersion.XPU2:
unittest.main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| 
               | 
          ||
| from paddle.framework import core | ||
| 
               | 
          ||
| 
               | 
          
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@unittest.skipIf(
not core.get_xpu_device_version(0) == core.XPUVersion.XPU2,
"XpuLayerNormReluFuse only support XPU2",
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
36380fe    to
    dff6f2d      
    Compare
  
    
PR Category
Performance Optimization
PR Types
New features
Description
新增融合算子,将layer_norm和relu进行了融合。在xpu平台上,经过实测,性能有提升