Skip to content

Conversation

@lj970926
Copy link
Contributor

PR Category

Custom Device

PR Types

Bug fixes

Description

  1. fix bugs on depthwise conv tests.
  2. change the default quant type of all related Ops.

{"XPU_PADDLE_FC_INT32_WITH_LL", XPUFCCalcType::FC_INT32_WITH_LL},
};
#ifdef PADDLE_WITH_XPU_XRE5
auto default_calc_type = XPUFCCalcType::FC_FLOAT;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TF32下目前会有一大批XPU单测跑不过,因此对FP32类型目前先默认使用FP32量化

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下个pr在这里加个TODO的注释吧,否则之后容易忘

Copy link
Contributor

@HarperCy HarperCy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@cqulilujia cqulilujia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

};

template <typename T>
XPUFCCalcType FCCalcType() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原版在确定量化类型的时候,是按照这一大堆if-else-if-else的逻辑,顺序执行的。按照新的写法,换成了unordered_map,会不会导致某些非常特殊的情况下(比如,同时刷了多个环境变量),行为发生变化?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用unordered_map确实为造成不同环境变量之间优先级不明确的问题,要不下个PR我吧unordered_map替换成vector,从而能够明确环境变量之间的优先级。

此外之前的那套环境变量选择的逻辑实际上也是有问题的,比如BF16类型可能选到int32等类型的量化,但是这些量化类型BF16根本不支持,可能会踩到unimplemented。所以新版本即使固定了优先级可能也不会和之前完全一样

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

收到。所以借这个机会能清理出一套比较明确的优先级和判定策略也是个好事,不必和之前的完全一致。

@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Feb 10, 2025

Sorry to inform you that 2fdd753's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.



@contextlib.contextmanager
def xpu_matmul_quant_type_guard(dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没太理解到这个的作用,为啥只关注了XPU_PADDLE_FC_FLOAT这个环境变量,以及其它对单测的修改,什么时候需要套这个装饰器,什么时候不套它呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个主要是考虑到像类似FA的单测,FC仅仅用来算baseline,与要测的算子精度无关。这里选择套一个这个guard,让计算baseline的时候全部使用最高精度的量化方式,可以不用改单测阈值。

目前只关注了FP32量化,后续看情况这个guard也可以指定其他量化类型

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow 🐾

@QingshuChen QingshuChen merged commit efd4815 into PaddlePaddle:develop Mar 7, 2025
32 of 33 checks passed
YqGe585 pushed a commit to YqGe585/Paddle that referenced this pull request May 7, 2025
…addlePaddle#70859)

* [XPU] fix bugs of depthwise conv test and change default quant type

* fix typo

* change default quant to float for fp32

* fp32 use tf32

* fix some ci bugs

* fix more ci bugs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants