[Triton-MLIR][Frontend] Interface fixes for libdevice (#829)
- Unifying several interfaces with different types to a single one, e.g. `fsub_ru` and `dsub_ru` -> `sub_ru`; - Minor bug fix: `fast_pow` is incorrectly classified into the `pow` interface, of which arguments are the same as `powf`; - Explicit interfaces for casting functions, e.g. decoupling `ll2float_ru` to `ll2float_ru` and `ull2float_ru`; - Removing interfaces that are not in NVIDIA's official documents, e.g. `fmaf_ieee_rn`, which is confusing together with `fmaf_rn`. Co-authored-by: Keren Zhou <kerenzhou@openai.com>
This commit is contained in:
@@ -13,8 +13,8 @@ def math_kernel(x1_ptr, x2_ptr, x3_ptr, x4_ptr, n, BLOCK_SIZE: tl.constexpr):
|
||||
|
||||
y1 = tl.sin(x1)
|
||||
y2 = tl.libdevice.sin(x2)
|
||||
y3 = tl.libdevice.fdiv_rn(x3, x3)
|
||||
y4 = tl.libdevice.fmaf_rd(x4, x4, x4)
|
||||
y3 = tl.libdevice.div_rn(x3, x3)
|
||||
y4 = tl.libdevice.fma_rd(x4, x4, x4)
|
||||
|
||||
tl.store(x1_ptr + offsets, y1, mask=offsets < n)
|
||||
tl.store(x2_ptr + offsets, y2, mask=offsets < n)
|
||||
|
Reference in New Issue
Block a user