Keren Zhou
|
baba98ad69
|
[Triton-MLIR] Fix threadsPerWarp derivation in BlockedEncodingAttr (#722)
Example:
```
auto encoding = triton::gpu::BlockedEncodingAttr::get(
&getContext(), {8, 32}, {2, 2}, {1, 0}, 2);
//shape = [32 x 8], order = [1, 0], sizePerThread=[2, 2], numWarps=2
```
Expected output:
```
//#triton_gpu.blocked_layout<{
// sizePerThread = {2, 2}
// threadsPerWarp = {8, 4}
// warpsPerCTA = {2, 1}
//}>
```
Incorrect output by the current branch
```
//#triton_gpu.blocked_layout<{
// sizePerThread = {2, 2}
// threadsPerWarp = {16, 2}
// warpsPerCTA = {2, 1}
//}>
```
|
2022-09-27 16:41:30 -07:00 |
|