Philippe Tillet
a0bab9748e
[OPTIMIZER] Coalesce pass no longer takes a num-warps
argument ( #99 )
...
Improved design to avoid inconsistent `num-warps` value between the pass and the parent module of the operation it processes.
2022-09-05 18:09:02 -07:00
Philippe Tillet
d0b4c67b05
[OPTIMIZER] Improved layout conversion simplification algorithm ( #97 )
...
This PR both simplifies the layout conversion simplification algorithm, and also improves it to make it work with vectorized element-wise ops. The conversion optimizer still has a lot of room for improvements, and other PRs will address its limitations (ideally via some sort of explicit cost model)
2022-09-02 16:52:44 -07:00
Shintaro Iwasaki
0ebef11c77
[TritonIR] Make mask operand optional ( #74 )
2022-08-22 22:00:17 -07:00
Da Yan
92ef552a54
[OPTIMIZER] Fix Num in AsyncWaitOp generated by the pipeline pass ( #72 )
2022-08-22 15:58:10 -07:00
Shintaro Iwasaki
9aa00249a6
[TritonIR] make other optional and remove isOtherUnspecified ( #67 )
...
[Triton] make other optional and remove isOtherUnspecified
2022-08-18 18:19:55 -07:00
Philippe Tillet
192be76b3c
[OPTIMIZER] Rewrite patterns for layout conversions ( #64 )
2022-08-18 12:49:37 -07:00
Da Yan
8776ad1a0e
[OPTIMIZER] Let the pipeline pass insert async wait. ( #63 )
2022-08-18 10:31:57 -07:00
Shintaro Iwasaki
d69ce77b19
[FRONTEND] add an attr for masked load without explicit other ( #55 )
2022-08-18 09:51:37 -07:00
Philippe Tillet
78ebbe24c7
[FRONTEND] Added ExpandDimsOp
primitive ( #36 )
2022-08-04 18:41:06 -07:00
Philippe Tillet
3236642e8f
[OPTIMIZER] Added memory coalescing pass ( #31 )
2022-07-31 20:59:31 -07:00
Philippe Tillet
d1593e6ca8
[TritonGPU] Improved documentation and semantics of layout encodings ( #30 )
2022-07-31 13:59:44 -07:00
Phil Tillet
65237f6117
[PACKAGING] Added FileCheck
2022-07-07 16:53:19 -07:00
Yan Da
26fcc12afd
better unit tests
2022-06-07 19:35:38 +08:00
Yan Da
0e11435448
more tests
2022-06-06 21:10:28 +08:00
Yan Da
7807f64ef3
rename sharded_layout => blocked_layout
2022-06-05 16:14:59 +08:00
Yan Da
bbf75b492f
more tests
2022-06-05 15:10:09 +08:00
Yan Da
d5eca56cf3
more TritonGPU unit tests
2022-06-05 14:25:09 +08:00