Keren Zhou
02ebf24d35
Analyze shared memory alias ( #81 )
...
The purpose of this PR is analyzing shared memory aliases so that we can
fix memory allocation bugs and save memory allocations in triton code
involving complex control flows.
Changes to memory bar and allocation are on the way.
Co-authored-by: Philippe Tillet <phil@openai.com >
2022-08-29 10:43:20 -07:00
Philippe Tillet
83287d7193
[CI] enable self-hosted runner ( #85 )
2022-08-25 19:12:16 -07:00
goostavz
bedbf221c0
[BACKEND] Support optional mask in TritonGPUToLLVM ( #80 )
...
Co-authored-by: gzhu <gzhu@nvidia.com >
2022-08-24 17:51:37 -07:00
Shintaro Iwasaki
84aa7d025a
[TritonIR] simplify Load/StoreOps when mask is true/false ( #79 )
...
* [TritonIR] fix Load/Store/CopyAsyncOp's parsers
* [TritonIR] simplify Load/StoreOps when mask is true/false
* [TEST] adds tests to check load/store simplification
2022-08-24 12:55:49 -07:00
Yan Chunwei
1b513c9866
[BACKEND] Refactoring codegen for LoadOp with PTXFormat ( #77 )
...
This PR does following things:
Enhance the PTXFormat by
Introducing PTXBuilder to enable multiple instructions in a single asm program
override PTXInstr's operator() method to enable instr(opr0, opr1) style of setting operands for an instruction
Refactor the PTX code used in LoadOpConversion with PTXFormat
Authored-by: goostavz <gzhu@nvidia.com >
2022-08-23 15:51:13 -07:00
Shintaro Iwasaki
0ebef11c77
[TritonIR] Make mask operand optional ( #74 )
2022-08-22 22:00:17 -07:00
goostavz
de2dd04c8a
[BACKEND] two minor bugfix on StoreOpLowering and kernel launch & support optional other in LoadOpLowering ( #69 )
...
* [BACKEND] two minor bugfix on StoreOpLowering and kernel launch & support optional other in LoadOpLowering
* Clean code
Co-authored-by: goostavz <gzhu@nvidia.com >
Co-authored-by: Yan Chunwei <yanchunwei@outlook.com >
2022-08-22 21:47:09 -07:00
Da Yan
92ef552a54
[OPTIMIZER] Fix Num in AsyncWaitOp generated by the pipeline pass ( #72 )
2022-08-22 15:58:10 -07:00
Yan Chunwei
10ba51c3bb
[FRONTEND] add python e2e launch empty kernel test ( #68 )
2022-08-19 10:46:01 -07:00
Shintaro Iwasaki
9aa00249a6
[TritonIR] make other optional and remove isOtherUnspecified ( #67 )
...
[Triton] make other optional and remove isOtherUnspecified
2022-08-18 18:19:55 -07:00
Philippe Tillet
192be76b3c
[OPTIMIZER] Rewrite patterns for layout conversions ( #64 )
2022-08-18 12:49:37 -07:00
Keren Zhou
e0bedeb44c
[BACKEND] Keren/shared memory barrier ( #59 )
2022-08-18 12:32:57 -07:00
Da Yan
8776ad1a0e
[OPTIMIZER] Let the pipeline pass insert async wait. ( #63 )
2022-08-18 10:31:57 -07:00
Shintaro Iwasaki
d69ce77b19
[FRONTEND] add an attr for masked load without explicit other ( #55 )
2022-08-18 09:51:37 -07:00
goostavz
fc58250a06
[BACKEND] Add backend support of arith::AddIOp, arith::AddFOp, GetProgramIdOp & GEPOp and bugfix for SplatOp, StoreOp, FuncOp ( #60 )
...
Add backend support of arith::AddIOp, arith::AddFOp, GetProgramIdOp, GEPOp and bugfix for SplatOp, StoreOp, FuncOp
Co-authored-by: gzhu <gzhu@nvidia.com >
2022-08-18 20:46:45 +08:00
Yan Chunwei
b1673caaf6
[FRONTEND] Expose end-to-end compile to python frontend ( #58 )
2022-08-17 10:42:48 -07:00
Yan Chunwei
95bbac41e7
[BACKEND] Add LLVM-translation for store and splat ops ( #47 )
2022-08-15 00:46:37 -07:00
goostavz
993ba7035a
[BACKEND] Codegen bringup, index calculation of blocked_layout & support of LoadOp, BroadcastOp, ViewOp & MakeRangeOp ( #38 )
...
Co-authored-by: gzhu <gzhu@nvidia.com >
2022-08-14 19:58:59 -07:00
Da Yan
e5ec8e16ea
[BUILD] Fix setup.py ( #45 )
2022-08-13 16:38:31 -07:00
Shintaro Iwasaki
d5856435d7
[CI] explicitly run unit tests ( #54 )
2022-08-12 13:39:04 -07:00
Shintaro Iwasaki
2ba9a83465
[BUILD] fix minor issues with MLIR assert enabled ( #46 )
2022-08-11 21:20:47 -07:00
Philippe Tillet
3a48ca0d4d
[BUILD] Fix includes ( #49 )
2022-08-11 11:49:29 -07:00
Yan Chunwei
83ef74f248
[BACKEND] Extracting numWarps from tritonGPU module ( #39 )
2022-08-08 09:40:20 -07:00
Yan Chunwei
920723cf3d
[BACKEND] add triton-translate to translate mlir to llvmir or PTX code ( #37 )
2022-08-07 22:34:36 -07:00
Philippe Tillet
490d34e0d5
[FRONTEND] Fixed python bindings link options ( #40 )
2022-08-07 13:09:12 -07:00
Philippe Tillet
78ebbe24c7
[FRONTEND] Added ExpandDimsOp
primitive ( #36 )
2022-08-04 18:41:06 -07:00
Keren Zhou
a7b49b3227
[BACKEND] Memory allocation ( #33 )
2022-08-04 11:22:49 -07:00
Yan Chunwei
b988bae813
Init TritonGPU to LLVM dialect conversion ( #32 )
...
* add toLLVM pass
* update num-warps setting in mlir
2022-08-04 10:15:45 +08:00
Philippe Tillet
3236642e8f
[OPTIMIZER] Added memory coalescing pass ( #31 )
2022-07-31 20:59:31 -07:00
Philippe Tillet
d1593e6ca8
[TritonGPU] Improved documentation and semantics of layout encodings ( #30 )
2022-07-31 13:59:44 -07:00
Yan Chunwei
e02c82c765
[TritonIR] Convert Triton dialect's Combine
pass to MLIR DRR based ( #16 )
2022-07-27 12:50:08 -07:00
Philippe Tillet
432c3df265
[BUILD] MacOS can now build compiler and run MLIR tests ( #25 )
2022-07-27 01:32:10 -07:00
Philippe Tillet
6d62d88d4f
[CI] run clang-format ( #24 )
2022-07-26 17:25:03 -07:00
Philippe Tillet
25357083e6
[CI] Added basic CI skeletons ( #23 )
...
Includes minor fixes to make things compile and pass static checks properly
2022-07-26 14:16:30 -07:00
Philippe Tillet
3265e0df5a
[PYTHON] Cleaned up legacy code; added simple standalone compilation API ( #22 )
2022-07-26 11:06:45 -07:00
Keren Zhou
96cc6fb563
[TritonGPU] Pretty printer for layouts ( #21 )
2022-07-26 10:50:11 -07:00
Philippe Tillet
27c9f3d8cb
[FRONTEND] Added comment on TensorSizeTrait::maxElement ( #20 )
2022-07-25 01:18:45 -07:00
Keren Zhou
7eda373a12
Add lit dependency ( #9 )
2022-07-24 19:14:52 -07:00
Philippe Tillet
a633d2b403
[Analysis] Added Axis Info Analysis ( #8 )
2022-07-19 13:38:48 -07:00
Philippe Tillet
df940aaab0
Merge pull request #7 from openai/broadcastAxis-fix
...
Fix blocked layout parser
2022-07-15 08:39:49 -07:00
Yan Da
63e6a85901
Fix blocked layout parser
2022-07-15 15:19:11 +08:00
Phil Tillet
65237f6117
[PACKAGING] Added FileCheck
2022-07-07 16:53:19 -07:00
Yan Da
9d1b5e3f79
special encoding for broadcast
2022-06-18 21:16:45 +08:00
Yan Da
53cf93ce6a
Revert "Remove TypeConverter from TritonToTritonGPU conversion"
...
This reverts commit 64d0b87ef0
.
2022-06-18 14:57:41 +08:00
Yan Da
64d0b87ef0
Remove TypeConverter from TritonToTritonGPU conversion
2022-06-18 14:34:59 +08:00
Yan Da
9feb256b71
op combine in Triton Dialect: broadcast(cst) -> cst
2022-06-17 16:19:47 +08:00
Yan Da
35736aa44e
more progress on the testing infrastructure
2022-06-12 15:14:45 +08:00
Yan Da
22c65a53d9
more progress on test/CMakeLists.txt
2022-06-10 21:37:56 +08:00
Yan Da
0ee6e486f8
add cse pass to the pipeline & pass num-warps as an argument
2022-06-10 17:31:48 +08:00
Yan Da
117a402c1b
more comments to TypeConverter & update warpTileSize
2022-06-08 16:20:07 +08:00