Phil Tillet
137e866bd2
more work
2023-01-09 16:20:10 -08:00
Phil Tillet
8ebb593bbb
more work
2023-01-09 15:45:06 -08:00
Phil Tillet
6c750b6856
Added verifier for trans
2023-01-08 14:29:17 -08:00
Phil Tillet
42421fabc5
.
2023-01-06 20:35:57 -08:00
Phil Tillet
600bcefb12
more optimizations
2023-01-06 20:27:49 -08:00
Philippe Tillet
18c7a72973
more pass template
2023-01-06 14:26:06 -08:00
Phil Tillet
b16aeb6541
added missing file
2023-01-06 13:39:54 -08:00
Phil Tillet
a81345f7c1
SinkConversionsFromShared template
2023-01-06 13:01:08 -08:00
Philippe Tillet
874ee11ab5
More optimizations
2023-01-06 11:04:20 -08:00
Philippe Tillet
e6f1a9ad34
commenting dq but not load/store
2023-01-05 23:25:41 -08:00
Philippe Tillet
6f997f4ecb
dq now mma
2023-01-05 21:14:55 -08:00
Phil Tillet
520b69fe70
more reassociation
2023-01-05 16:05:11 -08:00
Phil Tillet
764134ee34
trying to decrease register pressure
2023-01-05 13:02:38 -08:00
Phil Tillet
1bde80b1e8
Added ptx code
2023-01-04 17:23:16 -08:00
Phil Tillet
268d2cd18d
better convert + write-back
2023-01-04 17:12:35 -08:00
Phil Tillet
29a1e20b58
tweak convert + trans
2023-01-04 17:12:28 -08:00
Phil Tillet
36da342893
.
2023-01-04 11:25:03 -08:00
Phil Tillet
e70e1e76b4
swizzling
2023-01-04 11:21:19 -08:00
Phil Tillet
e3c3d9fc65
16 spills
2023-01-04 00:01:22 -08:00
Phil Tillet
ee86ea9c90
100 spills
2023-01-03 20:52:00 -08:00
Phil Tillet
645fa5c1cd
.
2023-01-03 18:34:05 -08:00
Phil Tillet
8df1fa5e5b
Merge remote-tracking branch 'origin/master' into phil/fused-attention-perf-fixup
2023-01-03 18:31:34 -08:00
Keren Zhou
8460ea3df1
[Frontend] Fix import for libdevice ( #1028 )
...
This is a hotfix for issue 1 in
https://github.com/openai/triton/issues/1017
2023-01-03 15:48:05 -08:00
Keren Zhou
678b9f53a2
[Backend] Use post-order traversal for liveness numbering ( #1027 )
...
Also add tests for `tt.trans`.
2023-01-03 15:11:54 -08:00
Phil Tillet
737e43a627
more tests
2023-01-03 09:48:08 -08:00
Phil Tillet
5c01c567b9
.
2023-01-02 23:13:12 -08:00
Phil Tillet
05920e0b8b
reduced some spilling
2023-01-02 19:28:54 -08:00
Phil Tillet
c11fe351e1
.
2023-01-02 19:16:06 -08:00
Phil Tillet
b246d85fad
trying to figure out spilling root cause
2022-12-30 15:21:00 -08:00
Phil Tillet
4dce8dd709
Merge remote-tracking branch 'origin/master' into phil/fused-attention-perf-fixup
2022-12-30 11:53:49 -08:00
goostavz
0e8590f1c9
[BACKEND] Add generic support of convert_layout from distributed to shared ( #1025 )
2022-12-30 11:29:58 -08:00
Phil Tillet
7388fb1de9
manual ttgir in bwd pass
2022-12-29 15:53:38 -08:00
fdrocha
194ba103b1
[BUILD] Fixed error when compiling in systems with multiple versions of python installed ( #1019 )
2022-12-29 15:10:34 -08:00
Phil Tillet
71e3143eaf
.
2022-12-29 14:40:27 -08:00
goostavz
1d3029faf8
[Backend] Add value cache in emitting indices calculation and some refinement ( #1018 )
...
1, add explicit value cache in emitting indices calculation;
2, move the indices calculation emitting logics into
ConvertTritonGPUOpToLLVMPatternBase to avoid the redundant build cost by
templates. Refer to the discussion in this thread by @LyricZhao :
https://triton-lang.slack.com/archives/C042VBSQWNS/p1671336755922969
2022-12-29 11:19:59 -08:00
Phil Tillet
263ad883a6
.
2022-12-28 14:23:59 -08:00
Phil Tillet
54ae3e8d6e
cleanup
2022-12-28 13:42:43 -08:00
Phil Tillet
7aba2a60d6
trying out another change
2022-12-27 21:51:51 -08:00
Phil Tillet
eefc9d1274
Added TTGIR kernel
2022-12-27 21:49:28 -08:00
Phil Tillet
0d6e6cf578
trying more things
2022-12-27 20:58:31 -08:00
Yan Chunwei
2ba74d2729
[OPTIMIZER] Update the versionMinor in MMA layout for volta ( #1014 )
...
Continue the work https://github.com/openai/triton/pull/990
# Background
The `versionMinor` in MmaEncodingAttr holds some states of DotOp's
operands in Volta, while such operands will be modified by some
patterns, making the states out-of-date.
This PR helps to correct the states.
# Implementation
It adds three new patterns:
1. `CollectMmaToUpdateForVolta` helps to collect and build a map holding
the MmaEncodingAttr instances with wrong states and create new correct
ones for them,
2. `UpdateMMAVersionMinorForVolta` helps to replace the Ops generating
the wrong MmaEncodingAttr instances with new correct ones, currently it
supports the following Ops
a. `convert_layout[X -> mma]`
b. `arith.constant SplatAttr : !tensor<mma>`
c. `dot ... : !tensor<mma>`
# Limitation
This PR chooses the mapping way to bypass the IR walk complexity from
the circular dependency between dot_operand[parent] and mma.
We use the MmaEncodingAttr instance as the mapping key, but there might
be multiple DotOp holding different DotOprand(IsMMAv1Row) that have the
same wrong MmaEncodingAttr instance.
To make each DotOp's (wrong) MmaEncodingAttr unique, we might need an ID
field to MmaEncodingAttr.
2022-12-28 12:24:01 +08:00
Philippe Tillet
4182e90862
less math
2022-12-24 00:31:05 -08:00
Keren Zhou
fd2da4aff6
[BACKEND] Support splat constant on the DotOperandLayout ( #1008 )
2022-12-22 00:48:46 -08:00
Sharad Vikram
925d3d7f98
[FRONTEND] Export broadcast
and broadcast_to
in triton.language
( #1007 )
2022-12-22 01:57:33 +00:00
Philippe Tillet
033e82060d
.
2022-12-21 14:02:10 -08:00
Phil Tillet
88e572e54d
.
2022-12-21 13:54:30 -08:00
Keren Zhou
b5aafb0dab
[FRONTEND] Fix 3d indexing ( #1006 )
2022-12-21 12:52:32 -08:00
Philippe Tillet
20100a7254
Merge triton-mlir
branch - Complete rewrite of the backend from scratch ( #1004 )
...
This PR merges the `triton-mlir` branch, in which we have been quietly
rewriting the Triton backend from scratch to increase maintainability,
stability and ultimately performance. Changes to the runtime are
minimal, and this new version aims to remain backward-compatible with
the previous commit. The legacy backend is now officially deprecated,
but can still be accessed via the `legacy-backend` tag.
Co-authored-by: Keren Zhou <kerenzhou@openai.com >
Co-authored-by: Yan Chunwei <yanchunwei@outlook.com >
Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com >
Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com >
Co-authored-by: Yan Da <dyanab@connect.ust.hk >
Co-authored-by: Jun Yang <yangjunpro@gmail.com >
Co-authored-by: Ian Bearman <ianb@microsoft.com >
Co-authored-by: Jason Ansel <jansel@jansel.net >
Co-authored-by: Qingyi Liu <qingyil@nvidia.com >
Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com >
Co-authored-by: Chenggang Zhao <lyricz@yeah.net >
Co-authored-by: ben-zhang-609 <benzh609@gmail.com >
Co-authored-by: dongdongl <dongdongl@nvidia.com >
2022-12-21 01:30:50 -08:00
Yang Hau
8650b4d1cb
[DRIVER] Fix typos ( #939 )
legacy-backend
2022-12-02 11:13:46 -08:00
Crutcher Dunnavant
44f577984d
Fix format double substitution bug: {i}
=> {{i}}
( #886 )
...
The previous `{i}` was silently expanding to the `i` from the
enumeration loop on `regular_args` (when it wasn't empty).
2022-11-20 11:44:42 -08:00