Philippe Tillet
|
ff04a5e9b6
|
.
|
2023-01-09 22:11:00 -08:00 |
|
Phil Tillet
|
d88353a5a4
|
.
|
2023-01-09 20:14:06 -08:00 |
|
Phil Tillet
|
bae4c40379
|
reorder conversions to dot operand
|
2023-01-09 20:13:30 -08:00 |
|
Phil Tillet
|
c98c889d7f
|
.
|
2023-01-09 19:08:51 -08:00 |
|
Phil Tillet
|
fc1007278d
|
.
|
2023-01-09 18:45:44 -08:00 |
|
Phil Tillet
|
0c101e0c33
|
.
|
2023-01-09 16:30:28 -08:00 |
|
Phil Tillet
|
3fefcd78d4
|
.
|
2023-01-09 16:29:45 -08:00 |
|
Phil Tillet
|
137e866bd2
|
more work
|
2023-01-09 16:20:10 -08:00 |
|
Phil Tillet
|
8ebb593bbb
|
more work
|
2023-01-09 15:45:06 -08:00 |
|
Phil Tillet
|
6c750b6856
|
Added verifier for trans
|
2023-01-08 14:29:17 -08:00 |
|
Phil Tillet
|
42421fabc5
|
.
|
2023-01-06 20:35:57 -08:00 |
|
Phil Tillet
|
600bcefb12
|
more optimizations
|
2023-01-06 20:27:49 -08:00 |
|
Philippe Tillet
|
18c7a72973
|
more pass template
|
2023-01-06 14:26:06 -08:00 |
|
Phil Tillet
|
a81345f7c1
|
SinkConversionsFromShared template
|
2023-01-06 13:01:08 -08:00 |
|
Philippe Tillet
|
874ee11ab5
|
More optimizations
|
2023-01-06 11:04:20 -08:00 |
|
Philippe Tillet
|
e6f1a9ad34
|
commenting dq but not load/store
|
2023-01-05 23:25:41 -08:00 |
|
Philippe Tillet
|
6f997f4ecb
|
dq now mma
|
2023-01-05 21:14:55 -08:00 |
|
Phil Tillet
|
520b69fe70
|
more reassociation
|
2023-01-05 16:05:11 -08:00 |
|
Phil Tillet
|
764134ee34
|
trying to decrease register pressure
|
2023-01-05 13:02:38 -08:00 |
|
Phil Tillet
|
1bde80b1e8
|
Added ptx code
|
2023-01-04 17:23:16 -08:00 |
|
Phil Tillet
|
268d2cd18d
|
better convert + write-back
|
2023-01-04 17:12:35 -08:00 |
|
Phil Tillet
|
29a1e20b58
|
tweak convert + trans
|
2023-01-04 17:12:28 -08:00 |
|
Phil Tillet
|
36da342893
|
.
|
2023-01-04 11:25:03 -08:00 |
|
Phil Tillet
|
e70e1e76b4
|
swizzling
|
2023-01-04 11:21:19 -08:00 |
|
Phil Tillet
|
e3c3d9fc65
|
16 spills
|
2023-01-04 00:01:22 -08:00 |
|
Phil Tillet
|
ee86ea9c90
|
100 spills
|
2023-01-03 20:52:00 -08:00 |
|
Phil Tillet
|
645fa5c1cd
|
.
|
2023-01-03 18:34:05 -08:00 |
|
Phil Tillet
|
8df1fa5e5b
|
Merge remote-tracking branch 'origin/master' into phil/fused-attention-perf-fixup
|
2023-01-03 18:31:34 -08:00 |
|
Keren Zhou
|
8460ea3df1
|
[Frontend] Fix import for libdevice (#1028)
This is a hotfix for issue 1 in
https://github.com/openai/triton/issues/1017
|
2023-01-03 15:48:05 -08:00 |
|
Phil Tillet
|
737e43a627
|
more tests
|
2023-01-03 09:48:08 -08:00 |
|
Phil Tillet
|
5c01c567b9
|
.
|
2023-01-02 23:13:12 -08:00 |
|
Phil Tillet
|
05920e0b8b
|
reduced some spilling
|
2023-01-02 19:28:54 -08:00 |
|
Phil Tillet
|
c11fe351e1
|
.
|
2023-01-02 19:16:06 -08:00 |
|
Phil Tillet
|
b246d85fad
|
trying to figure out spilling root cause
|
2022-12-30 15:21:00 -08:00 |
|
Phil Tillet
|
4dce8dd709
|
Merge remote-tracking branch 'origin/master' into phil/fused-attention-perf-fixup
|
2022-12-30 11:53:49 -08:00 |
|
Phil Tillet
|
7388fb1de9
|
manual ttgir in bwd pass
|
2022-12-29 15:53:38 -08:00 |
|
fdrocha
|
194ba103b1
|
[BUILD] Fixed error when compiling in systems with multiple versions of python installed (#1019)
|
2022-12-29 15:10:34 -08:00 |
|
Phil Tillet
|
71e3143eaf
|
.
|
2022-12-29 14:40:27 -08:00 |
|
Phil Tillet
|
54ae3e8d6e
|
cleanup
|
2022-12-28 13:42:43 -08:00 |
|
Phil Tillet
|
7aba2a60d6
|
trying out another change
|
2022-12-27 21:51:51 -08:00 |
|
Phil Tillet
|
eefc9d1274
|
Added TTGIR kernel
|
2022-12-27 21:49:28 -08:00 |
|
Phil Tillet
|
0d6e6cf578
|
trying more things
|
2022-12-27 20:58:31 -08:00 |
|
Philippe Tillet
|
4182e90862
|
less math
|
2022-12-24 00:31:05 -08:00 |
|
Keren Zhou
|
fd2da4aff6
|
[BACKEND] Support splat constant on the DotOperandLayout (#1008)
|
2022-12-22 00:48:46 -08:00 |
|
Sharad Vikram
|
925d3d7f98
|
[FRONTEND] Export broadcast and broadcast_to in triton.language (#1007)
|
2022-12-22 01:57:33 +00:00 |
|
Philippe Tillet
|
033e82060d
|
.
|
2022-12-21 14:02:10 -08:00 |
|
Phil Tillet
|
88e572e54d
|
.
|
2022-12-21 13:54:30 -08:00 |
|
Keren Zhou
|
b5aafb0dab
|
[FRONTEND] Fix 3d indexing (#1006)
|
2022-12-21 12:52:32 -08:00 |
|
Philippe Tillet
|
20100a7254
|
Merge triton-mlir branch - Complete rewrite of the backend from scratch (#1004)
This PR merges the `triton-mlir` branch, in which we have been quietly
rewriting the Triton backend from scratch to increase maintainability,
stability and ultimately performance. Changes to the runtime are
minimal, and this new version aims to remain backward-compatible with
the previous commit. The legacy backend is now officially deprecated,
but can still be accessed via the `legacy-backend` tag.
Co-authored-by: Keren Zhou <kerenzhou@openai.com>
Co-authored-by: Yan Chunwei <yanchunwei@outlook.com>
Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com>
Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com>
Co-authored-by: Yan Da <dyanab@connect.ust.hk>
Co-authored-by: Jun Yang <yangjunpro@gmail.com>
Co-authored-by: Ian Bearman <ianb@microsoft.com>
Co-authored-by: Jason Ansel <jansel@jansel.net>
Co-authored-by: Qingyi Liu <qingyil@nvidia.com>
Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com>
Co-authored-by: Chenggang Zhao <lyricz@yeah.net>
Co-authored-by: ben-zhang-609 <benzh609@gmail.com>
Co-authored-by: dongdongl <dongdongl@nvidia.com>
|
2022-12-21 01:30:50 -08:00 |
|
Yang Hau
|
8650b4d1cb
|
[DRIVER] Fix typos (#939)
|
2022-12-02 11:13:46 -08:00 |
|