Philippe Tillet
|
1b5a742a88
|
[triton/codegen] added shared memory padding for HMMA arguments and vectorized loads
|
2019-06-11 19:51:08 -07:00 |
|
Philippe Tillet
|
7d50b87681
|
[selection/codegen] bugfix in distributed tile indices initialization
|
2019-06-11 10:45:19 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
d074a166e2
|
[feature] basic tensor core utilization works
|
2019-06-08 14:39:45 -07:00 |
|
Philippe Tillet
|
5f3d48c1d0
|
[tensor cores] added basic codegen template for using wmma
|
2019-06-07 21:19:47 -07:00 |
|
Philippe Tillet
|
ec4c6aaaaa
|
Added inline PTX for mma.sync
|
2019-06-07 19:39:33 -07:00 |
|
Philippe Tillet
|
6fce9f28ae
|
added fragmented axis
|
2019-06-07 10:32:56 -07:00 |
|
Philippe Tillet
|
cdf5a0d011
|
[codegen/tune]: added fragmentation types
|
2019-06-06 16:48:32 -07:00 |
|
Philippe Tillet
|
f6fe9492e4
|
[dnn/conv] added triton-c code for wgrad
|
2019-05-11 18:09:23 -04:00 |
|
Philippe Tillet
|
f80441017c
|
[codegen] added leading dimension padding for transposition in shared
memory
|
2019-05-06 11:53:35 -04:00 |
|
Philippe Tillet
|
4813bb007c
|
[codegen] bugfix in builder insert point for predicated instructions
|
2019-05-04 12:09:27 -04:00 |
|
Philippe Tillet
|
af58b8bd81
|
[triton-c] predicate in assignment statement now propagates to rhs
computations
|
2019-04-27 14:00:15 -04:00 |
|
Philippe Tillet
|
4b77b764ba
|
[triton-c] added support for while loops
|
2019-04-26 15:08:02 -04:00 |
|
Philippe Tillet
|
3413aad582
|
[general] major overhaul of triton-c/triton-ir/triton-jit:
- Added alloc const
- Added atomics
- Pruning tuning space
- Added example for dot/conv/shift
- Bugfixes
|
2019-04-25 16:18:15 -04:00 |
|
Philippe Tillet
|
bc2a257d5c
|
[code generation] more flexibility in backend selection
|
2019-03-27 11:29:42 -07:00 |
|
Philippe Tillet
|
e04253c0dd
|
[code generation] basic CPU backend
|
2019-03-27 11:13:36 -07:00 |
|
Philippe Tillet
|
9d6fc1c051
|
[code generation] bugfix in single buffering
|
2019-03-26 15:55:48 -07:00 |
|
Philippe Tillet
|
deb7a1cc5c
|
Hack to make OpenCL for AMD work
|
2019-03-23 18:58:25 -07:00 |
|
Philippe Tillet
|
9de9feff4a
|
[jit] added runtime for host but compilation still needs to be implemented
|
2019-03-23 13:40:42 -07:00 |
|
Philippe Tillet
|
49fd6ece99
|
some cleaning
|
2019-03-21 23:51:47 -07:00 |
|
Philippe Tillet
|
87c85ed50d
|
[code generation] reparameterization
|
2019-03-11 19:30:21 -04:00 |
|
Philippe Tillet
|
94e315ea8a
|
Reparameterized in terms of micro- and nano- tiles
|
2019-03-10 23:10:17 -04:00 |
|
Philippe Tillet
|
5f29263044
|
[code generation] now using ir::metaparameter* for all tunable
metaparameters
|
2019-03-09 12:05:12 -05:00 |
|
Philippe Tillet
|
4189e130bf
|
[general] added support for constant memory declaration
|
2019-03-03 23:16:33 -05:00 |
|
Philippe Tillet
|
1f30e111ec
|
[code generation] more optimizations
|
2019-03-02 16:03:26 -05:00 |
|
Philippe Tillet
|
2467c5e504
|
[code generation] added ternary operator
|
2019-03-01 21:53:35 -05:00 |
|
Philippe Tillet
|
08fcfbca47
|
[code generation] better predication
|
2019-03-01 14:36:17 -05:00 |
|
Philippe Tillet
|
36acf22fd3
|
better masking
|
2019-02-28 23:46:11 -05:00 |
|
Philippe Tillet
|
017702590b
|
[intermediate representation] added ternary_inst
|
2019-02-26 14:20:58 -05:00 |
|
Philippe Tillet
|
68dea75aa0
|
[syntax tree] more fixes in lowering phi nodes
|
2019-02-26 12:36:37 -05:00 |
|
Philippe Tillet
|
338f291835
|
[code generation] now ordered iterations across distributed tiles
|
2019-02-25 11:41:45 -05:00 |
|
Philippe Tillet
|
6dc88878ac
|
[code generation] bugfix in double-buffering
|
2019-02-24 23:22:28 -05:00 |
|
Philippe Tillet
|
daa828ec18
|
[general] rename namespace tdl -> triton
|
2019-02-24 14:35:16 -05:00 |
|
Philippe Tillet
|
6b49818282
|
[filesystem] rename tdl -> triton
|
2019-02-24 14:20:40 -05:00 |
|
Philippe Tillet
|
1b5f7f2139
|
[code generation] basic metaparameter support
|
2019-02-23 22:24:12 -05:00 |
|
Philippe Tillet
|
8f4798b81a
|
[intermediate representation] transitioning towards more flexible tile
shapes
|
2019-02-23 11:37:01 -05:00 |
|
Philippe Tillet
|
7cda55df16
|
[code generation] implements hidden operands in user (e.g., mask)
|
2019-02-21 18:00:27 -05:00 |
|
Philippe Tillet
|
5618a15dc1
|
[code generation] more bugfixes in control flow
|
2019-02-20 22:55:20 -05:00 |
|
Philippe Tillet
|
90ec0ae2c0
|
[code generation] some more bugfixing with nested control flow
|
2019-02-18 22:54:08 -05:00 |
|
Philippe Tillet
|
cf1a583dbf
|
bla
|
2019-02-15 22:03:09 -05:00 |
|
Philippe Tillet
|
5f5959dc6e
|
[code generation] added masked loads
|
2019-02-15 11:14:50 -05:00 |
|
Philippe Tillet
|
32562677e9
|
[code generation] added barriers placement
|
2019-02-12 19:36:16 -05:00 |
|
Philippe Tillet
|
41aad4800c
|
[code generation] added double-buffering
|
2019-02-12 11:47:52 -05:00 |
|
Philippe Tillet
|
e45d6bbb60
|
some cleaning
|
2019-02-12 11:00:24 -05:00 |
|
Philippe Tillet
|
f8e522ada8
|
blabla
|
2019-02-11 17:27:16 -05:00 |
|
Philippe Tillet
|
b2e487491f
|
[code generation] now vectorizing shared memory stores
|
2019-02-10 21:59:41 -05:00 |
|
Philippe Tillet
|
8ab5ca3de3
|
blabla
|
2019-02-10 20:41:07 -05:00 |
|
Philippe Tillet
|
3d07e909c6
|
attempting vectorization
|
2019-02-10 18:29:25 -05:00 |
|
Philippe Tillet
|
4a0736ce20
|
[code generation] in-place CSE in shared memory reads
|
2019-02-09 23:56:53 -05:00 |
|
Philippe Tillet
|
d39f97ef38
|
[code generation] simple matrix-multiplication working
|
2019-02-09 19:20:50 -05:00 |
|