Philippe Tillet
|
1b5a742a88
|
[triton/codegen] added shared memory padding for HMMA arguments and vectorized loads
|
2019-06-11 19:51:08 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
d074a166e2
|
[feature] basic tensor core utilization works
|
2019-06-08 14:39:45 -07:00 |
|
Philippe Tillet
|
ec4c6aaaaa
|
Added inline PTX for mma.sync
|
2019-06-07 19:39:33 -07:00 |
|
Philippe Tillet
|
781b6d377d
|
seleciton now segfault (expected
|
2019-06-06 20:34:56 -07:00 |
|
Philippe Tillet
|
6045209d5b
|
Now find correct tuning configuration
|
2019-06-06 20:13:26 -07:00 |
|
Philippe Tillet
|
cdf5a0d011
|
[codegen/tune]: added fragmentation types
|
2019-06-06 16:48:32 -07:00 |
|
Philippe Tillet
|
f58c9a4d2b
|
[general] hmma baseline setup
|
2019-06-05 14:43:38 -07:00 |
|
Philippe Tillet
|
49fcfd6fc7
|
[examples/tensorflow] fixed #include issue
|
2019-06-05 11:09:41 -07:00 |
|
Philippe Tillet
|
70f49a56c1
|
[examples/python/tensorflow] better skeleton for blocksparse
|
2019-05-01 17:09:01 -04:00 |
|