Philippe Tillet
|
36e3667a9a
|
removed shared conflicts for 8x32x4 and 32x8x4 configurations
|
2019-06-13 17:51:54 -07:00 |
|
Philippe Tillet
|
d487cf31ce
|
trying 128 bits loads
|
2019-06-12 21:07:01 -07:00 |
|
Philippe Tillet
|
a6b580ec05
|
interleaving fails with B
|
2019-06-12 19:46:43 -07:00 |
|
Philippe Tillet
|
1b5a742a88
|
[triton/codegen] added shared memory padding for HMMA arguments and vectorized loads
|
2019-06-11 19:51:08 -07:00 |
|
Philippe Tillet
|
cbd916994d
|
[example/tensorflow] no longer hardcoding library dir
|
2019-06-11 11:06:02 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
d074a166e2
|
[feature] basic tensor core utilization works
|
2019-06-08 14:39:45 -07:00 |
|
Philippe Tillet
|
6fce9f28ae
|
added fragmented axis
|
2019-06-07 10:32:56 -07:00 |
|
Philippe Tillet
|
cdf5a0d011
|
[codegen/tune]: added fragmentation types
|
2019-06-06 16:48:32 -07:00 |
|
Philippe Tillet
|
f58c9a4d2b
|
[general] hmma baseline setup
|
2019-06-05 14:43:38 -07:00 |
|
Philippe Tillet
|
fd91368f98
|
[general] creation of dnn module for gemm/conv triton routines
|
2019-05-06 17:47:06 -04:00 |
|