Commit Graph

183 Commits

Author SHA1 Message Date
Philippe Tillet
f257884eb7 some cleaning 2019-06-24 09:31:34 -07:00
Philippe Tillet
67989e7d18 fixup 2019-06-13 20:03:28 -07:00
Philippe Tillet
f7dcea1187 Now doing double-buffering 2019-06-13 19:48:02 -07:00
Philippe Tillet
36e3667a9a removed shared conflicts for 8x32x4 and 32x8x4 configurations 2019-06-13 17:51:54 -07:00
Philippe Tillet
21a9b92c87 disabling interleaving 2019-06-13 17:16:00 -07:00
Philippe Tillet
d487cf31ce trying 128 bits loads 2019-06-12 21:07:01 -07:00
Philippe Tillet
1c6372711b added interleaving 2019-06-12 20:30:28 -07:00
Philippe Tillet
a6b580ec05 interleaving fails with B 2019-06-12 19:46:43 -07:00
Philippe Tillet
1b5a742a88 [triton/codegen] added shared memory padding for HMMA arguments and vectorized loads 2019-06-11 19:51:08 -07:00
Philippe Tillet
cbd916994d [example/tensorflow] no longer hardcoding library dir 2019-06-11 11:06:02 -07:00
Philippe Tillet
7d50b87681 [selection/codegen] bugfix in distributed tile indices initialization 2019-06-11 10:45:19 -07:00
Philippe Tillet
06b5992509 [feature] added basic tensor core support 2019-06-11 10:24:49 -07:00
Philippe Tillet
d074a166e2 [feature] basic tensor core utilization works 2019-06-08 14:39:45 -07:00
Philippe Tillet
5f3d48c1d0 [tensor cores] added basic codegen template for using wmma 2019-06-07 21:19:47 -07:00
Philippe Tillet
ec4c6aaaaa Added inline PTX for mma.sync 2019-06-07 19:39:33 -07:00
Philippe Tillet
6fce9f28ae added fragmented axis 2019-06-07 10:32:56 -07:00
Philippe Tillet
781b6d377d seleciton now segfault (expected 2019-06-06 20:34:56 -07:00
Philippe Tillet
6045209d5b Now find correct tuning configuration 2019-06-06 20:13:26 -07:00
Philippe Tillet
0a0b48e9a2 adding hmma tuning parameters 2019-06-06 19:51:02 -07:00
Philippe Tillet
81eba3e1ec ugh 2019-06-06 19:36:41 -07:00
Philippe Tillet
cdf5a0d011 [codegen/tune]: added fragmentation types 2019-06-06 16:48:32 -07:00
Philippe Tillet
f58c9a4d2b [general] hmma baseline setup 2019-06-05 14:43:38 -07:00
Philippe Tillet
49fcfd6fc7 [examples/tensorflow] fixed #include issue 2019-06-05 11:09:41 -07:00
Philippe Tillet
383b5b2a2a [triton/ast] renamed ast -> lang in namespace and file structure 2019-05-28 17:28:02 -04:00
Philippe Tillet
d2a46afe00 [triton/ast]: cleaned the ast module 2019-05-28 17:07:54 -04:00
Philippe Tillet
8102efc064 [triton/examples/cpp] removed common.hpp helper 2019-05-28 14:14:33 -04:00
Philippe Tillet
a9d078c06f [triton/dnn/conv] merged optimizations branch
- Added forward/backward support for strided convolution
- Added support for bias
- Added support for reduction splitting
2019-05-28 14:04:53 -04:00
Philippe Tillet
e526ffc62b [examples/pytorch] added a bunch of models for more thorough testing 2019-05-28 14:04:31 -04:00
Philippe Tillet
3f3eb1c2a4 [dnn/conv] Added the option to have look-up table for filters for all
operations
2019-05-22 19:03:33 -04:00
Philippe Tillet
f8291af7ef [dnn/conv] removed divergent paths in LUT computations 2019-05-22 17:49:40 -04:00
Philippe Tillet
2672812ad0 [dnn/conv] No more divergent path in conv::set_arg 2019-05-22 15:25:43 -04:00
Philippe Tillet
e8f23bcade [dnn/conv] Added bias and forward stride 2019-05-22 13:27:08 -04:00
Philippe Tillet
f33a1f3fe3 [examples/pytorch] Fixed issues in backward pass of conv 2019-05-19 01:31:08 -04:00
Philippe Tillet
b2b55c52c9 [triton/python/conv]: Added cache for compiled kernels 2019-05-18 11:51:49 -04:00
Philippe Tillet
600aef72d5 [conv/dnn] now created a separate .h and .cpp file 2019-05-17 12:29:11 -04:00
Philippe Tillet
34f8617709 [dnn/conv] fixed formatting of generated Triton-C code 2019-05-16 15:48:02 -04:00
Philippe Tillet
ece7beea3c [dnn/conv]: now using look-up table for wgrad computation as well 2019-05-16 15:26:16 -04:00
Philippe Tillet
15a967c81e [dnn/conv] minor cleaning 2019-05-15 11:32:47 -04:00
Philippe Tillet
be2ba03382 [dnn/conv] optimizations of backpropagation with look-up tables 2019-05-14 19:10:59 -04:00
Philippe Tillet
cbfbe72e46 [general] added LICENSE file 2019-05-13 22:29:53 -04:00
Philippe Tillet
5941501f70 [dnn] added Triton-C derivative computations in conv 2019-05-13 18:04:11 -04:00
Philippe Tillet
f6fe9492e4 [dnn/conv] added triton-c code for wgrad 2019-05-11 18:09:23 -04:00
Philippe Tillet
fc4daf11dd [examples/conv] now deferring shape computations to conv configuration 2019-05-08 13:58:25 -04:00
Philippe Tillet
54f888a270 [dnn/conv] some minor fixes 2019-05-08 10:09:30 -04:00
Philippe Tillet
615569287e more cleaning of conv 2019-05-06 19:30:22 -04:00
Philippe Tillet
fd91368f98 [general] creation of dnn module for gemm/conv triton routines 2019-05-06 17:47:06 -04:00
Philippe Tillet
f80441017c [codegen] added leading dimension padding for transposition in shared
memory
2019-05-06 11:53:35 -04:00
Philippe Tillet
4813bb007c [codegen] bugfix in builder insert point for predicated instructions 2019-05-04 12:09:27 -04:00
Philippe Tillet
30833c18f1 [codegen/tune] bugfix in heuristics for nano-tile sizes 2019-05-04 01:32:34 -04:00
Philippe Tillet
0d694445e6 [examples] added skeleton for pytorch wrapper 2019-05-03 14:30:06 -04:00