Philippe Tillet
|
d9945692a9
|
[dnn] better specification of recompilation key
|
2019-08-02 17:42:48 -07:00 |
|
Philippe Tillet
|
2f0817b2cd
|
[codegen/selection] tensor cores now used for transposed layotus
|
2019-07-17 17:20:38 -07:00 |
|
Philippe Tillet
|
bfa39b8992
|
preparing the field for tensor cores transposes
|
2019-07-17 13:20:33 -07:00 |
|
Philippe Tillet
|
d2e116d057
|
testing GEMM
|
2019-07-17 12:38:30 -07:00 |
|
Philippe Tillet
|
ec24e1e7df
|
trying to remove interior logic
|
2019-07-16 18:47:50 -07:00 |
|
Philippe Tillet
|
5f6dd23fc2
|
[dnn/dot] reverted back to peak tensorcores performance
|
2019-07-16 16:14:58 -07:00 |
|
Philippe Tillet
|
7d1797cd32
|
ugh
|
2019-07-16 12:59:27 -07:00 |
|
Philippe Tillet
|
aa8bcf6bde
|
[dnn/shift] added split-k for shift-conv
|
2019-07-15 21:03:58 -07:00 |
|
Philippe Tillet
|
3e7a3ed67a
|
[dnn/shift]: added support for fp16
|
2019-07-13 21:05:34 -07:00 |
|
Philippe Tillet
|
c1c7062914
|
blabla
|
2019-07-12 17:42:29 -07:00 |
|
Philippe Tillet
|
f36a646ffc
|
[dnn/shift-conv] added and tested NCHW layout
|
2019-07-11 21:00:33 -07:00 |
|
Philippe Tillet
|
207e021973
|
[codegen/shift] substantial cleaning of triton-c shift-conv code
|
2019-07-11 20:11:23 -07:00 |
|
Philippe Tillet
|
75cf2df110
|
[dnn/shift] many bugfixes in strided shift-conv
|
2019-07-10 19:49:31 -07:00 |
|
Philippe Tillet
|
4ca83f1935
|
ugh bug in shift-conv striding
|
2019-07-10 17:00:22 -07:00 |
|
Philippe Tillet
|
f665c742f9
|
testing a simple shiftnet
|
2019-07-10 13:33:08 -07:00 |
|
Philippe Tillet
|
b7986baffa
|
[dnn]: Now implementing all existing DNN routines using common base template and auto-tuner
|
2019-07-09 19:52:55 -07:00 |
|
Philippe Tillet
|
88675fa01a
|
[dnn] added base template class for mutualized auto-tuning
|
2019-07-09 16:09:34 -07:00 |
|
Philippe Tillet
|
066ae338f1
|
[dnn/shift]: added stride to shift
|
2019-07-09 14:08:51 -07:00 |
|
Philippe Tillet
|
cc41604784
|
[codegen/batchnorm] forward and backward now seemingly working
|
2019-07-09 13:03:16 -07:00 |
|
Philippe Tillet
|
f74dcb7e30
|
[dnn/batchnorm]: added some more code in Triton-C batchnorm implementations
|
2019-07-08 20:18:20 -07:00 |
|
Philippe Tillet
|
f9db0449b7
|
[dnn] Adding batchnorm
|
2019-07-08 18:44:37 -07:00 |
|
Philippe Tillet
|
b0cf3143c5
|
[dnn/shift] bugfix in wgrad
|
2019-07-06 11:27:49 -07:00 |
|
Philippe Tillet
|
3e49dbe6ab
|
[dnn/shift] fixed in leading dimensions for shift-conv operation
|
2019-07-05 17:17:22 -07:00 |
|
Philippe Tillet
|
c666f71fd6
|
fixed bug
|
2019-07-05 15:07:20 -07:00 |
|
Philippe Tillet
|
88ebdddf3d
|
makes more sense now
|
2019-07-03 20:45:03 -07:00 |
|
Philippe Tillet
|
bd1040510f
|
dx works but that makes no sense?
|
2019-07-03 20:24:52 -07:00 |
|
Philippe Tillet
|
1b2ceadf0d
|
weight gradient seem to work
|
2019-07-03 20:04:38 -07:00 |
|
Philippe Tillet
|
39aa22babb
|
more tinkering
|
2019-07-03 19:52:31 -07:00 |
|
Philippe Tillet
|
1d88f0a36b
|
stuff
|
2019-07-03 19:25:16 -07:00 |
|
Philippe Tillet
|
5144dc3a6c
|
[examples/python] added framework code for shift-conv
|
2019-07-02 20:45:10 -07:00 |
|
Philippe Tillet
|
6300ec5080
|
[examples] added conv2d op in tensorflow
|
2019-06-26 18:50:53 -07:00 |
|
Philippe Tillet
|
f1a8972267
|
[examples] added tensorflow dense convolution templates
|
2019-06-26 11:39:22 -07:00 |
|
Philippe Tillet
|
64513fb407
|
[codegen] added fallback when tensor cores cannot be used
|
2019-06-25 15:49:58 -07:00 |
|
Philippe Tillet
|
f7dcea1187
|
Now doing double-buffering
|
2019-06-13 19:48:02 -07:00 |
|
Philippe Tillet
|
36e3667a9a
|
removed shared conflicts for 8x32x4 and 32x8x4 configurations
|
2019-06-13 17:51:54 -07:00 |
|
Philippe Tillet
|
d487cf31ce
|
trying 128 bits loads
|
2019-06-12 21:07:01 -07:00 |
|
Philippe Tillet
|
a6b580ec05
|
interleaving fails with B
|
2019-06-12 19:46:43 -07:00 |
|
Philippe Tillet
|
1b5a742a88
|
[triton/codegen] added shared memory padding for HMMA arguments and vectorized loads
|
2019-06-11 19:51:08 -07:00 |
|
Philippe Tillet
|
cbd916994d
|
[example/tensorflow] no longer hardcoding library dir
|
2019-06-11 11:06:02 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
d074a166e2
|
[feature] basic tensor core utilization works
|
2019-06-08 14:39:45 -07:00 |
|
Philippe Tillet
|
6fce9f28ae
|
added fragmented axis
|
2019-06-07 10:32:56 -07:00 |
|
Philippe Tillet
|
cdf5a0d011
|
[codegen/tune]: added fragmentation types
|
2019-06-06 16:48:32 -07:00 |
|
Philippe Tillet
|
f58c9a4d2b
|
[general] hmma baseline setup
|
2019-06-05 14:43:38 -07:00 |
|
Philippe Tillet
|
fd91368f98
|
[general] creation of dnn module for gemm/conv triton routines
|
2019-05-06 17:47:06 -04:00 |
|