Philippe Tillet
|
397d76156b
|
progress on re-association
|
2019-07-23 17:21:24 -07:00 |
|
Philippe Tillet
|
38b3771c26
|
some reassociation
|
2019-07-23 14:43:18 -07:00 |
|
Philippe Tillet
|
c448876178
|
better benchmarking
|
2019-07-22 19:26:12 -07:00 |
|
Philippe Tillet
|
b1d81a5802
|
more work on heuristics
|
2019-07-21 18:11:54 -07:00 |
|
Philippe Tillet
|
484e3871cf
|
[dnn/shift] added base pointer for a, b
|
2019-07-20 23:00:27 -07:00 |
|
Philippe Tillet
|
d159455f7b
|
[codegen/alignment_info] better alignment information
|
2019-07-20 21:44:18 -07:00 |
|
Philippe Tillet
|
28c250216c
|
[dnn/gemm] added some bounds checking
|
2019-07-19 21:32:55 -07:00 |
|
Philippe Tillet
|
5215fb0424
|
[codegen] some more optimizations
|
2019-07-19 20:29:03 -07:00 |
|
Philippe Tillet
|
71594da66f
|
[dnn/gemm]: fixed leading dimension in transposed variants
|
2019-07-18 16:35:48 -07:00 |
|
Philippe Tillet
|
f0d8306437
|
[codegen/alignment_info] better handling of constants
|
2019-07-18 16:12:06 -07:00 |
|
Philippe Tillet
|
86f70f8224
|
[codegen/selection] performance fix-up when A is transposed for hmma
|
2019-07-17 21:46:23 -07:00 |
|
Philippe Tillet
|
2f0817b2cd
|
[codegen/selection] tensor cores now used for transposed layotus
|
2019-07-17 17:20:38 -07:00 |
|
Philippe Tillet
|
bfa39b8992
|
preparing the field for tensor cores transposes
|
2019-07-17 13:20:33 -07:00 |
|
Philippe Tillet
|
791c91ee63
|
[dnn/shift] bugfix in static shape division
|
2019-07-17 11:39:17 -07:00 |
|
Philippe Tillet
|
a55b098e88
|
[dnn/shift] now using constant divisions
|
2019-07-16 21:05:21 -07:00 |
|
Philippe Tillet
|
07c964919c
|
[dnn/shift] now strictly only shifting the interior
|
2019-07-16 20:18:48 -07:00 |
|
Philippe Tillet
|
ec24e1e7df
|
trying to remove interior logic
|
2019-07-16 18:47:50 -07:00 |
|
Philippe Tillet
|
5f6dd23fc2
|
[dnn/dot] reverted back to peak tensorcores performance
|
2019-07-16 16:14:58 -07:00 |
|
Philippe Tillet
|
28959fe165
|
[runtime/jit] made auto-tuning silent
|
2019-07-16 14:41:38 -07:00 |
|
Philippe Tillet
|
7d1797cd32
|
ugh
|
2019-07-16 12:59:27 -07:00 |
|
Philippe Tillet
|
3e7a3ed67a
|
[dnn/shift]: added support for fp16
|
2019-07-13 21:05:34 -07:00 |
|
Philippe Tillet
|
f74dcb7e30
|
[dnn/batchnorm]: added some more code in Triton-C batchnorm implementations
|
2019-07-08 20:18:20 -07:00 |
|
Philippe Tillet
|
fa3270dcf2
|
[codegen/selection] bugfix in code generation for reduction instructions
|
2019-07-08 18:53:37 -07:00 |
|
Philippe Tillet
|
f9db0449b7
|
[dnn] Adding batchnorm
|
2019-07-08 18:44:37 -07:00 |
|
Philippe Tillet
|
8fc253946c
|
[codegen] shift: added sketch for shift-convolution backpropagation
|
2019-07-02 16:39:07 -07:00 |
|
Philippe Tillet
|
6cfb575d29
|
[lang] fixup in cast type
|
2019-06-30 17:43:18 -07:00 |
|
Philippe Tillet
|
c172bd518b
|
more stuff
|
2019-06-30 16:55:02 -07:00 |
|
Philippe Tillet
|
9a86bc51e1
|
[language] added alignment metadata for variables
|
2019-06-29 13:58:46 -07:00 |
|
Philippe Tillet
|
d8c3d58593
|
more optimization
|
2019-06-28 20:22:52 -07:00 |
|
Philippe Tillet
|
f4dedb522c
|
fixup
|
2019-06-27 17:05:48 -07:00 |
|
Philippe Tillet
|
6300ec5080
|
[examples] added conv2d op in tensorflow
|
2019-06-26 18:50:53 -07:00 |
|
Philippe Tillet
|
25e9a10917
|
changed auto-tuner parameter ranges
|
2019-06-25 19:27:49 -07:00 |
|
Philippe Tillet
|
d945ce5e1b
|
Now showing valid parameter for NN
|
2019-06-25 19:18:43 -07:00 |
|
Philippe Tillet
|
64513fb407
|
[codegen] added fallback when tensor cores cannot be used
|
2019-06-25 15:49:58 -07:00 |
|
Philippe Tillet
|
62000738f0
|
[codegen] renamed axis_info -> alignment_info
|
2019-06-25 15:10:47 -07:00 |
|
Philippe Tillet
|
d52abc9379
|
[codegen] bugfix in alignment inference
|
2019-06-25 15:06:15 -07:00 |
|
Philippe Tillet
|
edc31cabb0
|
[codegen] rough template for axis_info pass
|
2019-06-24 18:57:32 -07:00 |
|
Philippe Tillet
|
72867d17d4
|
more cleaning
|
2019-06-24 12:37:13 -07:00 |
|
Philippe Tillet
|
f257884eb7
|
some cleaning
|
2019-06-24 09:31:34 -07:00 |
|
Philippe Tillet
|
67989e7d18
|
fixup
|
2019-06-13 20:03:28 -07:00 |
|
Philippe Tillet
|
f7dcea1187
|
Now doing double-buffering
|
2019-06-13 19:48:02 -07:00 |
|
Philippe Tillet
|
36e3667a9a
|
removed shared conflicts for 8x32x4 and 32x8x4 configurations
|
2019-06-13 17:51:54 -07:00 |
|
Philippe Tillet
|
21a9b92c87
|
disabling interleaving
|
2019-06-13 17:16:00 -07:00 |
|
Philippe Tillet
|
d487cf31ce
|
trying 128 bits loads
|
2019-06-12 21:07:01 -07:00 |
|
Philippe Tillet
|
1c6372711b
|
added interleaving
|
2019-06-12 20:30:28 -07:00 |
|
Philippe Tillet
|
a6b580ec05
|
interleaving fails with B
|
2019-06-12 19:46:43 -07:00 |
|
Philippe Tillet
|
1b5a742a88
|
[triton/codegen] added shared memory padding for HMMA arguments and vectorized loads
|
2019-06-11 19:51:08 -07:00 |
|
Philippe Tillet
|
7d50b87681
|
[selection/codegen] bugfix in distributed tile indices initialization
|
2019-06-11 10:45:19 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
d074a166e2
|
[feature] basic tensor core utilization works
|
2019-06-08 14:39:45 -07:00 |
|