Philippe Tillet
|
0d8faa5b1e
|
fixup
|
2019-07-02 21:38:10 -07:00 |
|
Philippe Tillet
|
5144dc3a6c
|
[examples/python] added framework code for shift-conv
|
2019-07-02 20:45:10 -07:00 |
|
Philippe Tillet
|
8fc253946c
|
[codegen] shift: added sketch for shift-convolution backpropagation
|
2019-07-02 16:39:07 -07:00 |
|
Philippe Tillet
|
6cfb575d29
|
[lang] fixup in cast type
|
2019-06-30 17:43:18 -07:00 |
|
Philippe Tillet
|
c172bd518b
|
more stuff
|
2019-06-30 16:55:02 -07:00 |
|
Philippe Tillet
|
9a86bc51e1
|
[language] added alignment metadata for variables
|
2019-06-29 13:58:46 -07:00 |
|
Philippe Tillet
|
d8c3d58593
|
more optimization
|
2019-06-28 20:22:52 -07:00 |
|
Philippe Tillet
|
83b753512c
|
prefetching with shift
|
2019-06-28 17:17:50 -07:00 |
|
Philippe Tillet
|
ab1afbf082
|
more performance optimizations
|
2019-06-28 17:04:07 -07:00 |
|
Philippe Tillet
|
a567f3f8a8
|
more cleaning
|
2019-06-28 15:10:39 -07:00 |
|
Philippe Tillet
|
21fd0fd65e
|
fixup
|
2019-06-28 11:13:36 -07:00 |
|
Philippe Tillet
|
f4dedb522c
|
fixup
|
2019-06-27 17:05:48 -07:00 |
|
Philippe Tillet
|
12e6036e5f
|
trying interior shift
|
2019-06-27 14:13:48 -07:00 |
|
Philippe Tillet
|
d8526669f5
|
fixup
|
2019-06-27 12:39:17 -07:00 |
|
Philippe Tillet
|
9028e40f1d
|
[dnn] added shift in the DNN libs
|
2019-06-27 11:37:19 -07:00 |
|
Philippe Tillet
|
6300ec5080
|
[examples] added conv2d op in tensorflow
|
2019-06-26 18:50:53 -07:00 |
|
Philippe Tillet
|
f1a8972267
|
[examples] added tensorflow dense convolution templates
|
2019-06-26 11:39:22 -07:00 |
|
Philippe Tillet
|
25e9a10917
|
changed auto-tuner parameter ranges
|
2019-06-25 19:27:49 -07:00 |
|
Philippe Tillet
|
d945ce5e1b
|
Now showing valid parameter for NN
|
2019-06-25 19:18:43 -07:00 |
|
Philippe Tillet
|
616f22c610
|
confirmed this is the fastest bounds checking
|
2019-06-25 16:35:43 -07:00 |
|
Philippe Tillet
|
64513fb407
|
[codegen] added fallback when tensor cores cannot be used
|
2019-06-25 15:49:58 -07:00 |
|
Philippe Tillet
|
62000738f0
|
[codegen] renamed axis_info -> alignment_info
|
2019-06-25 15:10:47 -07:00 |
|
Philippe Tillet
|
d52abc9379
|
[codegen] bugfix in alignment inference
|
2019-06-25 15:06:15 -07:00 |
|
Philippe Tillet
|
edc31cabb0
|
[codegen] rough template for axis_info pass
|
2019-06-24 18:57:32 -07:00 |
|
Philippe Tillet
|
72867d17d4
|
more cleaning
|
2019-06-24 12:37:13 -07:00 |
|
Philippe Tillet
|
f257884eb7
|
some cleaning
|
2019-06-24 09:31:34 -07:00 |
|
Philippe Tillet
|
67989e7d18
|
fixup
|
2019-06-13 20:03:28 -07:00 |
|
Philippe Tillet
|
f7dcea1187
|
Now doing double-buffering
|
2019-06-13 19:48:02 -07:00 |
|
Philippe Tillet
|
36e3667a9a
|
removed shared conflicts for 8x32x4 and 32x8x4 configurations
|
2019-06-13 17:51:54 -07:00 |
|
Philippe Tillet
|
21a9b92c87
|
disabling interleaving
|
2019-06-13 17:16:00 -07:00 |
|
Philippe Tillet
|
d487cf31ce
|
trying 128 bits loads
|
2019-06-12 21:07:01 -07:00 |
|
Philippe Tillet
|
1c6372711b
|
added interleaving
|
2019-06-12 20:30:28 -07:00 |
|
Philippe Tillet
|
a6b580ec05
|
interleaving fails with B
|
2019-06-12 19:46:43 -07:00 |
|
Philippe Tillet
|
1b5a742a88
|
[triton/codegen] added shared memory padding for HMMA arguments and vectorized loads
|
2019-06-11 19:51:08 -07:00 |
|
Philippe Tillet
|
cbd916994d
|
[example/tensorflow] no longer hardcoding library dir
|
2019-06-11 11:06:02 -07:00 |
|
Philippe Tillet
|
7d50b87681
|
[selection/codegen] bugfix in distributed tile indices initialization
|
2019-06-11 10:45:19 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
d074a166e2
|
[feature] basic tensor core utilization works
|
2019-06-08 14:39:45 -07:00 |
|
Philippe Tillet
|
5f3d48c1d0
|
[tensor cores] added basic codegen template for using wmma
|
2019-06-07 21:19:47 -07:00 |
|
Philippe Tillet
|
ec4c6aaaaa
|
Added inline PTX for mma.sync
|
2019-06-07 19:39:33 -07:00 |
|
Philippe Tillet
|
6fce9f28ae
|
added fragmented axis
|
2019-06-07 10:32:56 -07:00 |
|
Philippe Tillet
|
781b6d377d
|
seleciton now segfault (expected
|
2019-06-06 20:34:56 -07:00 |
|
Philippe Tillet
|
6045209d5b
|
Now find correct tuning configuration
|
2019-06-06 20:13:26 -07:00 |
|
Philippe Tillet
|
0a0b48e9a2
|
adding hmma tuning parameters
|
2019-06-06 19:51:02 -07:00 |
|
Philippe Tillet
|
81eba3e1ec
|
ugh
|
2019-06-06 19:36:41 -07:00 |
|
Philippe Tillet
|
cdf5a0d011
|
[codegen/tune]: added fragmentation types
|
2019-06-06 16:48:32 -07:00 |
|
Philippe Tillet
|
f58c9a4d2b
|
[general] hmma baseline setup
|
2019-06-05 14:43:38 -07:00 |
|
Philippe Tillet
|
49fcfd6fc7
|
[examples/tensorflow] fixed #include issue
|
2019-06-05 11:09:41 -07:00 |
|
Philippe Tillet
|
383b5b2a2a
|
[triton/ast] renamed ast -> lang in namespace and file structure
|
2019-05-28 17:28:02 -04:00 |
|
Philippe Tillet
|
d2a46afe00
|
[triton/ast]: cleaned the ast module
|
2019-05-28 17:07:54 -04:00 |
|