Commit Graph

208 Commits

Author SHA1 Message Date
Philippe Tillet
0d8faa5b1e fixup 2019-07-02 21:38:10 -07:00
Philippe Tillet
5144dc3a6c [examples/python] added framework code for shift-conv 2019-07-02 20:45:10 -07:00
Philippe Tillet
8fc253946c [codegen] shift: added sketch for shift-convolution backpropagation 2019-07-02 16:39:07 -07:00
Philippe Tillet
6cfb575d29 [lang] fixup in cast type 2019-06-30 17:43:18 -07:00
Philippe Tillet
c172bd518b more stuff 2019-06-30 16:55:02 -07:00
Philippe Tillet
9a86bc51e1 [language] added alignment metadata for variables 2019-06-29 13:58:46 -07:00
Philippe Tillet
d8c3d58593 more optimization 2019-06-28 20:22:52 -07:00
Philippe Tillet
83b753512c prefetching with shift 2019-06-28 17:17:50 -07:00
Philippe Tillet
ab1afbf082 more performance optimizations 2019-06-28 17:04:07 -07:00
Philippe Tillet
a567f3f8a8 more cleaning 2019-06-28 15:10:39 -07:00
Philippe Tillet
21fd0fd65e fixup 2019-06-28 11:13:36 -07:00
Philippe Tillet
f4dedb522c fixup 2019-06-27 17:05:48 -07:00
Philippe Tillet
12e6036e5f trying interior shift 2019-06-27 14:13:48 -07:00
Philippe Tillet
d8526669f5 fixup 2019-06-27 12:39:17 -07:00
Philippe Tillet
9028e40f1d [dnn] added shift in the DNN libs 2019-06-27 11:37:19 -07:00
Philippe Tillet
6300ec5080 [examples] added conv2d op in tensorflow 2019-06-26 18:50:53 -07:00
Philippe Tillet
f1a8972267 [examples] added tensorflow dense convolution templates 2019-06-26 11:39:22 -07:00
Philippe Tillet
25e9a10917 changed auto-tuner parameter ranges 2019-06-25 19:27:49 -07:00
Philippe Tillet
d945ce5e1b Now showing valid parameter for NN 2019-06-25 19:18:43 -07:00
Philippe Tillet
616f22c610 confirmed this is the fastest bounds checking 2019-06-25 16:35:43 -07:00
Philippe Tillet
64513fb407 [codegen] added fallback when tensor cores cannot be used 2019-06-25 15:49:58 -07:00
Philippe Tillet
62000738f0 [codegen] renamed axis_info -> alignment_info 2019-06-25 15:10:47 -07:00
Philippe Tillet
d52abc9379 [codegen] bugfix in alignment inference 2019-06-25 15:06:15 -07:00
Philippe Tillet
edc31cabb0 [codegen] rough template for axis_info pass 2019-06-24 18:57:32 -07:00
Philippe Tillet
72867d17d4 more cleaning 2019-06-24 12:37:13 -07:00
Philippe Tillet
f257884eb7 some cleaning 2019-06-24 09:31:34 -07:00
Philippe Tillet
67989e7d18 fixup 2019-06-13 20:03:28 -07:00
Philippe Tillet
f7dcea1187 Now doing double-buffering 2019-06-13 19:48:02 -07:00
Philippe Tillet
36e3667a9a removed shared conflicts for 8x32x4 and 32x8x4 configurations 2019-06-13 17:51:54 -07:00
Philippe Tillet
21a9b92c87 disabling interleaving 2019-06-13 17:16:00 -07:00
Philippe Tillet
d487cf31ce trying 128 bits loads 2019-06-12 21:07:01 -07:00
Philippe Tillet
1c6372711b added interleaving 2019-06-12 20:30:28 -07:00
Philippe Tillet
a6b580ec05 interleaving fails with B 2019-06-12 19:46:43 -07:00
Philippe Tillet
1b5a742a88 [triton/codegen] added shared memory padding for HMMA arguments and vectorized loads 2019-06-11 19:51:08 -07:00
Philippe Tillet
cbd916994d [example/tensorflow] no longer hardcoding library dir 2019-06-11 11:06:02 -07:00
Philippe Tillet
7d50b87681 [selection/codegen] bugfix in distributed tile indices initialization 2019-06-11 10:45:19 -07:00
Philippe Tillet
06b5992509 [feature] added basic tensor core support 2019-06-11 10:24:49 -07:00
Philippe Tillet
d074a166e2 [feature] basic tensor core utilization works 2019-06-08 14:39:45 -07:00
Philippe Tillet
5f3d48c1d0 [tensor cores] added basic codegen template for using wmma 2019-06-07 21:19:47 -07:00
Philippe Tillet
ec4c6aaaaa Added inline PTX for mma.sync 2019-06-07 19:39:33 -07:00
Philippe Tillet
6fce9f28ae added fragmented axis 2019-06-07 10:32:56 -07:00
Philippe Tillet
781b6d377d seleciton now segfault (expected 2019-06-06 20:34:56 -07:00
Philippe Tillet
6045209d5b Now find correct tuning configuration 2019-06-06 20:13:26 -07:00
Philippe Tillet
0a0b48e9a2 adding hmma tuning parameters 2019-06-06 19:51:02 -07:00
Philippe Tillet
81eba3e1ec ugh 2019-06-06 19:36:41 -07:00
Philippe Tillet
cdf5a0d011 [codegen/tune]: added fragmentation types 2019-06-06 16:48:32 -07:00
Philippe Tillet
f58c9a4d2b [general] hmma baseline setup 2019-06-05 14:43:38 -07:00
Philippe Tillet
49fcfd6fc7 [examples/tensorflow] fixed #include issue 2019-06-05 11:09:41 -07:00
Philippe Tillet
383b5b2a2a [triton/ast] renamed ast -> lang in namespace and file structure 2019-05-28 17:28:02 -04:00
Philippe Tillet
d2a46afe00 [triton/ast]: cleaned the ast module 2019-05-28 17:07:54 -04:00