Commit Graph

186 Commits

Author SHA1 Message Date
Philippe Tillet
66e32b3074 [codegen][grid] some cleaning 2019-09-14 13:05:53 -04:00
Philippe Tillet
8ae779206f more fixes 2019-09-14 02:36:11 -04:00
Philippe Tillet
eae02b99e5 [codegen][coalesce] fixed stale users in cloned instructions 2019-09-13 19:16:04 -04:00
Philippe Tillet
579a662e60 [codegen][coalesce] more bugfixes 2019-09-13 14:17:21 -04:00
Philippe Tillet
11ff27d638 [codegen][coalesce] some bugfix for phi-nodes 2019-09-12 22:44:07 -04:00
Philippe Tillet
0c41bade07 [codegen] basic recoalescing working 2019-09-10 23:25:47 -04:00
Philippe Tillet
c622619bcb more progress 2019-09-10 00:37:51 -04:00
Philippe Tillet
0cbbcce5c0 added missing file 2019-09-08 21:38:08 -04:00
Philippe Tillet
3daef1726d more progress 2019-09-08 21:36:54 -04:00
Philippe Tillet
3d78810d5e more progress 2019-09-08 21:29:40 -04:00
Philippe Tillet
32234c2612 ugh 2019-09-08 17:35:24 -04:00
Philippe Tillet
5e03f0a065 [codegen][align] reverted some changes 2019-09-03 15:28:07 -04:00
Philippe Tillet
97fdb5b6be [tests] added missing files 2019-09-03 12:44:35 -04:00
Philippe Tillet
a842d337c5 [general] various cleaning and bugfix:
* added copy1d and copy2d benchmark
* fixed issue in reassociation pass
2019-09-02 23:00:49 -04:00
Philippe Tillet
90d80c3b2e [codegen][selection] bugfix in scanline dot lowering 2019-09-01 16:30:53 -04:00
Philippe Tillet
7e0af2118c [codegen] worked around bug seemingly from nvptx/ptxas by simplifying multiplications by 1:
- Generated LLVM-IR looked correct
- Illegal addressing disappeared when running cuda-memcheck
- Illegal addressing disappeared when using nvptx-short-pointer
2019-08-30 16:45:14 -07:00
Philippe Tillet
d457482539 [codegen] fixed issue in double buffering pointer update 2019-08-28 17:50:45 -07:00
Philippe Tillet
37cbcfabd0 [examples] back to 96 TFLOPS on V100 2019-08-26 22:49:14 -07:00
Philippe Tillet
b4ae06a714 tracking down performance regression 2019-08-26 20:38:39 -07:00
Philippe Tillet
4075949f80 [python] basic tensorflow wrapper working 2019-08-26 16:53:49 -07:00
Philippe Tillet
321d268a4a more progress 2019-08-25 21:26:09 -07:00
Philippe Tillet
96b4d5e411 [examples] multiple transposition schemes now supported 2019-08-24 13:08:38 -07:00
Philippe Tillet
732156b942 [general] rename *.cpp -> *.cc 2019-08-23 19:06:39 -07:00
Philippe Tillet
a110a7e8cf [ir] changed type of tile shapes from constant_int* to int 2019-08-23 17:49:21 -07:00
Philippe Tillet
8798d240dc matmul test passes 2019-08-23 17:13:30 -07:00
Philippe Tillet
0970fe12dd [general] cleaned tensorflow source code generation 2019-08-18 15:39:36 -07:00
Philippe Tillet
457c330f15 more cleaning 2019-08-18 14:20:42 -07:00
Philippe Tillet
c787ebae68 more cleaning 2019-08-18 14:09:55 -07:00
Philippe Tillet
81571246cf [general] fixed some warnings 2019-08-18 14:08:57 -07:00
Philippe Tillet
b58b0d8b27 [general] removed unnecessary includes 2019-08-18 00:34:30 -07:00
Philippe Tillet
b4a9ed9663 [python] added basic tensorflow support 2019-08-17 18:18:26 -07:00
Philippe Tillet
c7cb5f82ad [general] removed LLVM #include's in all Triton headers 2019-08-16 15:56:58 -07:00
Philippe Tillet
38a8b0ab19 [runtime] overall of the run-time API 2019-08-14 20:26:11 -07:00
Philippe Tillet
b8cd63e0da [codegen] separated lower_dot_inst into lower_outer_dot ||
lower_hmma_dot || lower_scanline_dot
2019-08-12 21:48:30 -07:00
Philippe Tillet
4bc5758a22 [general] some cleaning:
* trans/dot -> peephole
* isel -> added function for tile-level lowering
2019-08-12 21:15:21 -07:00
Philippe Tillet
1400d960a6 [auto-tuning] much smaller parameters space 2019-08-12 21:15:21 -07:00
Philippe Tillet
fd49cdc92b [dnn][blocksparse] added dw code 2019-08-08 19:15:35 -07:00
Philippe Tillet
f93099bda1 [codegen][transform][trans] fixed incorrect replace_all_uses_with 2019-08-07 21:50:16 -07:00
Philippe Tillet
7578c27d3d [general][filesystem] added structure and namespace to code generation files 2019-08-07 21:17:17 -07:00
Philippe Tillet
392b55280d [codegen] some cleaning for batched matmul 2019-08-07 21:17:17 -07:00
Philippe Tillet
7b75b68edc dirty but working warp-splitting 2019-08-06 21:07:13 -07:00
Philippe Tillet
494bfa7671 didn't break correctness of existing HMMA 2019-08-06 17:34:00 -07:00
Philippe Tillet
cf256a636c fixup 2019-08-06 16:44:16 -07:00
Philippe Tillet
5efdb7978e more improvements and regressions 2019-08-06 16:21:20 -07:00
Philippe Tillet
26c9849462 [ir][instructions] added permutations option for trans 2019-08-05 21:19:13 -07:00
Philippe Tillet
d62e581ab3 basic split-k across warps working for GEMM 2019-08-05 19:33:28 -07:00
Philippe Tillet
d869d9a924 [codegen][selection] more flexible instruction selection for reduce_inst 2019-08-04 16:34:36 -07:00
Philippe Tillet
6be532c6a2 [codegen][selection] adding support for reduction along arbitrary axis 2019-08-02 21:29:36 -07:00
Philippe Tillet
d9945692a9 [dnn] better specification of recompilation key 2019-08-02 17:42:48 -07:00
Philippe Tillet
3b92ddf7e6 [codegen/reassociation] now recursively takes pointer arguments into account as well 2019-07-31 18:41:56 -07:00