Philippe Tillet
a9efb27fde
[CODEGEN][ANALYSIS] bugfix in alignment analysis
2020-05-01 17:38:23 -04:00
Philippe Tillet
f278d9741a
[GENERAL] Merged einsum feature branch. Various feature, performance
...
improvements and bugfixes:
* Added preliminary support for extended Einstein summation in PyTriton
* Significant performance improvement on FP32 kernels containing matrix
multiplication
* Added re-coalescing pass for FP16 kernels containing matrix
multiplication
* Various bugfixes
2020-01-20 12:42:48 -05:00
Philippe Tillet
de6fdd5625
[general] removed useless files and includes
2019-10-20 19:29:48 -04:00
Philippe Tillet
650c43ca07
[codegen] more cleaning
2019-10-07 18:06:54 -04:00
Philippe Tillet
ed1b2bc563
more work on padding
2019-09-27 22:15:30 -04:00
Philippe Tillet
c24d55db23
[codegen] more work on hmma coalescing
2019-09-23 20:38:27 -04:00
Philippe Tillet
43d88154bd
[codegen] cleaning-up / formalizing shared-memory passes
2019-09-20 16:01:12 -04:00
Philippe Tillet
e35be1ddcf
[ir][instruction] added identifier for each instruction
2019-09-19 16:25:36 -04:00
Philippe Tillet
e184bad9a1
[auto-coalesce] more bugfixes
2019-09-16 13:28:23 -04:00
Philippe Tillet
495163e0e8
some more cleaning
2019-09-14 16:53:13 -04:00
Philippe Tillet
0c41bade07
[codegen] basic recoalescing working
2019-09-10 23:25:47 -04:00
Philippe Tillet
32234c2612
ugh
2019-09-08 17:35:24 -04:00
Philippe Tillet
5e03f0a065
[codegen][align] reverted some changes
2019-09-03 15:28:07 -04:00
Philippe Tillet
97fdb5b6be
[tests] added missing files
2019-09-03 12:44:35 -04:00
Philippe Tillet
a842d337c5
[general] various cleaning and bugfix:
...
* added copy1d and copy2d benchmark
* fixed issue in reassociation pass
2019-09-02 23:00:49 -04:00