Philippe Tillet
78b98fb7cf
[GENERAL] Cleaned polymorphic structure of layouts analysis pass
2020-01-21 11:38:39 -05:00
Philippe Tillet
f278d9741a
[GENERAL] Merged einsum feature branch. Various feature, performance
...
improvements and bugfixes:
* Added preliminary support for extended Einstein summation in PyTriton
* Significant performance improvement on FP32 kernels containing matrix
multiplication
* Added re-coalescing pass for FP16 kernels containing matrix
multiplication
* Various bugfixes
2020-01-20 12:42:48 -05:00
Philippe Tillet
de6fdd5625
[general] removed useless files and includes
2019-10-20 19:29:48 -04:00
Philippe Tillet
4efd0a3c6b
[codegen] more cleaning
2019-10-10 15:52:03 -04:00
Philippe Tillet
9bc6df4fd1
[codegen] more cleaning
2019-10-09 15:05:44 -04:00
Philippe Tillet
650c43ca07
[codegen] more cleaning
2019-10-07 18:06:54 -04:00
Philippe Tillet
001973630e
[codegen] cleaned up shared memory and double-buffering logic
2019-09-21 22:21:40 -04:00
Philippe Tillet
43d88154bd
[codegen] cleaning-up / formalizing shared-memory passes
2019-09-20 16:01:12 -04:00
Philippe Tillet
e35be1ddcf
[ir][instruction] added identifier for each instruction
2019-09-19 16:25:36 -04:00
Philippe Tillet
1fd9be27ee
[tests][bench] now benchmarking all variants of copy
2019-09-17 22:17:58 -04:00
Philippe Tillet
307c1128d5
[codegen] removed vectorization pass (now part of selection)
2019-09-17 15:21:10 -04:00
Philippe Tillet
e01e623333
[codegen][auto-coalesce] more debugging
2019-09-16 20:34:08 -04:00
Philippe Tillet
e184bad9a1
[auto-coalesce] more bugfixes
2019-09-16 13:28:23 -04:00
Philippe Tillet
495163e0e8
some more cleaning
2019-09-14 16:53:13 -04:00
Philippe Tillet
8ae779206f
more fixes
2019-09-14 02:36:11 -04:00
Philippe Tillet
eae02b99e5
[codegen][coalesce] fixed stale users in cloned instructions
2019-09-13 19:16:04 -04:00
Philippe Tillet
579a662e60
[codegen][coalesce] more bugfixes
2019-09-13 14:17:21 -04:00
Philippe Tillet
11ff27d638
[codegen][coalesce] some bugfix for phi-nodes
2019-09-12 22:44:07 -04:00