Philippe Tillet
25055cbba7
[TESTS] Simplified testing of half-precision transposes
2020-04-09 01:10:11 -04:00
Philippe Tillet
f278d9741a
[GENERAL] Merged einsum feature branch. Various feature, performance
...
improvements and bugfixes:
* Added preliminary support for extended Einstein summation in PyTriton
* Significant performance improvement on FP32 kernels containing matrix
multiplication
* Added re-coalescing pass for FP16 kernels containing matrix
multiplication
* Various bugfixes
2020-01-20 12:42:48 -05:00
Philippe Tillet
96cba9036a
[tests] [unit] added 1D and 3D reduction test
2019-10-20 17:48:19 -04:00
Philippe Tillet
d76c6bc3c7
Merge branch 'master' into auto-coalesce
2019-10-18 16:21:28 -04:00
Philippe Tillet
3d5ab4bc0d
[codegen] [selection] created machine layouts
2019-10-15 12:29:58 -04:00
Philippe Tillet
856e7baa04
[test] added tests for copy
2019-09-23 12:07:24 -04:00
Philippe Tillet
7f2bc5bb66
[testing] re-arranged util.h
2019-09-12 16:20:29 -04:00
Philippe Tillet
f4beb713ab
[test] added support for max, min reduction and made it easy to add more
2019-09-12 16:11:57 -04:00
Philippe Tillet
178094b5f7
[codegen] exposed a bug in reductions
2019-09-11 20:47:17 -04:00
Philippe Tillet
04a0fbd8e3
[tests] basic test for reduction in python passes
2019-09-11 17:35:56 -04:00
Philippe Tillet
2781cdcf93
[lang] added templates for reductions
2019-09-10 15:54:16 -04:00
Philippe Tillet
a842d337c5
[general] various cleaning and bugfix:
...
* added copy1d and copy2d benchmark
* fixed issue in reassociation pass
2019-09-02 23:00:49 -04:00
Philippe Tillet
e3c953e79f
[test] added more re-usable code in common/util.h
2019-08-28 18:06:36 -07:00