[GENERAL] Merged einsum feature branch. Various feature, performance
improvements and bugfixes: * Added preliminary support for extended Einstein summation in PyTriton * Significant performance improvement on FP32 kernels containing matrix multiplication * Added re-coalescing pass for FP16 kernels containing matrix multiplication * Various bugfixes
This commit is contained in:
@@ -77,7 +77,7 @@ class CMakeBuild(build_ext):
|
||||
pass
|
||||
|
||||
cfg = 'Debug' if self.debug else 'Release'
|
||||
#cfg = 'Release'
|
||||
cfg = 'Release'
|
||||
build_args = ['--config', cfg]
|
||||
|
||||
if platform.system() == "Windows":
|
||||
|
Reference in New Issue
Block a user