Commit Graph

4 Commits

Author SHA1 Message Date
Philippe Tillet
15f8e8c3b7 [CODEGEN] Major performance improvements on A100 (#70)
Improved handling of asynchronous copy, scheduling and synchronization for A100. Now achieving CUTLASS-like performance on large square dense matrix multiplication tasks
2021-02-21 18:19:39 -05:00
Philippe Tillet
069083e28a [DRIVER] Removed obsolete SetArg 2020-11-26 23:12:39 -05:00
Philippe Tillet
db7a72bfe3 [DRIVER] Removed OpenCL support
There is no plan to support OpenCL anytime soon (Vulkan would be preferred). Removing the adequate portion of the driver code
2020-10-13 20:57:32 -07:00
Philippe Tillet
732156b942 [general] rename *.cpp -> *.cc 2019-08-23 19:06:39 -07:00