Commit Graph

6 Commits

Author SHA1 Message Date
Philippe Tillet
2acaa4d0dd [LANG] Added support for constexpr (#361) 2021-10-30 00:32:58 -07:00
Philippe Tillet
d3e584d4ba Revert "[DRIVER] Fixed CUDA 10.1 bug (#357)" (#358)
This reverts commit d35014ba47.
2021-10-26 15:04:49 -07:00
Philippe Tillet
d35014ba47 [DRIVER] Fixed CUDA 10.1 bug (#357) 2021-10-26 11:17:06 -07:00
Philippe Tillet
5ce1b726dc [CODEGEN] Various bugfixes that make it possible to fuse RNG in a matmul epilogue (#356) 2021-10-24 02:30:46 -07:00
Philippe Tillet
6e5b0b4301 [FRONTEND] Added on-disk cache for compiled kernels (#287) 2021-09-18 22:48:26 -07:00
Philippe Tillet
94c83d30ce [GENERAL] Removed deprecated driver files and added basic compatibility with rocm (#268)
- Removed driver module -- accelerator runtime is handled by pytorch
- Added basic support for ROCM based on @micmelesse 's PR -- now can execute empty kernel on AMD devices without any compile-time changes
- Now only using PREFER_SHARED for kernels when the size of shared memory is greater than 49k. Otherwise there can be poor L1 performance for broadcast tensors
2021-09-09 00:04:28 -07:00