Commit Graph

6 Commits

Author SHA1 Message Date
Philippe Tillet
7e0af2118c [codegen] worked around bug seemingly from nvptx/ptxas by simplifying multiplications by 1:
- Generated LLVM-IR looked correct
- Illegal addressing disappeared when running cuda-memcheck
- Illegal addressing disappeared when using nvptx-short-pointer
2019-08-30 16:45:14 -07:00
Philippe Tillet
d457482539 [codegen] fixed issue in double buffering pointer update 2019-08-28 17:50:45 -07:00
Philippe Tillet
37cbcfabd0 [examples] back to 96 TFLOPS on V100 2019-08-26 22:49:14 -07:00
Philippe Tillet
b4ae06a714 tracking down performance regression 2019-08-26 20:38:39 -07:00
Philippe Tillet
cb04ec0b3b some more cleaning 2019-08-23 19:22:38 -07:00
Philippe Tillet
732156b942 [general] rename *.cpp -> *.cc 2019-08-23 19:06:39 -07:00