Philippe Tillet
|
7e0af2118c
|
[codegen] worked around bug seemingly from nvptx/ptxas by simplifying multiplications by 1:
- Generated LLVM-IR looked correct
- Illegal addressing disappeared when running cuda-memcheck
- Illegal addressing disappeared when using nvptx-short-pointer
|
2019-08-30 16:45:14 -07:00 |
|
Philippe Tillet
|
37cbcfabd0
|
[examples] back to 96 TFLOPS on V100
|
2019-08-26 22:49:14 -07:00 |
|
Philippe Tillet
|
b4ae06a714
|
tracking down performance regression
|
2019-08-26 20:38:39 -07:00 |
|
Philippe Tillet
|
4075949f80
|
[python] basic tensorflow wrapper working
|
2019-08-26 16:53:49 -07:00 |
|
Philippe Tillet
|
321d268a4a
|
more progress
|
2019-08-25 21:26:09 -07:00 |
|
Philippe Tillet
|
732156b942
|
[general] rename *.cpp -> *.cc
|
2019-08-23 19:06:39 -07:00 |
|
Philippe Tillet
|
a110a7e8cf
|
[ir] changed type of tile shapes from constant_int* to int
|
2019-08-23 17:49:21 -07:00 |
|
Philippe Tillet
|
0970fe12dd
|
[general] cleaned tensorflow source code generation
|
2019-08-18 15:39:36 -07:00 |
|
Philippe Tillet
|
457c330f15
|
more cleaning
|
2019-08-18 14:20:42 -07:00 |
|
Philippe Tillet
|
c787ebae68
|
more cleaning
|
2019-08-18 14:09:55 -07:00 |
|
Philippe Tillet
|
81571246cf
|
[general] fixed some warnings
|
2019-08-18 14:08:57 -07:00 |
|
Philippe Tillet
|
b4a9ed9663
|
[python] added basic tensorflow support
|
2019-08-17 18:18:26 -07:00 |
|
Philippe Tillet
|
c7cb5f82ad
|
[general] removed LLVM #include's in all Triton headers
|
2019-08-16 15:56:58 -07:00 |
|
Philippe Tillet
|
38a8b0ab19
|
[runtime] overall of the run-time API
|
2019-08-14 20:26:11 -07:00 |
|
Philippe Tillet
|
1400d960a6
|
[auto-tuning] much smaller parameters space
|
2019-08-12 21:15:21 -07:00 |
|
Philippe Tillet
|
fd49cdc92b
|
[dnn][blocksparse] added dw code
|
2019-08-08 19:15:35 -07:00 |
|
Philippe Tillet
|
7578c27d3d
|
[general][filesystem] added structure and namespace to code generation files
|
2019-08-07 21:17:17 -07:00 |
|