Philippe Tillet
|
37cbcfabd0
|
[examples] back to 96 TFLOPS on V100
|
2019-08-26 22:49:14 -07:00 |
|
Philippe Tillet
|
b4ae06a714
|
tracking down performance regression
|
2019-08-26 20:38:39 -07:00 |
|
Philippe Tillet
|
4075949f80
|
[python] basic tensorflow wrapper working
|
2019-08-26 16:53:49 -07:00 |
|
Philippe Tillet
|
321d268a4a
|
more progress
|
2019-08-25 21:26:09 -07:00 |
|
Philippe Tillet
|
96b4d5e411
|
[examples] multiple transposition schemes now supported
|
2019-08-24 13:08:38 -07:00 |
|
Philippe Tillet
|
732156b942
|
[general] rename *.cpp -> *.cc
|
2019-08-23 19:06:39 -07:00 |
|
Philippe Tillet
|
f98b0b8e2a
|
[general] deleted the old compiler frontend
|
2019-08-23 17:28:02 -07:00 |
|
Philippe Tillet
|
8798d240dc
|
matmul test passes
|
2019-08-23 17:13:30 -07:00 |
|
Philippe Tillet
|
845c0e5b93
|
adding tunable parameters
|
2019-08-22 19:21:01 -07:00 |
|
Philippe Tillet
|
87072203c1
|
[codegen] triton-ir code generation does not crash
|
2019-08-22 17:27:10 -07:00 |
|
Philippe Tillet
|
a6ec807223
|
more debugging
|
2019-08-21 21:53:41 -07:00 |
|
Philippe Tillet
|
bc11e31419
|
[lang] more progress on parser
|
2019-08-19 20:56:39 -07:00 |
|
Philippe Tillet
|
0970fe12dd
|
[general] cleaned tensorflow source code generation
|
2019-08-18 15:39:36 -07:00 |
|
Philippe Tillet
|
457c330f15
|
more cleaning
|
2019-08-18 14:20:42 -07:00 |
|
Philippe Tillet
|
81571246cf
|
[general] fixed some warnings
|
2019-08-18 14:08:57 -07:00 |
|
Philippe Tillet
|
c05445d001
|
[general] removed dnn/ module and runtime/jit.cpp
|
2019-08-18 00:41:05 -07:00 |
|
Philippe Tillet
|
4de22df930
|
[python] added skeleton for python interface
|
2019-08-15 20:50:10 -07:00 |
|
Philippe Tillet
|
3ece461ce2
|
added tensorflow code generator
|
2019-08-15 15:59:53 -07:00 |
|
Philippe Tillet
|
38a8b0ab19
|
[runtime] overall of the run-time API
|
2019-08-14 20:26:11 -07:00 |
|
Philippe Tillet
|
1400d960a6
|
[auto-tuning] much smaller parameters space
|
2019-08-12 21:15:21 -07:00 |
|
Philippe Tillet
|
fd49cdc92b
|
[dnn][blocksparse] added dw code
|
2019-08-08 19:15:35 -07:00 |
|
Philippe Tillet
|
7578c27d3d
|
[general][filesystem] added structure and namespace to code generation files
|
2019-08-07 21:17:17 -07:00 |
|
Philippe Tillet
|
392b55280d
|
[codegen] some cleaning for batched matmul
|
2019-08-07 21:17:17 -07:00 |
|
Philippe Tillet
|
7b75b68edc
|
dirty but working warp-splitting
|
2019-08-06 21:07:13 -07:00 |
|
Philippe Tillet
|
cf256a636c
|
fixup
|
2019-08-06 16:44:16 -07:00 |
|
Philippe Tillet
|
5efdb7978e
|
more improvements and regressions
|
2019-08-06 16:21:20 -07:00 |
|
Philippe Tillet
|
d869d9a924
|
[codegen][selection] more flexible instruction selection for reduce_inst
|
2019-08-04 16:34:36 -07:00 |
|
Philippe Tillet
|
6be532c6a2
|
[codegen][selection] adding support for reduction along arbitrary axis
|
2019-08-02 21:29:36 -07:00 |
|
Philippe Tillet
|
d9945692a9
|
[dnn] better specification of recompilation key
|
2019-08-02 17:42:48 -07:00 |
|
Philippe Tillet
|
f7bd976fc7
|
[dnn/blocksparse] added heuristics for block-sparse dot
|
2019-07-31 17:12:36 -07:00 |
|
Philippe Tillet
|
5af7e5adac
|
Made sure it works for FP16
|
2019-07-30 20:02:16 -07:00 |
|
Philippe Tillet
|
17cb2db356
|
[dnn/blocksparse/dot] prototype version seems to pass basic test
|
2019-07-27 21:21:36 -07:00 |
|
Philippe Tillet
|
c448876178
|
better benchmarking
|
2019-07-22 19:26:12 -07:00 |
|
Philippe Tillet
|
ead368d1ed
|
[general] a bunch of fixes in anticipation of proper triton vs cudnn
benchmarks
* DNN: Added partial auto-tuning mode and skeleton for heuristics
* Examples: Moduralized benchmarking and now evaluating ResNet-18 shapes
|
2019-07-21 20:17:56 -07:00 |
|
Philippe Tillet
|
b1d81a5802
|
more work on heuristics
|
2019-07-21 18:11:54 -07:00 |
|
Philippe Tillet
|
28c250216c
|
[dnn/gemm] added some bounds checking
|
2019-07-19 21:32:55 -07:00 |
|
Philippe Tillet
|
5215fb0424
|
[codegen] some more optimizations
|
2019-07-19 20:29:03 -07:00 |
|
Philippe Tillet
|
f0d8306437
|
[codegen/alignment_info] better handling of constants
|
2019-07-18 16:12:06 -07:00 |
|
Philippe Tillet
|
bfa39b8992
|
preparing the field for tensor cores transposes
|
2019-07-17 13:20:33 -07:00 |
|
Philippe Tillet
|
a55b098e88
|
[dnn/shift] now using constant divisions
|
2019-07-16 21:05:21 -07:00 |
|
Philippe Tillet
|
07c964919c
|
[dnn/shift] now strictly only shifting the interior
|
2019-07-16 20:18:48 -07:00 |
|
Philippe Tillet
|
164d85077f
|
more stuff
|
2019-07-16 15:03:53 -07:00 |
|
Philippe Tillet
|
28959fe165
|
[runtime/jit] made auto-tuning silent
|
2019-07-16 14:41:38 -07:00 |
|
Philippe Tillet
|
7d1797cd32
|
ugh
|
2019-07-16 12:59:27 -07:00 |
|
Philippe Tillet
|
f50d7a420a
|
[runtime/jit] fixed bug in multi-threaded auto-tuning
|
2019-07-15 21:16:50 -07:00 |
|
Philippe Tillet
|
aa8bcf6bde
|
[dnn/shift] added split-k for shift-conv
|
2019-07-15 21:03:58 -07:00 |
|
Philippe Tillet
|
434f65737f
|
[runtime] put jit::launch_info in another file
|
2019-07-15 12:35:53 -07:00 |
|
Philippe Tillet
|
3c128fc2e2
|
[jit/autotune] added support for multi-threaded auto-tuning
|
2019-07-14 22:31:30 -07:00 |
|
Philippe Tillet
|
3e7a3ed67a
|
[dnn/shift]: added support for fp16
|
2019-07-13 21:05:34 -07:00 |
|
Philippe Tillet
|
1d88f0a36b
|
stuff
|
2019-07-03 19:25:16 -07:00 |
|