Philippe Tillet
|
5efdb7978e
|
more improvements and regressions
|
2019-08-06 16:21:20 -07:00 |
|
Philippe Tillet
|
26c9849462
|
[ir][instructions] added permutations option for trans
|
2019-08-05 21:19:13 -07:00 |
|
Philippe Tillet
|
d62e581ab3
|
basic split-k across warps working for GEMM
|
2019-08-05 19:33:28 -07:00 |
|
Philippe Tillet
|
d869d9a924
|
[codegen][selection] more flexible instruction selection for reduce_inst
|
2019-08-04 16:34:36 -07:00 |
|
Philippe Tillet
|
d9945692a9
|
[dnn] better specification of recompilation key
|
2019-08-02 17:42:48 -07:00 |
|
Philippe Tillet
|
3b92ddf7e6
|
[codegen/reassociation] now recursively takes pointer arguments into account as well
|
2019-07-31 18:41:56 -07:00 |
|
Philippe Tillet
|
f7bd976fc7
|
[dnn/blocksparse] added heuristics for block-sparse dot
|
2019-07-31 17:12:36 -07:00 |
|
Philippe Tillet
|
bb32ac56c9
|
[codegen/optimize_dce.cpp] fixed bugs whereby barriers were removed by DCE
|
2019-07-31 15:11:10 -07:00 |
|
Philippe Tillet
|
5af7e5adac
|
Made sure it works for FP16
|
2019-07-30 20:02:16 -07:00 |
|
Philippe Tillet
|
080bf1af88
|
[dnn/blocksparse/dot]: BlocksparseDx also working
|
2019-07-30 11:42:31 -07:00 |
|
Philippe Tillet
|
dc11f70fad
|
[dnn/blocksparse] FPROP test passes!
|
2019-07-29 17:06:20 -07:00 |
|
Philippe Tillet
|
17cb2db356
|
[dnn/blocksparse/dot] prototype version seems to pass basic test
|
2019-07-27 21:21:36 -07:00 |
|
Philippe Tillet
|
2a377bc8b1
|
[ir] deleted mask/merge instructions; will be replaced by masked_load/store and select
|
2019-07-25 15:06:15 -07:00 |
|
Philippe Tillet
|
38b3771c26
|
some reassociation
|
2019-07-23 14:43:18 -07:00 |
|
Philippe Tillet
|
c448876178
|
better benchmarking
|
2019-07-22 19:26:12 -07:00 |
|
Philippe Tillet
|
ead368d1ed
|
[general] a bunch of fixes in anticipation of proper triton vs cudnn
benchmarks
* DNN: Added partial auto-tuning mode and skeleton for heuristics
* Examples: Moduralized benchmarking and now evaluating ResNet-18 shapes
|
2019-07-21 20:17:56 -07:00 |
|
Philippe Tillet
|
b1d81a5802
|
more work on heuristics
|
2019-07-21 18:11:54 -07:00 |
|
Philippe Tillet
|
484e3871cf
|
[dnn/shift] added base pointer for a, b
|
2019-07-20 23:00:27 -07:00 |
|
Philippe Tillet
|
d159455f7b
|
[codegen/alignment_info] better alignment information
|
2019-07-20 21:44:18 -07:00 |
|
Philippe Tillet
|
28c250216c
|
[dnn/gemm] added some bounds checking
|
2019-07-19 21:32:55 -07:00 |
|
Philippe Tillet
|
5215fb0424
|
[codegen] some more optimizations
|
2019-07-19 20:29:03 -07:00 |
|
Philippe Tillet
|
71594da66f
|
[dnn/gemm]: fixed leading dimension in transposed variants
|
2019-07-18 16:35:48 -07:00 |
|
Philippe Tillet
|
f0d8306437
|
[codegen/alignment_info] better handling of constants
|
2019-07-18 16:12:06 -07:00 |
|
Philippe Tillet
|
791c91ee63
|
[dnn/shift] bugfix in static shape division
|
2019-07-17 11:39:17 -07:00 |
|
Philippe Tillet
|
a55b098e88
|
[dnn/shift] now using constant divisions
|
2019-07-16 21:05:21 -07:00 |
|
Philippe Tillet
|
07c964919c
|
[dnn/shift] now strictly only shifting the interior
|
2019-07-16 20:18:48 -07:00 |
|
Philippe Tillet
|
ec24e1e7df
|
trying to remove interior logic
|
2019-07-16 18:47:50 -07:00 |
|
Philippe Tillet
|
5f6dd23fc2
|
[dnn/dot] reverted back to peak tensorcores performance
|
2019-07-16 16:14:58 -07:00 |
|
Philippe Tillet
|
164d85077f
|
more stuff
|
2019-07-16 15:03:53 -07:00 |
|
Philippe Tillet
|
7d1797cd32
|
ugh
|
2019-07-16 12:59:27 -07:00 |
|
Philippe Tillet
|
f50d7a420a
|
[runtime/jit] fixed bug in multi-threaded auto-tuning
|
2019-07-15 21:16:50 -07:00 |
|
Philippe Tillet
|
aa8bcf6bde
|
[dnn/shift] added split-k for shift-conv
|
2019-07-15 21:03:58 -07:00 |
|
Philippe Tillet
|
434f65737f
|
[runtime] put jit::launch_info in another file
|
2019-07-15 12:35:53 -07:00 |
|
Philippe Tillet
|
3e7a3ed67a
|
[dnn/shift]: added support for fp16
|
2019-07-13 21:05:34 -07:00 |
|
Philippe Tillet
|
fe42cb7142
|
[dnn/shift] optimizations for NCHW layout
|
2019-07-12 20:22:32 -07:00 |
|
Philippe Tillet
|
54617b4e51
|
some cleaning
|
2019-07-12 20:10:15 -07:00 |
|
Philippe Tillet
|
7512c7ebed
|
some cleaning
|
2019-07-12 20:03:05 -07:00 |
|
Philippe Tillet
|
c1c7062914
|
blabla
|
2019-07-12 17:42:29 -07:00 |
|
Philippe Tillet
|
f36a646ffc
|
[dnn/shift-conv] added and tested NCHW layout
|
2019-07-11 21:00:33 -07:00 |
|
Philippe Tillet
|
fe8caf12f0
|
[dnn/conv]: skeleton for NCHW layout
|
2019-07-11 20:34:38 -07:00 |
|
Philippe Tillet
|
207e021973
|
[codegen/shift] substantial cleaning of triton-c shift-conv code
|
2019-07-11 20:11:23 -07:00 |
|
Philippe Tillet
|
75cf2df110
|
[dnn/shift] many bugfixes in strided shift-conv
|
2019-07-10 19:49:31 -07:00 |
|
Philippe Tillet
|
4ca83f1935
|
ugh bug in shift-conv striding
|
2019-07-10 17:00:22 -07:00 |
|
Philippe Tillet
|
f665c742f9
|
testing a simple shiftnet
|
2019-07-10 13:33:08 -07:00 |
|
Philippe Tillet
|
b7986baffa
|
[dnn]: Now implementing all existing DNN routines using common base template and auto-tuner
|
2019-07-09 19:52:55 -07:00 |
|
Philippe Tillet
|
88675fa01a
|
[dnn] added base template class for mutualized auto-tuning
|
2019-07-09 16:09:34 -07:00 |
|
Philippe Tillet
|
066ae338f1
|
[dnn/shift]: added stride to shift
|
2019-07-09 14:08:51 -07:00 |
|
Philippe Tillet
|
cc41604784
|
[codegen/batchnorm] forward and backward now seemingly working
|
2019-07-09 13:03:16 -07:00 |
|
Philippe Tillet
|
f74dcb7e30
|
[dnn/batchnorm]: added some more code in Triton-C batchnorm implementations
|
2019-07-08 20:18:20 -07:00 |
|
Philippe Tillet
|
fa3270dcf2
|
[codegen/selection] bugfix in code generation for reduction instructions
|
2019-07-08 18:53:37 -07:00 |
|