Philippe Tillet
|
b1d81a5802
|
more work on heuristics
|
2019-07-21 18:11:54 -07:00 |
|
Philippe Tillet
|
d159455f7b
|
[codegen/alignment_info] better alignment information
|
2019-07-20 21:44:18 -07:00 |
|
Philippe Tillet
|
28c250216c
|
[dnn/gemm] added some bounds checking
|
2019-07-19 21:32:55 -07:00 |
|
Philippe Tillet
|
5215fb0424
|
[codegen] some more optimizations
|
2019-07-19 20:29:03 -07:00 |
|
Philippe Tillet
|
f0d8306437
|
[codegen/alignment_info] better handling of constants
|
2019-07-18 16:12:06 -07:00 |
|
Philippe Tillet
|
86f70f8224
|
[codegen/selection] performance fix-up when A is transposed for hmma
|
2019-07-17 21:46:23 -07:00 |
|
Philippe Tillet
|
2f0817b2cd
|
[codegen/selection] tensor cores now used for transposed layotus
|
2019-07-17 17:20:38 -07:00 |
|
Philippe Tillet
|
bfa39b8992
|
preparing the field for tensor cores transposes
|
2019-07-17 13:20:33 -07:00 |
|
Philippe Tillet
|
791c91ee63
|
[dnn/shift] bugfix in static shape division
|
2019-07-17 11:39:17 -07:00 |
|
Philippe Tillet
|
a55b098e88
|
[dnn/shift] now using constant divisions
|
2019-07-16 21:05:21 -07:00 |
|
Philippe Tillet
|
07c964919c
|
[dnn/shift] now strictly only shifting the interior
|
2019-07-16 20:18:48 -07:00 |
|
Philippe Tillet
|
ec24e1e7df
|
trying to remove interior logic
|
2019-07-16 18:47:50 -07:00 |
|
Philippe Tillet
|
5f6dd23fc2
|
[dnn/dot] reverted back to peak tensorcores performance
|
2019-07-16 16:14:58 -07:00 |
|
Philippe Tillet
|
28959fe165
|
[runtime/jit] made auto-tuning silent
|
2019-07-16 14:41:38 -07:00 |
|
Philippe Tillet
|
f9db0449b7
|
[dnn] Adding batchnorm
|
2019-07-08 18:44:37 -07:00 |
|
Philippe Tillet
|
8fc253946c
|
[codegen] shift: added sketch for shift-convolution backpropagation
|
2019-07-02 16:39:07 -07:00 |
|
Philippe Tillet
|
6cfb575d29
|
[lang] fixup in cast type
|
2019-06-30 17:43:18 -07:00 |
|
Philippe Tillet
|
c172bd518b
|
more stuff
|
2019-06-30 16:55:02 -07:00 |
|
Philippe Tillet
|
d8c3d58593
|
more optimization
|
2019-06-28 20:22:52 -07:00 |
|
Philippe Tillet
|
f4dedb522c
|
fixup
|
2019-06-27 17:05:48 -07:00 |
|
Philippe Tillet
|
6300ec5080
|
[examples] added conv2d op in tensorflow
|
2019-06-26 18:50:53 -07:00 |
|
Philippe Tillet
|
25e9a10917
|
changed auto-tuner parameter ranges
|
2019-06-25 19:27:49 -07:00 |
|
Philippe Tillet
|
d945ce5e1b
|
Now showing valid parameter for NN
|
2019-06-25 19:18:43 -07:00 |
|
Philippe Tillet
|
64513fb407
|
[codegen] added fallback when tensor cores cannot be used
|
2019-06-25 15:49:58 -07:00 |
|
Philippe Tillet
|
06b5992509
|
[feature] added basic tensor core support
|
2019-06-11 10:24:49 -07:00 |
|
Philippe Tillet
|
6045209d5b
|
Now find correct tuning configuration
|
2019-06-06 20:13:26 -07:00 |
|
Philippe Tillet
|
0a0b48e9a2
|
adding hmma tuning parameters
|
2019-06-06 19:51:02 -07:00 |
|
Philippe Tillet
|
81eba3e1ec
|
ugh
|
2019-06-06 19:36:41 -07:00 |
|
Philippe Tillet
|
cdf5a0d011
|
[codegen/tune]: added fragmentation types
|
2019-06-06 16:48:32 -07:00 |
|
Philippe Tillet
|
30833c18f1
|
[codegen/tune] bugfix in heuristics for nano-tile sizes
|
2019-05-04 01:32:34 -04:00 |
|
Philippe Tillet
|
3413aad582
|
[general] major overhaul of triton-c/triton-ir/triton-jit:
- Added alloc const
- Added atomics
- Pruning tuning space
- Added example for dot/conv/shift
- Bugfixes
|
2019-04-25 16:18:15 -04:00 |
|
Philippe Tillet
|
0c607c9392
|
[examples] normalize benchmark by max_clock / current_clock
|
2019-03-28 07:58:37 -04:00 |
|
Philippe Tillet
|
2c3ae0675e
|
[JIT] re-added nvidia compatibility
|
2019-03-27 21:12:01 -04:00 |
|
Philippe Tillet
|
fdf8559806
|
[general] added missing files
|
2019-03-27 20:01:35 -04:00 |
|
Philippe Tillet
|
bc2a257d5c
|
[code generation] more flexibility in backend selection
|
2019-03-27 11:29:42 -07:00 |
|
Philippe Tillet
|
e04253c0dd
|
[code generation] basic CPU backend
|
2019-03-27 11:13:36 -07:00 |
|
Philippe Tillet
|
8d35c98920
|
[code generation] search space pruning
|
2019-03-25 14:10:24 -07:00 |
|
Philippe Tillet
|
b73c3bdd25
|
[examples] removed dependency on isaac for auto-tuning
|
2019-03-11 22:22:43 -04:00 |
|
Philippe Tillet
|
87c85ed50d
|
[code generation] reparameterization
|
2019-03-11 19:30:21 -04:00 |
|
Philippe Tillet
|
614f83baee
|
[jit] basic auto-tuning support
|
2019-03-11 12:00:50 -04:00 |
|
Philippe Tillet
|
94e315ea8a
|
Reparameterized in terms of micro- and nano- tiles
|
2019-03-10 23:10:17 -04:00 |
|
Philippe Tillet
|
c96a263896
|
[jit] changed default metaparameter ranges
|
2019-03-10 10:45:21 -04:00 |
|
Philippe Tillet
|
9a3537662d
|
[jit] can now infer launch parameters from triton module
|
2019-03-09 14:44:13 -05:00 |
|
Philippe Tillet
|
b721202812
|
[code generation] uniformized shape and layout metaparameters
|
2019-03-09 12:31:21 -05:00 |
|
Philippe Tillet
|
5f29263044
|
[code generation] now using ir::metaparameter* for all tunable
metaparameters
|
2019-03-09 12:05:12 -05:00 |
|
Philippe Tillet
|
36acf22fd3
|
better masking
|
2019-02-28 23:46:11 -05:00 |
|
Philippe Tillet
|
daa828ec18
|
[general] rename namespace tdl -> triton
|
2019-02-24 14:35:16 -05:00 |
|
Philippe Tillet
|
6b49818282
|
[filesystem] rename tdl -> triton
|
2019-02-24 14:20:40 -05:00 |
|
Philippe Tillet
|
8f4798b81a
|
[intermediate representation] transitioning towards more flexible tile
shapes
|
2019-02-23 11:37:01 -05:00 |
|
Philippe Tillet
|
7cda55df16
|
[code generation] implements hidden operands in user (e.g., mask)
|
2019-02-21 18:00:27 -05:00 |
|