Philippe Tillet
|
feeb1e9862
|
Feature: Merged kernel-fusion branch
* Fuses multiple AXPY kernel
* Possibility to add thread-wise for loops in AXPY-like kernels
|
2015-09-30 15:31:41 -04:00 |
|
Philippe Tillet
|
10524ebdee
|
CUDA: various improvements
|
2015-08-24 17:03:31 -04:00 |
|
Philippe Tillet
|
155554f5cf
|
Code quality: added clBLAS.def and some ISAACAPI
|
2015-07-21 23:48:50 -07:00 |
|
U-AMR\ptillet
|
8879a867d8
|
Code Quality: fixed compilation errors on MSVC
|
2015-07-20 18:05:31 -07:00 |
|
Philippe Tillet
|
cbe930398e
|
Code quality: Cleaned directory folder, variable names and mingw compliance
|
2015-07-21 13:29:23 -04:00 |
|
Philippe Tillet
|
9d0d50ba05
|
Backend: Fixed alpha, beta in GEMM.
|
2015-06-29 21:52:50 -07:00 |
|
Philippe Tillet
|
f55e499ef5
|
C++: added support for [unsigned] long long
|
2015-05-04 23:54:43 -04:00 |
|
Philippe Tillet
|
278109eef8
|
C++: Now using standard C++ types instead of stdint
|
2015-05-04 21:23:05 -04:00 |
|
Philippe Tillet
|
5cdbef7b4e
|
C++: in value_scalar replaced cl types by stdint types
|
2015-05-04 19:05:32 -04:00 |
|
Philippe Tillet
|
cf5028d55b
|
Squashed feature branch:
* Added CUDA support
* Performance improvements
* API improvements
* Added "depth" parameter to GEMM
* Android cross-compilation
|
2015-04-29 15:52:21 -04:00 |
|