Philippe Tillet
|
0e207e7ca4
|
Backend: Now not creating a temporary upon C = alpha*dot(op(A), op(B)) + beta*C
|
2015-06-27 17:55:01 -07:00 |
|
Philippe
|
743a559f76
|
Backend: Bugfix in GEMM bound-checking
|
2015-06-27 13:14:46 -04:00 |
|
Philippe Tillet
|
80bcbd095f
|
C++: Some renaming; added possibility to pass buffers when constructing arrays
|
2015-06-23 09:38:34 -07:00 |
|
Philippe Tillet
|
05e730f06e
|
CUDA: Many fixes in the backend
|
2015-05-13 02:26:38 -04:00 |
|
Philippe Tillet
|
3b983cf32f
|
CMake: some cleaning
|
2015-05-03 17:41:23 -04:00 |
|
Philippe Tillet
|
cf5028d55b
|
Squashed feature branch:
* Added CUDA support
* Performance improvements
* API improvements
* Added "depth" parameter to GEMM
* Android cross-compilation
|
2015-04-29 15:52:21 -04:00 |
|
Philippe Tillet
|
e453031094
|
More efficient access pattern in the GEMV kernel
|
2015-02-11 02:06:16 -05:00 |
|
Philippe Tillet
|
85b7eb8b5e
|
Added another parameter to GEMV
|
2015-02-10 16:33:38 -05:00 |
|
Philippe Tillet
|
37fc98c532
|
Fixed bug on marix-vector products with vectorization
|
2015-02-10 03:09:41 -05:00 |
|
Philippe Tillet
|
a6d7671831
|
removing C++11 interface
|
2015-02-08 23:19:38 -05:00 |
|
Philippe Tillet
|
e214927b16
|
Better control flow through options
|
2015-02-05 04:43:50 -05:00 |
|
Philippe Tillet
|
bbf2f0188e
|
Ported to C++11
|
2015-02-05 04:43:40 -05:00 |
|
Philippe Tillet
|
2afc574724
|
Implemented simple operation cache
|
2015-02-01 23:56:05 -05:00 |
|
Philippe Tillet
|
535706f35a
|
Some renaming; lower overhead in benchmark
|
2015-02-01 22:28:49 -05:00 |
|
Philippe Tillet
|
d29f1252ad
|
Clearer array_expression with hopefully lower overhead.
Also removed pyc's
|
2015-01-31 22:01:48 -05:00 |
|
Philippe Tillet
|
f488274269
|
Fixed relational operators tests
|
2015-01-29 16:01:46 -05:00 |
|
Philippe Tillet
|
d4629ba018
|
Bugfix in cast and relational operators
|
2015-01-29 02:50:51 -05:00 |
|
Philippe Tillet
|
c7665021d1
|
reducing overhead; reverted custom CL/ header because CL/cl.hpp was buggy
|
2015-01-28 23:04:19 -05:00 |
|
Philippe Tillet
|
0dcf4d3617
|
Better exception handling, lowered CMake requirement ; blas-bench now benchmarks square matrices
|
2015-01-28 17:08:39 -05:00 |
|
Philippe Tillet
|
c37d8a2a81
|
Now using system CL include
|
2015-01-27 16:19:50 -05:00 |
|
Philippe Tillet
|
a96c897cb3
|
Various fixes
|
2015-01-27 02:41:27 -05:00 |
|
Philippe Tillet
|
4a9e16fefd
|
various bugfixes
|
2015-01-25 01:08:18 -05:00 |
|
Philippe Tillet
|
e74563070a
|
API enhancement
|
2015-01-20 11:17:42 -05:00 |
|
Philippe Tillet
|
4f73fb384f
|
More flexibility in scalars
|
2015-01-19 21:29:47 -05:00 |
|
Philippe Tillet
|
6dd3b20ace
|
Bugfix in boundchecking for fallback GEMM
|
2015-01-19 14:37:33 -05:00 |
|
Philippe Tillet
|
edaa821d93
|
low level representation of array
|
2015-01-18 16:53:34 -05:00 |
|
Philippe Tillet
|
16648f18e0
|
various changes
|
2015-01-17 15:47:52 -05:00 |
|
Philippe Tillet
|
0068560bc6
|
Some cleaning + outer product
|
2015-01-17 10:49:36 -05:00 |
|
Philippe Tillet
|
1d70396711
|
Adding diag
|
2015-01-16 20:06:08 -05:00 |
|
Philippe Tillet
|
faa3974f3c
|
Fixed some warnings
|
2015-01-16 07:38:26 -05:00 |
|
Philippe Tillet
|
69311b7982
|
Now ATIDLAS is standalone. Everything dynamic....
|
2015-01-12 13:24:06 -05:00 |
|