Philippe Tillet
|
e2cdb88338
|
Core: included bugfixes from the SVD branch
|
2015-11-19 12:37:18 -05:00 |
|
Philippe Tillet
|
feeb1e9862
|
Feature: Merged kernel-fusion branch
* Fuses multiple AXPY kernel
* Possibility to add thread-wise for loops in AXPY-like kernels
|
2015-09-30 15:31:41 -04:00 |
|
Philippe Tillet
|
ff4cf94df7
|
Code quality: significant cleaning of namespaces, etc..
|
2015-08-12 00:47:58 -07:00 |
|
Philippe Tillet
|
5a8cfede45
|
Code quality: renamed model/ to database/
|
2015-08-11 20:18:39 -07:00 |
|
Philippe Tillet
|
db090d7942
|
Code quality: Large clean-up of the codebase and especially of the include/ folder
|
2015-08-06 12:05:12 -07:00 |
|
Philippe Tillet
|
df2d5e7d00
|
Models: cleaning of the global caching mechanism
|
2015-08-04 10:06:52 -07:00 |
|
Philippe Tillet
|
9c15debf8b
|
Code quality: removed tools::shared_ptr<>
|
2015-07-28 15:26:10 -07:00 |
|
Philippe Tillet
|
89ee015f7f
|
General: Bugfixes here and there
|
2015-07-27 11:37:19 -07:00 |
|
Philippe Tillet
|
0ef6654c5f
|
Code quality: removed dependencies on the C++ OpenCL wrapper
|
2015-07-26 10:05:16 -07:00 |
|
Philippe Tillet
|
cfa6ea812d
|
Cleaning: Largely renamed templates to BLAS-like names
|
2015-07-11 11:21:15 -04:00 |
|
Philippe Tillet
|
48073dc710
|
C++: improved temporaries handling
|
2015-06-28 00:06:49 -07:00 |
|
Philippe Tillet
|
0e207e7ca4
|
Backend: Now not creating a temporary upon C = alpha*dot(op(A), op(B)) + beta*C
|
2015-06-27 17:55:01 -07:00 |
|
Philippe
|
4cce9d3efd
|
C: More clBLAS tests
|
2015-06-27 11:44:50 -04:00 |
|
Philippe Tillet
|
e1506097b2
|
Python: now removing the build directory while packaging
|
2015-05-04 21:26:27 -04:00 |
|
Philippe Tillet
|
cf5028d55b
|
Squashed feature branch:
* Added CUDA support
* Performance improvements
* API improvements
* Added "depth" parameter to GEMM
* Android cross-compilation
|
2015-04-29 15:52:21 -04:00 |
|
Philippe Tillet
|
a6d7671831
|
removing C++11 interface
|
2015-02-08 23:19:38 -05:00 |
|
Philippe Tillet
|
e214927b16
|
Better control flow through options
|
2015-02-05 04:43:50 -05:00 |
|
Philippe Tillet
|
bbf2f0188e
|
Ported to C++11
|
2015-02-05 04:43:40 -05:00 |
|
Philippe Tillet
|
2afc574724
|
Implemented simple operation cache
|
2015-02-01 23:56:05 -05:00 |
|
Philippe Tillet
|
d29f1252ad
|
Clearer array_expression with hopefully lower overhead.
Also removed pyc's
|
2015-01-31 22:01:48 -05:00 |
|
Philippe Tillet
|
0dcf4d3617
|
Better exception handling, lowered CMake requirement ; blas-bench now benchmarks square matrices
|
2015-01-28 17:08:39 -05:00 |
|
Philippe Tillet
|
c37d8a2a81
|
Now using system CL include
|
2015-01-27 16:19:50 -05:00 |
|
Philippe Tillet
|
4f73fb384f
|
More flexibility in scalars
|
2015-01-19 21:29:47 -05:00 |
|
Philippe Tillet
|
edaa821d93
|
low level representation of array
|
2015-01-18 16:53:34 -05:00 |
|
Philippe Tillet
|
faa3974f3c
|
Fixed some warnings
|
2015-01-16 07:38:26 -05:00 |
|
Philippe Tillet
|
69311b7982
|
Now ATIDLAS is standalone. Everything dynamic....
|
2015-01-12 13:24:06 -05:00 |
|