Philippe Tillet
|
006d0f13de
|
Packaging: polished
|
2015-04-30 00:46:42 -04:00 |
|
Philippe Tillet
|
5ef01f041a
|
Python: Refactored wrapper
|
2015-04-29 17:48:57 -04:00 |
|
Philippe Tillet
|
a67476671d
|
Bench: Removed warnings in bench-blas when no external blas is defined
|
2015-04-29 16:11:32 -04:00 |
|
Philippe Tillet
|
cf5028d55b
|
Squashed feature branch:
* Added CUDA support
* Performance improvements
* API improvements
* Added "depth" parameter to GEMM
* Android cross-compilation
|
2015-04-29 15:52:21 -04:00 |
|
Philippe Tillet
|
5ff16bfcb6
|
Added cublas sgemm
|
2015-02-13 04:31:42 -05:00 |
|
Philippe Tillet
|
e453031094
|
More efficient access pattern in the GEMV kernel
|
2015-02-11 02:06:16 -05:00 |
|
Philippe Tillet
|
85b7eb8b5e
|
Added another parameter to GEMV
|
2015-02-10 16:33:38 -05:00 |
|
Philippe Tillet
|
37fc98c532
|
Fixed bug on marix-vector products with vectorization
|
2015-02-10 03:09:41 -05:00 |
|
Philippe Tillet
|
a89f6d88be
|
Fix bug in operation-specific tuning
|
2015-02-09 01:58:32 -05:00 |
|
Philippe Tillet
|
7e65601534
|
fixup
|
2015-02-08 23:22:48 -05:00 |
|
Philippe Tillet
|
a6d7671831
|
removing C++11 interface
|
2015-02-08 23:19:38 -05:00 |
|
Philippe Tillet
|
85fb438806
|
More convenient use of specific runtime tuning
|
2015-02-08 14:23:38 -05:00 |
|
Philippe Tillet
|
9c68704f09
|
Now using a list of event instead of a single one
|
2015-02-08 00:56:24 -05:00 |
|
Philippe Tillet
|
b768e913c9
|
Now using events to time autotuning
|
2015-02-06 22:11:03 -05:00 |
|
Philippe
|
385f007c0b
|
Fixed overhead-benchmark
|
2015-02-06 02:00:02 -05:00 |
|
Philippe
|
7fc2348924
|
Fixed CUDA benchmark
|
2015-02-05 23:42:31 -05:00 |
|
Philippe Tillet
|
58fdc5d18e
|
Added FindOpenBlas
|
2015-02-05 23:17:42 -05:00 |
|
Philippe Tillet
|
8f8b01938b
|
Rearranged benchmarking script
|
2015-02-05 23:11:16 -05:00 |
|
Philippe Tillet
|
e214927b16
|
Better control flow through options
|
2015-02-05 04:43:50 -05:00 |
|
Philippe Tillet
|
bbf2f0188e
|
Ported to C++11
|
2015-02-05 04:43:40 -05:00 |
|
Philippe Tillet
|
3a296ae3b7
|
Added a control flow API
|
2015-02-03 15:25:01 -05:00 |
|
Philippe Tillet
|
939ce15b45
|
Cleaner benchmarking code
|
2015-02-02 00:03:48 -05:00 |
|
Philippe Tillet
|
2afc574724
|
Implemented simple operation cache
|
2015-02-01 23:56:05 -05:00 |
|
Philippe Tillet
|
535706f35a
|
Some renaming; lower overhead in benchmark
|
2015-02-01 22:28:49 -05:00 |
|
Philippe Tillet
|
f0bb130416
|
Auto-tuner: Renamed "json_file" to "out"
|
2015-02-01 21:30:45 -05:00 |
|
Philippe Tillet
|
3b61842528
|
Lower overhead in the benchmarking source code
|
2015-02-01 18:59:27 -05:00 |
|
Philippe Tillet
|
b404b687ee
|
Incorporated low-level array representation to store array's parameters
|
2015-02-01 17:15:41 -05:00 |
|
Philippe Tillet
|
3f1fa822f8
|
save
|
2015-02-01 15:58:05 -05:00 |
|
Philippe Tillet
|
b0bf235cc2
|
Reverted strange change on model.cpp
|
2015-01-31 22:10:09 -05:00 |
|
Philippe Tillet
|
d29f1252ad
|
Clearer array_expression with hopefully lower overhead.
Also removed pyc's
|
2015-01-31 22:01:48 -05:00 |
|
Philippe Tillet
|
13ec84fbda
|
Bugfix in benchmark's cmakelists
|
2015-01-29 22:40:41 +01:00 |
|
Philippe Tillet
|
f488274269
|
Fixed relational operators tests
|
2015-01-29 16:01:46 -05:00 |
|
Philippe Tillet
|
d4629ba018
|
Bugfix in cast and relational operators
|
2015-01-29 02:50:51 -05:00 |
|
Philippe Tillet
|
c7665021d1
|
reducing overhead; reverted custom CL/ header because CL/cl.hpp was buggy
|
2015-01-28 23:04:19 -05:00 |
|
Philippe Tillet
|
1246fbe9a8
|
More portable synchronization in blas-bench
|
2015-01-28 20:06:41 -05:00 |
|
Philippe Tillet
|
a6c513014f
|
Silly bugfix in cublas saxpy
|
2015-01-28 19:54:36 -05:00 |
|
Philippe Tillet
|
04cec21752
|
Fixed warnings and compilation for pyatidlas
|
2015-01-28 19:50:47 -05:00 |
|
Philippe Tillet
|
e059178759
|
Bugfix
|
2015-01-28 19:25:51 -05:00 |
|
Philippe Tillet
|
41764e0429
|
Setting release type by default for the lib
|
2015-01-28 19:22:48 -05:00 |
|
Philippe Tillet
|
510d9293ca
|
Now shipping CL folder
|
2015-01-28 19:19:09 -05:00 |
|
Philippe Tillet
|
0dcf4d3617
|
Better exception handling, lowered CMake requirement ; blas-bench now benchmarks square matrices
|
2015-01-28 17:08:39 -05:00 |
|
Philippe Tillet
|
736a441eb1
|
Using 1.1 APIs in cl.hpp
|
2015-01-27 16:24:48 -05:00 |
|
Philippe Tillet
|
c37d8a2a81
|
Now using system CL include
|
2015-01-27 16:19:50 -05:00 |
|
Philippe Tillet
|
53c9bef85d
|
Reverted cl.hpp update
|
2015-01-27 15:50:30 -05:00 |
|
Philippe Tillet
|
433a661d5e
|
Updated cl.hpp
|
2015-01-27 15:48:38 -05:00 |
|
Philippe Tillet
|
46836753aa
|
Bugfix in building cuda executable
|
2015-01-27 21:42:52 +01:00 |
|
Philippe Tillet
|
c12ec4cebd
|
tentative cuda benchmark integration
x
|
2015-01-27 15:32:59 -05:00 |
|
Philippe Tillet
|
be006268d7
|
More robust build system
|
2015-01-27 15:12:08 -05:00 |
|
Philippe Tillet
|
c13059d69c
|
Added missing file
|
2015-01-27 13:07:26 -05:00 |
|
Philippe Tillet
|
2a249d26c6
|
Added postinstall script
|
2015-01-27 13:06:25 -05:00 |
|