[IPOL discuss] use of Eigen for PCA/SVD computation in demo blm_color_dimensional_filtering

Mon May 23 18:49:44 CEST 2011

Nicolas Limare <nicolas.limare at cmla.ens-cachan.fr>:

> CPU optimizations will include vector instructions: for example
> the SSE instruction "mulss" will multiply 4 floats with 4 other floats
> in a single clock cycle, and the recent CPUs (SSE4 generation) have a
> dot product instruction. The CPU cache size and pipeline management
> can also be improved in vendor implementations.

We should do some experiments to compare performances before I'm
convinced.  But in my view, chosing SSE instructions should be the
task of the compiler, not of an optimized library.  There is no reason
to think that a trivial two-line C program that computes the dot
product can not be as fast as a fancy library.  The compiler should be
fancy enough to select the appropriate instructions by itself.