[IPOL discuss] hyperthreading and openmp

Jerome Darbon darbon at cmla.ens-cachan.fr
Mon Dec 5 18:02:49 CET 2011


Hi,
    indeed hyperthreading does not give higher computing resources, it 
essentially
allows to run several  pieces of code using the same resources. This is 
interesting
when the code often waits for some event. A typical application is a 
server that
needs to deal with many (hundreds or thousands) of socket connections 
over the
network. Most of the time the code is waiting. Hyperthreading allows to 
keep of
the context of execution of a process in the processor, and therefore, 
there is no
cost for switching between 2 processes. However, the available computing 
units
remain the same and the processes compete for the resources. In scientific
computing, you do not want this competition which worsens the performance.

Maybe you have a race condition in your code. A short explanation is 
given here:
http://dsc.sun.com/solaris/articles/cpp_race.html
Hope this helps.

Regards,
jerome

On 12/05/2011 02:55 AM, Nicolas Limare wrote:
> Hi,
>
>> I send this mail to ask for advice or opinion on the default number of
>> threads that an OMP demo should use.  Either as a constant, or as a
>> way to compute this number from the system.
>>
>> I suspect that the best choice is one less than the real number of
>> processors (thus, 15 for the green server).
> There are 32 real CPUs on the green server. Hyperthreadong has been
> disabled on the advise of Jerôme, because it is only useful for idle
> processes (such as web servers waiting for a connexion), not for busy
> ones. This server has 4 CPUs with 8 cores each.
>
>> Above this point, there are diminishing returns in increasing the
>> number of threads while the total CPU time, which could be used for
>> other demos, is wasted.
> I think the "good number" of processes depends a lot on the algorithm
> and on the size of your data. Moreover, the parallel gain is limited
> by the part of your code which is effectively parallelisable; if 95%
> of your program is perfectly parallelisable, you will never get better
> than x20 performance gain, even with an infinity of
> processors[1]. This, plus the overhead cost of every multithread
> branch, could explain your results. Did you try with some data
> substancially larger? Do you get the same graph?
>
> [1] http://en.wikipedia.org/wiki/Amdahl%27s_law
>
> I think any tentative of fine tuning will eventually be inadapted when
> running on another machine or larger data. So in the absence of a
> better wisdom, I would take no decision and keep the default settings,
> ie use as many threads as CPUs. According to your graph, 32 threads is
> not worse than other values. If you want to save on the total CPU time
> (and power use), you can decide to only use half of the resources. You
> can get the number of available CPUs from the openmp library[2], with:
>
>      #ifdef _OPENMP
>      #include<omp.h>
>      omp_set_num_threads(omp_get_max_threads() / 2);
>      #endif
>
> [2] https://computing.llnl.gov/tutorials/openMP/#OMP_GET_MAX_THREADS
>      https://computing.llnl.gov/tutorials/openMP/#OMP_SET_NUM_THREADS
>
>
>
>
> _______________________________________________
> discuss mailing list
> discuss at list.ipol.im
> http://tools.ipol.im/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tools.ipol.im/mailman/archive/discuss/attachments/20111205/2e42e958/attachment.html>


More information about the discuss mailing list