Gcc and even more suncc are better choices for these kind of tests. Hence, for measuring scalability effects, the Intel Compiler does not seem to be the best tool. Looking closer the absolute error, there seems to be almost no trend.īut comparing the relative error gives a lot more information: This shows different error ranges for each compiler/gridsize. PGI (10.4), Suncc (SunStudio12) and gcc (4.4.3) are scaling very well, Intel (11.0.074) does not, but has a significant lower runtime: The above code (init with zero) has been run 4 times with each compiler to investigate in the distribution of measurements. run script: r1246 makefile: test code for init with zero: r1244 test code for init with math functione: r1245 if you are not logged in, please go to the very end of the page, where the source is displayedĪpart from the fact, that runtime with complex init is much larger than expected, is seems to scale better that with zero initialization.This should be replaced by zero values and lead to the same scalability effects, but reality seems to be a bit different: In the above example, a performace consuming initialization has been used: array = tan((double) (ompthID+i+0.5)/exp(-(double)ompthID)) Tests for 4, 8 and 16 thread and array size from 10000000 to 60000000Ĭomparing different kinds in initialization ¶.With Data Locality: Without Data Locality: (see diff).Simple application, which works on a single array:.Some investigation have been started on OpenMP-Performance and Data Locality: Increase dimensions up to reasonable number for levels and variables.Getting real: Smaller grid size and complex algorithms.Comparing different kinds in initialization.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |