Hyper-Threading and Dual Core Performance Comparison for Computational Intensive Applications – Update
In a post (Hyper-Threading and Dual Core Performance Comparison for Computational Intensive Applications) I wrote at the end of last year, I compared multi-threaded scientific application performance of a Pentium 4 processor with hyper-threading enabled and a Pentium D processor, and concluded that for multi-threaded scientific applications, hyper-threaded processor helped little in terms of application performance.
As new processors roll out, new benchmarks tend to only capture the performance of that specific processor family and rarely compare the performance between different generations of processors. From many sources and the readily available data, we know that in terms of multi-media and gaming application which heavily rely on SSE* instructions, the new Core family processors out performs the old Pentium processors by leaps and bounds (The new Core architecture has made significant improvement over its Pentium predecessors). But there is very little information on how scientific applications might have benefited from the enhanced architecture.
I recently just got a new ThinkPad T60 laptop at work (it has a Core Duo T2400 1.83 GHz processor), so I decided to run the same benchmark program I created in my earlier post and see how the numbers turn out.
And here are the results for the total run time when running 1 to 4 threads:
Time (1 thread) = 1548.6 ms Time (2 threads)= 784.8 ms Time (3 threads)= 785.3 ms Time (4 threads)= 788.1 ms
In comparison, here are the results for the same benchmark program when running on Pentium 4 3GHz processor:
Time (1 thread) = 756.2 ms Time (2 threads)= 754.7 ms Time (3 threads)= 769.3 ms Time (4 threads)= 772.5 ms
And Pentium D 2.8GHz processor:
Time (1 thread) = 931.6 ms Time (2 threads)= 469.4 ms Time (3 threads)= 466.9 ms Time (4 threads)= 485.8 ms
As you can see, the performance of Core Duo is not that impressive for scientific calculations even if the clock speeds were the same. And the Core Duo T2400 mobile processor is probably a little slower than the Pentium D processor with the same clock speed.
According to an article Mobile CPU Wars: Core 2 Duo vs. Core Duo on AnandTech, the new Core 2 Duo processor is roughly 10% faster than Core Duo processor under the same clock speed. Of course the mobile version of Core Duo or Core 2 Duo are running at a lower bus speed (667MHz) compared to their desktop counterparts (1066MHz) so from what we’ve seen the performance for scientific calculations on Core 2 Duo processors are probably only marginally better then the current Pentium D (in multi-threaded scenario) or Pentium (in single-threaded scenario) processors.
This is not all that surprising as the simple benchmark application I created is so small in size that the working set is likely to fit in Pentium’s much smaller on-die cache, so the extra L1/L2 cache memory found in Core series does not provide any significant advantage here. Also, the calculation is somewhat predictable in our simple program so the inefficiencies in the deep pipe-lined Pentium processors do not signify. Further, we did not utilize the new SSE* instructions either.
Nevertheless, it is clear that unless we consciously utilize the new instruction sets found in Core serious processors and have a large working set to work with, the speed advantage of the new processors are not significant compared to Pentium processors.