<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kerry D. Wong &#187; TBB</title>
	<atom:link href="http://www.kerrywong.com/tag/tbb/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kerrywong.com</link>
	<description></description>
	<lastBuildDate>Fri, 03 Sep 2010 00:51:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>TBB Mandelbrot Set</title>
		<link>http://www.kerrywong.com/2008/09/13/tbb-mandelbrot-set/</link>
		<comments>http://www.kerrywong.com/2008/09/13/tbb-mandelbrot-set/#comments</comments>
		<pubDate>Sun, 14 Sep 2008 02:35:48 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Mandelbrot]]></category>
		<category><![CDATA[Multi-threading]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=352</guid>
		<description><![CDATA[In an earlier post, I created a simple prime finding program using Intel&#8217;s TBB (Thread Building Block). The main benefit of using TBB is that threading and thread synchronization mechanism are abstracted away within the TBB library so we do not need to deal with threads explicitly. Also, TBB is optimized for performance and scales [...]]]></description>
			<content:encoded><![CDATA[<p>In an <a href="/2008/06/22/a-simple-tbb-program-tbb-prime/">earlier post</a>, I created a simple prime finding program using Intel&#8217;s TBB (<a href="http://www.threadingbuildingblocks.org/">Thread Building Block</a>). The main benefit of using TBB is that threading and thread synchronization mechanism are abstracted away within the TBB library so we do not need to deal with threads explicitly. Also, TBB is optimized for performance and scales nicely as the number of processing unit increases.<span id="more-352"></span> In this post, I will show you how to create a <a href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a> generator using TBB and how to optimize the algorithm using loop unrolling.</p>
<p>The standard algorithm for generating Mandelbrot Set is extremely easy to adapt to using TBB. In fact the loops look almost identical to those in the single-threaded approach, except that the iterations are calculated within a 2D range block (<strong>blocked_range2d</strong>) instead of the entire two dimensional space.</p>
<div style="background: white none repeat scroll 0% 0%; font-family: Courier New; font-size: 10pt; color: black; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> <span style="color: blue; font-weight: bold;">operator</span><span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range2d<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing_area drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t x <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">++)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t y <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y<span style="color: rgb(128, 128, 192); font-weight: bold;">++)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cx <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> x_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> x_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>y <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> y_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> y_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">while</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>zx <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> <span style="color: teal;">4</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> max_iteration<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> xtemp <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cx<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">2</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cy<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zx <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> xtemp<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; i<span style="color: rgb(128, 128, 192); font-weight: bold;">++;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> <span style="color: teal;">255</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; color_t c <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">16</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">8</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>set_pixel<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>y<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>c<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
</div>
<p>Because xlib by itself is not thread-safe, special attention must be made when trying to update the display concurrently. One way to address this issue is to employee a shared memory region (<strong>X11/extensions/XShm.h</strong> and <strong>sys/shm.h</strong>), the display is first built in memory and then the shared memory is attached to the display. In my examples above I used code (<strong>video.h</strong>, <strong>xvideo.cpp</strong>) from the sample code that come with the TBB library, which uses the shared memory method I mentioned earlier to make the X11 calls thread-safe.</p>
<p>Many optimization methods can be used to further enhance the performance of the algorithm. One of the most efficient methods is to utilize SSE instructions found on all modern Intel processors (examples can be found here: <a href="http://softwarecommunity.intel.com/articles/eng/3426.htm">Using SSE3 Technology in Algorithms with Complex Arithmetic</a>). This approach however might be difficult to implement and debug since parallel data structures must be used in order to benefit from SSE instructions. Also, explicit assembly level coding makes porting code to other machine architectures a daunting task. Modern compilers can already take full advantage of the underlying machine architecture. For example, the gcc compiler (4.2.3) already generates SSE instructions for the code snippet above. While hand tweaking using SSE instructions might further improve the performance, we would certainly sacrifice code simplicity and portability.</p>
<p>The approach I am going to take to further optimize the code is to use loop unrolling. Since the inner loop of the standard algorithm is pretty short, unrolling the inner loop should lessen the burden of loop overhead and decrease the chances of stalling the pipeline (when branching must be predicted). So a high-level loop unrolling should be able to improve the performance.</p>
<p>Here is the code after the inner loop is unrolled:</p>
<div style="background: white none repeat scroll 0% 0%; font-family: Courier New; font-size: 10pt; color: black; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> <span style="color: blue; font-weight: bold;">operator</span><span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range2d<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing_area drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t x <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t y <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y<span style="color: rgb(128, 128, 192); font-weight: bold;">++)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> x_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> x_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)(</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> <span style="color: teal;">1</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> x_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> x_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>y <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> y_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> y_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> i1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> i2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> loop_stop1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> loop_stop2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">while</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">!(</span>loop_stop1 <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> loop_stop2<span style="color: rgb(128, 128, 192); font-weight: bold;">))</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> xtemp1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> xtemp2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">((</span>zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> <span style="color: teal;">4</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i1 <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> max_iteration<span style="color: rgb(128, 128, 192); font-weight: bold;">))</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; xtemp1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cx1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">2</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cy<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> xtemp1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; i1<span style="color: rgb(128, 128, 192); font-weight: bold;">++;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">else</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; loop_stop1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">((</span>zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> <span style="color: teal;">4</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i2<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> max_iteration<span style="color: rgb(128, 128, 192); font-weight: bold;">))</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; xtemp2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cx2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">2</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cy<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> xtemp2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; i2<span style="color: rgb(128, 128, 192); font-weight: bold;">++;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">else</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; loop_stop2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i1<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> <span style="color: teal;">255</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; color_t c <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">16</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">8</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>set_pixel<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>y<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>c<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> i2&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> <span style="color: teal;">255</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; c <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">16</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">8</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>set_pixel<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x<span style="color: rgb(128, 128, 192); font-weight: bold;">+</span><span style="color: teal;">1</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>y<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>c<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
</div>
<p>This code generates identical results as the code mentioned previously. As you can see, the inner loop is not unrolled to handle two data points at a time.</p>
<p>As it turned out, this algorithm runs almost twice as fast as the code mentioned earlier(280ms versus 510ms on Intel Q9450 @ 3.4GHz).</p>
<p align="center"><img alt="Mandelbrot Set" src="/blog/wp-content/uploads/2008/09/mandelbrot_tbb.jpg" /></p>
<p><strong>Source code</strong> for this article can be downloaded <a href="/blog/wp-content/uploads/2008/09/mandelbrot_tbb.zip">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/09/13/tbb-mandelbrot-set/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Simple TBB Program: TBB Prime</title>
		<link>http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/</link>
		<comments>http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/#comments</comments>
		<pubDate>Sun, 22 Jun 2008 16:12:08 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Multi-threading]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/</guid>
		<description><![CDATA[I have been playing around with Intel&#8217;s Threading Building Block for a while and have started to really appreciate its simplicity and elegance: Instead of thinking in threads and thread synchronizations, one can just simply concentrate on the problem on the hand. Take finding prime numbers for example, while the problem itself (using the most [...]]]></description>
			<content:encoded><![CDATA[<p>I have been playing around with Intel&#8217;s <a href="http://www.threadingbuildingblocks.org/">Threading Building Block</a> for a while and have started to really appreciate its simplicity and elegance: Instead of thinking in threads and thread synchronizations, one can just simply concentrate on the problem on the hand.<span id="more-310"></span>  Take finding prime numbers for example, while the problem itself (using the most rudimentary algorithm) is quite simple, getting it to work in a multi-threaded fashion does take a little bit of work. In this particular example, the prime finding algorithm can be easily paralleled by utilizing threads and thread synchronization is almost a non-issue since the problem domain can be divided into totally disjoint regions, but in general dividing the problem domain into multiple sub-domains and performing load balancing among them could take significant work.  In the following example, I created two C++ classes that both find prime numbers for a given interval (since all prime numbers are odd numbers except 2, 2 is omitted in the calculates below), one sequential and the other parallel. In the main function, both methods are timed and the results are outputted.(download <a title="tbbprime.cpp" href="/blog/wp-content/uploads/2008/06/tbbprime.zip"> tbbprime.cpp</a>)</p>
<div style="background: white none repeat scroll 0% 0%; font-family: Courier New; font-size: 10pt; color: black; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;stdio.h&gt;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;stdlib.h&gt;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;iostream&gt;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;iomanip&gt;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;math.h&gt;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/task_scheduler_init.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/tick_count.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/blocked_range.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/parallel_for.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/partitioner.h&quot;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">using</span> <span style="color: blue; font-weight: bold;">namespace</span> std<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">using</span> <span style="color: blue; font-weight: bold;">namespace</span> tbb<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">static</span> <span style="color: blue; font-weight: bold;">const</span> <span style="color: blue; font-weight: bold;">int</span> MAX_SIZE <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">1000000</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">class</span> prime_single_thread</div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">public</span><span style="color: rgb(128, 128, 192); font-weight: bold;">:</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t x <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">int</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">3</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> sqrt<span style="color: rgb(128, 128, 192); font-weight: bold;">((</span><span style="color: blue; font-weight: bold;">double</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span> i<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">==</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">continue</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: green;">// Output prime numbers:</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (is_prime)</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; cout &lt;&lt; x &lt;&lt; endl;</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">};</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">class</span> prime_tbb</div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">public</span><span style="color: rgb(128, 128, 192); font-weight: bold;">:</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> test_prime<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">int</span> num<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">int</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">3</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> sqrt<span style="color: rgb(128, 128, 192); font-weight: bold;">((</span><span style="color: blue; font-weight: bold;">double</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> num<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span> i<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>num <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">==</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">continue</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: green;">// Output prime numbers:</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (is_prime)</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; cout &lt;&lt; num &lt;&lt; endl;</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> <span style="color: blue; font-weight: bold;">operator</span><span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> i<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; test_prime<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; prime_tbb prime_tbb<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; parallel_for<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> prime_tbb<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">};</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">int</span> main<span style="color: rgb(128, 128, 192); font-weight: bold;">()</span></div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; task_scheduler_init init<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">static</span> tick_count t_start<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> t_end<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; prime_single_thread p1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; prime_tbb p2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>setf<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>ios<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>fixed<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>setf<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>ios<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>showpoint<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>precision<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: green;">//starting from 3, with a granularity of 100.</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">3</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> MAX_SIZE<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> <span style="color: teal;">100</span><span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_start <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; p1<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> t_start<span style="color: rgb(128, 128, 192); font-weight: bold;">).</span>seconds<span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> <span style="color: teal;">1000</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot; ms&quot;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> endl<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_start <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; p2<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> t_start<span style="color: rgb(128, 128, 192); font-weight: bold;">).</span>seconds<span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> <span style="color: teal;">1000</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot; ms&quot;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> endl<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">return</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
</div>
<div style="background: white none repeat scroll 0% 0%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; font-family: Courier New; font-size: 10pt; color: black;">&nbsp;</div>
<p>For a very large interval (e.g. 3~1,000,000), the TBB version of the prime program achieved a 4x speed up given a reasonably large grain size (e.g. 100). Smaller grain size resulted in slightly more overhead.  On a quad-core machine (Q9450 @ 3.2GHz), it took 217.83 ms for the single threaded routine to find all the prime numbers within 1,000,000, whereas it only took 58.32 ms for the TBB version, which runs roughly four times as fast. The TBB framework took care of dividing the task according to the number of processors automatically.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Develop TBB Using KDevelop and Code::Blocks</title>
		<link>http://www.kerrywong.com/2008/06/01/develop-tbb-using-kdevelop-and-codeblocks/</link>
		<comments>http://www.kerrywong.com/2008/06/01/develop-tbb-using-kdevelop-and-codeblocks/#comments</comments>
		<pubDate>Mon, 02 Jun 2008 01:20:23 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[Code::Blocks]]></category>
		<category><![CDATA[KDevelop]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2008/06/01/develop-tbb-using-kdevelop-and-codeblocks/</guid>
		<description><![CDATA[I have been playing around with Intel&#8217;s open source TBB (Threading Building Block) recently.The default download includes all the sources that can run on Windows , Linux or Mac OS X. The linux version however, builds using the shell by default and does not support any IDE&#8217;s (the Windows version can be built using Visual [...]]]></description>
			<content:encoded><![CDATA[<p>I have been playing around with Intel&#8217;s open source <a href="http://www.threadingbuildingblocks.org/">TBB</a> (Threading Building Block) recently.<span id="more-291"></span>The default download includes all the sources that can run on Windows , Linux or Mac OS X. The linux version however, builds using the shell by default and does not support any IDE&#8217;s (the Windows version can be built using Visual Studio 2003 or 2005). Even though running TBB applications under any Linux IDEs is not all that different comparing to doing the same under Windows, I found that there were not many tutorials on the Internet on how to do this. So here, I will show you how to setup <a href="http://www.kdevelop.org/">KDevelop</a> and <a href="http://www.codeblocks.org">Code::Blocks</a> for you TBB development and debugging.</p>
<p>KDevelop is a powerful multi-language development IDE that comes as default package for the KDE environment (e.g. Kubuntu). For those who are using GNOME, it can be easily installed via apt-get install (Ubuntu).</p>
<p>The KDevelop IDE is one of the most feature-rich inegrated programming environments on Linux, but unfortunately it is not the easiest to get started due to its overwhelmingly large feature set. Here I will show you what I found the easiest way to set up a project that supports TBB, using the Automake project type.</p>
<p>The version of KDevelop I am using is 3.5.9. The commands might be slightly different in other versions, but the idea should be the same. To start a TBB application, use Automake project type (Projects -&gt; New Projects, under C++ Automake Project, choose Empty Autotools Template.)</p>
<div align="center">
<p><img alt="" src="/blog/wp-content/uploads/2008/06/tbb_kde_1.png" /></p>
</div>
<p>Don&#8217;t be surprised to see a few dozens of files being created, most of these files are used for program auto-configurations.</p>
<p>For all Automake projects, an active target is needed. So we will need to create an active target. By default, all the targets are configured within the right-hand-side split panels. The name of the active target will be compiled into the binary executable. In my example below, the target is named as tbbdev. In my example, I will use one of the stock program (sub_string_finder.cpp) that came with the TBB source code. It can be found in <em><strong>{TBB Source root}</strong></em>/examples/GettingStarted/sub_string_finder/</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_kde_2.png" alt="" /></p>
<p align="left">&nbsp;In order for the target application to find the shared library at runtime, we need to setup an environment variable LD_LIBRARY_PATH. The setting can be found at Project -&gt; Project Options.</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_kde_3.png" alt="" /></p>
<p align="left">In the example given above, the TBB library source was extracted within my user&#8217;s root folder under tbb (<strong>/home/kwong/tbb</strong>/tbb20_020oss_src/), and the library I used is for 64 bit operating systems. Depending on whether it is debug build or release build, choose library path accordingly.</p>
<p align="left">Next we need to configure the build target. The include paths are configured at the root project level (which is the parent to all configured targets).</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_kde_4.png" alt="" /></p>
<p align="left">the added directory should take the following format: -l<em><strong>{TBB installation root}</strong></em>/include/.</p>
<p align="left">Then we will configure the target library path (this takes place at the active target level):</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_kde_5.png" alt="" />2</p>
<p align="left">Note the syntax, the first part (-L) specifies the library path, and the second part (-l) specifies the library name. The actual library is called libtbb_debug.so.</p>
<p align="left">That&#8217;s all the settings you will need to compile and run the sample program. Before you start debugging though, make sure that the current target is set to be built in debug mode (For some reason, the default build config does not support debugging and thus you can not set breakpoints within the IDE):</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_kde_6.png" longdesc="http://www.kerrywong.com/undefined" alt="" /></p>
<p align="left">Alternatively, you can also change the compiler flags (CXXFLAGS) to -O0 -g3 (in prorject options).</p>
<p align="left">After the above steps, you should be able to compile, debug and run the sample TBB program from within KDevelop.</p>
<p align="left">As I suggested earlier, Kdevelop might not be the easiest to configure and run since it is mainly geared towards the more complex KDE applications. If you are so used to Microsoft Visual Studio IDEs, you might find <a href="http://www.codeblocks.org/">Code::Blocks</a> much easier to use.</p>
<p align="left">Start Code::Blocks, and choose File -&gt; New -&gt; Project and choose Console Application. Again, we will use the sub_string_finder.cpp sample code. The library path/include path properties can be set either at the Workspace level (one workspace can contain multiple projects and the settings are inherited unless overwritten) or project level. Assume that we are configuring at the workspace level. Click and highlight the workspace, and choose from menu Project -&gt; Build Options and click on the Linker settings tab. Add the libraries to be linked here:</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_cbk_1.png" alt="" /></p>
<p align="left">Note that only the first entry was necessary for this particular project. As you can see from above, you can configure multiple library dependencies here easily. Now click on the Search directories tab, and set up the include path:</p>
<p align="center"><img src="/blog/wp-content/uploads/2008/06/tbb_cbk_2.png" alt="" /></p>
<p align="left">Now you can build and debug the application in Code::Blocks.&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/06/01/develop-tbb-using-kdevelop-and-codeblocks/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>TBB Benchmarks</title>
		<link>http://www.kerrywong.com/2008/05/11/tbb-benchmarks/</link>
		<comments>http://www.kerrywong.com/2008/05/11/tbb-benchmarks/#comments</comments>
		<pubDate>Mon, 12 May 2008 02:18:35 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2008/05/11/tbb-benchmarks/</guid>
		<description><![CDATA[Since I started using Ubuntu 8.04 as my main operating system, I have been trying to obtain some benchmark information for my month-old new build. Unlike a machine running Windows, benchmarking suites for Linux are few and far between, and are especially hard to find for 64bit systems. Phoronix does provide an excellent test suite [...]]]></description>
			<content:encoded><![CDATA[<p>Since I started using Ubuntu 8.04 as my main operating system, I have been trying to obtain some benchmark information for my <a href="/2008/04/12/some-pictures-of-my-new-rig/">month-old new build</a>.<span id="more-288"></span></p>
<p>Unlike a machine running Windows, benchmarking suites for Linux are few and far between, and are especially hard to find for 64bit systems. <a href="http://www.phoronix.com">Phoronix</a> does provide <a href="http://phoronix-test-suite.com/">an excellent test suite</a> that is designed to run under Linux. But I haven&#8217;t had any luck to get the latest version (0.6.0) to build and run properly under the 64bit version of Linux yet. And since most of the applications within the test suite have Linux versions only, it would be very difficult to make cross-OS performance comparisons.</p>
<p>If your primary goal is to test your CPU and memory sub system, then I would recommend using Intel&#8217;s open source <a href="http://www.threadingbuildingblocks.org/">Threading Building Block</a> (TBB). The source includes a few algorithms that were executed after compilation to test whether the build was successful. As a side benefit, these tests are timed and can be used as benchmarks as well.</p>
<p>As an example, the following list is the benchmark information obtained while building the latest stable version of TBB (tbb20_020oss_src). The library was built on my&nbsp;machine (Q9450 @3.2G, 8GB DDR2-800, Linux 2.6.24-16-generic SMP x86_64)</p>
<blockquote dir="ltr" style="margin-right: 0px">
<div>&nbsp;./count_strings 1<br />
threads = 1&nbsp; total = 1000000&nbsp; time = 0.336895<br />
./count_strings 2<br />
threads = 2&nbsp; total = 1000000&nbsp; time = 0.214048<br />
./count_strings 4<br />
threads = 4&nbsp; total = 1000000&nbsp; time = 0.181645</div>
<div>./seismic &#8211; 300<br />
101.5 frame per sec with serial version<br />
102.3 frame per sec with 1 way parallelism<br />
193.9 frame per sec with 2 way parallelism<br />
219.0 frame per sec with 3 way parallelism<br />
244.2 frame per sec with 4 way parallelism</div>
<div>./convex_hull_bench<br />
Starting TBB unbufferred push_back version of QUICK HULL algorithm<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:1&nbsp; Initialization time:0.293048&nbsp; Calculation time:0.807145<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:2&nbsp; Initialization time:0.822569&nbsp; Calculation time:1.02838<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:3&nbsp; Initialization time:0.607247&nbsp; Calculation time:1.13264<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:4&nbsp; Initialization time:0.5828&nbsp; Calculation time:1.08477<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:5&nbsp; Initialization time:0.569491&nbsp; Calculation time:1.10567<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:6&nbsp; Initialization time:0.585655&nbsp; Calculation time:1.09051<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:7&nbsp; Initialization time:0.583944&nbsp; Calculation time:1.08213<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:8&nbsp; Initialization time:0.561563&nbsp; Calculation time:1.09363<br />
Starting TBB bufferred version of QUICK HULL algorithm<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:1&nbsp; Initialization time:0.180772&nbsp; Calculation time:0.713631<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:2&nbsp; Initialization time:0.09458&nbsp; Calculation time:0.369742<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:3&nbsp; Initialization time:0.0698851&nbsp; Calculation time:0.266026<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:4&nbsp; Initialization time:0.0567744&nbsp; Calculation time:0.207367<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:5&nbsp; Initialization time:0.0555128&nbsp; Calculation time:0.230236<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:6&nbsp; Initialization time:0.0598358&nbsp; Calculation time:0.23095<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:7&nbsp; Initialization time:0.0624336&nbsp; Calculation time:0.257518<br />
&nbsp; Number of nodes:5000000&nbsp; Number of threads:8&nbsp; Initialization time:0.0586483&nbsp; Calculation time:0.278667</div>
<div>./primes 100000000 0:4<br />
#primes from [2..100000000] = 5761455 (0.16 sec with serial code)<br />
#primes from [2..100000000] = 5761455 (0.18 sec with 1-way parallelism)<br />
#primes from [2..100000000] = 5761455 (0.09 sec with 2-way parallelism)<br />
#primes from [2..100000000] = 5761455 (0.06 sec with 3-way parallelism)<br />
#primes from [2..100000000] = 5761455 (0.05 sec with 4-way parallelism)</div>
<div>./parallel_preorder 1:4<br />
0.235308 seconds using 1 threads (average of 199.74 nodes in root_set)<br />
0.202356 seconds using 2 threads (average of 199.74 nodes in root_set)<br />
0.153144 seconds using 3 threads (average of 199.74 nodes in root_set)<br />
0.181067 seconds using 4 threads (average of 199.74 nodes in root_set)</div>
<div>./sum_tree<br />
Tree creation using TBB scalable allocator<br />
&nbsp;&nbsp; half created serially: time = 177.1 msec<br />
&nbsp;&nbsp; half done in parallel: time = 77.9 msec<br />
Calculations:<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SerialSumTree: time = 77.9 msec, sum=7.01275e+08<br />
&nbsp;&nbsp; SimpleParallelSumTree: time = 44.5 msec, sum=7.01275e+08<br />
OptimizedParallelSumTree: time = 43.4 msec, sum=7.01275e+08<br />
./sum_tree -stdmalloc<br />
Tree creation using standard operator new<br />
&nbsp;&nbsp; half created serially: time = 369.2 msec<br />
&nbsp;&nbsp; half done in parallel: time = 548.7 msec<br />
Calculations:<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SerialSumTree: time = 94.7 msec, sum=7.01275e+08<br />
&nbsp;&nbsp; SimpleParallelSumTree: time = 65.3 msec, sum=7.01275e+08<br />
OptimizedParallelSumTree: time = 65.4 msec, sum=7.01275e+08</div>
</blockquote>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/05/11/tbb-benchmarks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
