<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kerry D. Wong &#187; C++</title>
	<atom:link href="http://www.kerrywong.com/tag/c/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kerrywong.com</link>
	<description></description>
	<lastBuildDate>Sat, 13 Mar 2010 01:47:11 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>A Parallel Port Stepper Motor Driver With Discrete Components</title>
		<link>http://www.kerrywong.com/2010/02/20/a-parallel-port-stepper-motor-driver-with-discrete-components/</link>
		<comments>http://www.kerrywong.com/2010/02/20/a-parallel-port-stepper-motor-driver-with-discrete-components/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 01:31:13 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[MOSFET]]></category>
		<category><![CDATA[Parallel Port]]></category>
		<category><![CDATA[Stepper Motor]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1636</guid>
		<description><![CDATA[Using PC&#8217;s parallel port is a convenient way to control a stepper motor. For unipolar stepper motors, up to two motors can be controlled with the 8bit data line.
The standard way of connecting a unipolar stepper motor to the parallel port is to use a Darlington driver such as ULN2003 and there are already many [...]]]></description>
			<content:encoded><![CDATA[<p>Using PC&#8217;s <a href="http://en.wikipedia.org/wiki/Parallel_port">parallel port</a> is a convenient way to control a stepper motor. For unipolar stepper motors, up to two motors can be controlled with the 8bit data line.<span id="more-1636"></span></p>
<p>The standard way of connecting a <a href="http://en.wikipedia.org/wiki/Stepper_motor">unipolar stepper motor</a> to the parallel port is to use a <a href="http://en.wikipedia.org/wiki/Darlington_transistor">Darlington</a> driver such as <a href="http://www.st.com/stonline/books/pdf/docs/5279.pdf">ULN2003</a> and there are already <a href="http://www.google.com/#hl=en&#038;q=parallel+port+stepper+motor&#038;aq=f&#038;aqi=g1&#038;oq=&#038;fp=79a46ede2c2a175d">many examples</a> out there on how to do this. In this post, I will show you how to build a simple stepper motor driver using discrete <a href="http://en.wikipedia.org/wiki/MOSFET">MOSFET</a>s.</p>
<p>In the circuit diagram below, you will find that the four power MOSFETs are used as switches for each coil in the stepper motor (the stepper motor I used in this example is a <a href="http://www.mitsumi.co.jp/Catalog/pdf/motor_m35sp_9_e.pdf">MITSUMI M35SP-9</a>. In theory, any uni-polar stepper motors should work with this circuit). A pull-down resistor is attached to the gate of each MOSFET. This is important as otherwise the interference from the port would prevent the MOSFET from switching reliably.<br />
<div id="attachment_1648" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2010/02/controllercircuit.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2010/02/controllercircuit-300x145.png" alt="Controller Circuit" title="Controller Circuit" width="300" height="145" class="size-medium wp-image-1648" /></a><p class="wp-caption-text">Controller Circuit</p></div></p>
<p>The benefit of using discrete MOSFET is that they can handle extremely high current loads. Using <a href="http://pdf1.alldatasheet.com/datasheet-pdf/view/96663/IRF/IRFZ22.html">IRFZ22</a>, the coil current can be as high as 10 Amp.<br />
<div id="attachment_1657" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2010/02/steppermotorctrl.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2010/02/steppermotorctrl-300x225.jpg" alt="Stepper Motor Controller" title="Stepper Motor Controller" width="300" height="225" class="size-medium wp-image-1657" /></a><p class="wp-caption-text">Stepper Motor Controller</p></div></p>
<p>The following C code sets the data port (pin 2, 3, 4, 5) to high in order so that the stepper motor would rotate clockwise. If you want the motor to rotate counter-clockwise, simply change the output order to 8,4,2,1.</p>
<pre class="brush: cpp;">
#include &lt;sys/io.h&gt;

#define PAR_PORT 0x378 

void main()
{
	while (1)
	{
		outb(1, PAR_PORT);
		outb(2, PAR_PORT);
		outb(4, PAR_PORT);
		outb(8, PAR_PORT);
	}
}                                            
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2010/02/20/a-parallel-port-stepper-motor-driver-with-discrete-components/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A Simple Program for Finding Palindromic Prime Numbers</title>
		<link>http://www.kerrywong.com/2009/11/15/a-simple-program-for-finding-palindromic-prime-numbers/</link>
		<comments>http://www.kerrywong.com/2009/11/15/a-simple-program-for-finding-palindromic-prime-numbers/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 01:49:08 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Palindromic Prime Number]]></category>
		<category><![CDATA[Palprime]]></category>
		<category><![CDATA[Prime Number]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1527</guid>
		<description><![CDATA[A palindromic prime (palprime) is a prime number that is also palindromic. So out of curiosity I wrote a simple program a few days ago that can find the palindromic numbers within a given range. Here is the code in C++:

#include &#60;stdio.h&#62;
#include &#60;stdlib.h&#62;
#include &#60;limits.h&#62;
#include &#60;math.h&#62;
#include &#60;iostream&#62;

using namespace std;

bool IsPrime(unsigned long long n) {
	bool r = [...]]]></description>
			<content:encoded><![CDATA[<p>A <a href="http://mathworld.wolfram.com/PalindromicPrime.html">palindromic prime</a> (palprime) is a prime number that is also palindromic. So out of curiosity I wrote a simple program a few days ago that can find the palindromic numbers within a given range. Here is the code in C++:<span id="more-1527"></span></p>
<pre class="brush: cpp;">
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;limits.h&gt;
#include &lt;math.h&gt;
#include &lt;iostream&gt;

using namespace std;

bool IsPrime(unsigned long long n) {
	bool r = true;

	for (unsigned long long i = 3; i &lt; sqrt((double) n) + 1; i+= 2)
	{
		if (n % i ==0) {
			r = false;
			break;
		}
	}

	return r;
}

bool IsPalindrome(unsigned long long n) {
	bool r = true;
	char s[30];
	int l = sprintf(s, &quot;%llu&quot;, n);

	if (l == 1 &amp;&amp; n != 1) {
		r = true;
	} else	{
		for (int i = 0; i &lt; l/2; i++) {
			if (s[i] != s[l-i-1]) {
				r = false;
				break;
			}
		}
	}

	return r;
}

/*
 * usage: palprime [lbound] [ubound]
 */
int main(int argc, char** argv) {
	unsigned long long beginNum = 3;
	unsigned long long endNum = 3;

	if (argc == 2) { // lbound default to 3
#ifdef _WIN32
		endNum = _strtoui64(argv[1], NULL, 10);
#else
		endNum = strtoull(argv[1], NULL, 10);
#endif

	} else if (argc == 3) {
#ifdef _WIN32
		beginNum = _strtoui64(argv[1], NULL, 10);
		endNum = _strtoui64(argv[2], NULL, 10);
#else
		beginNum = strtoull(argv[1], NULL, 10);
		endNum = strtoull(argv[2], NULL, 10);
#endif
	}

        unsigned long long i = beginNum;

        while (i &lt; endNum) {
                char s[30];
                int l = sprintf(s, &quot;%llu&quot;, i);

		//length cannot be even as even length palindrome numbers
		//can be divided by 11.
                if (l % 2 == 0) {
                    i = ((unsigned long long) (i / 10)) * 100 + 1;
                    continue;
                }

		if (IsPalindrome(i)) {
			if (IsPrime(i)) {
				cout &lt;&lt; i &lt;&lt; endl;
			}
		}

                i+=2;

                if (s[0] % 2 == 0) {
                    i+=pow(10, l-1);

		    //leading/ending number cannot be 5
                    if (((int) (s[0] - '0')) + 1 == 5) {
                        i += 2 * pow(10, l-1);
                    }
                }
	}
	return (EXIT_SUCCESS);
}
</pre>
<p>At first, I was trying to find all the palprimes that can be represented by 64 bit integers. But soon I realized that it would take months to do so using the code above with a quad-core PC (using 4 processes with different ranges). Anyway, here&#8217;s the last few palindromic primes less than 10,000,000,000,000:</p>
<blockquote><p>
9999899989999<br />
9999901099999<br />
9999907099999<br />
9999913199999<br />
9999919199999<br />
9999938399999<br />
9999961699999<br />
9999970799999<br />
9999980899999<br />
9999987899999
</p></blockquote>
<p>And here are a few interesting ones:</p>
<blockquote><p>
11357975311<br />
1112345432111<br />
1300000000031<br />
1700000000071<br />
1900000000091<br />
7900000000097<br />
9200000000029<br />
1357900097531
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/11/15/a-simple-program-for-finding-palindromic-prime-numbers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Image Blur Detection via Hough Transform &#8212; IV</title>
		<link>http://www.kerrywong.com/2009/07/03/image-blur-detection-via-hough-transform-iv/</link>
		<comments>http://www.kerrywong.com/2009/07/03/image-blur-detection-via-hough-transform-iv/#comments</comments>
		<pubDate>Sat, 04 Jul 2009 01:06:30 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Blur Detection]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Edge Detection]]></category>
		<category><![CDATA[Hough Transform]]></category>
		<category><![CDATA[Intel IPP]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1277</guid>
		<description><![CDATA[In my previous three articles (1,2,3) I discussed how to use Canny edge detection and Hough transform to identify blur images. Here I will show some results from the algorithm discussed before.
Results
When presented with images that are clear, the algorithm correctly identified most of them (see images below):
















The following images illustrate how the original image [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous three articles (<a href="/2009/06/19/image-blur-detection-via-hough-transform-i/">1</a>,<a href="/2009/06/24/image-blur-detection-via-hough-transform-ii/">2</a>,<a href="/2009/06/27/image-blur-detection-via-hough-transform-iii/">3</a>) I discussed how to use Canny edge detection and Hough transform to identify blur images. Here I will show some results from the algorithm discussed before.<span id="more-1277"></span></p>
<h3>Results</h3>
<p>When presented with images that are clear, the algorithm correctly identified most of them (see images below):</p>
<table>
<tr>
<td>
<div id="attachment_1285" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c1.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c1.jpg" alt="Building (Microsoft Research Digital Image)" title="Building (Microsoft Research Digital Image)" width="320" height="240" class="size-full wp-image-1285" /></a><p class="wp-caption-text">Building (Microsoft Research Digital Image)</p></div>
</td>
<td>
<div id="attachment_1286" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c2.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c2.jpg" alt="Street (Microsoft Research Digital Image)" title="Street (Microsoft Research Digital Image)" width="320" height="240" class="size-full wp-image-1286" /></a><p class="wp-caption-text">Street (Microsoft Research Digital Image)</p></div>
</td>
</tr>
</table>
<table>
<tr>
<td>
<div id="attachment_1289" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c5.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c5.jpg" alt="Sky" title="Sky" width="320" height="240" class="size-full wp-image-1289" /></a><p class="wp-caption-text">Sky</p></div>
</td>
<td>
<div id="attachment_1288" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c4.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c4.jpg" alt="Flower" title="Flower" width="320" height="240" class="size-full wp-image-1288" /></a><p class="wp-caption-text">Flower</p></div>
</td>
</tr>
</table>
<p>The following images illustrate how the original image (top right) is divided into sub regions. Canny detection is performed on each of the sub images.</p>
<table>
<tr>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/1.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/1.jpg" alt="1" title="1" width="214" height="160" class="aligncenter size-full wp-image-1300" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/4.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/4.jpg" alt="4" title="4" width="214" height="160" class="aligncenter size-full wp-image-1303" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/7.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/7.jpg" alt="7" title="7" width="214" height="160" class="aligncenter size-full wp-image-1306" /></a>
</td>
</tr>
<tr>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/2.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/2.jpg" alt="2" title="2" width="214" height="160" class="aligncenter size-full wp-image-1301" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/5.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/5.jpg" alt="5" title="5" width="214" height="160" class="aligncenter size-full wp-image-1304" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/8.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/8.jpg" alt="8" title="8" width="214" height="160" class="aligncenter size-full wp-image-1307" /></a>
</td>
</tr>
<tr>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/3.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/3.jpg" alt="3" title="3" width="214" height="160" class="aligncenter size-full wp-image-1302" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/6.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/6.jpg" alt="6" title="6" width="214" height="160" class="aligncenter size-full wp-image-1305" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/9.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/9.jpg" alt="9" title="9" width="214" height="160" class="aligncenter size-full wp-image-1331" /></a>
</td>
</tr>
</table>
<p>When performing Hough Transform, I chose to detect up to ten lines in each image, with the following stepping parameter (the detection results are very sensitive to these parameters, the following parameters were chosen based on experiment results):<br />
\[\rho=1, \theta=0.01\]</p>
<p>Out of all the detected lines, a few sections are selected based on line continuity and the calculated average gradients around the detected lines. The following images shows the chosen Hough line segments based on the algorithm. </p>
<table>
<tr>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/1_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/1_o.jpg" alt="1_o" title="1_o" width="214" height="160" class="aligncenter size-full wp-image-1309" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/4_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/4_o.jpg" alt="4_o" title="4_o" width="214" height="160" class="aligncenter size-full wp-image-1312" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/7_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/7_o.jpg" alt="7_o" title="7_o" width="214" height="160" class="aligncenter size-full wp-image-1315" /></a>
</td>
</tr>
<tr>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/2_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/2_o.jpg" alt="2_o" title="2_o" width="214" height="160" class="aligncenter size-full wp-image-1310" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/5_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/5_o.jpg" alt="5_o" title="5_o" width="214" height="160" class="aligncenter size-full wp-image-1313" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/8_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/8_o.jpg" alt="8_o" title="8_o" width="214" height="160" class="aligncenter size-full wp-image-1316" /></a>
</td>
</tr>
<tr>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/3_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/3_o.jpg" alt="3_o" title="3_o" width="214" height="160" class="aligncenter size-full wp-image-1311" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/6_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/6_o.jpg" alt="6_o" title="6_o" width="214" height="160" class="aligncenter size-full wp-image-1314" /></a>
</td>
<td>
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/9_o.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/07/9_o.jpg" alt="9_o" title="9_o" width="214" height="160" class="aligncenter size-full wp-image-1317" /></a>
</td>
</tr>
</table>
<p>The table below shows the gradient index calculated within each area (the average of the gradients along all the chosen line segments within a sub-image. The result is scaled by 1000, the scale factor is chosen such that for clear images the resulted index is greater than 1 and for blur images the resulted index is less than 1).</p>
<table>
<tr>
<td>1</td>
<td>1.709</td>
</tr>
<tr>
<td>2</td>
<td>2.383</td>
</tr>
<tr>
<td>3</td>
<td>1.012</td>
</tr>
<tr>
<td>4</td>
<td>2.842</td>
</tr>
<tr>
<td>5</td>
<td>3.389</td>
</tr>
<tr>
<td>6</td>
<td>2.419</td>
</tr>
<tr>
<td>7</td>
<td>2.933</td>
</tr>
<tr>
<td>8</td>
<td>2.168</td>
</tr>
<tr>
<td>9</td>
<td>2.534</td>
</tr>
</table>
<p>And the sub images are indexed as follows:</p>
<table>
<tr>
<td>1</td>
<td>4</td>
<td>7</td>
</tr>
<tr>
<td>2</td>
<td>5</td>
<td>8</td>
</tr>
<tr>
<td>3</td>
<td>6</td>
<td>9</td>
</tr>
</table>
<p>The algorithm can also detect images with deliberate blur regions (e.g. <a href="http://en.wikipedia.org/wiki/Bokeh">Bokeh</a>). The results are illustrated below:</p>
<div id="attachment_1287" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c3.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/c3.jpg" alt="Plant (Microsoft Research Digital Image)" title="Plant (Microsoft Research Digital Image)" width="320" height="240" class="size-full wp-image-1287" /></a><p class="wp-caption-text">Plant (Microsoft Research Digital Image)</p></div>
<table>
<tr>
<td>1</td>
<td><font color="red">0.000</font></td>
</tr>
<tr>
<td>2</td>
<td>1.750</td>
</tr>
<tr>
<td>3</td>
<td>1.973</td>
</tr>
<tr>
<td>4</td>
<td>1.595</td>
</tr>
<tr>
<td>5</td>
<td>2.815</td>
</tr>
<tr>
<td>6</td>
<td>3.188</td>
</tr>
<tr>
<td>7</td>
<td><font color="red">0.000</font></td>
</tr>
<tr>
<td>8</td>
<td>1.204</td>
</tr>
<tr>
<td>9</td>
<td>1.308</td>
</tr>
</table>
<p>Note that the 0&#8217;s in the detection results indicate that within those regions, no lines could be reliably detected and thus those regions are considered blurred.<br />
Generally speaking, when an image contains both blurred and clear regions, some of the indexes will be zero and others will be greater than one.</p>
<p>The following images are detected as blurred, with the detected indexes far less than one.</p>
<table>
<tr>
<td>
<div id="attachment_1294" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b3.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b3.jpg" alt="Boat (Microsoft Research Digital Image)" title="Boat (Microsoft Research Digital Image)" width="320" height="240" class="size-full wp-image-1294" /></a><p class="wp-caption-text">Boat (Microsoft Research Digital Image)</p></div>
</td>
<td>
<div id="attachment_1295" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b4.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b4.jpg" alt="Candle" title="Candle" width="320" height="427" class="size-full wp-image-1295" /></a><p class="wp-caption-text">Candle</p></div>
</td>
</tr>
</table>
<h3>Limitations</h3>
<p>when image contrast is low, or when objects borders are not clearly defined, the algorithm may have difficulty in distinguishing whether an image is blurred. Take the following two cloud images for instance, the image on the left was correctly classified as a clear image due to the relatively high contrast around the center. But the image to the right was classified as blurred due to its lack of contrast. </p>
<table>
<tr>
<td>
<div id="attachment_1291" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b1.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b1.jpg" alt="Cloud" title="Cloud" width="320" height="240" class="size-full wp-image-1291" /></a><p class="wp-caption-text">Cloud</p></div>
</td>
<td>
<div id="attachment_1293" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b2.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/b2.jpg" alt="Cloud" title="Cloud" width="320" height="240" class="size-full wp-image-1293" /></a><p class="wp-caption-text">Cloud</p></div>
</td>
</tr>
</table>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/07/03/image-blur-detection-via-hough-transform-iv/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Image Blur Detection via Hough Transform &#8212; III</title>
		<link>http://www.kerrywong.com/2009/06/27/image-blur-detection-via-hough-transform-iii/</link>
		<comments>http://www.kerrywong.com/2009/06/27/image-blur-detection-via-hough-transform-iii/#comments</comments>
		<pubDate>Sun, 28 Jun 2009 01:31:01 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Blur Detection]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Edge Detection]]></category>
		<category><![CDATA[Hough Transform]]></category>
		<category><![CDATA[Intel IPP]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1231</guid>
		<description><![CDATA[I will continue where I left off in my previous post. After performing Hough transform, and extracted the longest sections of lines for each corresponding Hough line detected, we will need to calculate the gradients of the image pixels luminance around the line sections.
Gradient Calculation
If you remember how the Hough parameters were determined (in polar [...]]]></description>
			<content:encoded><![CDATA[<p>I will continue where I left off in my <a href="/2009/06/24/image-blur-detection-via-hough-transform-ii/">previous post</a>. After performing Hough transform, and extracted the longest sections of lines for each corresponding Hough line detected, we will need to calculate the gradients of the image pixels luminance around the line sections.<span id="more-1231"></span></p>
<h3>Gradient Calculation</h3>
<p>If you remember how the Hough parameters were determined (in polar form, see figure below), it is not difficult to obtain the pixel coordinates centered around the points on the detected line. </p>
<div id="attachment_1214" class="wp-caption aligncenter" style="width: 554px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/polar.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/polar.png" alt="Region selection for gradient calculation" title="Region selection for gradient calculation" width="544" height="410" class="size-full wp-image-1214" /></a><p class="wp-caption-text">Region selection for gradient calculation</p></div>
<p>In fact, we can formulate a line that is perpendicular to the line section detected during Hough transform, and use the line section that is within a predefined region (e.g. the between the dotted lines) to calculate the gradients of luminance. The following code snippet shows how the perpendicular line&#8217;s parameters are obtained.</p>
<pre class="brush: cpp;">
    /**
     * Get the equation parameters for the line that passes through (x,y) and is perpendicular
     * to the line specified by parameters (p0, theta0) in normal form.
     *
     * @param p0 : distance to line from origin.
     * @param theta0 : the slope of p0.
     * @param x : x coordinate of the point where the perpendicular line passes through
     * @param y : y coordinate of the point where the perpendicular line passes through
     * @param &amp;p : the perpendicular line's distance from origin.
     * @param &amp;theta : the slope of p.
     **/
    void LineUtils::GetPerpendicularLineParameters(float p0, float theta0, float x, float y, float &amp;p, float &amp;theta) {
        float x0 = p0 * cos(theta0);
        float y0 = p0 * sin(theta0);

        p = sqrt((x0 - x)*(x0 - x) + (y0 - y)*(y0 - y));

        float a1 = theta0 - PI / 2.0;
        float a2 = theta0 + PI / 2.0;

        //d=|x0 * cos a + y0 * sin a - p|
        float d1 = abs(x * cos(a1) + y * sin(a1) - p);
        float d2 = abs(x * cos(a2) + y * sin(a2) - p);

        if (d1 &lt; d2)
            theta = a1;
        else
            theta = a2;
    }
</pre>
<p>In an ideal situation, the gradient of the line points obtained via the method above looks like the figures below, where the edges are clearly identified. </p>
<div id="attachment_1254" class="wp-caption aligncenter" style="width: 570px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/edge1.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/edge1.png" alt="Edge (dark to bright)" title="Edge (dark to bright)" width="560" height="420" class="size-full wp-image-1254" /></a><p class="wp-caption-text">Edge (dark to bright)</p></div>
<div id="attachment_1255" class="wp-caption aligncenter" style="width: 570px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/edge2.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/edge2.png" alt="Edge (dark to bright to dark)" title="Edge (dark to bright to dark)" width="560" height="420" class="size-full wp-image-1255" /></a><p class="wp-caption-text">Edge (dark to bright to dark)</p></div>
<p>Sometimes, when the lighting condition is poor, the image would appear to be &#8220;grainy&#8221;, which sometimes led to poor line detection. For instance, the following figure shows the the gradient when the region around the detected edge is grainy:</p>
<div id="attachment_1258" class="wp-caption aligncenter" style="width: 570px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/noneedge.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/noneedge.png" alt="None-Edge" title="None-Edge" width="560" height="420" class="size-full wp-image-1258" /></a><p class="wp-caption-text">None-Edge</p></div>
<p>We use the number of times the luminance increases above or decreases below its mean along the perpendicular line interval as a measure of whether we accept the detection results as gradients or not. Typically when such number of crossings is less than 3 (see the first two images above) the curve is either monotonic or has a single peak, we assume that gradients can be correctly calculated and if the crossings are more than 3 we discard the results. The gradient is calculated as the slope of the curve. In the examples above, we used ten pixels on each side of the Hough line to calculate gradients.</p>
<h3>Image Regions</h3>
<p>For images with complex contents, it becomes difficult for the Hough transform to reliably identify line structures within the image. Future more, certain photography techniques (i.e. <a href="http://en.wikipedia.org/wiki/Bokeh">Bokeh</a>) leave portions of images deliberately blurred. Without dividing image into different sub-regions, the classification results would be compromised.</p>
<p>Thus, images are divided into 9 (3&#215;3) sub-images after Canny edge detection, and Hough transform is performed against each sub-images. The figure below illustrates how an image is divided:<br />
<div id="attachment_1267" class="wp-caption aligncenter" style="width: 654px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/3x3.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/3x3.jpg" alt="Image divided into 3x3 sub-images (Microsoft Research Digital Image)" title="Image divided into 3x3 sub-images (Microsoft Research Digital Image)" width="644" height="482" class="size-full wp-image-1267" /></a><p class="wp-caption-text">Image divided into 3x3 sub-images (Microsoft Research Digital Image)</p></div></p>
<p>This technique is especially useful when portions of images are deliberately blurred, like the image shown above. It also helps the line detection accuracy when the image contains complex scenes. By dividing up the image, each sub area&#8217;s complexity is greatly reduced. Other methods in scene separation might achieve even better results, but it is out of the scope for our discussion here.</p>
<p>In my next post, I will show some results obtained from using the method mentioned in this and the previous articles and will also discuss its limitations.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/06/27/image-blur-detection-via-hough-transform-iii/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Image Blur Detection via Hough Transform &#8212; II</title>
		<link>http://www.kerrywong.com/2009/06/24/image-blur-detection-via-hough-transform-ii/</link>
		<comments>http://www.kerrywong.com/2009/06/24/image-blur-detection-via-hough-transform-ii/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 01:44:44 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Blur Detection]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Edge Detection]]></category>
		<category><![CDATA[Hough Transform]]></category>
		<category><![CDATA[Intel IPP]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1194</guid>
		<description><![CDATA[In my previous post, I briefly discussed the rationale behind automated blur detection in digital imagery and did an overview of an algorithm that could be used to detect blur images. Here I will show some implementation details along with some C++ code snippets.
Experience tells us that blur images tend to contain less details then [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="/2009/06/19/image-blur-detection-via-hough-transform-i/">my previous post</a>, I briefly discussed the rationale behind automated blur detection in digital imagery and did an overview of an algorithm that could be used to detect blur images. Here I will show some implementation details along with some C++ code snippets.<span id="more-1194"></span></p>
<p>Experience tells us that blur images tend to contain less details then their sharper counterparts. And the areas where <a href="http://en.wikipedia.org/wiki/Intensity">intensity</a> transitions occur (e.g. the border of an object) are more well defined in clear images. Mathematically speaking, the <a href="http://en.wikipedia.org/wiki/Slope">slope</a> of the intensity transition is statistically deeper in clear images than blur ones. Since whether an image is blurred or not is not affected by color space, it is sufficient to perform detection in the <a href="http://en.wikipedia.org/wiki/Luminance">luminance</a> space (i.e. gray scale images):</p>
<p>\[I=0.299R\times0.587G\times0.114B\]</p>
<h3>Canny Edge Detection</h3>
<p>Thus the very first step in deciding whether an image or an area within an image is blurred is to use some sort of edge detection algorithms to obtain a collection of the edges in the image.</p>
<p><a href="http://en.wikipedia.org/wiki/Canny_edge_detector">Canny edge detection</a> is a good candidate since it is optimal in terms of good detection and localization. And hysteresis is used to reduce streaking and thus Canny edge detection achieves relatively continuous edge boundaries comparing to other edge detection methods.</p>
<p>In <a href="/2009/05/07/canny-edge-detection-auto-thresholding/">one of my previous posts</a>, I discussed Canny edge detection using auto thresholding utilizing Intel&#8217;s <a href="http://software.intel.com/en-us/intel-ipp/">Integrated Performance Primitives</a> (IPP). I used the same algorithm here for the image pre-processing process.</p>
<h3>Hough Transform</h3>
<p>In order to analyze the gradients along detected edges, it is necessary to first parameterize them. <a href="http://en.wikipedia.org/wiki/Hough_transform">Hough transform</a> comes in handy for this task. </p>
<p>While Hough transform is capable of identifying arbitrary shapes, for the purpose of detecting image blurs simple line detection is more robust. Besides, IPP has an implementation for line detection using Hough transform out of the box.</p>
<p>Since our goal is to determine the quality of the image within a region of interest, we do not need to analyze all the edges identified within that region. But rather, we could select a few based on some pre-determined criteria that would maximize our ability to correctly determine the luminance gradients.</p>
<p>The following code snippet shows my implementation of the edge parameterization method using IPP:</p>
<pre class="brush: cpp;">
    bool CompareIppiPoint(IppiPoint &amp;p1, IppiPoint &amp;p2) {
        if (p1.x == p2.x) {
            return p1.y &lt; p2.y;
        } else {
            return p1.x &lt; p2.x;
        }
    }

    void IPPGrayImage::HoughLine(IppPointPolar delta, int threshold, int maxLineCount, int* pLineCount, IppPointPolar pLines[], list&lt;IppiPoint&gt; pList[]) {
        IppStatus sts;
        IppiSize roiSize = {_width, _height};

        int bufSize;
        sts = ippiHoughLineGetSize_8u_C1R(roiSize, delta, maxLineCount, &amp;bufSize);
        assert(sts == ippStsNoErr);
        Ipp8u * pBuf = ippsMalloc_8u(bufSize);

        Ipp8u *imgBuf;
        int stepSize;
        imgBuf = ippiMalloc_8u_C1(_width, _height, &amp;stepSize);

        sts = ippiConvert_32f8u_C1R(_imgBuffer, _width * PIXEL_SIZE, imgBuf, _width, roiSize, ippRndNear);
        assert(sts == ippStsNoErr);
        assert(imgBuf != NULL);

        sts = ippiHoughLine_8u32f_C1R(imgBuf, _width, roiSize, delta, threshold, pLines, maxLineCount, pLineCount, pBuf);
        assert(sts == ippStsNoErr);

        //d=|x0 * cos a + y0 * sin a - p|
        int val = 0;
        float d;
        for (int x = 0; x &lt; _width; x++) {
            for (int y = 0; y &lt; _height; y++) {
                if (imgBuf[x + y * _width] &gt; 0) {
                    val = imgBuf[x + y * _width];

                    for (int i = 0; i &lt; *pLineCount; i++) {
                        d = abs((float) cos(pLines[i].theta) * (float) x + (float) sin(pLines[i].theta) * (float) y - abs(pLines[i].rho));
                        if (d &lt; threshold) {
                            IppiPoint p;
                            p.x = x;
                            p.y = y;

                            pList[i].push_back(p);
                        }
                    }
                }
            }
        }

        for (int i = 0; i &lt; *pLineCount; i++) {
            pList[i].sort(CompareIppiPoint);
        }

        ippsFree(pBuf);
        ippiFree(imgBuf);
    }
</pre>
<p>Note that the selection for <em>delta</em> is crucial for the line detection quality. In the code above, up to <em>maxLineCount</em> number of lines are detected and each line&#8217;s parameters are stored in <em>pLines[]</em>. The edge image pixel coordinates within a given range that is less than <em>threshold</em> are collected into the list. Because the Hough detection does not indicate the begin and the end point of a line, we need to iterate through the pixels in <em>pList[]</em> and find the sections that the detected lines pass through. The image below illustrates how the edge regions are chosen based on detected line parameters. Image features within a region between the dotted line are recorded based on the distances to the line defined by:<br />
\[d=\vert x_0cos\alpha+y_0sin\alpha-p\vert\]</p>
<div id="attachment_1214" class="wp-caption aligncenter" style="width: 554px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/polar.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/06/polar.png" alt="Region selection for gradient calculation" title="Region selection for gradient calculation" width="544" height="410" class="size-full wp-image-1214" /></a><p class="wp-caption-text">Region selection for gradient calculation</p></div>
<p>In practice, for each parameterized line I chose to select the longest section where the line approximates a section of the edge. Other line sections can be used as well, but the longest portion tends to offer better classification results. The code is shown below, and the thickened section in the figure above illustrates the selected section.</p>
<pre class="brush: cpp;">
    void LineUtils::GetLongestConnectedLinePoints(IppiPoint points[], int len, int threshold, IppiPoint lpoints[], int &amp;maxLen) {
        int maxIndexEnd = 0, curRunLength = 0, maxRunLength = 0;

        lpoints[curRunLength] = points[0];

        for (int i = 1; i &lt; len; i++) {
            if ((abs(points[i].x - lpoints[curRunLength].x) &gt; threshold) || (abs(points[i].y - lpoints[curRunLength].y) &gt; threshold)) {
                if (curRunLength &gt; maxRunLength) {
                    maxRunLength = curRunLength;
                    maxIndexEnd = i;
                }
                curRunLength = 0;
                lpoints[curRunLength] = points[i];
            } else {
                if (curRunLength &gt;= maxLen) {
                    maxRunLength = maxLen;
                    break;
                }

                curRunLength++;
                lpoints[curRunLength] = points[i];
            }
        }

        for (int i = maxIndexEnd - maxRunLength - 1; i &lt; maxIndexEnd; i++) {
            if (i - (maxIndexEnd - maxRunLength) - 1 &lt; maxRunLength) {
                lpoints[i - (maxIndexEnd - maxRunLength - 1)] = points[i];
            }
        }

        maxLen = maxRunLength;
    }
</pre>
<p>In my next post, I will continue with the gradient calculation and some methods to enhance the detection accuracies.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/06/24/image-blur-detection-via-hough-transform-ii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Image Blur Detection via Hough Transform &#8212; I</title>
		<link>http://www.kerrywong.com/2009/06/19/image-blur-detection-via-hough-transform-i/</link>
		<comments>http://www.kerrywong.com/2009/06/19/image-blur-detection-via-hough-transform-i/#comments</comments>
		<pubDate>Sat, 20 Jun 2009 02:41:13 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Blur Detection]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Edge Detection]]></category>
		<category><![CDATA[Hough Transform]]></category>
		<category><![CDATA[Intel IPP]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1172</guid>
		<description><![CDATA[It is often necessary to identify and classify images based on their clarities. For instance, it is desirable for an automated process to locate blurred images within a large digitized image library and then automatically sharpen the blurred images via inverse filtering or blind deconvolution. In the following series of articles, I will discuss a [...]]]></description>
			<content:encoded><![CDATA[<p>It is often necessary to identify and classify images based on their clarities. For instance, it is desirable for an automated process to locate blurred images within a large digitized image library and then automatically sharpen the blurred images via inverse filtering or blind deconvolution. In the following series of articles, I will discuss a practical method in detecting blur images using <a href="http://en.wikipedia.org/wiki/Hough_transform">Hough Transform</a>.<span id="more-1172"></span></p>
<h3>Background</h3>
<p>Image blur is typically caused by</p>
<ul>
<li>Motion (e.g. the relative movement between the camera and the object during exposure)</li>
<li>Out-of-focus</li>
<li>Low lighting condition</li>
</ul>
<p>Other types of image blur may also occur in photography (such as <a href="http://en.wikipedia.org/wiki/Bokeh">Bokeh</a>). And it is desirable to be able to automatically distinguish desired image blur (e.g. Bokeh caused by camera lens with a shallow depth of field) from un-intended blur (e.g. out of focus).</p>
<p>When reference images (e.g. a series of images with different focus settings) are present, detecting image blur is a relatively simple task. For instance, in passive auto-focus cameras, a series of scene images are captured with progressive focus settings. And the intensity differences are calculated within the same region across these different images. Thus, the image with the highest intensity difference corresponds to the correct focus setting. In Active-focus cameras, image blur detection problem is circumvented by measuring the distance between the lens and the object.</p>
<p>The problem becomes more difficult if we are only given a single image. Since many parameters that we used above in camera auto-focusing are not present in a single image setting, we could not reliably infer image sharpness by the method used in passive auto-focusing as we do not have sufficient knowledge to re-construct the series of images necessary for comparison.</p>
<p>We can however, tell whether an image is in-focus by calculating the intensity differences along the edges in an image. If the calculated intensity is higher than a predefined threshold, we deduce that the image is sharp. And if the calculated intensity is lower than a predefined threshold, we conclude that the image is blurred. So now we shifted the problem to identifying such optimal regions where the intensities are calculated.</p>
<h3>Algorithm Overview</h3>
<p>In this algorithm, we first use <a href="http://en.wikipedia.org/wiki/Canny_edge_detector">Canny edge detection</a> to obtain the edges in the image. The edges are then parameterized using Hough transform. We then calculate the pixel gradients along the parameterized lines detected and finally we use the gradients to decide whether the image is blurred.</p>
<p>I will discuss some technical and implementation details in the up-coming posts. Stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/06/19/image-blur-detection-via-hough-transform-i/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>C++ Recursive Directory Search Under Linux</title>
		<link>http://www.kerrywong.com/2009/06/12/c-recursive-directory-search-under-linux/</link>
		<comments>http://www.kerrywong.com/2009/06/12/c-recursive-directory-search-under-linux/#comments</comments>
		<pubDate>Fri, 12 Jun 2009 21:55:29 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[boost]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1154</guid>
		<description><![CDATA[I was trying to search for some code examples on how to do a recursive directory search under Linux using C++ the other day. But to my surprise, I could not find any place that offers a complete example. So I decided to post my code here after I created my own and hopefully you [...]]]></description>
			<content:encoded><![CDATA[<p>I was trying to search for some code examples on how to do a recursive directory search under Linux using C++ the other day. But to my surprise, I could not find any place that offers a complete example. So I decided to post my code here after I created my own and hopefully you will find it helpful.<span id="more-1154"></span></p>
<p>For those who are impatient, the function to perform recursive directory search is here:</p>
<pre class="brush: cpp;">
#include &lt;sys/types.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;dirent.h&gt;
#include &lt;errno.h&gt;
#include &lt;vector&gt;
#include &lt;string&gt;
#include &lt;iostream&gt;
#include &lt;boost/regex.hpp&gt;

void GetFileListing(vector&lt;string&gt; &amp;files, string dir, string filter, bool ignoreCase) {
    DIR *d;
    if ((d = opendir(dir.c_str())) == NULL) return;
    if (dir.at(dir.length() - 1) != '/') dir += &quot;/&quot;;

    struct dirent *dent;
    struct stat st;
    boost::regex exp;

    if (ignoreCase)
        exp.set_expression(filter, boost::regex_constants::icase);
    else
        exp.set_expression(filter);

    while ((dent = readdir(d)) != NULL) {
        string path = dir;

        if (string(dent-&gt;d_name) != &quot;.&quot; &amp;&amp; string(dent-&gt;d_name) != &quot;..&quot;) {
            path += string(dent-&gt;d_name);
            const char *p = path.c_str();
            lstat(p, &amp;st);

            if (S_ISDIR(st.st_mode)) {
                GetFiles(files, (path + string(&quot;/&quot;)).c_str(), filter, ignoreCase);
            } else {
                if (filter == &quot;.*&quot;) {
                    files.push_back(path);
                } else {
                    if (boost::regex_match(string(dent-&gt;d_name), exp)) files.push_back(path);
                }
            }
        }
    }

    closedir(d);
}
</pre>
<p>I used <a href="http://www.boost.org">boost library</a> to perform regular expression matches for file names. If you just want to obtain a listing of all the files, you can do without using the boost library.</p>
<p>The following code snippet demonstrates how to use the function. The results are stored in a vector container passed in. Note that the &#8220;filter&#8221; parameter needs standard regular expressions (so if you are looking for any files, the expression should be  .* instead of just *) to work properly.</p>
<p>The code should be pretty self-explanatory. If <em>ignoreCase</em> is set to true, then the match will be case-insensitive.</p>
<pre class="brush: cpp;">
vector&lt;string&gt; files;

FileUtils::GetFiles(files,&quot;/tmp&quot;, &quot;.*&quot;, true);
for (int i = 0 ; i &lt; files.size(); i++) {
    cout &lt;&lt; files[i] &lt;&lt; endl;
}
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/06/12/c-recursive-directory-search-under-linux/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Timing Methods in C++ Under Linux</title>
		<link>http://www.kerrywong.com/2009/05/28/timing-methods-in-c-under-linux/</link>
		<comments>http://www.kerrywong.com/2009/05/28/timing-methods-in-c-under-linux/#comments</comments>
		<pubDate>Fri, 29 May 2009 01:23:38 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[RDTSC]]></category>
		<category><![CDATA[Timing]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1092</guid>
		<description><![CDATA[Measuring the execution time for code sections can be done in multiple ways in C++. Except for the time resolution issue, different timing methods worked relatively the same in single processor environment. As multi-core processors become more prevalent however, we need to be careful at choosing the correct timing mechanism as not all such routines [...]]]></description>
			<content:encoded><![CDATA[<p>Measuring the execution time for code sections can be done in multiple ways in C++. Except for the time resolution issue, different timing methods worked relatively the same in single processor environment. As multi-core processors become more prevalent however, we need to be careful at choosing the correct timing mechanism as not all such routines measure the wall time elapsed.<span id="more-1092"></span></p>
<p>Here I will examine a few commonly used method in measuring time intervals under Linux. All of the following timing routines are timed against the same OpenMP parallel for loop (on a quad-core CPU, the parallel for will spawn four concurrent threads).</p>
<h3>time()</h3>
<p>The <strong>time()</strong> function returns time with the accuracy to a second. So this function is generally useful for measuring long-running processes. </p>
<h3>clock()</h3>
<p>In single-core systems, <strong>clock()</strong> is often used for time measurements. The resolution of this timer is determined by <strong>CLOCKS_PER_SEC</strong> and is usually a microsecond. Since it determines the number of CPU clock cycles elapsed, it is not particularly useful in measuring time on a multi-core processor system when there are concurrent executing threads as the result of <strong>clock()</strong> function is the accumulation of CPU clocks across all active CPUs. On a quad-core system, if all cores are at full utilization then the result time is roughly four times the wall time.</p>
<h3>gettimeofday()</h3>
<p>Similar to the <strong>clock()</strong> function, <strong>gettimeofday()</strong> has a resolution up to one microsecond. As the function name suggests, <strong>gettimeofday()</strong> measures the wall time and thus is suitable for time measurement in multi-core, multi-cpu systems.</p>
<h3>rdtsc</h3>
<p>The <a href="http://en.wikipedia.org/wiki/Time_Stamp_Counter">time stamp counter</a> is available on most modern CPUs (since Pentium). There are many implementations based on <strong>rdtsc</strong> (e.g. on Windows systems, the Win32 API call <strong>QueryPerformanceCounter</strong>). Implementation based on <strong>rdtsc</strong> is generally very accurate (with resolution up to one nanosecond) but depending on implementation, its accuracy might be susceptible to CPU clock throttling (common in mobile CPUs). In my implementation below, the rdtsc results are divided by the CPU frequency. </p>
<blockquote><p>
grep &#8220;cpu MHz&#8221; /proc/cpuinfo  | cut -d&#8217;:&#8217; -f2
</p></blockquote>
<p>This implementation assumes that CPU frequency remains constant during operations, which could lead to poor accuracy should the CPU frequency change during the measurement. For desktop CPU though, this is less of a concern however.</p>
<h3>clock_gettime</h3>
<p>Like <strong>rdtsc</strong>, this function has a nanosecond accuracy and is available on all POSIX compliant systems.</p>
<p>Intel <a href="http://www.threadingbuildingblocks.org/">Threading Building Block</a> also provides a timer function <strong>tick_count::now()</strong> and the time can easily measured using the code snippet below:</p>
<pre class="brush: cpp;">
    t_start = tick_count::now();
    //statements
    t_end = tick_count::now();
    cout &lt;&lt; (t_end - t_start).seconds() * 1000 &lt;&lt; &quot; ms&quot; &lt;&lt; endl;
</pre>
<p>The following lists the code for time measuring using the methods mentioned above.</p>
<pre class="brush: cpp;">
#include &lt;iostream&gt;
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;sys/time.h&gt;
#include &lt;time.h&gt;
#include &lt;ctime&gt;

using namespace std;

void Foo() {
#pragma omp parallel
    {
        for (long i = 0; i &lt; 50000; i++)
            for (long j = 0; j &lt; 50000; j++);
    }

}

unsigned long long rdtsc() {
    unsigned a, d;

    __asm__ volatile(&quot;rdtsc&quot; : &quot;=a&quot; (a), &quot;=d&quot; (d));

    return ((unsigned long long) a) | (((unsigned long long) d) &lt;&lt; 32);
}

void Time() {
    time_t t1, t2;

    time(&amp;t1);
    Foo();
    time(&amp;t2);

    cout &lt;&lt; &quot;time() : &quot; &lt;&lt; t2 - t1 &lt;&lt; &quot; s&quot; &lt;&lt; endl;
}

void Clock() {
    clock_t c1 = clock();
    Foo();
    clock_t c2 = clock();
    cout &lt;&lt; &quot;clock() : &quot; &lt;&lt; (float) (c2 - c1) / (float) CLOCKS_PER_SEC &lt;&lt; &quot; s&quot; &lt;&lt; endl;
}

void GetTimeOfDay() {
    timeval t1, t2, t;
    gettimeofday(&amp;t1, NULL);
    Foo();
    gettimeofday(&amp;t2, NULL);
    timersub(&amp;t2, &amp;t1, &amp;t);

    cout &lt;&lt; &quot;gettimeofday() : &quot; &lt;&lt; t.tv_sec + t.tv_usec / 1000000.0 &lt;&lt; &quot; s&quot; &lt;&lt; endl;
}

void RDTSC() {
    unsigned long long t1, t2;

    t1 = rdtsc();
    Foo();
    t2 = rdtsc();

    cout &lt;&lt; &quot;rdtsc() : &quot; &lt;&lt; 1.0 * (t2 - t1) / 3199987.0 / 1000.0 &lt;&lt; &quot; s&quot; &lt;&lt; endl;
}

void ClockGettime() {
    timespec res, t1, t2;
    clock_getres(CLOCK_REALTIME, &amp;res);

    clock_gettime(CLOCK_REALTIME, &amp;t1);
    Foo();
    clock_gettime(CLOCK_REALTIME, &amp;t2);

    cout &lt;&lt; &quot;clock_gettime() : &quot;
         &lt;&lt; (t2.tv_sec - t1.tv_sec)  + (float) (t2.tv_nsec - t1.tv_nsec) / 1000000000.0
         &lt;&lt; &quot; s&quot; &lt;&lt; endl;
}

int main() {
    cout.setf(ios::fixed);
    cout.setf(ios::showpoint);
    cout.precision(5);

    Time();
    Clock();
    GetTimeOfDay();

    RDTSC();
    ClockGettime();

    return (EXIT_SUCCESS);
}
</pre>
<p>And here is the output from my quad-core computer in debug mode.</p>
<blockquote><p>
time() :  6 s<br />
clock() : 21.85000 s<br />
gettimeofday() : 5.67368 s<br />
rdtsc() : 5.65957 s<br />
clock_gettime() : 5.65894 s
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/05/28/timing-methods-in-c-under-linux/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Magick++ Missing Delegate Error</title>
		<link>http://www.kerrywong.com/2009/05/20/magick-missing-delegate-error/</link>
		<comments>http://www.kerrywong.com/2009/05/20/magick-missing-delegate-error/#comments</comments>
		<pubDate>Thu, 21 May 2009 00:24:02 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Magick++]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=1084</guid>
		<description><![CDATA[As I wrote last time, I did a clean Ubuntu 9.04 install on my main PC. Everything worked pretty well. But after re-installing all the packages I needed for C++ development, I realized that I still was missing some libraries as I got &#8220;Magick::ErrorMissingDelegate&#8221; exception (ImageMagick: no decode delegate for this image format) when I [...]]]></description>
			<content:encoded><![CDATA[<p>As I wrote <a href="/2009/05/13/ubuntu-904-on-my-main-pc/">last time</a>, I did a clean Ubuntu 9.04 install on my main PC. <span id="more-1084"></span>Everything worked pretty well. But after re-installing all the packages I needed for C++ development, I realized that I still was missing some libraries as I got &#8220;Magick::ErrorMissingDelegate&#8221; exception (ImageMagick: no decode delegate for this image format) when I tried to open JPEG or PNG images from code utilizing Magick++ image library.</p>
<p>After a quick <a href="http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&#038;t=12366">search on ImageMagick&#8217;s site</a> I realized that I was missing the following libraries:</p>
<blockquote><p>
libpng12<br />
libjpeg62
</p></blockquote>
<p><!--more--></p>
<p>Note, that you might want to use the following commands to find out the exact package names:</p>
<blockquote><p>
sudo apt-cache search libpng<br />
sudo apt-cache search libjpeg
</p></blockquote>
<p>sudo apt-cache search libjpeg<br />
And then use the following commands to install the packages:</p>
<blockquote><p>
sudo apt-get install libpng12-0 libpng12-dev<br />
sudo apt-get install libjpeg62 libjpeg62-dev
</p></blockquote>
<p>Note that after that, you will need to re-configure and re-build the Magick++ library (e.g. ./configure, ./make ./make install) so that Magick++ library can be linked correctly with the above libraries. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/05/20/magick-missing-delegate-error/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Canny Edge Detection Auto Thresholding</title>
		<link>http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/</link>
		<comments>http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/#comments</comments>
		<pubDate>Fri, 08 May 2009 02:15:41 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Image Processing]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=999</guid>
		<description><![CDATA[In the example I gave in &#8220;Interfacing IPP with Magick++&#8220;, I illustrated how to use Intel’s Integrated Performance Primitives (IPP) to perform edge detection. One issue with Canny edge detection algorithm is that we need to specify a high threshold and a low threshold. How to select those threshold values affect the quality of the [...]]]></description>
			<content:encoded><![CDATA[<p>In the example I gave in &#8220;<a href="/2009/03/17/interfacing-ipp-with-magick/">Interfacing IPP with Magick++</a>&#8220;, I illustrated how to use <a href="http://software.intel.com/en-us/intel-ipp/">Intel’s Integrated Performance Primitives (IPP)</a> to perform edge detection. One issue with <a href="http://en.wikipedia.org/wiki/Canny_edge_detector">Canny edge detection</a> algorithm is that we need to specify a high threshold and a low threshold. How to select those threshold values affect the quality of the detected edge greatly. And in my previous example, the threshold values were chosen manually. In this blog post, I will examine a couple of simple methods that can be used to automatically determine the threshold values.<span id="more-999"></span></p>
<p>The simplest way is to use the mean value of the gray scale image pixel values. As a rule of thumb, we set the low threshold to 0.66*[mean value] and set the high threshold to 1.33*[mean value]. Another way is to use the median color in the gray scale image and uses 0.66*[median value] and 1.33*[median value] accordingly. For typical images, these two methods achieve comparable results.</p>
<p>For example, the following shows the picture of a building along with its histogram (original image from <a href="ftp://ftp.research.microsoft.com/pub/download/orid">Microsoft Research Digital Image</a>. Please see <a href="http://research.microsoft.com/en-us/um/people/antcrim/data_objrec/msr%20cambridge%20eula%20for%20digital%20images_download.rtf">Microsoft Research Digital Image License Agreement</a> for more information):</p>
<table>
<tr>
<td>
<div id="attachment_1011" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_gray.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_gray.jpg" alt="Building" title="Building" width="320" height="240" class="size-full wp-image-1011" /></a><p class="wp-caption-text">Building</p></div>
</td>
<td>
<div id="attachment_1012" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_hist.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_hist.png" alt="Building Histogram" title="Building Histogram" width="320" height="240" class="size-full wp-image-1012" /></a><p class="wp-caption-text">Building Histogram</p></div>
</td>
</tr>
</table>
<p>The following shows the edge detection results using Canny algorithm (left image uses mean value auto-thresholding, right image uses median value auto-thresholding) and the results exhibit very little visible differences. </p>
<table>
<tr>
<td>
<div id="attachment_1006" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_canny_mean.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_canny_mean.jpg" alt="Building (Canny Mean)" title="Building (Canny Mean)" width="320" height="240" class="size-full wp-image-1006" /></a><p class="wp-caption-text">Building (Canny Mean)</p></div>
</td>
<td>
<div id="attachment_1007" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_canny_median.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/building_canny_median.jpg" alt="Building (Canny Median)" title="Building (Canny Median)" width="320" height="240" class="size-full wp-image-1007" /></a><p class="wp-caption-text">Building (Canny Median)</p></div>
</td>
</tr>
</table>
<p>However, for images has non-equalized histogram (see the picture of cloud and its histogram below): </p>
<table>
<tr>
<td>
<div id="attachment_1018" class="wp-caption aligncenter" style="width: 250px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_gray.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_gray.jpg" alt="Cloud" title="Cloud" width="240" height="320" class="size-full wp-image-1018" /></a><p class="wp-caption-text">Cloud</p></div>
</td>
<td>
<div id="attachment_1019" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_hist.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_hist.png" alt="Cloud Histogram" title="Cloud Histogram" width="320" height="240" class="size-full wp-image-1019" /></a><p class="wp-caption-text">Cloud Histogram</p></div>
</td>
</tr>
</table>
<p>Canny Edge detection result based on mean value auto-thresholding is pretty poor (see image on the left below), while edge detection based on median value auto-thresholding achieved much better result (see image on the right below)</p>
<table>
<tr>
<td>
<div id="attachment_1021" class="wp-caption aligncenter" style="width: 250px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_canny_mean.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_canny_mean.jpg" alt="Cloud (Canny Mean)" title="Cloud (Canny Mean)" width="240" height="320" class="size-full wp-image-1021" /></a><p class="wp-caption-text">Cloud (Canny Mean)</p></div>
</td>
<td>
<div id="attachment_1024" class="wp-caption aligncenter" style="width: 250px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq_canny_median.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq_canny_median.jpg" alt="Cloud (Canny Median)" title="Cloud (Canny Median)" width="240" height="320" class="size-full wp-image-1024" /></a><p class="wp-caption-text">Cloud (Canny Median)</p></div>
</td>
</tr>
</table>
<p>Alternatively, we could have performed histogram equalization on the image first before applying Canny edge detection with mean auto-thresholding:</p>
<table>
<tr>
<td>
<div id="attachment_1026" class="wp-caption aligncenter" style="width: 250px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq.png" alt="Cloud Equalized" title="Cloud Equalized" width="240" height="320" class="size-full wp-image-1026" /></a><p class="wp-caption-text">Cloud Equalized</p></div>
</td>
<td>
<div id="attachment_1028" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_hist_eq.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_hist_eq.png" alt="Cloud Equalized Histogram" title="Cloud Equalized Histogram" width="320" height="240" class="size-full wp-image-1028" /></a><p class="wp-caption-text">Cloud Equalized Histogram</p></div>
</td>
</tr>
</table>
<p>And after image equalization, both mean and median value auto-thresholding achieved similar results.</p>
<table>
<tr>
<td>
<div id="attachment_1030" class="wp-caption aligncenter" style="width: 250px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq_canny_mean.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq_canny_mean.jpg" alt="Cloud Equalized (Canny Mean)" title="Cloud Equalized (Canny Mean)" width="240" height="320" class="size-full wp-image-1030" /></a><p class="wp-caption-text">Cloud Equalized (Canny Mean)</p></div>
</td>
<td>
<div id="attachment_1031" class="wp-caption aligncenter" style="width: 250px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq_canny_median.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/05/cloud_eq_canny_median.jpg" alt="Cloud Equalized (Canny Median)" title="Cloud Equalized (Canny Median)" width="240" height="320" class="size-full wp-image-1030" /></a><p class="wp-caption-text">Cloud Equalized (Canny Median)</p></div>
</td>
</tr>
</table>
<p>So the Canny edge detection using median value auto-thresholding seems to adapt to different types of images very well (note selecting the median value selection can be thought as equalizing the histogram, except that the pixel values are not changed during such operation).</p>
<p>The following is the code listing for the histogram calculation using IPP (based on the <a href="/2009/04/10/an-image-class-based-on-ipp/">image class I created earlier</a>)</p>
<pre class="brush: cpp;">
/** @brief Get the min max mean value statistics for the current image
 *   @param min, max, mean: these are output parameters that are passed
 *         back by reference.
 */
void IPPGrayImage::MinMaxMean(float&amp; min, float&amp; max, double&amp; mean)
{
    IppStatus sts;
    IppiSize origImgSize = {_width, _height};

    sts = ippiMinMax_32f_C1R(_imgBuffer, _width * PIXEL_SIZE, origImgSize, &amp;min, &amp;max);
    assert(sts == ippStsNoErr);
    sts = ippiMean_32f_C1R(_imgBuffer, _width * PIXEL_SIZE, origImgSize, &amp;mean, ippAlgHintFast);
    assert(sts == ippStsNoErr);
}

/** @brief Calculate the histogram of the image
 *   @param nLevel: the number of bins in the histogram
 *        levels: this is the optional levels user can pass in (histogram will be then
 *                calculated with these levels instead of the uniform levels by default.
 *   @return an integer array which contains the histogram.
 *   @note the histogram is by default calculated using uniform bins across the color range.
 */
unsigned int * IPPGrayImage::GetHistogram(unsigned int nLevel, float levels[])
{
    IppStatus sts;
    Ipp32f* l = new Ipp32f[nLevel];
    IppiSize origImgSize = {_width, _height};

    unsigned int* bins = new unsigned int[nLevel - 1];

    float minVal = 0, maxVal = 0, stepVal = 0;
    double meanVal = 0;
    if (levels != NULL)
    {
        for (unsigned int i = 0; i &lt; nLevel; i++)
        {
            l[i] = levels[i];
        }
    }
    else
    {
        MinMaxMean(minVal, maxVal, meanVal);
        stepVal = (maxVal - minVal) / (float) nLevel;

        for (unsigned int i = 0; i &lt; nLevel; i++)
        {
            l[i] = minVal + stepVal * i;
        }
    }

    sts = ippiHistogramRange_32f_C1R(_imgBuffer, _width * PIXEL_SIZE, origImgSize, (Ipp32s*) bins, (Ipp32f*) l, nLevel);
    return bins;
}
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>C++ IDEs Under Linux</title>
		<link>http://www.kerrywong.com/2009/04/18/c-ides-under-linux/</link>
		<comments>http://www.kerrywong.com/2009/04/18/c-ides-under-linux/#comments</comments>
		<pubDate>Sun, 19 Apr 2009 02:27:04 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Code::Blocks]]></category>
		<category><![CDATA[KDevelop]]></category>
		<category><![CDATA[NetBeans]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=964</guid>
		<description><![CDATA[So far I have been mainly using KDevelop and Code::Blocks as my C++ development IDEs. Recently, I started using NetBeans IDE for C++ and I started to like it quite a bit.
In my opinion, the above three IDEs all have their strengths and weaknesses depending on what kind of project you are working on. Here [...]]]></description>
			<content:encoded><![CDATA[<p>So far I have been mainly using <a href="http://www.kdevelop.org">KDevelop</a> and <a href="http://www.codeblocks.org">Code::Blocks</a> as my C++ development IDEs. Recently, I started using <a href="http://www.netbeans.org/">NetBeans IDE</a> for C++ and I started to like it quite a bit.<span id="more-964"></span></p>
<p>In my opinion, the above three IDEs all have their strengths and weaknesses depending on what kind of project you are working on. Here are some of my observations.<br />
<strong><br />
KDevelop is probably the most comprehensive IDE of the three</strong>. This should not come as a surprise as it has been around for more than 10 years and it bas been the de facto development environment for most of the <a href="http://www.kde.org">KDE</a> development work. It is very feature rich and you can use it to develop many different types of applications natively out of box (e.g. KDE, GTK+, Qt, wxWidgets, etc). If you are developing <a href="http://www.gnu.org">GNU</a> style applications, you will benefit from the Automake project type if provides. Besides the vast functionalities KDevelop provides, it is also very fast, efficient and stable. </p>
<p>Nonetheless, KDevelop is not the most intuitive IDE and does not suit small projects (e.g. prof of concept code) well as the initial project setup can be time consuming.</p>
<p><strong>Code::Blocks is a very capable IDE for C++ development as well</strong>. It offers many project templates even though it does not offer native KDE project types. This should not be a problem to most people however. Among the features I like the most is that it does not require explicit make file configuration and the build dependencies are inferred by default. This makes it very attractive for rapid prototyping. Debugging under Code::Blocks is also a pleasant experience. It also integrates (at least in the later SVN versions) <a href="http://valgrind.org/">Valgrind</a>&#8217;s MemCheck and Cachegrind, which are very useful for detecting memory leaks and tweaking algorithms for the maximum performance.</p>
<p>The latest stable version of Code::Blocks is 8.02, it is a little dated as a lot of functionalities have been added in the later SVN builds. If you do not require the most stability (I have run into <a href="/2009/04/15/how-to-revert-to-a-specific-svn-version-of-codeblocks/">some issues</a> recently), using SVN build should not be a problem. The editor (e.g. syntax highlighting, collapsible regions) is a little bit buggy though and the contextual help does not always work to the level of detail I desired. </p>
<p><strong>NetBeans C++ IDE is probably the most beautiful one among the three</strong>. It offers the most detailed syntax highlighting, and can be configured to display class hierarchy and library function information. The refactor tool works pretty well and is certainly a boon to large scale development. Its contextual help is also of top-notch.</p>
<p>All of these come at a cost of course. NetBeans C++ IDE is the most resource intensive among the three. It can easily use 500 MB memory when doing development and can be sluggish at times. It also comes with a very limited project template (e.g. no out-of-box project templates for Qt, wxWidgets). Certain settings <a href="http://www.netbeans.org/kb/docs/cnd/toolchain.html">are hard to get at</a> as well. Nevertheless, if I am primarily writing back-end code, NetBeans C++ IDE could easily earn my top choice. </p>
<p>Some people like <a href="http://www.anjuta.org">Anjuta</a> and compare it favorably to Code::Blocks. But I haven&#8217;t got a chance to use it yet.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/04/18/c-ides-under-linux/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>An Image Class Based On IPP</title>
		<link>http://www.kerrywong.com/2009/04/10/an-image-class-based-on-ipp/</link>
		<comments>http://www.kerrywong.com/2009/04/10/an-image-class-based-on-ipp/#comments</comments>
		<pubDate>Sat, 11 Apr 2009 00:34:55 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[FFT]]></category>
		<category><![CDATA[IPP]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=935</guid>
		<description><![CDATA[A couple of weeks ago, I wrote about how to interface Integrated Performance Primitives (IPP) with Magick++. While IPP offers excellent performance advantages, it does not come with the easiest programming model. Fortunately, it is easy enough to create a C++ wrapper on top of IPP and provide an easier to use programming interface.
In this [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago, I wrote about <a href="/2009/03/17/interfacing-ipp-with-magick/">how to interface Integrated Performance Primitives (IPP) with Magick++</a>. While IPP offers excellent performance advantages, it does not come with the easiest programming model. Fortunately, it is easy enough to create a C++ wrapper on top of IPP and provide an easier to use programming interface.<span id="more-935"></span></p>
<p>In this article, I will show a simple example of creating a wrapper class using <a href="http://www.intel.com/cd/software/products/asmo-na/eng/302910.htm">IPP</a> and <a href="http://www.imagemagick.org/Magick%2B%2B/">Magick++</a>. The example I am going to show can be used to calculate the 2-dimensional FFT spectrum of a gray-scale image. This framework can be easily extended to include other algorithms that can be applied to an image using IPP.</p>
<p>Before I show the implementation details, let me first show how easy it is to use the class. The code snippet below shows how to read in an image file, apply 2D FFT with and without a Hamming window and save the results into image files.</p>
<pre class="brush: cpp;">
    IPPGrayImage *img, *img1;

    img = new IPPGrayImage();
    img-&gt;LoadFromFile(IMAGE_FILE);

    img1 = img-&gt;Clone();
    img1 = img1-&gt;FFT(true);
    img1-&gt;SaveToFile(IMAGE_HOME + &quot;/test_fftmag.jpg&quot;);
    img1 = img-&gt;Clone();
    img1 = img-&gt;ApplyHammingWindow();
    img1-&gt;SaveToFile(IMAGE_HOME + &quot;/test_hamming.jpg&quot;);
    img1 = img1-&gt;FFT(true);
    img1-&gt;SaveToFile(IMAGE_HOME + &quot;/test_fftmag_hamming.jpg&quot;);
</pre>
<p>And for the following IMAGE_FILE,<br />
<div id="attachment_941" class="wp-caption aligncenter" style="width: 266px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/testimg.jpeg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/testimg.jpeg" alt="Test image used for 2D FFT" title="Test Image" width="256" height="256" class="size-full wp-image-941" /></a><p class="wp-caption-text">Test image used for 2D FFT</p></div></p>
<p>Here are the results for FFT spectrum with and without hamming window:<br />
<div id="attachment_943" class="wp-caption aligncenter" style="width: 266px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/test_hamming.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/test_hamming.jpg" alt="Hamming window applied" title="Hamming window applied" width="256" height="256" class="size-full wp-image-943" /></a><p class="wp-caption-text">Hamming window applied</p></div><br />
<div id="attachment_944" class="wp-caption aligncenter" style="width: 266px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/test_fftmag.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/test_fftmag.jpg" alt="FFT spectrum (without Hamming window)" title="FFT spectrum (without Hamming window)" width="256" height="256" class="size-full wp-image-944" /></a><p class="wp-caption-text">FFT spectrum (without Hamming window)</p></div><br />
<div id="attachment_945" class="wp-caption aligncenter" style="width: 266px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/test_fftmag_hamming.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/04/test_fftmag_hamming.jpg" alt="FFT spectrum with Hamming window" title="FFT spectrum with Hamming window" width="256" height="256" class="size-full wp-image-945" /></a><p class="wp-caption-text">FFT spectrum with Hamming window</p></div></p>
<p>The header file for the class is as follows:</p>
<pre class="brush: cpp;">
#ifndef IPPGRAYIMAGE_H
#define IPPGRAYIMAGE_H

#include &lt;Magick++/Image.h&gt;
#include &lt;Magick++.h&gt;
#include &lt;ipp.h&gt;

using namespace std;
using namespace Magick;

namespace KDW
{
    class IPPGrayImage
    {
    public:
        unsigned int PIXEL_SIZE;

        IPPGrayImage();
        IPPGrayImage(const unsigned int width, const unsigned int height);
        IPPGrayImage(Ipp32f *imgBuffer, const unsigned int width, const unsigned int height);
        IPPGrayImage(const IPPGrayImage&amp; other);
        virtual ~IPPGrayImage();
        IPPGrayImage&amp; operator=(const IPPGrayImage&amp; other);

        void LoadFromFile(string fileName);
        void SaveToFile();
        void SaveToFile(string fileName);

        inline float GetPixel(unsigned int col, unsigned int row) {return _imgBuffer[row * _width + col];}
        inline void SetPixel(unsigned int col, unsigned int row, float clr) {_imgBuffer[row * _width + col] = clr;}

        unsigned int GetWidth() { return _width;}
        unsigned int GetHeight() { return _height;}

        Ipp32f* GetImageBuffer() { return _imgBuffer;}
        Image* GetImage() { return _img;}

        IPPGrayImage* Clone();
        IPPGrayImage* ApplyHammingWindow();
        IPPGrayImage* FFT(bool fftShift = false);
    protected:
    private:
        unsigned int _width;
        unsigned int _height;

        Image* _img;
        Pixels* _view;
        PixelPacket* _pixels;
        Ipp32f* _imgBuffer;
        string _fileName;

        void Init();
    };
}
#endif // IPPGRAYIMAGE_H
</pre>
<p>And here&#8217;s the implementation for the class:</p>
<pre class="brush: cpp;">
#include &lt;assert.h&gt;
#include &lt;math.h&gt;

#include &quot;IPPGrayImage.h&quot;

namespace KDW
{
    /** @brief Constructor()
     *  Initialize image buffer.
     */
    IPPGrayImage::IPPGrayImage()
    {
        Init();
    }

    /** @brief Constructor(width, height)
     *  Initialize an image size of width x height.
     */
    IPPGrayImage::IPPGrayImage(const unsigned int width, const unsigned int height)
    {
        Init();

        _width = width;
        _height = height;

        int stepByte = 0;
        _imgBuffer = ippiMalloc_32f_C1(_width, _height, &amp;stepByte);
    }

    /** @brief Constructor(imgBuffer, width, height)
     *  Initilize from an Ipp32f buffer (width x height)
     */
    IPPGrayImage::IPPGrayImage(Ipp32f *imgBuffer, const unsigned int width, const unsigned int height)
    {
        Init();

        _width = width;
        _height = height;
        _imgBuffer = imgBuffer;
    }

    /** @brief Copy Constructor
     */
    IPPGrayImage::IPPGrayImage(const IPPGrayImage&amp; other)
    {
        _width = other._width;
        _height = other._height;

        int stepByte = 0;

        _imgBuffer = ippiMalloc_32f_C1(_width, _height, &amp;stepByte);

        for (unsigned int row = 0; row &lt; _height ; row++)
        {
            for (unsigned int column = 0; column &lt; _width ; column++)
            {
                _imgBuffer[column + row * _width] =other._imgBuffer[column + row * _width];
            }
        }

        if (other._pixels == NULL) _pixels = NULL;
    }

    /** @brief Overload =
     */
    IPPGrayImage&amp; IPPGrayImage::operator=(const IPPGrayImage&amp; rhs)
    {
        if (this == &amp;rhs) return *this; // handle self assignment

        return *this;
    }

    /**
     * @brief Destructor
     */
    IPPGrayImage::~IPPGrayImage()
    {
        ippFree(_imgBuffer);
        delete _img;
        delete _view;
    }

    /**
     * @brief Common initialization code
     */
    void IPPGrayImage::Init()
    {
        PIXEL_SIZE = sizeof(Ipp32f);
        _pixels = NULL;
        _imgBuffer = NULL;
        _img = NULL;
    }

    /** @brief Load an image from file
      */
    void IPPGrayImage::LoadFromFile(string fileName)
    {
        _img = new Image(fileName);
        Geometry g = _img-&gt;size();

        _width = g.width();
        _height= g.height();

        _view = new Pixels(*_img);
        _pixels = _view-&gt;get(0,0, _width, _height);

        int stepByte = 0;
        _imgBuffer = ippiMalloc_32f_C1(_width, _height, &amp;stepByte);

        for (unsigned int row = 0; row &lt; _height ; row++)
        {
            for (unsigned int column = 0; column &lt; _width ; column++)
            {
                PixelPacket *p = &amp;_pixels[column + row * _width];
                Color c = Color(p-&gt;red, p-&gt;green, p-&gt;blue);
                _imgBuffer[column + row * _width] = c.intensity();
            }
        }
    }

    /** @brief Save the current image buffer to file
      */
    void IPPGrayImage::SaveToFile()
    {
        SaveToFile(&quot;&quot;);
    }

    /** @brief SaveToFile(fileName)
     *  Saves the current image buffer to a file (by file name).
     */
    void IPPGrayImage::SaveToFile(string fileName)
    {
        if (_img != NULL &amp;&amp; _pixels != NULL)
        {
            for (unsigned int y = 0; y&lt; _height ; y++)
            {
                for (unsigned int x = 0; x &lt; _width; x++)
                {
                    float clr = (float) _imgBuffer[x + y * _width];
                    _pixels[x+ y * _width] = Color(clr, clr, clr);
                }
            }
            _view-&gt;sync();
            _img-&gt;syncPixels();

            if (fileName == &quot;&quot;)
            {
                _img-&gt;write(_fileName);
            }
            else
            {
                _img-&gt;write(fileName);
            }
        }
        else
        {
            Image img(Geometry(_width, _height),&quot;white&quot;);

            for (unsigned int y = 0; y&lt; _height ; y++)
            {
                for (unsigned int x = 0; x &lt; _width; x++)
                {
                    Color c;
                    float clr = (float) _imgBuffer[x + y * _width];
                    img.pixelColor(x,y, Color(clr, clr, clr));
                }
            }

            img.write(fileName);
        }
    }

    /** @brief Clone
    *   Duplicate the current image to another IPPGrayImage object.
    *   @return an IPPGrayImage pointer to the cloned image
    */
    IPPGrayImage* IPPGrayImage::Clone()
    {
        IPPGrayImage *newImg;

        newImg = new IPPGrayImage(_width, _height);

        int stepByte = 0;
        newImg-&gt;_imgBuffer = ippiMalloc_32f_C1(_width,_height, &amp;stepByte);

        for (unsigned int y = 0 ; y &lt; _height ; y++)
        {
            for (unsigned int x = 0 ; x &lt; _width ; x++)
            {
                newImg-&gt;_imgBuffer[x + y * _width] = _imgBuffer[x + y * _width];
            }
        }

        return newImg;
    }

    /** @brief Apply Hamming window to the image
     *  @return an IPPGrayImage pointer to the processed image
     */
    IPPGrayImage* IPPGrayImage::ApplyHammingWindow()
    {
        IPPGrayImage *newImg;
        newImg = new IPPGrayImage(_width , _height);
        IppiSize srcImgSize = {_width, _height};

        IppStatus sts;
        int stepByte;
        Ipp32f *imgCache = ippiMalloc_32f_C1(_width , _height , &amp;stepByte);

        sts = ippiWinHamming_32f_C1R(_imgBuffer, _width * PIXEL_SIZE, imgCache, _width * PIXEL_SIZE, srcImgSize);
        assert(sts ==ippStsNoErr);

        for (unsigned int y = 0; y&lt; _height; y++)
        {
            for (unsigned int x = 0; x &lt; _width; x++)
            {
                newImg-&gt;_imgBuffer[x+ y * _width] = imgCache[x + y * _width];
            }
        }

        return newImg;
    }

    /** @brief Perform FFT on the image and returns the magnitude component
     *  @param fftShift: if it is true, the the result is with
     *         zero-frequency component shifted to center of spectrum
     *  @return the magnitude FFT component
     */
    IPPGrayImage* IPPGrayImage::FFT(bool fftShift)
    {
        IPPGrayImage *newImg;
        IppiFFTSpec_R_32f *spec;
        IppStatus sts;

        unsigned int n = (int) (logf((float) _width) / logf(2.0f));
        unsigned int m = (int) (logf((float) _height) / logf(2.0f));

        unsigned int N = pow(2, n);
        unsigned int M = pow(2, m);

        if (N &lt; _width)
        {
            n = n + 1;
            N = pow(2, n);
        }

        if (M &lt; _height)
        {
            m = m + 1;
            M = pow(2, m);
        }

        int stepByte;
        Ipp32f *src = ippiMalloc_32f_C1(M , N , &amp;stepByte);
        Ipp32f *dst = ippiMalloc_32f_C1(M , N , &amp;stepByte);
        Ipp32f *mag = ippiMalloc_32f_C1(M , N , &amp;stepByte);

        IppiSize srcImgSize = {_width, _height};
        IppiSize dstImgSize = {N, M};

        sts = ippiCopyConstBorder_32f_C1R(
                  _imgBuffer, _width * PIXEL_SIZE, srcImgSize,
                  src,  N * PIXEL_SIZE, dstImgSize,
                  0,0,0);
        assert(sts ==ippStsNoErr);

        sts = ippiFFTInitAlloc_R_32f(&amp;spec, n , m, IPP_FFT_DIV_BY_SQRTN, ippAlgHintAccurate);
        assert(sts ==ippStsNoErr);

        sts = ippiFFTFwd_RToPack_32f_C1R(src, N*PIXEL_SIZE, dst, N*PIXEL_SIZE, spec, 0);
        assert(sts ==ippStsNoErr);

        sts = ippiMagnitudePack_32f_C1R(dst, N*PIXEL_SIZE, mag, N*PIXEL_SIZE, dstImgSize);
        assert (sts ==ippStsNoErr);

        newImg = new IPPGrayImage(N , M);

        if (fftShift)
        {
#pragma omp sections
            {
#pragma omp section
                {
                    for (unsigned int y = 0 ; y &lt; M/2; y++)
                    {
                        for (unsigned int x = 0 ; x &lt; N/2; x++)
                        {
                            newImg-&gt;_imgBuffer[x+ y *N] = mag[x + N/2 + (y + M/2) * N];
                        }
                    }
                }
#pragma omp section
                {
                    for (unsigned int y = 0 ; y &lt; M/2; y++)
                    {
                        for (unsigned int x = N/2 ; x &lt; N; x++)
                        {
                            newImg-&gt;_imgBuffer[x+ y *N] = mag[x - N/2 + (y + M/2) * N];
                        }
                    }
                }
#pragma omp section
                {
                    for (unsigned int y = M/2 ; y &lt; M; y++)
                    {
                        for (unsigned int x = 0 ; x &lt; N/2; x++)
                        {
                            newImg-&gt;_imgBuffer[x+ y *N] = mag[x + N/2 + (y - M/2) * N];
                        }
                    }
                }
#pragma omp section
                {
                    for (unsigned int y = M/2 ; y &lt; M; y++)
                    {
                        for (unsigned int x = N/2 ; x &lt; N; x++)
                        {
                            newImg-&gt;_imgBuffer[x+ y *N] = mag[x - N/2 + (y - M/2) * N];
                        }
                    }
                }
            }
        }
        else
        {
            for (unsigned int y = 0; y&lt; M; y++)
            {
                for (unsigned int x = 0; x &lt;N; x++)
                {
                    newImg-&gt;_imgBuffer[x+ y * N] = mag[x + y * N];
                }
            }

        }

        return newImg;
    }
}
</pre>
<p>The above example showed only the FFT function, but virtually all IPP image routines can be accommodated using the wrapper image class above. Intel IPP utilizes many different data types (e.g. Ipp8u, Ipp16s, Ipp32f etc.), but to provide most of the flexibility and compatibility I chose to use only the 32 bit floating data type. For specific implementations, other data types can be used as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/04/10/an-image-class-based-on-ipp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interfacing IPP with Magick++</title>
		<link>http://www.kerrywong.com/2009/03/17/interfacing-ipp-with-magick/</link>
		<comments>http://www.kerrywong.com/2009/03/17/interfacing-ipp-with-magick/#comments</comments>
		<pubDate>Wed, 18 Mar 2009 01:01:24 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Image Processing]]></category>
		<category><![CDATA[IPP]]></category>
		<category><![CDATA[Multi-threading]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=864</guid>
		<description><![CDATA[Intel&#8217;s Integrated Performance Primitives (IPP) is a low level C++ library. It provides routines that are highly optimized on Intel processors. I recently started using it because its vast speed advantage in signal and image processing applications.
Since the implementation of many of the functions are threaded, it makes the task of writing high performance applications [...]]]></description>
			<content:encoded><![CDATA[<p>Intel&#8217;s <a href="http://www.intel.com/cd/software/products/asmo-na/eng/302910.htm">Integrated Performance Primitives (IPP)</a> is a low level C++ library. It provides routines that are highly optimized on Intel processors. I recently started using it because its vast speed advantage in signal and image processing applications.<span id="more-864"></span></p>
<p>Since the implementation of many of the functions are threaded, it makes the task of writing high performance applications much easier. Since it is a set of &#8220;performance primitives&#8221; as the name suggests, it uses its own data structures (e.g. Ipp8u) and does not provide functions to directly inter-operate with data coming from other sources (e.g. image files).</p>
<p>Fortunately, such data conversion is pretty straight forward. In this post, I will illustrate how to convert an image file (e.g. .jpg, .png, .gif) to the data IPP uses, and how to save the result into a standard image file once the processing is done.</p>
<p><a href="http://www.imagemagick.org/Magick%2B%2B/">Magick++</a> is a very comprehensive image-processing C++ library and the Image class it provides handles image files quite well. So I chose to use Magic++&#8217;s API to convert image files to the data structure IPP uses. In this particular example, I will use a gray level image. But in practice, color images can be easily handled in a similar fashion.</p>
<p>For the code mentioned below, the following headers and namespaces are used:</p>
<pre class="brush: cpp;">
#include &lt;Magick++/Image.h&gt;
#include &lt;Magick++.h&gt;
#include &lt;ipp.h&gt;

using namespace std;
using namespace Magick;
</pre>
<p>And the following code shows how to convert a standard image file data into format that is suitable for IPP.</p>
<pre class="brush: cpp;">
    IppStatus sts;
    Image img(&quot;{Image File Name}&quot;);
    Geometry g = img.size();

    unsigned int width = g.width();
    unsigned int height= g.height();

    Pixels view(img);
    PixelPacket *pixels = view.get(0,0,width,height);

    int stepByte = 0;
    //allocating a buffer of unsigned char (Ipp8u) for the image.
    Ipp8u *imgCache = ippiMalloc_8u_C1(width,height, &amp;stepByte);

    for (unsigned int row = 0; row &lt; height ; row++)
    {
        for (unsigned int column = 0; column &lt; width ; column++)
        {
            PixelPacket *p = &amp;pixels[column + row * width];
            Color c = Color(p-&gt;red, p-&gt;green, p-&gt;blue);
            double i = c.scaleQuantumToDouble(c.intensity()) * 255;
            imgCache[column + row * width] = (char) i;
        }
    }
</pre>
<p>The above code snippet first reads the image data into *PixelPacket and the pixel buffer is then converted into a one channel buffer of chars (color value 0-255). Note that the range of the image data Magick++ reads in is between 0 and <strong><em>QuantumRange</em></strong>, which needs to be converted back to the range 0-255 accepted by the Ipp8u buffer. If other types of <strong><em>IPP</em></strong> image buffers are used (e.g. Ipp32f), this <strong><em>scaleQuantumToDouble()</em></strong> conversion may not be necessary. At the end, the image data is converted into the one dimensional array <strong><em>imgCache</em></strong> which can be used by <em>IPP</em> procedures.</p>
<p>Once we are in the <strong><em>IPP</em></strong> data domain, we can proceed with whatever processing we had in mind. Here I will show the code for edge detection using <a href="http://en.wikipedia.org/wiki/Canny_edge_detector">Canny algorithm</a>.</p>
<pre class="brush: cpp;">
    IppiSize orgImgSize = {width, height};
    IppiSize newImgSize = {width + 2, height + 2};

    Ipp8u *imgCache1 = ippiMalloc_8u_C1(width + 2,height + 2, &amp;stepByte);
    sts = ippiCopyReplicateBorder_8u_C1R(imgCache, width, orgImgSize,  imgCache1,
            width + 2 , newImgSize, 2, 2);
    IppiSize roi = {width, height};

    Ipp32f low=30.0f, high=100.0f;
    int size, size1, srcStep, dxStep, dyStep, dstStep;

    Ipp8u *src, *dst, *buffer;
    Ipp16s *dx, *dy;

    sts = ippiFilterSobelNegVertGetBufferSize_8u16s_C1R(
            roi, ippMskSize3x3, &amp;size);
    sts = ippiFilterSobelHorizGetBufferSize_8u16s_C1R(
            roi, ippMskSize3x3, &amp;size1);

    if (size&lt;size1) size=size1;
    ippiCannyGetSize(roi, &amp;size1);
    if (size&lt;size1) size=size1;

    buffer = ippsMalloc_8u(size);
    dx = ippsMalloc_16s(size);
    dy = ippsMalloc_16s(size);
    dst = ippsMalloc_8u(size);

    sts = ippiFilterSobelNegVertBorder_8u16s_C1R (
            imgCache1, width + 2 , dx, (width + 2) * 2 ,
            roi, ippMskSize3x3, ippBorderRepl, 0, buffer);
    sts = ippiFilterSobelHorizBorder_8u16s_C1R(
            imgCache1, width + 2, dy, (width + 2) *2 ,
            roi, ippMskSize3x3, ippBorderRepl, 0, buffer);
    sts = ippiCanny_16s8u_C1R(dx,
            (width + 2) * 2, dy, (width + 2) * 2,
            dst, width, roi, low, high, buffer);
</pre>
<p>The code shown above is adopted from Intel&#8217;s IPP manual for image and video processing (by default it is located at /opt/intel/ipp/<em>{version number}</em>/em64t/doc/ippiman.pdf).</p>
<p>Please pay special attention to how the original image is extended via <strong><em>ippiCopyReplicateBorder_8u_C1R</em></strong>. The image is extended by 2 pixels in each direction because the 3&#215;3 mask used for the filtering operation. I omitted error checking code for simplicity, but in general you need to check the return status of each ippi function call. When the call is successful, the status is <strong><em>ippStsNoErr</em></strong>. If you receive a value other than <strong><em>ippStsNoErr</em></strong> (e.g. <strong><em>ippStsStepErr</em></strong>) you will need to check your buffer size to make sure that they are adjusted according to the data type. For instance, in the code above <strong><em>dx</em> </strong>and <strong><em>dy</em></strong> are both 16bit signed integers and thus they both occupy two bytes each. </p>
<p>To save the result image, we convert the pixel buffer back to <strong><em>PixelPacket</em></strong> type. Again we need to convert the data in the buffer (<strong><em>Ipp8u</em></strong>) to the range accepted by Magick++ API.</p>
<pre class="brush: cpp;">
    for (unsigned int y = 0; y&lt; height ; y++)
    {
        for (unsigned int x = 0; x &lt; width; x++)
        {
            Color c;
            float clr = (float) dst[x + y * width] /255;
            float q = c.scaleDoubleToQuantum(clr);
            pixels[x+ y * width] = Color(q, q, q);
        }
    }

    view.sync();
    img.syncPixels();

    img.write(&quot;{Image File Name}&quot;);
</pre>
<p>We can also save the result data into a new image (instead of syncing the data back to the image object holding the original image data) using the code below:</p>
<pre class="brush: cpp;">
    Image img1(Geometry(width, height),&quot;white&quot;);

    for (unsigned int y = 0; y&lt; height ; y++)
    {
        for (unsigned int x = 0; x &lt; width; x++)
        {
            Color c;
            float clr = (float) dst[x + y * width] /255;
            float q = c.scaleDoubleToQuantum(clr);
            img1.pixelColor(x,y, Color(q, q, q));
        }
    }

    img1.write(&quot;{Image File Name}&quot;);
</pre>
<p>And finally the buffers used are freed.</p>
<pre class="brush: cpp;">
    ippiFree(imgCache);
    ippsFree(buffer);
</pre>
<p>The following images show Canny edge detector in action using the code in this article:</p>
<div id="attachment_876" class="wp-caption aligncenter" style="width: 573px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/test.jpg"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/test.jpg" alt="Original" title="Original" width="563" height="422" class="size-full wp-image-876" /></a><p class="wp-caption-text">Original</p></div>
<div id="attachment_879" class="wp-caption aligncenter" style="width: 573px"><a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/test_canny.png"><img src="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/test_canny.png" alt="Canny Edge Detector applied" title="Canny Edge Detector applied" width="563" height="422" class="size-full wp-image-879" /></a><p class="wp-caption-text">Canny Edge Detector applied</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/03/17/interfacing-ipp-with-magick/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Matrix Multiplication Performance in C++</title>
		<link>http://www.kerrywong.com/2009/03/07/matrix-multiplication-performance-in-c/</link>
		<comments>http://www.kerrywong.com/2009/03/07/matrix-multiplication-performance-in-c/#comments</comments>
		<pubDate>Sat, 07 Mar 2009 04:16:14 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[BLAS]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[matrix multiplication]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=586</guid>
		<description><![CDATA[A few days ago, I ran across this article by Dmitri Nesteruk. In his article, he compared the performance between C# and C++ in matrix multiplication. From the data he provided, matrix multiplication using C# is two to three times slower than using C++ in comparable situations.
Even though a lot of optimizations have been done [...]]]></description>
			<content:encoded><![CDATA[<p>A few days ago, I ran across <a href="http://mindstudies.psy.soton.ac.uk/dmitri/blog/index.php/archives/160">this article by Dmitri Nesteruk</a>. In his article, he compared the performance between C# and C++ in <a href="http://en.wikipedia.org/wiki/Matrix_multiplication">matrix multiplication</a>. From the data he provided, matrix multiplication using C# is two to three times slower than using C++ in comparable situations.<span id="more-586"></span></p>
<p>Even though a lot of optimizations have been done in the .Net runtime to make it more efficient, it is apparent that scientific programming still favors C and C++ because that the performance advantage is huge.</p>
<p>In this article, I will examine some matrix multiplication algorithms that are commonly used and illustrate the efficiencies of the various methods. All the tests are done using C++ only and matrices size ranging from 500&#215;500 to 2000&#215;2000. When the matrix sizes are small (e.g. &lt;50), you can pretty much use any matrix multiplication algorithms without observing any significant performance differences. This is largely due to the fact that the typical stable matrix multiplication algorithms are O(n^3) and sometimes array operation overheads outweigh the benefit of algorithm efficiencies. But for matrices of larger dimensions, the efficiency of the multiplication algorithm becomes extremely important.</p>
<p>Since <a href="http://mindstudies.psy.soton.ac.uk/dmitri/blog/index.php/archives/160">Dmitri&#8217;s article</a> has already captured pretty detailed data using the standard matrix multiplication algorithm, I will not repeat his findings in this article. What I intended to show was the performance data of <a href="http://www.boost.org/doc/libs/1_38_0/libs/numeric/ublas/doc/overview.htm">uBLAS</a>, <a href="http://openmp.org/wp/">OpenMP</a>, <a href="http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms">cBLAS</a> and <a href="http://www.mathworks.com/">MATLAB</a>.</p>
<p>The following sample code are compiled under Ubuntu 8.10 64 bit (kernel 2.6.24.23) on Intel Q9450@3.2GHz.</p>
<h4>Standard Matrix Multiplication (Single Threaded)</h4>
<p>This is our reference code. Later on, I will only show the critical portion of the code and not repeat the common portion of code that initializes/finalizes the arrays. Similarly, the timing method used is also the same across all the tests and will be omitted later on.</p>
<pre class="brush: cpp;">
float **A, **B, **C;

A = new float*[matrix_size];
B = new float*[matrix_size];
C = new float*[matrix_size];

for (int i = 0 ; i &lt; matrix_size; i++)
{
    A[i] = new float[matrix_size];
    B[i] = new float[matrix_size];
    C[i] = new float[matrix_size];
}

for (int i=0; i&lt;matrix_size; i++)
{
    for (int j = 0 ; j &lt; matrix_size; j++)
    {
        A[i][j]=rand();
        B[i][j]=rand();
    }
}

timeval t1, t2, t;
gettimeofday(&amp;t1, NULL);

for (int i = 0 ; i &lt; matrix_size; i++)
{
    for (int j = 0;  j &lt; matrix_size; j++)
    {
        C[i][j] = 0;
        for (int k = 0; k &lt; matrix_size; k++)
        {
            C[i][j] += A[i][k] * B[k][j];
        }
    }
}

gettimeofday(&amp;t2, NULL);
timersub(&amp;t2, &amp;t1, &amp;t);

cout &lt;&lt; t.tv_sec + t.tv_usec/1000000.0 &lt;&lt; &quot; Seconds -- Standard&quot; &lt;&lt; endl;

for (int i = 0 ; i &lt; matrix_size; i++)
{
    delete A[i];
    delete B[i];
    delete C[i];
}

delete A;
delete B;
delete C;
</pre>
<h4>OpenMP With Two Dimensional Arrays</h4>
<p>Using OpenMP, we are able to multiple threads via the #pragma omp directives. For the simple algorithm we used here, the speed increase is almost proportional to the number of available cores within the system.</p>
<pre class="brush: cpp;">
...
#pragma omp parallel for shared(a,b,c)
for (long i=0; i&lt;matrix_size; i++)
{
    for (long j = 0; j &lt; matrix_size; j++)
    {
        float sum = 0;
        for (long k = 0; k &lt; matrix_size; k++)
        {
            sum +=a[i][k]*b[k][j];
        }
        c[i][j] = sum;
    }
}
...
</pre>
<h4>OpenMP With One Dimensional Arrays</h4>
<p>Cache locality is poor using the simple algorithm I showed above. The performance can be easily improved however by improving the locality of the references. One way to achieve better cache locality is to use one dimensional array instead of two dimensional array and as you will see later, the performance of the following implementation has as much as 50% speed gains over the previous OpenMP implementation using two dimensional arrays.</p>
<pre class="brush: cpp;">
float *a, *b, *c;

a = new float[matrix_size * matrix_size];
b = new float[matrix_size * matrix_size];
c = new float[matrix_size * matrix_size];

for (long i=0; i&lt;matrix_size * matrix_size; i++)
{
    a[i]=rand();
    b[i] = rand();
    c[i] = 0;
}

#pragma omp parallel for shared(a,b,c)
for (long i=0; i&lt;matrix_size; i++)
{
    for (long j = 0; j &lt; matrix_size; j++)
    {
        long idx = i * matrix_size;
        float sum = 0;
        for (long k = 0; k &lt; matrix_size; k++)
        {
            sum +=a[idx + k]*b[k * matrix_size +j];
        }
        c[idx + j] = sum;
    }
}

delete a;
delete b;
delete c;
</pre>
<h4>Boost Library uBLAS (Single Threaded)</h4>
<p>Boost library provides a convenient way to perform matrix multiplication. However, the performance is very poor compared to all other approaches mentioned in this article. The performance of the uBLAS implementation is largely on par with that using C# (see benchmarks towards the end of the article). Intel&#8217;s Math Kernal Library (MKL) 10.1 does provide functionality to dynamically convert code using uBLAS syntax into highly efficient code using MKL by the inclusion of header file mkl_boost_ublas_matrix_prod.hpp. I have not tried it myself though, but the performance should be comparible to algorithms using the native MKL BLAS interface.</p>
<p>By default (without using MKL&#8217;s uBLAS capability) though, uBLAS is single threaded and due to its poor performance and I would strongly suggest avoid using uBLAS in any high performance scientific applications.</p>
<pre class="brush: cpp;">
matrix&lt;float&gt; A, B, C;

A.resize(matrix_size,matrix_size);
B.resize(matrix_size,matrix_size);

for (int i = 0; i &lt; matrix_size; ++ i)
{
    for (int j = 0; j &lt; matrix_size; ++ j)
    {
        A(i, j) = rand();
        B(i, j) = rand();
    }
}

C =prod(A, B);
</pre>
<h4>Intel Math Kernel Library (MKL) cBLAS</h4>
<p>Intel&#8217;s Math Kernel Library (MKL) is highly optimized on Intel&#8217;s microprocessor platforms. Given that Intel developed this library for its own processor platforms we can expect significant performance gains. I am still surprised at how fast the code runs using cBLAS though. In fact, it was so fast that I doubted the validity of the result at first. But after checking the results against those obtained by other means, those doubts were putting into rest.</p>
<p>The cBLAS matrix multiplication uses blocked matrix multiplication method which further improves cache locality. And it is more than thirty times faster then the fastest OMP 1D algorithm listed above! Another benefit is that by default it automatically detects the number of CPUs/cores available and uses all available threads. This behavior greatly simplifies the code since threading is handled transparently within the library.</p>
<pre class="brush: cpp;">
float *A, *B, *C;

A = new float[matrix_size * matrix_size];
B = new float[matrix_size * matrix_size];
C = new float[matrix_size * matrix_size];

for (int i = 0; i &lt; matrix_size * matrix_size; i++)
{
    A[i] = rand();
    B[i] = rand();
    C[i] = 0;
}

cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
    matrix_size,  matrix_size,  matrix_size, 1.0, A,matrix_size,
    B, matrix_size, 0.0, C, matrix_size);
</pre>
<h4>MATLAB (Single Threaded)</h4>
<p>MATLAB is known for its efficient algorithms. In fact it uses BLAS libraries for its own matrix calculation routines. The version of MATLAB I have is a little dated (7.0.1), but nevertheless it would be interesting to see how its performance compares with that of latest MKL&#8217;s. MATLAB 7 is single threaded, and given the same matrix size, it runs roughly three times slower than the fastest MKL routine listed above (per core).</p>
<pre>    a = rand(i,i);
    b = rand(i,i);
    tic;
    c = a*b;
    t = toc</pre>
<p>
The following table shows the results I obtained by running the code listed above. The results are time in seconds. (note, S.TH means single threaded and M.TH means multi-threaded).</p>
<table border="0" cellspacing="0" frame="void" rules="none">
<colgroup>
<col width="116"></col>
<col width="116"></col>
<col width="116"></col>
<col width="116"></col>
<col width="116"></col>
<col width="116"></col>
<col width="116"></col>
</colgroup>
<tbody>
<tr>
<td style="border: 1px solid #000000;" width="116" height="17" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">Size/Algorithm</span></strong></td>
<td style="border: 1px solid #000000;" width="116" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">uBLAS S.TH</span></strong></td>
<td style="border: 1px solid #000000;" width="116" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">STD S.TH</span></strong></td>
<td style="border: 1px solid #000000;" width="116" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">OMP 2D</span></strong></td>
<td style="border: 1px solid #000000;" width="116" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">OMP 1D</span></strong></td>
<td style="border: 1px solid #000000;" width="116" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">MATLAB S.TH</span></strong></td>
<td style="border: 1px solid #000000;" width="116" align="center" bgcolor="#008080"><strong><span style="color: #ffffff;">cBLAS M.TH</span></strong></td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">500&#215;500</span></strong></td>
<td style="border: 1px solid #000000;" align="right">3.2435</td>
<td style="border: 1px solid #000000;" align="right">0.5253</td>
<td style="border: 1px solid #000000;" align="right">0.1939</td>
<td style="border: 1px solid #000000;" align="right">0.0536</td>
<td style="border: 1px solid #000000;" align="right">0.0810</td>
<td style="border: 1px solid #000000;" align="right">0.0206</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">600&#215;600</span></strong></td>
<td style="border: 1px solid #000000;" align="right">5.7854</td>
<td style="border: 1px solid #000000;" align="right">0.9349</td>
<td style="border: 1px solid #000000;" align="right">0.3223</td>
<td style="border: 1px solid #000000;" align="right">0.1655</td>
<td style="border: 1px solid #000000;" align="right">0.1410</td>
<td style="border: 1px solid #000000;" align="right">0.0093</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">700&#215;700</span></strong></td>
<td style="border: 1px solid #000000;" align="right">9.2292</td>
<td style="border: 1px solid #000000;" align="right">1.2928</td>
<td style="border: 1px solid #000000;" align="right">0.3529</td>
<td style="border: 1px solid #000000;" align="right">0.2797</td>
<td style="border: 1px solid #000000;" align="right">0.2230</td>
<td style="border: 1px solid #000000;" align="right">0.0122</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">800&#215;800</span></strong></td>
<td style="border: 1px solid #000000;" align="right">13.7711</td>
<td style="border: 1px solid #000000;" align="right">2.3746</td>
<td style="border: 1px solid #000000;" align="right">0.7259</td>
<td style="border: 1px solid #000000;" align="right">0.4135</td>
<td style="border: 1px solid #000000;" align="right">0.3320</td>
<td style="border: 1px solid #000000;" align="right">0.0310</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">900&#215;900</span></strong></td>
<td style="border: 1px solid #000000;" align="right">20.3245</td>
<td style="border: 1px solid #000000;" align="right">3.4983</td>
<td style="border: 1px solid #000000;" align="right">1.0146</td>
<td style="border: 1px solid #000000;" align="right">0.7449</td>
<td style="border: 1px solid #000000;" align="right">0.4740</td>
<td style="border: 1px solid #000000;" align="right">0.0306</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1000&#215;1000</span></strong></td>
<td style="border: 1px solid #000000;" align="right">28.8345</td>
<td style="border: 1px solid #000000;" align="right">3.4983</td>
<td style="border: 1px solid #000000;" align="right">1.4748</td>
<td style="border: 1px solid #000000;" align="right">1.0548</td>
<td style="border: 1px solid #000000;" align="right">0.6530</td>
<td style="border: 1px solid #000000;" align="right">0.0700</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1100&#215;1100</span></strong></td>
<td style="border: 1px solid #000000;" align="right">38.2545</td>
<td style="border: 1px solid #000000;" align="right">7.0240</td>
<td style="border: 1px solid #000000;" align="right">1.9383</td>
<td style="border: 1px solid #000000;" align="right">1.6257</td>
<td style="border: 1px solid #000000;" align="right">0.8620</td>
<td style="border: 1px solid #000000;" align="right">0.1250</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1200&#215;1200</span></strong></td>
<td style="border: 1px solid #000000;" align="right">50.4964</td>
<td style="border: 1px solid #000000;" align="right">9.9319</td>
<td style="border: 1px solid #000000;" align="right">2.8411</td>
<td style="border: 1px solid #000000;" align="right">2.1215</td>
<td style="border: 1px solid #000000;" align="right">1.1170</td>
<td style="border: 1px solid #000000;" align="right">0.0440</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1300&#215;1300</span></strong></td>
<td style="border: 1px solid #000000;" align="right">64.5064</td>
<td style="border: 1px solid #000000;" align="right">12.8344</td>
<td style="border: 1px solid #000000;" align="right">3.6277</td>
<td style="border: 1px solid #000000;" align="right">2.9720</td>
<td style="border: 1px solid #000000;" align="right">1.4250</td>
<td style="border: 1px solid #000000;" align="right">0.0440</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1400&#215;1400</span></strong></td>
<td style="border: 1px solid #000000;" align="right">81.1826</td>
<td style="border: 1px solid #000000;" align="right">17.1119</td>
<td style="border: 1px solid #000000;" align="right">4.8309</td>
<td style="border: 1px solid #000000;" align="right">3.5977</td>
<td style="border: 1px solid #000000;" align="right">1.7760</td>
<td style="border: 1px solid #000000;" align="right">0.0938</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1500&#215;1500</span></strong></td>
<td style="border: 1px solid #000000;" align="right">100.1330</td>
<td style="border: 1px solid #000000;" align="right">21.0622</td>
<td style="border: 1px solid #000000;" align="right">6.1689</td>
<td style="border: 1px solid #000000;" align="right">4.8022</td>
<td style="border: 1px solid #000000;" align="right">2.1870</td>
<td style="border: 1px solid #000000;" align="right">0.1111</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1600&#215;1600</span></strong></td>
<td style="border: 1px solid #000000;" align="right">120.3400</td>
<td style="border: 1px solid #000000;" align="right">26.4316</td>
<td style="border: 1px solid #000000;" align="right">7.3189</td>
<td style="border: 1px solid #000000;" align="right">5.0451</td>
<td style="border: 1px solid #000000;" align="right">2.6490</td>
<td style="border: 1px solid #000000;" align="right">0.1699</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1700&#215;1700</span></strong></td>
<td style="border: 1px solid #000000;" align="right">145.8550</td>
<td style="border: 1px solid #000000;" align="right">31.2706</td>
<td style="border: 1px solid #000000;" align="right">8.7525</td>
<td style="border: 1px solid #000000;" align="right">6.8915</td>
<td style="border: 1px solid #000000;" align="right">3.1870</td>
<td style="border: 1px solid #000000;" align="right">0.1452</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1800&#215;1800</span></strong></td>
<td style="border: 1px solid #000000;" align="right">174.6860</td>
<td style="border: 1px solid #000000;" align="right">38.9293</td>
<td style="border: 1px solid #000000;" align="right">11.1060</td>
<td style="border: 1px solid #000000;" align="right">8.1316</td>
<td style="border: 1px solid #000000;" align="right">3.7940</td>
<td style="border: 1px solid #000000;" align="right">0.1989</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">1900&#215;1900</span></strong></td>
<td style="border: 1px solid #000000;" align="right">206.0520</td>
<td style="border: 1px solid #000000;" align="right">45.8589</td>
<td style="border: 1px solid #000000;" align="right">13.0832</td>
<td style="border: 1px solid #000000;" align="right">9.9527</td>
<td style="border: 1px solid #000000;" align="right">4.4450</td>
<td style="border: 1px solid #000000;" align="right">0.2725</td>
</tr>
<tr>
<td style="border: 1px solid #000000;" height="17" align="right" bgcolor="#008080"><strong><span style="color: #ffffff;">2000&#215;2000</span></strong></td>
<td style="border: 1px solid #000000;" align="right">240.7820</td>
<td style="border: 1px solid #000000;" align="right">55.4392</td>
<td style="border: 1px solid #000000;" align="right">16.0542</td>
<td style="border: 1px solid #000000;" align="right">11.0314</td>
<td style="border: 1px solid #000000;" align="right">5.1820</td>
<td style="border: 1px solid #000000;" align="right">0.3359</td>
</tr>
</tbody>
</table>
<p style="text-align: left;">
The following figure shows the results. Since uBLAS and single threaded matrix multiplications took significantly longer to compute, I did not include them in the figure below.
</p>
<p style="text-align: center;">
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/linearplot.png"><img class="size-full wp-image-645" title="Matrix Multiplication (Linear)" src="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/linearplot.png" alt="Matrix Multiplication (Linear)" width="560" height="420" /></a></p>
<p style="text-align: left;">
The following figure shows the same data but uses log-scale Y axis instead so that all the data can show up nicely. You can get a sense of various algorithms&#8217; efficiencies here:</p>
<p style="text-align: center;">
<a href="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/logplot.png"><img class="alignnone size-full wp-image-646"  title="Matrix Multiplication (Log)" src="http://www.kerrywong.com/blog/wp-content/uploads/2009/03/logplot.png" alt="Matrix Multiplication (Log)" width="560" height="420" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2009/03/07/matrix-multiplication-performance-in-c/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>PD1001 Webcam on Hardy Heron</title>
		<link>http://www.kerrywong.com/2008/10/11/pd1001-webcam-on-hardy-heron/</link>
		<comments>http://www.kerrywong.com/2008/10/11/pd1001-webcam-on-hardy-heron/#comments</comments>
		<pubDate>Sun, 12 Oct 2008 01:07:18 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[Webcam]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=390</guid>
		<description><![CDATA[I have an old webcam (Creative PD1001) which is not officially supported on Linux. Fortunately, Endpoints EPCAM USB Camera Driver is known to work with PD1001 on many Linux distros. 
To get the driver built on Ubuntu 8.04 however, I needed to make some minor changes to epcam.c. The kernel version I was compiling against [...]]]></description>
			<content:encoded><![CDATA[<p>I have an old webcam (Creative PD1001) which is not officially supported on Linux. Fortunately, <a href="http://ubuntuforums.org/showpost.php?p=2626919&amp;postcount=29">Endpoints EPCAM USB Camera Driver</a> is known to work with PD1001 on many Linux distros. <span id="more-390"></span></p>
<p>To get the driver built on Ubuntu 8.04 however, I needed to make some minor changes to epcam.c. The kernel version I was compiling against is 2.6.24-19-generic. To get the 0.7.3 driver build successfully, I needed to comment out #include &lt;linux/config.h&gt; and then ran</p>
<p>KBUILD_NOPEDANTIC=1 make install</p>
<p>If you do not want to setup the environment variables for the build, you could just modify the #include statements to where the kernel header files are located:</p>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/module.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/version.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/init.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/fs.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/vmalloc.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/slab.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/proc_fs.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/pagemap.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/linux/usb.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/asm/io.h&quot;</div>
<div>#include &quot;/usr/src/linux-headers-2.6.24-19-generic/include/asm/semaphore.h&quot;</div>
<p>&nbsp;</p>
<p>After the build the driver is automatically installed and when the webcam is plugged in, it should be recognized by apps such as camorama (note: /dev/video0 is automatically created when the webcam is connected).</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/10/11/pd1001-webcam-on-hardy-heron/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>TBB Mandelbrot Set</title>
		<link>http://www.kerrywong.com/2008/09/13/tbb-mandelbrot-set/</link>
		<comments>http://www.kerrywong.com/2008/09/13/tbb-mandelbrot-set/#comments</comments>
		<pubDate>Sun, 14 Sep 2008 02:35:48 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Mandelbrot]]></category>
		<category><![CDATA[Multi-threading]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/?p=352</guid>
		<description><![CDATA[In an earlier post, I created a simple prime finding program using Intel&#8217;s TBB (Thread Building Block). The main benefit of using TBB is that threading and thread synchronization mechanism are abstracted away within the TBB library so we do not need to deal with threads explicitly. Also, TBB is optimized for performance and scales [...]]]></description>
			<content:encoded><![CDATA[<p>In an <a href="/2008/06/22/a-simple-tbb-program-tbb-prime/">earlier post</a>, I created a simple prime finding program using Intel&#8217;s TBB (<a href="http://www.threadingbuildingblocks.org/">Thread Building Block</a>). The main benefit of using TBB is that threading and thread synchronization mechanism are abstracted away within the TBB library so we do not need to deal with threads explicitly. Also, TBB is optimized for performance and scales nicely as the number of processing unit increases.<span id="more-352"></span> In this post, I will show you how to create a <a href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a> generator using TBB and how to optimize the algorithm using loop unrolling.</p>
<p>The standard algorithm for generating Mandelbrot Set is extremely easy to adapt to using TBB. In fact the loops look almost identical to those in the single-threaded approach, except that the iterations are calculated within a 2D range block (<strong>blocked_range2d</strong>) instead of the entire two dimensional space.</p>
<div style="background: white none repeat scroll 0% 0%; font-family: Courier New; font-size: 10pt; color: black; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> <span style="color: blue; font-weight: bold;">operator</span><span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range2d<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing_area drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t x <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">++)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t y <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y<span style="color: rgb(128, 128, 192); font-weight: bold;">++)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cx <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> x_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> x_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>y <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> y_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> y_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">while</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>zx <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> <span style="color: teal;">4</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> max_iteration<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> xtemp <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cx<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">2</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cy<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zx <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> xtemp<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; i<span style="color: rgb(128, 128, 192); font-weight: bold;">++;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> <span style="color: teal;">255</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; color_t c <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">16</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">8</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>set_pixel<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>y<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>c<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
</div>
<p>Because xlib by itself is not thread-safe, special attention must be made when trying to update the display concurrently. One way to address this issue is to employee a shared memory region (<strong>X11/extensions/XShm.h</strong> and <strong>sys/shm.h</strong>), the display is first built in memory and then the shared memory is attached to the display. In my examples above I used code (<strong>video.h</strong>, <strong>xvideo.cpp</strong>) from the sample code that come with the TBB library, which uses the shared memory method I mentioned earlier to make the X11 calls thread-safe.</p>
<p>Many optimization methods can be used to further enhance the performance of the algorithm. One of the most efficient methods is to utilize SSE instructions found on all modern Intel processors (examples can be found here: <a href="http://softwarecommunity.intel.com/articles/eng/3426.htm">Using SSE3 Technology in Algorithms with Complex Arithmetic</a>). This approach however might be difficult to implement and debug since parallel data structures must be used in order to benefit from SSE instructions. Also, explicit assembly level coding makes porting code to other machine architectures a daunting task. Modern compilers can already take full advantage of the underlying machine architecture. For example, the gcc compiler (4.2.3) already generates SSE instructions for the code snippet above. While hand tweaking using SSE instructions might further improve the performance, we would certainly sacrifice code simplicity and portability.</p>
<p>The approach I am going to take to further optimize the code is to use loop unrolling. Since the inner loop of the standard algorithm is pretty short, unrolling the inner loop should lessen the burden of loop overhead and decrease the chances of stalling the pipeline (when branching must be predicted). So a high-level loop unrolling should be able to improve the performance.</p>
<p>Here is the code after the inner loop is unrolled:</p>
<div style="background: white none repeat scroll 0% 0%; font-family: Courier New; font-size: 10pt; color: black; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> <span style="color: blue; font-weight: bold;">operator</span><span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range2d<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing_area drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>screen_size<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t x <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>rows<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t y <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>cols<span style="color: rgb(128, 128, 192); font-weight: bold;">().</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> y<span style="color: rgb(128, 128, 192); font-weight: bold;">++)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> x_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> x_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)(</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> <span style="color: teal;">1</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> x_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> x_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> cy <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>y <span style="color: rgb(128, 128, 192); font-weight: bold;">/</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">float</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span>screen_size <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> y_range <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> y_min<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> i1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> i2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> loop_stop1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> loop_stop2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">while</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">!(</span>loop_stop1 <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> loop_stop2<span style="color: rgb(128, 128, 192); font-weight: bold;">))</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> xtemp1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">float</span> xtemp2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">((</span>zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> <span style="color: teal;">4</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i1 <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> max_iteration<span style="color: rgb(128, 128, 192); font-weight: bold;">))</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; xtemp1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cx1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">2</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy1 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cy<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zx1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> xtemp1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; i1<span style="color: rgb(128, 128, 192); font-weight: bold;">++;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">else</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; loop_stop1 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">((</span>zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> <span style="color: teal;">4</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;&amp;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i2<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> max_iteration<span style="color: rgb(128, 128, 192); font-weight: bold;">))</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; xtemp2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cx2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">2</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> zy2 <span style="color: rgb(128, 128, 192); font-weight: bold;">+</span> cy<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; zx2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> xtemp2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; i2<span style="color: rgb(128, 128, 192); font-weight: bold;">++;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">else</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; loop_stop2 <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">int</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i1<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> <span style="color: teal;">255</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; color_t c <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">16</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">8</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>set_pixel<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>y<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>c<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; itr <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> i2&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> <span style="color: teal;">255</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; c <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">16</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: teal;">8</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">|</span> itr<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; drawing<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>set_pixel<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x<span style="color: rgb(128, 128, 192); font-weight: bold;">+</span><span style="color: teal;">1</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>y<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span>c<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
</div>
<p>This code generates identical results as the code mentioned previously. As you can see, the inner loop is not unrolled to handle two data points at a time.</p>
<p>As it turned out, this algorithm runs almost twice as fast as the code mentioned earlier(280ms versus 510ms on Intel Q9450 @ 3.4GHz).</p>
<p align="center"><img alt="Mandelbrot Set" src="/blog/wp-content/uploads/2008/09/mandelbrot_tbb.jpg" /></p>
<p><strong>Source code</strong> for this article can be downloaded <a href="/blog/wp-content/uploads/2008/09/mandelbrot_tbb.zip">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/09/13/tbb-mandelbrot-set/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Code::Blocks Settings for Qt Development</title>
		<link>http://www.kerrywong.com/2008/07/12/codeblocks-settings-for-qt-development/</link>
		<comments>http://www.kerrywong.com/2008/07/12/codeblocks-settings-for-qt-development/#comments</comments>
		<pubDate>Sun, 13 Jul 2008 01:32:27 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Linux/BSD]]></category>
		<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Code::Blocks]]></category>
		<category><![CDATA[Qt]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2008/07/12/codeblocks-settings-for-qt-development/</guid>
		<description><![CDATA[Qt is a cross-platform object oriented C++ framework for application development. KDE is written using Qt. And not surprisingly, KDE comes with native support for building Qt based applications. In KDE, KDevelop is the default IDE under Linux for creating Qt based applications. For smaller projects however, it seems that the feature richness of KDevelop [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://trolltech.com/products/qt">Qt</a> is a cross-platform object oriented C++ framework for application development.<span id="more-315"></span> <a href="http://www.kde.org/">KDE</a> is written using Qt. And not surprisingly, KDE comes with native support for building Qt based applications. In KDE, <a href="http://www.kerrywong.com/blog/wp-admin/www.kdevelop.org/">KDevelop</a> is the default IDE under Linux for creating Qt based applications. For smaller projects however, it seems that the feature richness of KDevelop is almost an overkill.</p>
<p align="left">I personally like simplicity of Code::Blocks when writing small programs. Even though Code::Blocks does come with Qt support, you will have to tell it where Qt is installed before you can compile your code correctly. Like any other C++ environment, the two things you must provide are the include path and the linker path.</p>
<p align="left">Ubuntu 8.04 comes with Qt3 installed, to install Qt4, please follow the instructions <a href="http://ubuntu-gamedev.wikispaces.com/Simple+GUI+Using+QT4">here</a>. Here are the build options you need to add with the default Qt4 install:</p>
<p><a title="qtlinkersettings.gif" href="http://www.kerrywong.com/blog/wp-content/uploads/2008/07/qtlinkersettings.gif"></p>
<p style="text-align: center;"><img alt="qtlinkersettings.gif" src="http://www.kerrywong.com/blog/wp-content/uploads/2008/07/qtlinkersettings.gif" /></p>
<p></a>  <a title="qtsearchdirs.gif" href="http://www.kerrywong.com/blog/wp-content/uploads/2008/07/qtsearchdirs.gif"></p>
<p style="text-align: center;"><img alt="qtsearchdirs.gif" src="http://www.kerrywong.com/blog/wp-content/uploads/2008/07/qtsearchdirs.gif" /></p>
<p></a>  As a bare minimum, you will need to set the libQtGui and libQtCore libraries paths. Note thatUbuntu 8.04 comes with Qt3 installed, when you specify the libraries you need to make sure you specify the ones with the correct version. In my example, the required Qt4 libraries are /usr/lib/libQtGui.so.4 and /usr/lib/libQtCore.so.4.  You also need to specify where the include files can be found. For typical applications, /ur/incude/qt4 and /usr/include/qt4/QtGui are the paths you will need.  If you want contextual help to be available, you will have to provide the search paths for the include files under &quot;C/C++ parser options&quot; tab in Project/targets options settings.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/07/12/codeblocks-settings-for-qt-development/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Simple TBB Program: TBB Prime</title>
		<link>http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/</link>
		<comments>http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/#comments</comments>
		<pubDate>Sun, 22 Jun 2008 16:12:08 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Multi-threading]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/</guid>
		<description><![CDATA[I have been playing around with Intel&#8217;s Threading Building Block for a while and have started to really appreciate its simplicity and elegance: Instead of thinking in threads and thread synchronizations, one can just simply concentrate on the problem on the hand.  Take finding prime numbers for example, while the problem itself (using the [...]]]></description>
			<content:encoded><![CDATA[<p>I have been playing around with Intel&#8217;s <a href="http://www.threadingbuildingblocks.org/">Threading Building Block</a> for a while and have started to really appreciate its simplicity and elegance: Instead of thinking in threads and thread synchronizations, one can just simply concentrate on the problem on the hand.<span id="more-310"></span>  Take finding prime numbers for example, while the problem itself (using the most rudimentary algorithm) is quite simple, getting it to work in a multi-threaded fashion does take a little bit of work. In this particular example, the prime finding algorithm can be easily paralleled by utilizing threads and thread synchronization is almost a non-issue since the problem domain can be divided into totally disjoint regions, but in general dividing the problem domain into multiple sub-domains and performing load balancing among them could take significant work.  In the following example, I created two C++ classes that both find prime numbers for a given interval (since all prime numbers are odd numbers except 2, 2 is omitted in the calculates below), one sequential and the other parallel. In the main function, both methods are timed and the results are outputted.(download <a title="tbbprime.cpp" href="/blog/wp-content/uploads/2008/06/tbbprime.zip"> tbbprime.cpp</a>)</p>
<div style="background: white none repeat scroll 0% 0%; font-family: Courier New; font-size: 10pt; color: black; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;stdio.h&gt;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;stdlib.h&gt;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;iostream&gt;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;iomanip&gt;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&lt;math.h&gt;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/task_scheduler_init.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/tick_count.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/blocked_range.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/parallel_for.h&quot;</span></div>
<div style="margin: 0px;"><span style="color: rgb(0, 128, 192); font-weight: bold;">#include</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot;tbb/partitioner.h&quot;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">using</span> <span style="color: blue; font-weight: bold;">namespace</span> std<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">using</span> <span style="color: blue; font-weight: bold;">namespace</span> tbb<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">static</span> <span style="color: blue; font-weight: bold;">const</span> <span style="color: blue; font-weight: bold;">int</span> MAX_SIZE <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">1000000</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">class</span> prime_single_thread</div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">public</span><span style="color: rgb(128, 128, 192); font-weight: bold;">:</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t x <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">int</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">3</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> sqrt<span style="color: rgb(128, 128, 192); font-weight: bold;">((</span><span style="color: blue; font-weight: bold;">double</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> x<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span> i<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>x <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">==</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">continue</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: green;">// Output prime numbers:</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (is_prime)</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; cout &lt;&lt; x &lt;&lt; endl;</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">};</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">class</span> prime_tbb</div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">public</span><span style="color: rgb(128, 128, 192); font-weight: bold;">:</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> test_prime<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">int</span> num<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">bool</span> is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">true</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">int</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: teal;">3</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;=</span> sqrt<span style="color: rgb(128, 128, 192); font-weight: bold;">((</span><span style="color: blue; font-weight: bold;">double</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> num<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span> i<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">if</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>num <span style="color: rgb(128, 128, 192); font-weight: bold;">%</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">==</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; is_prime <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> <span style="color: blue; font-weight: bold;">false</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">continue</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: green;">// Output prime numbers:</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (is_prime)</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; cout &lt;&lt; num &lt;&lt; endl;</span></div>
<div style="margin: 0px;"><span style="color: green;">//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> <span style="color: blue; font-weight: bold;">operator</span><span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: blue; font-weight: bold;">const</span> blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&amp;</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span> <span style="color: blue; font-weight: bold;">const</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">for</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>size_t i <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>begin<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> i <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>end<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span> i<span style="color: rgb(128, 128, 192); font-weight: bold;">+=</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; test_prime<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>i<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">void</span> run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">)</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; prime_tbb prime_tbb<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; parallel_for<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> prime_tbb<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">};</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;"><span style="color: blue; font-weight: bold;">int</span> main<span style="color: rgb(128, 128, 192); font-weight: bold;">()</span></div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">{</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; task_scheduler_init init<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">static</span> tick_count t_start<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> t_end<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; prime_single_thread p1<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; prime_tbb p2<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>setf<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>ios<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>fixed<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>setf<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>ios<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>showpoint<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>precision<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">2</span><span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: green;">//starting from 3, with a granularity of 100.</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; blocked_range<span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;</span>size_t<span style="color: rgb(128, 128, 192); font-weight: bold;">&gt;</span> r<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span><span style="color: teal;">3</span><span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> MAX_SIZE<span style="color: rgb(128, 128, 192); font-weight: bold;">,</span> <span style="color: teal;">100</span><span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_start <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; p1<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> t_start<span style="color: rgb(128, 128, 192); font-weight: bold;">).</span>seconds<span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> <span style="color: teal;">1000</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot; ms&quot;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> endl<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_start <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; p2<span style="color: rgb(128, 128, 192); font-weight: bold;">.</span>run<span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>r<span style="color: rgb(128, 128, 192); font-weight: bold;">);</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">=</span> tick_count<span style="color: rgb(128, 128, 192); font-weight: bold;">::</span>now<span style="color: rgb(128, 128, 192); font-weight: bold;">();</span></div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; cout <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">(</span>t_end <span style="color: rgb(128, 128, 192); font-weight: bold;">-</span> t_start<span style="color: rgb(128, 128, 192); font-weight: bold;">).</span>seconds<span style="color: rgb(128, 128, 192); font-weight: bold;">()</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">*</span> <span style="color: teal;">1000</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> <span style="color: rgb(163, 21, 21); font-weight: bold;">&quot; ms&quot;</span> <span style="color: rgb(128, 128, 192); font-weight: bold;">&lt;&lt;</span> endl<span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;">&nbsp;</div>
<div style="margin: 0px;">&nbsp;&nbsp;&nbsp; <span style="color: blue; font-weight: bold;">return</span> <span style="color: teal;">0</span><span style="color: rgb(128, 128, 192); font-weight: bold;">;</span></div>
<div style="margin: 0px;"><span style="color: rgb(128, 128, 192); font-weight: bold;">}</span></div>
</div>
<div style="background: white none repeat scroll 0% 0%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; font-family: Courier New; font-size: 10pt; color: black;">&nbsp;</div>
<p>For a very large interval (e.g. 3~1,000,000), the TBB version of the prime program achieved a 4x speed up given a reasonably large grain size (e.g. 100). Smaller grain size resulted in slightly more overhead.  On a quad-core machine (Q9450 @ 3.2GHz), it took 217.83 ms for the single threaded routine to find all the prime numbers within 1,000,000, whereas it only took 58.32 ms for the TBB version, which runs roughly four times as fast. The TBB framework took care of dividing the task according to the number of processors automatically.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/06/22/a-simple-tbb-program-tbb-prime/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Thoughts on &#8220;Where Are the Software Engineers of Tomorrow?&#8221;</title>
		<link>http://www.kerrywong.com/2008/02/09/thoughts-on-where-are-the-software-engineers-of-tomorrow/</link>
		<comments>http://www.kerrywong.com/2008/02/09/thoughts-on-where-are-the-software-engineers-of-tomorrow/#comments</comments>
		<pubDate>Sun, 10 Feb 2008 03:49:57 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[C++]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2008/02/09/thoughts-on-where-are-the-software-engineers-of-tomorrow/</guid>
		<description><![CDATA[I recently ran across two very interesting articles (Computer Science Education: Where Are the Software Engineers of Tomorrow?, Who Killed the Software Engineer? (Hint: It Happened in College)) discussing how the current university education has become inadequate in terms of producing highly qualified software engineers and developers.As a software engineer myself, I think I would [...]]]></description>
			<content:encoded><![CDATA[<p>I recently ran across two very interesting articles (<a href="http://www.stsc.hill.af.mil/CrossTalk/2008/01/0801DewarSchonberg.html">Computer Science Education: Where Are the Software Engineers of Tomorrow?</a>, <a href="http://itmanagement.earthweb.com/career/article.php/11067_3722876_2">Who Killed the Software Engineer? (Hint: It Happened in College)</a>) discussing how the current university education has become inadequate in terms of producing highly qualified software engineers and developers.<span id="more-264"></span>As a software engineer myself, I think I would have to agree with Dr. Dewar&#8217;s view detailed in the above two articles.</p>
<p>Like any kind of science, the essence of computer science education was not to teach you any particular languages, but to teach you the theories by which all computational tasks abide.</p>
<p>A typical software engineer today is very different from one even just a couple of decades ago. Back then, creating a computer program usually required some very intimate knowledge of both the software and the hardware. Of course we had no choice back then as we lacked the modern high level languages (e.g. Java, C#) and the hardware resources were usually very limited. Back then we valued efficient algorithms in terms of both footprint and efficiency as a 600K versus 60k application usually means one can fit into the main memory and one cannot. And similarly, an inefficient algorithm usually led to an unusable application.</p>
<p>Things have arguably changed quite a bit nowadays. Personal computers today are becoming ever more powerful and the limit imposed by storage is rapidly disappearing for all practical matters. Nobody would even notice the difference of an inefficient application taking one second to run whereas the same application could be optimized to run a thousand time faster. A 10Meg application and a 1Meg application would rarely pose any problem as today&#8217;s computers can easily handle multi-gigabytes of working sets.</p>
<p>And with computers being as ubiquitous as they are today, almost every one knows how to operate a computer and does what his or her heart desires to a certain degree. So naturally, the technological savviness of an average developer is far less than that ten or twenty years ago.</p>
<p>This situation was made even worse in a market economy today. Most companies&#8217; job postings require the knowledge and experience of certain computer languages and without such credentials (e.g. for a fresh colledge graduate) a candidate is simply not considered. Even though we all know that a good developer in one language typically can master any given languages within a very short period of time and be good at them as well. Unfortunately, as human nature, we tend to emphasize more on the surface value that we can see.</p>
<p>The society we live in is one that is driven by demand. Given the corporate culture just mentioned, more and more universities, especially the lesser known ones started to put more emphasis on what is needed in the market and the fundamentals (e.g. Assembly, C/C++) became less desirable to teach or learn as such fundamentals hardly ever translate into the market place value.</p>
<p>And it is only to become worse, as companies like Microsoft make the application development environment easier and easier to use, wring programs becoming seemingly more and more trivial. Some people even believe that anyone, with some training, can easily master a computer language and thus becomes a software developer. There might be some truth to that as far as the &quot;surface value&quot; is concerned, but this is rather dangerous as there is a lot more to writing good programs than being able to drag and drop.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2008/02/09/thoughts-on-where-are-the-software-engineers-of-tomorrow/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>My Favorite Program of All Time</title>
		<link>http://www.kerrywong.com/2007/12/18/my-favorite-program-of-all-time/</link>
		<comments>http://www.kerrywong.com/2007/12/18/my-favorite-program-of-all-time/#comments</comments>
		<pubDate>Wed, 19 Dec 2007 03:07:43 +0000</pubDate>
		<dc:creator>kwong</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[C++]]></category>

		<guid isPermaLink="false">http://www.kerrywong.com/2007/12/18/my-favorite-program-of-all-time/</guid>
		<description><![CDATA[I remember that in one of my graduate school classes at University of Wisconsin &#8211; Madison, professor Rastislav Bodik (he had left UW Madison and gone to UC Berkeley since) showed us this article Reverse Engineering the Twelve Days of Christmas from Microsoft Research while explaining some compilation theories. 
To this day, this is still [...]]]></description>
			<content:encoded><![CDATA[<p>I remember that in one of my graduate school classes at <a href="http://www.wisc.edu">University of Wisconsin &#8211; Madison</a>, professor <a href="http://www.cs.berkeley.edu/~bodik/">Rastislav Bodik</a> (he had left UW Madison and gone to UC Berkeley since) showed us this article <a href="http://research.microsoft.com/~tball/papers/XmasGift/">Reverse Engineering the Twelve Days of Christmas</a> from <a href="http://research.microsoft.com/">Microsoft Research</a> while explaining some compilation theories. <span id="more-256"></span></p>
<p>To this day, this is still the program I admire the most. I still wonder how Jim Coplien created such an elegant program which, even trying to understand the reverse engineered version remains a daunting task.</p>
<p>In order for every &quot;modern&quot; programmer to grasp the true beauty of the original program, I have modified it slightly so it will run in Visual Studio 2005. <a href="/blog/wp-content/uploads/2007/12/twelveday.zip">Try it yourself</a>!</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kerrywong.com/2007/12/18/my-favorite-program-of-all-time/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
