<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>Portability Blog - Compilers</title>
    <link>http://portabilityblog.com/blog/</link>
    <description>tales about building software on many platforms</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.2.1 - http://www.s9y.org/</generator>
    <managingEditor>df.portabilityblog@erinye.com</managingEditor>
<webMaster>df.portabilityblog@erinye.com</webMaster>
<pubDate>Sun, 16 Dec 2007 11:46:13 GMT</pubDate>

    <image>
        <url>http://portabilityblog.com/blog/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: Portability Blog - Compilers - tales about building software on many platforms</title>
        <link>http://portabilityblog.com/blog/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>Denormal Numbers</title>
    <link>http://portabilityblog.com/blog/archives/5-Denormal-Numbers.html</link>
            <category>Compilers</category>
            <category>CPUs</category>
    
    <comments>http://portabilityblog.com/blog/archives/5-Denormal-Numbers.html#comments</comments>
    <wfw:comment>http://portabilityblog.com/blog/wfwcomment.php?cid=5</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://portabilityblog.com/blog/rss.php?version=2.0&amp;type=comments&amp;cid=5</wfw:commentRss>
    

    <author>nospam@example.com (Daniel Fischer)</author>
    <content:encoded>
    In school, we learned that x - y = 0 is true if, and only if, x = y. In computing, we learned that we can only store so many accurate digits in a fixed-size register. For example, a register that is 8 bits wide can store exactly 256 different values. If we want to represent negative and positive values, there are a number of ways to express that, but we&#039;ll still only get 256 different individual values. The way computers typically do it will give us values from -128 to +127. &lt;br /&gt;
&lt;br /&gt;
Now we might want to represent real numbers like 1.5. There&#039;s no way to get more than 256 different values from 8 bits, but we could agree that there&#039;s a decimal point, and it is always before the last digit. Instead of -128 to +127, we get -12.8 to 12.7. We traded in range for accuracy. Sometimes, range is more important than accuracy. In such a case, we could pretend that the decimal point is always one digit to the right of the last digit we store, giving us a range from -1280 to 1270. Our range got boosted, but we lost accuracy: We can no longer store numbers like 1234, even though it is in the range of our type. This is adequately refered to as &lt;i&gt;fixed point arithmetics&lt;/i&gt;.&lt;br /&gt;
&lt;br /&gt;
Now, how do we get both range, and accuracy? Instead of putting the decimal point in a fixed location, we could instead store its position together with the actual number. This is called &lt;i&gt;floating point arithmetics&lt;/i&gt;. For example, we could say that we use 2 bits for the position of the decimal point, and 6 bits for the actual number. The way computers really do it is a bit more complex and is defined in &lt;a href=&quot;http://en.wikipedia.org/wiki/IEEE_754&quot;&gt;IEEE 754&lt;/a&gt;. &lt;br /&gt;
&lt;br /&gt;
IEEE 754 contains one detail that might not be obvious. Floating point numbers are represented by a sign bit, an exponent to a fixed and previously agreed-upon base, and a mantissa. However, the mantissa isn&#039;t just a number that is shifted left or right based on the exponent. Instead, it is defined that the mantissa is a number between including 1, and excluding 2. Since this means there&#039;s always one digit before the decimal point that can only be 1, the part that is stored is only the digits &lt;i&gt;after&lt;/i&gt; the decimal point. A number stored like this is called a &lt;i&gt;normal number&lt;/i&gt;.&lt;br /&gt;
&lt;br /&gt;
For simplicity, let&#039;s assume a similar system based on base 10. Let&#039;s say we can store the sign, exponents from -2 to +2, and we have room for a mantissa of three digits. This will let us express numbers from -1.999 * 10 ^ -2 to + 1.999 * 10 ^ 2. We can express x = 0.012 as 1.2 * 10 ^ -2 and y = 0.0105 as 1.05 * 10 ^ -2. &lt;br /&gt;
&lt;br /&gt;
However, we can&#039;t represent  x - y = 0.0015! Normalising it and writing it as 1.5 * 10 ^ -3 fails to satisfy the condition we agreed upon previously that we would only have exponents from -2 to +2. We&#039;ll have to live with it and just accept that the result of this computation is smaller than the smallest allowed number that we can represent, and replace it with zero. But x and y aren&#039;t equal!&lt;br /&gt;
&lt;br /&gt;
This is a terrible situation for scientists, and thus, a solution was quickly found. Basically, it goes like this: We sacrifice one of our possible exponent values, and use it instead to indicate that the number we represent isn&#039;t normalised as agreed upon before, and smaller than the smallest possible normalised number.  Such a number is called a &lt;i&gt;denormal number&lt;/i&gt;. &lt;br /&gt;
&lt;br /&gt;
You might wonder where the lesson is - after all, the problem seems to be solved. By introducing denormals, we fixed our condition from the first paragraph, and now it holds again for all x and y that can be represented individually.&lt;br /&gt;
&lt;br /&gt;
In theory, all major platforms of today support denormals. In practice, many CPUs don&#039;t handle them in hardware, but instead trap to some software implementation. Implementing floating point operations in software can be rather slow. Programmers, however, don&#039;t like their programs running slowly and tell compilers to optimise. Compilers for CPUs that trap when denormals occur know that they&#039;re slow, and therefore, optimise by turning of denormals altogether. Instead, results of calculations that can&#039;t be normalised are flushed to zero.&lt;br /&gt;
&lt;br /&gt;
For example, on Itanium CPUs, the x - y operation can be slower by orders of magnitude if the result is a denormal as compared to the same operation when the result can be expressed as a normal number. Here&#039;s some (rather simple) code to try this if you have access to an Itanium box. Compile with gcc, without optimisation, once with -DSMALL, once without, and then compare the run time. It&#039;s not a proper benchmark, but should be sufficient to show the problem.&lt;br /&gt;
&lt;br /&gt;
&lt;pre style=&quot;font-size:9pt;&quot;&gt;&lt;span style=&quot;color:#808080&quot;&gt; 1 &lt;/span&gt;&lt;span style=&quot;color:#733710&quot;&gt;#include &amp;lt;stdio.h&amp;gt;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 2 &lt;/span&gt;&lt;span style=&quot;color:#733710&quot;&gt;#include &amp;lt;math.h&amp;gt;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 3 &lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 4 &lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;() {&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 5 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;double&lt;/span&gt; x&lt;span style=&quot;color:#000000&quot;&gt;, *&lt;/span&gt;y&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 6 &lt;/span&gt;  y &lt;span style=&quot;color:#000000&quot;&gt;= &amp;amp;&lt;/span&gt;x&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 7 &lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 8 &lt;/span&gt;  &lt;span style=&quot;color:#733710&quot;&gt;#ifdef SMALL&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 9 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;double&lt;/span&gt; a &lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;pow&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;,-&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;1022&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;) +&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;pow&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;,-&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;1023&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;);&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;10 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;double&lt;/span&gt; b &lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;pow&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;,-&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;1022&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;);&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;11 &lt;/span&gt;  &lt;span style=&quot;color:#733710&quot;&gt;#else&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;12 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;double&lt;/span&gt; a &lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;pow&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;,-&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;122&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;) +&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;pow&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;,-&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;123&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;);&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;13 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;double&lt;/span&gt; b &lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;pow&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;,-&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;122&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;);&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;14 &lt;/span&gt;  &lt;span style=&quot;color:#733710&quot;&gt;#endif&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;15 &lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;16 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; i &lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color:#2300ff&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;17 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;for&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;i&lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt; i&lt;span style=&quot;color:#000000&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&quot;color:#2300ff&quot;&gt;10000000&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;; ++&lt;/span&gt;i&lt;span style=&quot;color:#000000&quot;&gt;) {&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;18 &lt;/span&gt;    &lt;span style=&quot;color:#000000&quot;&gt;*&lt;/span&gt;y &lt;span style=&quot;color:#000000&quot;&gt;=&lt;/span&gt; a &lt;span style=&quot;color:#000000&quot;&gt;-&lt;/span&gt; b&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;19 &lt;/span&gt;  &lt;span style=&quot;color:#000000&quot;&gt;}&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;20 &lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;21 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;color:#2300ff&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt;22 &lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;br /&gt;
However, the vendor compiler for Itanium, Intel&#039;s icc, knows about this. When you use -O3 with icc on ia64, it enables flush to zero mode, which results in all denormal results to be flushed to zero. Use the same compiler on x86_64 and it won&#039;t, because x86_64 can handle denormals faster. It&#039;s still measurable, but not as much of a problem.&lt;br /&gt;
&lt;br /&gt;
I saw this problem in a test case that expected a denormal number as a result, and therefore failed (only) on ia64 with optimisation. It&#039;s still possible to get icc to optimise without enabling flush to zero mode, by specifically disabling it with the -no-ftz flag.&lt;br /&gt;
&lt;br /&gt;
 
    </content:encoded>

    <pubDate>Mon, 17 Dec 2007 15:56:00 +0100</pubDate>
    <guid isPermaLink="false">http://portabilityblog.com/blog/archives/5-guid.html</guid>
    <category>floating point</category>
<category>ia64</category>
<category>icc</category>

</item>
<item>
    <title>sizeof(long)</title>
    <link>http://portabilityblog.com/blog/archives/3-sizeoflong.html</link>
            <category>Compilers</category>
    
    <comments>http://portabilityblog.com/blog/archives/3-sizeoflong.html#comments</comments>
    <wfw:comment>http://portabilityblog.com/blog/wfwcomment.php?cid=3</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://portabilityblog.com/blog/rss.php?version=2.0&amp;type=comments&amp;cid=3</wfw:commentRss>
    

    <author>nospam@example.com (Daniel Fischer)</author>
    <content:encoded>
    It should be an offense to rely on the size of a given type in C to be the same across different platforms. Still, certain assumptions appear to be fairly common. One of them is the value of sizeof(long). I think we&#039;ve gotten over the idea that long and int are the same size now that 64 bit platforms are becoming more and more common. However, occassionally I still encounter a similar misconception: That the sizes of long and any pointer type are the same.&lt;br /&gt;
&lt;br /&gt;
&lt;pre style=&quot;font-size:9pt;&quot;&gt;&lt;span style=&quot;color:#808080&quot;&gt; 1 &lt;/span&gt;&lt;span style=&quot;color:#733710&quot;&gt;#include &amp;lt;stdio.h&amp;gt;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 2 &lt;/span&gt;&lt;span style=&quot;color:#733710&quot;&gt;#define S(X) printf(#X&lt;/span&gt; &lt;span style=&quot;color:#733710&quot;&gt;&amp;quot; %d&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;\n&lt;/span&gt;&lt;span style=&quot;color:#733710&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#733710&quot;&gt;, sizeof(X))&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 3 &lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 4 &lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;() {&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 5 &lt;/span&gt;  &lt;span style=&quot;color:#000000&quot;&gt;S&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;*);&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 6 &lt;/span&gt;  &lt;span style=&quot;color:#000000&quot;&gt;S&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;long&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;);&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 7 &lt;/span&gt;  &lt;span style=&quot;color:#8f0055&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;color:#2300ff&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 8 &lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;br /&gt;
&lt;br /&gt;
This snippet will tell you that both a long and a pointer to int are of size 4 on most 32 bit platforms, and that both are of size 8 on most 64 bit platforms. There is one notable platform where this isn&#039;t the case. When compiled with Visual Studio&#039;s cl.exe on 64 bit Windows, the size of the pointer will be 8, but the size of the long will be 4.&lt;br /&gt;
&lt;br /&gt;
According to the C standard, this is perfectly legal. In reality, I&#039;ve seen variables of type long used to store pointers, or anything else that fits into a long on one of the other platforms. Please stop doing that, it&#039;s wrong, and it will break on Windows. 
    </content:encoded>

    <pubDate>Sun, 16 Dec 2007 11:00:00 +0100</pubDate>
    <guid isPermaLink="false">http://portabilityblog.com/blog/archives/3-guid.html</guid>
    <category>64 bit</category>
<category>visual studio</category>
<category>windows</category>

</item>

</channel>
</rss>