<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>Portability Blog (Entries tagged as x86_64)</title>
    <link>http://portabilityblog.com/blog/</link>
    <description>tales about building software on many platforms</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.2.1 - http://www.s9y.org/</generator>
    <managingEditor>df.portabilityblog@erinye.com</managingEditor>
<webMaster>df.portabilityblog@erinye.com</webMaster>
<pubDate>Fri, 29 May 2009 11:34:46 GMT</pubDate>

    <image>
        <url>http://portabilityblog.com/blog/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: Portability Blog - tales about building software on many platforms</title>
        <link>http://portabilityblog.com/blog/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>Bad register name &quot;dil&quot; (or &quot;sil&quot;)</title>
    <link>http://portabilityblog.com/blog/archives/11-Bad-register-name-dil-or-sil.html</link>
            <category>CPUs</category>
    
    <comments>http://portabilityblog.com/blog/archives/11-Bad-register-name-dil-or-sil.html#comments</comments>
    <wfw:comment>http://portabilityblog.com/blog/wfwcomment.php?cid=11</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://portabilityblog.com/blog/rss.php?version=2.0&amp;type=comments&amp;cid=11</wfw:commentRss>
    

    <author>nospam@example.com (Daniel Fischer)</author>
    <content:encoded>
    This is a piece of code from a project that runs on both the x86 and x86_64 architectures:&lt;br /&gt;
&lt;br /&gt;
&lt;pre style=&quot;font-size:9pt;&quot;&gt;&lt;span style=&quot;color:#808080&quot;&gt; 1 &lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;static&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;inline&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;swap_int&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;*&lt;/span&gt;a&lt;span style=&quot;color:#000000&quot;&gt;,&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; b&lt;span style=&quot;color:#000000&quot;&gt;) {&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 2 &lt;/span&gt;    &lt;span style=&quot;color:#8f0055&quot;&gt;asm&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;volatile&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#c00000&quot;&gt;&amp;quot;xchg %0, %1&amp;quot;&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;:&lt;/span&gt; &lt;span style=&quot;color:#c00000&quot;&gt;&amp;quot;+r&amp;quot;&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;b&lt;span style=&quot;color:#000000&quot;&gt;) ,&lt;/span&gt; &lt;span style=&quot;color:#c00000&quot;&gt;&amp;quot;+m&amp;quot;&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;(*&lt;/span&gt;a&lt;span style=&quot;color:#000000&quot;&gt;));&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 3 &lt;/span&gt;    &lt;span style=&quot;color:#8f0055&quot;&gt;return&lt;/span&gt; b&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 4 &lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;br /&gt;
&lt;br /&gt;
It&#039;s fairly easy to see what it does: It swaps two values of type &lt;code&gt;int&lt;/code&gt;. This code works perfectly fine on both architectures, provided that you&#039;re using a compiler that understands the &lt;code&gt;asm&lt;/code&gt; statement, such as gcc. Later, this similar piece of code appears:&lt;br /&gt;
&lt;br /&gt;
&lt;pre style=&quot;font-size:9pt;&quot;&gt;&lt;span style=&quot;color:#808080&quot;&gt; 1 &lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;static&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;inline&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;int&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;swap_char&lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#8f0055&quot;&gt;char&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;*&lt;/span&gt;a&lt;span style=&quot;color:#000000&quot;&gt;,&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;char&lt;/span&gt; b&lt;span style=&quot;color:#000000&quot;&gt;) {&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 2 &lt;/span&gt;    &lt;span style=&quot;color:#8f0055&quot;&gt;asm&lt;/span&gt; &lt;span style=&quot;color:#8f0055&quot;&gt;volatile&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#c00000&quot;&gt;&amp;quot;xchg %0, %1&amp;quot;&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;:&lt;/span&gt; &lt;span style=&quot;color:#c00000&quot;&gt;&amp;quot;+r&amp;quot;&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;(&lt;/span&gt;b&lt;span style=&quot;color:#000000&quot;&gt;) ,&lt;/span&gt; &lt;span style=&quot;color:#c00000&quot;&gt;&amp;quot;+m&amp;quot;&lt;/span&gt; &lt;span style=&quot;color:#000000&quot;&gt;(*&lt;/span&gt;a&lt;span style=&quot;color:#000000&quot;&gt;));&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 3 &lt;/span&gt;    &lt;span style=&quot;color:#8f0055&quot;&gt;return&lt;/span&gt; b&lt;span style=&quot;color:#000000&quot;&gt;;&lt;/span&gt;
&lt;span style=&quot;color:#808080&quot;&gt; 4 &lt;/span&gt;&lt;span style=&quot;color:#000000&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;br /&gt;
&lt;br /&gt;
This code still compiles and works just fine. For the most part. But when optimisation is turned on, you may get this error from gcc when building for x86:&lt;br /&gt;
&lt;br /&gt;
&lt;blockquote&gt;&lt;code&gt;Error: bad register name `%dil&#039;&lt;/code&gt;&lt;/blockquote&gt;&lt;br /&gt;
&lt;br /&gt;
Then you try the same on x86_64 and the error is gone again. No wonder: As opposed to x86, the x86_64 architecture actually has the %dil register.&lt;br /&gt;
&lt;br /&gt;
At first, this appears to be a compiler bug. After all, the compiler is choosing a register that doesn&#039;t exist on x86. On closer look, it&#039;s a bug in the example code. The issue is that for the &lt;code&gt;b&lt;/code&gt; argument, the constraint &lt;code&gt;r&lt;/code&gt; is used, indicating that the value should be stored in any general-purpose register. In the first example, this is just fine. All of them will do for 32-bit operations. The second example, on closer examination, actually requires a register whose lower byte is accessible. On x86, there are only four general-purpose registers where this is true: EAX, EBX, ECX and EDX. On x86_64, this is also true for the ESI and EDI registers that are also treated as general-purpose registers on x86.&lt;br /&gt;
&lt;br /&gt;
So what happens is that the compiler correctly chooses the &lt;code&gt;%edi&lt;/code&gt; register, which satisfies the &lt;code&gt;r&lt;/code&gt; constraint. Later, the &lt;code&gt;xchg&lt;/code&gt; instruction is interpreted as referring to two byte-sized values due to the size of the arguments &lt;code&gt;*a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;. Thus, the compiler translates the instruction to its 8-bit form and replaces the register placeholder &lt;code&gt;%0&lt;/code&gt; with the 8-bit form of the &lt;code&gt;%edi&lt;/code&gt; register, which is &lt;code&gt;%dil&lt;/code&gt;. During assembly, this fails because &lt;code&gt;%dil&lt;/code&gt; doesn&#039;t actually exist on x86.&lt;br /&gt;
&lt;br /&gt;
If there is a compiler bug, it is only that the error output is misleading. It shouldn&#039;t even try to use &lt;code&gt;%dil&lt;/code&gt;, it should warn about the real problem. &lt;br /&gt;
&lt;br /&gt;
The real bug is that in the example source, a byte-sized argument was qualified with a constraint that allowed any general-purpose register to be used, where instead, the set should be constrained to registers whose lower byte is available. In gcc, this can be achieved by using the &lt;code&gt;q&lt;/code&gt; constraint instead of &lt;code&gt;r&lt;/code&gt;. 
    </content:encoded>

    <pubDate>Fri, 29 May 2009 13:09:23 +0200</pubDate>
    <guid isPermaLink="false">http://portabilityblog.com/blog/archives/11-guid.html</guid>
    <category>32 bit</category>
<category>64 bit</category>
<category>gcc</category>
<category>x86</category>
<category>x86_64</category>

</item>

</channel>
</rss>