This is a piece of code from a project that runs on both the x86 and x86_64 architectures:
1 static inline int swap_int(int *a, int b) {
2 asm volatile ("xchg %0, %1" : "+r" (b) , "+m" (*a));
3 return b;
4 }
It's fairly easy to see what it does: It swaps two values of type
int. This code works perfectly fine on both architectures, provided that you're using a compiler that understands the
asm statement, such as gcc. Later, this similar piece of code appears:
1 static inline int swap_char(char *a, char b) {
2 asm volatile ("xchg %0, %1" : "+r" (b) , "+m" (*a));
3 return b;
4 }
This code still compiles and works just fine. For the most part. But when optimisation is turned on, you may get this error from gcc when building for x86:
Error: bad register name `%dil'
Then you try the same on x86_64 and the error is gone again. No wonder: As opposed to x86, the x86_64 architecture actually has the %dil register.
At first, this appears to be a compiler bug. After all, the compiler is choosing a register that doesn't exist on x86. On closer look, it's a bug in the example code. The issue is that for the
b argument, the constraint
r is used, indicating that the value should be stored in any general-purpose register. In the first example, this is just fine. All of them will do for 32-bit operations. The second example, on closer examination, actually requires a register whose lower byte is accessible. On x86, there are only four general-purpose registers where this is true: EAX, EBX, ECX and EDX. On x86_64, this is also true for the ESI and EDI registers that are also treated as general-purpose registers on x86.
So what happens is that the compiler correctly chooses the
%edi register, which satisfies the
r constraint. Later, the
xchg instruction is interpreted as referring to two byte-sized values due to the size of the arguments
*a and
b. Thus, the compiler translates the instruction to its 8-bit form and replaces the register placeholder
%0 with the 8-bit form of the
%edi register, which is
%dil. During assembly, this fails because
%dil doesn't actually exist on x86.
If there is a compiler bug, it is only that the error output is misleading. It shouldn't even try to use
%dil, it should warn about the real problem.
The real bug is that in the example source, a byte-sized argument was qualified with a constraint that allowed any general-purpose register to be used, where instead, the set should be constrained to registers whose lower byte is available. In gcc, this can be achieved by using the
q constraint instead of
r.