Friday, December 28. 2007
O_DIRECT Posted by Daniel Fischer
in Operating Systems at
11:03
Comments (0) Trackbacks (0) Defined tags for this entry: linux
O_DIRECT
On UNIXoid operating systems, you can open(2) a file in many modes. On some operating systems, one of them is O_DIRECT, which stands for direct i/o without any caching. To sum it all up:
The whole notion of "direct IO" is totally braindamaged. Just say no. --Linus Torvalds Accessing files that were opened with O_DIRECT requires aligned buffers for reading and writing. For example, on Linux 2.6, all buffers must be aligned to 512 bytes and reads and writes can only happen in multiples of 512 bytes. It's fairly easy to align your data to 512 bytes, though. On Linux 2.4, on the other hand, buffers have to be aligned to multiples of the underlying file system's logical block size - which is generally much larger than 512 bytes. Size also has to be a multiple of the block size. It's not so easy to solve this generally. It's also often forgotten because nobody wants to use Linux 2.4 anymore, at least not for development work. We ran into this problem at least three times in three different places in 2007. You can tell that I work for a database company. Saturday, December 22. 2007
socklen_t confusion Posted by Daniel Fischer
in Operating Systems at
14:22
Comments (2) Trackbacks (0) socklen_t confusion
The BSD socket API (accept, bind, and so on) uses a struct sockaddr to pass socket addresses. Additionally, there's a parameter for passing the size of the memory block allocated for a struct sockaddr to the socket functions. This argument is passed as a pointer to the actual size, and upon completion of the API call, will contain the actual length of the address stored in the memory block instead of its size. In the original BSD API, this argument was of type int *.
At one point, a draft of the POSIX.1g standard defined this to be size_t *. This was a bit broken because size_t is usually not the same type as int on 64-bit platforms, the re-definition thus resulting in an unintended and incompatible change. The short of it is that people complained, and the type was changed again. Instead of reverting to int *, however, a new type socklen_t was introduced. This new type is defined to be the same as int on most platforms. As Linus Torvalds puts it, _Any_ sane library _must_ have "socklen_t" be the same size as int. Anything else breaks any BSD socket layer stuff. POSIX initially did make it a size_t, and I (and hopefully others, but obviously not too many) complained to them very loudly indeed. Making it a size_t is completely broken, exactly because size_t very seldom is the same size as "int" on 64-bit architectures, for example. And it has to be the same size as "int" because that's what the BSD socket interface is. (Quote taken from man 2 accept on Linux.) So for a small period of time, operating system vendors were preparing for POSIX.1g as it was known back then and started to use size_t instead of int. As the draft was changed to use socklen_t, and later SUSv2 included socklen_t, they mostly started to use socklen_t and defined it to int. Some defined it to be the same type as size_t. One example that is mentioned in Linux' man 2 accept is SunOS 5. This includes the current release of Solaris, being based on SunOS 5.10. However, that's not a big portability issue. It's broken as described above, but code that uses socklen_t in all places should be just fine. On HP-UX, you get the worst of both worlds. On the one hand, you can use the old BSD API with pointers to int. On the other hand, you can define _XOPEN_SOURCE_EXTENDED and get the new API with socklen_t. However, if you don't define _XOPEN_SOURCE_EXTENDED, you still get a definition of socklen_t to size_t. The type exists but is completely worthless as it can't be used with the socket API, which expects int. I've actually seen code that fell over this in both possible ways, using int * in one place and socklen_t * in another... Wednesday, December 19. 2007
UNIX domain sockets Posted by Daniel Fischer
in Operating Systems at
12:05
Comments (0) Trackbacks (0) UNIX domain sockets
Unlike sockets in other domains, sockets in the UNIX domain are visible in the file system. Because of this, they're sometimes confused with regular files. When they're not being confused with regular files, their specific restrictions that don't apply to other files are still easily forgotten. One such restriction of a UNIX domain socket is the length of its name.
Path names can be rather long on UNIX-like systems these days. Gone are the days of file names limited to 14 characters and for path names, POSIX-compliant operating systems generally support up to 256 characters. On many platforms, path names can be even longer than that. For example, PATH_MAX is 1024 on Mac OS X and 4096 on GNU/Linux. In contrast, the full path name of a UNIX domain socket must fit into a struct sockaddr_un. Its component sun_path generally has much less room for the socket's name than 1024 characters. Typical numbers are 108 (Linux, Solaris, Cygwin), 104 (AIX, BSDs, Mac OS X), and 92 (HP-UX). At some point, it was only 14 in Interix 5.2. This means that, while UNIX domain sockets do appear as if they were files, they can't be placed in an arbitrary location in the file system. Now, imagine an automated testing process that can test multiple instances of the software at a time, and keeps all files relevant to one test run within one directory. Path names can easily become longer than 92 characters in a scenario like this. In one case, this happened in a system where the name of one such instance's directory didn't have a constant length and occasionally brought the complete path to more than 92 characters, causing random total failures on HP-UX. Monday, December 17. 2007Denormal Numbers
In school, we learned that x - y = 0 is true if, and only if, x = y. In computing, we learned that we can only store so many accurate digits in a fixed-size register. For example, a register that is 8 bits wide can store exactly 256 different values. If we want to represent negative and positive values, there are a number of ways to express that, but we'll still only get 256 different individual values. The way computers typically do it will give us values from -128 to +127.
Now we might want to represent real numbers like 1.5. There's no way to get more than 256 different values from 8 bits, but we could agree that there's a decimal point, and it is always before the last digit. Instead of -128 to +127, we get -12.8 to 12.7. We traded in range for accuracy. Sometimes, range is more important than accuracy. In such a case, we could pretend that the decimal point is always one digit to the right of the last digit we store, giving us a range from -1280 to 1270. Our range got boosted, but we lost accuracy: We can no longer store numbers like 1234, even though it is in the range of our type. This is adequately refered to as fixed point arithmetics. Now, how do we get both range, and accuracy? Instead of putting the decimal point in a fixed location, we could instead store its position together with the actual number. This is called floating point arithmetics. For example, we could say that we use 2 bits for the position of the decimal point, and 6 bits for the actual number. The way computers really do it is a bit more complex and is defined in IEEE 754. IEEE 754 contains one detail that might not be obvious. Floating point numbers are represented by a sign bit, an exponent to a fixed and previously agreed-upon base, and a mantissa. However, the mantissa isn't just a number that is shifted left or right based on the exponent. Instead, it is defined that the mantissa is a number between including 1, and excluding 2. Since this means there's always one digit before the decimal point that can only be 1, the part that is stored is only the digits after the decimal point. A number stored like this is called a normal number. For simplicity, let's assume a similar system based on base 10. Let's say we can store the sign, exponents from -2 to +2, and we have room for a mantissa of three digits. This will let us express numbers from -1.999 * 10 ^ -2 to + 1.999 * 10 ^ 2. We can express x = 0.012 as 1.2 * 10 ^ -2 and y = 0.0105 as 1.05 * 10 ^ -2. However, we can't represent x - y = 0.0015! Normalising it and writing it as 1.5 * 10 ^ -3 fails to satisfy the condition we agreed upon previously that we would only have exponents from -2 to +2. We'll have to live with it and just accept that the result of this computation is smaller than the smallest allowed number that we can represent, and replace it with zero. But x and y aren't equal! This is a terrible situation for scientists, and thus, a solution was quickly found. Basically, it goes like this: We sacrifice one of our possible exponent values, and use it instead to indicate that the number we represent isn't normalised as agreed upon before, and smaller than the smallest possible normalised number. Such a number is called a denormal number. You might wonder where the lesson is - after all, the problem seems to be solved. By introducing denormals, we fixed our condition from the first paragraph, and now it holds again for all x and y that can be represented individually. In theory, all major platforms of today support denormals. In practice, many CPUs don't handle them in hardware, but instead trap to some software implementation. Implementing floating point operations in software can be rather slow. Programmers, however, don't like their programs running slowly and tell compilers to optimise. Compilers for CPUs that trap when denormals occur know that they're slow, and therefore, optimise by turning of denormals altogether. Instead, results of calculations that can't be normalised are flushed to zero. For example, on Itanium CPUs, the x - y operation can be slower by orders of magnitude if the result is a denormal as compared to the same operation when the result can be expressed as a normal number. Here's some (rather simple) code to try this if you have access to an Itanium box. Compile with gcc, without optimisation, once with -DSMALL, once without, and then compare the run time. It's not a proper benchmark, but should be sufficient to show the problem. 1 #include <stdio.h> 2 #include <math.h> 3 4 int main() { 5 double x, *y; 6 y = &x; 7 8 #ifdef SMALL 9 double a = pow(2,-1022) + pow(2,-1023); 10 double b = pow(2,-1022); 11 #else 12 double a = pow(2,-122) + pow(2,-123); 13 double b = pow(2,-122); 14 #endif 15 16 int i = 0; 17 for(i=0; i<10000000; ++i) { 18 *y = a - b; 19 } 20 21 return 0; 22 } However, the vendor compiler for Itanium, Intel's icc, knows about this. When you use -O3 with icc on ia64, it enables flush to zero mode, which results in all denormal results to be flushed to zero. Use the same compiler on x86_64 and it won't, because x86_64 can handle denormals faster. It's still measurable, but not as much of a problem. I saw this problem in a test case that expected a denormal number as a result, and therefore failed (only) on ia64 with optimisation. It's still possible to get icc to optimise without enabling flush to zero mode, by specifically disabling it with the -no-ftz flag. Sunday, December 16. 2007sizeof(long)
It should be an offense to rely on the size of a given type in C to be the same across different platforms. Still, certain assumptions appear to be fairly common. One of them is the value of sizeof(long). I think we've gotten over the idea that long and int are the same size now that 64 bit platforms are becoming more and more common. However, occassionally I still encounter a similar misconception: That the sizes of long and any pointer type are the same.
1 #include <stdio.h> 2 #define S(X) printf(#X " %d\n", sizeof(X)) 3 4 int main() { 5 S(int *); 6 S(long); 7 return 0; 8 } This snippet will tell you that both a long and a pointer to int are of size 4 on most 32 bit platforms, and that both are of size 8 on most 64 bit platforms. There is one notable platform where this isn't the case. When compiled with Visual Studio's cl.exe on 64 bit Windows, the size of the pointer will be 8, but the size of the long will be 4. According to the C standard, this is perfectly legal. In reality, I've seen variables of type long used to store pointers, or anything else that fits into a long on one of the other platforms. Please stop doing that, it's wrong, and it will break on Windows. Saturday, December 15. 2007
transparent_union Posted by Daniel Fischer
in Operating Systems at
14:32
Comments (4) Trackbacks (0) transparent_union
Let's say you're writing a function that accepts a pointer to an int.
1 void do_stuff_with_int(int *p) { 2 /* ... */ 3 } Later, you notice that the doing stuff operation doesn't depend on signedness at all, and decide that you want to use your function with unsigned int too. Just passing a pointer to an unsigned int will cause a warning. How would you get rid of it in a clean way? Easy, use a union. 1 typedef union { 2 int *si; 3 unsigned int *ui; 4 } signed_or_unsigned_int_pointer; Now you have a type that can both contain a pointer to an int, or a pointer to an unsigned int. Wouldn't it be convenient if you could use your shiny new type to indicate that do_stuff_with_int() really can accept both types? That's where gcc's transparent_union attribute comes in. (Some other compilers also support this.) 1 typedef union { 2 int *si; 3 unsigned int *ui; 4 } signed_or_unsigned_int_pointer 5 __attribute__((transparent_union)); 6 7 void do_stuff_with_int(signed_or_unsigned_int_pointer p) { 8 /* ... */ 9 } 10 11 int main() { 12 int a = 0; 13 unsigned int b = 0; 14 15 do_stuff_with_int(&a); 16 do_stuff_with_int(&b); 17 18 return 0; 19 } Now, what's the catch? You can only use transparent_union with unions that contain only types with the same representation. In other words, they need to be the same size. You can't, for example, use it with a 32-bit int and a 64-bit integer, in which case gcc will generate a warning. And lots of errors later on, because you're not passing unions to your function, but any one of their constituent types, which isn't allowed without transparent_union. And now, the reason why this post is related to portability: 1 typedef union { 2 int an_int; 3 long a_long; 4 } int_union 5 __attribute__((transparent_union)); The transparent_union attribute will be ignored on most 64 bit platforms. Treating int and long as the same type is a beginner's mistake. But what about this code? 1 typedef union { 2 int volatile *i; 3 unsigned int volatile *u; 4 } int_union 5 __attribute__((transparent_union)); Should work, shouldn't it? Turns out it doesn't. That is, it works in most places. It even works on Mac OS X. Unless you build for the 64-bit PowerPC CPU. In that case, the attribute is ignored, even though the two types are still the same size. This only seems to happen on ppc64. On all three other architectures currently supported by Mac OS X, it works. 1 $ gcc -c t.c -arch ppc 2 $ gcc -c t.c -arch ppc64 3 t.c:6: warning: 'transparent_union' attribute ignored 4 $ gcc -c t.c -arch i386 5 $ gcc -c t.c -arch x86_64 Saturday, December 15. 2007
Welcome to the Portability Blog! Posted by Daniel Fischer
in Miscellany at
14:26
Comments (0) Trackbacks (0) Welcome to the Portability Blog!
Hello there. I'm Danny, and also, going to use this blog to post stories about all the minor and not so minor annoyances in building software on different platforms from the same code base. If you think that doesn't sound like a lot of fun, you're probably not the type of person that enjoys digging into the build process and stepping through each individual stage between "source file" and "binary". I'm not saying that I enjoy it myself, but it's part of what I do for a living. And maybe I do actually enjoy it a little, too
|
CalendarQuicksearchtagsArchivesCategoriesSyndicate This BlogBlog AdministrationImprintAs required by German federal law, contact details and imprint for this web site.
![]() Blog posts are licensed under a Creative Commons Attribution-Share Alike 2.0 Germany License. |

