Archive for the ‘C++’ Category

How to Migrate from HP-UX to Linux: Undefined Behavior

The C and C++ languages don’t try to define the results of every syntactically correct program.  For example, if you dynamically allocate some memory, and then free it, and then try to access the memory you freed, you invoke undefined behavior.  The C and C++ standards don’t specify how the program will respond.

Undefined behavior means that anything can happen, because the compiler is under no constraints.   The traditional formulation is that undefined behavior can make demons fly out your nose.  In practice the consequences are usually less dramatic.

If demons ever fly out your nose, you’ll know you have a bug.  You can track it down and fix it.  More insidious is undefined behavior that happens to be exactly what you want.

I ran across an example as I was preparing to port some code from HP-UX to Linux.  The program was freeing a linked list, using code similar to the following:

 

Node * curr_node = first_node;

while( curr_node )

{

free( curr_node );

curr_node = curr_node->next;

}

 

To a long-time C coder, this code immediately looks fishy, because the loop has only two statements in it.  Look a little closer.  The second statement in the loop tries to access memory through a pointer that has already been freed.  It invokes undefined behavior.

This program has been running for years with no obvious ill effects from this bug.  Apparently HP-UX isn’t very persnickety about accessing previously freed memory.  That’s legal.  “Anything can happen” includes “what you want.”

When I first saw this code, I wasn’t ready to port the entire program to Linux yet, but I could experiment.  I dashed off a little test program that built a linked list and then freed it, using the logic shown above.  Under HP-UX this program ran to completion without incident.  Under Linux, the same program stopped abruptly in the first iteration.  It didn’t issue any messages, dump core, or even leave a non-zero condition code; it just stopped cold.  That’s legal too.  Anything can happen.

I don’t know whether this difference is attributable to the operating systems, the compilers, the libraries, or the machine architectures.  I don’t care.  What matters is that I can’t run this program under Linux without fixing the loop:

 

Node * curr_node = first_node;

while( curr_node )

{

Node * temp = curr_node->next;

free( curr_node );

curr_node = temp;

}

 

A few days later, another example of undefined behavior popped up, in the form of a buffer overflow.  Under HP-UX the overflow had no visible effect, at least not until I started poking around with printf statements.  Under Linux the code just didn’t work.  Probably the variables are arranged differently in memory.  In HP-UX the overflow didn’t damage anything that mattered, and in Linux it did.

These examples are just things that I stumbled across.  There will be more, and I won’t catch them all so painlessly.  Fancy code analyzers may help catch things in advance, but there is no substitute for vigilance.

It’s tempting to conclude that HP-UX is more forgiving of blunders than Linux is, since some things work in HP-UX but don’t work in Linux.  That conclusion is premature.  Maybe the two platforms are just forgiving about different things.  If the bugs had done obvious damage under HP-UX they would have been fixed already.

There’s a Darwinian process at work here.  Bugs survive when they’re well adapted to the environment.  When the environment changes, some of those bugs will go extinct.  Unfortunately, new species will probably replace them.