The Case Against 64-bit Builds
I wrote a version of this some time ago as an argument against what I
considered to be an ill-considered push within the company I worked
for to 'modernize' software on a particular high-availability
high-reliability product line, said push seeming not to take into
account the considerable costs and side-effects of such an action on
the somewhat monolithic products in question. The argument doesn't
necessarily apply well to software in general, but should at
least be considered wherever you have a choice of build size.
Leaving aside esoteric number-crunching applications, where the
actual size of 64-bit integers (by default) is required, or
at least advantageous, the main reason to go to a '64-bit' build is to
gain additional address space. So-called 32-bit builds offer a
maximum 4-GB address space, which is often smaller than this when
realized on a platform. (For performance reasons PPC and Intel Linux
platforms often only give processes 3GB of address space, and MIPS
Linux platforms only give processes 2GB of address space.) It isn't
all that hard anymore to craft a large application that starts bumping
its head on the ceiling in a 32-bit process. Systems these days often
have more than 4GB of RAM, so a single process cannot use all of it
even if it wanted to.
64-bit processes offer the advantage of effectively unlimited address
space, more than is physically possible anyway, and so give you the
advantage of being able to use all the RAM that is available. (The exact
same argument that was used for 32-bit processes not all that
many years ago.)
Paradoxically, this same 'advantage' can also be a pretty
substantial disadvantage! To wit:
- A process is able to use all the RAM that is available.
What if it's not supposed to? Bugs happen. Ulimit quotas can be
used to prevent this, but require more effort to ensure that the
limit is reasonable, and kept up to date as code evolves, or else
the 'cure' will be worse than the disease. Usually all
processes in a 64-bit environment end up being 64-bit processes,
and so should have quotas if ultimate system reliability is to be
ensured. (This isn't really new.)
- You usually don't want to have a single giant process
with a bazillion threads in it. The threads are all vulnerable to
each other, and the larger the pie gets the more likely someone is
to drop it. A large 32-bit multifunction (multithreaded?) process
that is straining at the seams should be chopped up, if possible,
rather than having its address space made larger.
- 32-bit processes that are converted to 64-bit processes are
immediately larger than they were, even with no
functional changes. This is because all default integers and
pointers are twice as large as they were before. With a fixed
amount of RAM in the target system that means you're using what
you have less efficiently than before, and unless you have a task
that itself must access more than 4GB (3GB? 2GB?) of
resources there's no inherent value to the move. Such tasks are
actually fairly rare, statistically.
These less-semantically-dense instructions not only effectively
reduce available RAM, they reduce the effective cache size of the
CPU, and the effective speed of the CPU since the instruction
stream is bigger but the hardware's fetch rate is unchanged. They
also increase the size of a build's on-disk (in-flash) footprint.
(Slower software installation, potentially even prohibiting an
installation due to lack of space.) These effects are all bad.
- Core files can be large. Very large. That makes them
difficult and slow to handle, as they're usually full of a lot of
un-interesting information too. Huge core files are more
likely to be truncated due to a lack of system resources, both
on-DUT and off, and truncated core files are usually useless.
- Though a bad idea, there are often a lot of casts within older
code, many of which assume that a pointer can be jammed into a
U32. Some of these are insidious, and can take a lot of time to
shake out of what is otherwise perfectly functional and reliable
code. More work, for zero functional gain.
- Upgrades from 32-bit to 64-bit builds may cause problems with
saved data structures.
- In-service upgrades (from 32-bit to 64-bit) may not even be possible.
IMHO, the Engineering effort to convert to a 64-bit build might be
better spent on slicing into multiple 32-bit pieces. That would offer
the following real, and potential, advantages:
- Separate processes could be implemented as separate programs,
which opens up the possibility that these could be individually
replaced. (Incremental upgrades? In-service upgrades?
Individual subsystem restarts? Such features would require
additional work, and are not strictly necessary, but do become
possible once the surgery has been done.)
- Malfunctioning subsystems don't necessarily take down other
features. If one crashed the rest of the system could continue on
undisturbed. (At least this would offer the choice to continue
running at reduced functionality until a maintenance window opened
up, if you wanted to implement this.)
- Memory/cache utilization remains as efficient as before, though
there would be some per-process penalties that you didn't have before.
The system would provide the most perceived functionality this way.
- Core files can be smaller! If something crashes you'd
only get the core file from the crashing process, and not all the
data from every other unrelated process that was part your
product. You could afford to keep more core files than
before, and on- and off-DUT resource loads would be minimized.
- Unnecessary to re-write already-functional 32-bit-assuming support
functions, with the attendant risk of collateral damage.
- By considering the necessary interactions among major
components, and designing interfaces to accomplish them rather
than directly reaching out in a large address space, reliability
(and test-ability) is ultimately enhanced. Complexity is the bane
of reliability, and 'wide' interfaces like shared libraries and
direct access to data structures are ultimately more complex, due
to the lack of constraints.