Why safe languages are the best way to achieve memory safety

August 14, 2015 ∞

I have argued that out of the infinite number of bugs in the world, memory safety bugs are the important bugs. They aren’t the only serious bugs in the world, but today they are overwhelmingly the ones that are being exploited to achieve complete control over a target (remote code exploitation).

There are many ways to mitigate memory safety bugs, but I claim that the best way to address them is to abandon unsafe languages like C/C++ and instead use safe programming languages. I’d like to expand on that argument here.

Memory safety

Consider the prototypical program that contains a memory safety bug:

x[y] = z

We expect x to be a pointer to an array and y to be an in-bounds index to the array.

Memory safety is violated when either of these conditions fail. For example, if x is not a pointer to an array but, say, a pointer into the machine code of the program, then this array update might be overwriting the program—a classic attack.

On the other hand, if y is an out-of-bounds index to the array, then the array update is not actually updating the array. It is updating some other part of the memory space of the process (possibly code, possibly data). One typical way this happens is through an off-by-one error. Another is an integer overflow—here the index y is a signed, fixed-width integer, and repeated increments eventually turn it negative. Once again, this can be used by attackers to take control of the program.

To know that x[y] is memory safe, then, we need to know that x points at an array and that whatever arithmetic has been used to calculate y is in-bounds. This in itself should be a warning flag. Understanding whether the arithmetic is correct in all cases is mathematically intractable, and an undecidable problem. Humans writing this code in an unsafe language are very unlikely to get this right in all cases.

The way that a safe programming language handles this is to make sure that whenever a program needs to perform an array access, you have enough information on hand to check that the access is safe. In the case of simple arithmetic, the compiler may have enough mathematical knowledge to guarantee this at compile time (this is ideal, it means that the array access can execute with no overhead compared to C). In the general case, you make sure to have the bounds of x available, and you check that y is in-bounds at run time.

Remember that I am speaking from the perspective of the safe language, so when I say “you make sure,” what I mean is that the compiler makes sure that the array access is safe. And that is the first reason that safe languages are the best solution to memory safety.

All programs in safe languages are safe

The interpreter or compiler of a safe language ensures that every program is checked for safety. Potentially unsafe programs are rejected out of hand (e.g., the compiler reports an error). In cases where safety cannot be guaranteed statically, the compiler or interpreter can insert dynamic checks that halt the program (or throw an exception) just before an unsafe memory access would otherwise occur.

Again, this applies to all programs. This includes programs written by newbies and by wizards, and tiny programs as well as enormous ones.

In unsafe languages like C/C++, it is of course possible to write safe programs. This is more likely when the program is small, and the programmer is an expert.

Nowadays, however, programs are not small; they are multi-million line programs written by teams of programmers. Most programmers are also not as smart as Dennis Ritchie, and even Dennis dumped core.

There are very few programmers that I would trust to write 50,000 lines of C without a memory safety bug. Actually the number is 0.

On the other hand, I’m confident that any team of programmers can write a very large program in a safe language without ever suffering a memory safety exploit. There will be out-of-bounds array accesses written in the program, but they will not execute at run time.

Ensuring the correctness of compilers

As you can see, the safety of safe languages comes from their compiler. If the compiler is implemented correctly, then we are in a very strong position: all programs that pass the compiler are safe.

But if the compiler is not implemented correctly, then we are in an extremely dangerous position: a great many of the programs (maybe all!) that pass the compiler may be unsafe. (This is not a weakness of safe languages—a buggy compiler can compile safe programs written in an unsafe language into unsafe programs.)

A safe language and its compiler comprise a large system and therefore are likely to include bugs. To a certain extent this can be addressed by putting your best team of programmers on the project (be sure to pay them well :-), but this does not ensure correctness.

For that, we need math.

The behavior of a program written in the safe language can be expressed mathematically through a semantics of the language. The safety of the program can be proven mathematically, typically through a type safety theorem. This proof can be verified by machine—any complex proof is made up of many simple steps that can be checked with a very high degree of assurance by a very small (therefore, likely to be correct) proof-checking program.

Mathematics, the oldest branch of human knowledge, can be boiled down to a very small number of axioms from which all else follows; say, the 6 or 7 axioms of Zermelo-Fraenkel set theory. This makes a mechanically checked proof of a mathematic statement a “gold standard” in assurance.

This gives us a first step: the safety of a language can be proven.

Now we come to the correctness of the compiler itself. We are concerned that a bug in the compiler will cause the compiler to output an unsafe program. Once again, this can be addressed through mathematics, in several ways.

First, the correctness of the compiler can be stated mathematically and mechanically checked. This is the most difficult way, but it has been done.

Second, we could use a certifying compiler. The compiler could be built to produce not only an executable program, but also a proof that the executable program computes the input program correctly. This proof might be incorrect—since it is computed by the fallible compiler—but we will insist on mechanically checking the proof of each compiler output. If the proof does not check, then we know there is a bug in the compiler—either the output program is incorrect (including, possibly, a memory safety bug), or the compiler has constructed an incorrect proof. In either case, we do not let the program execute.

Third, we could compile programs into a safe assembly language/machine code. This is a conventional assembly language (say, a subset of x86) extended with enough type annotations that program safety can be mechanically checked. These programs can be executed directly on hardware simply by removing their annotations; in other words, the compiler for a typed assembly language is essentially a no-op, and so its correctness is well-assured.

In contrast to the first two methods, compiling to typed assembly language does not, by itself, guarantee that the output program correctly computes the original source program; however, the output program is guaranteed to be safe.

Degrees of math, degrees of safety

Mathematics can be used to varying degrees in compilers. This is good, because it means that we can start small, using less math (but allowing more bugs), and work our way up to a bullet-proof system. Even a little bit of math produces dividends.

I have built compilers for languages that were designed to be safe, but not mechanically checked (all the math was done by hand in designing the language specification; none of the other mathematical techniques described above were applied). In my experience, even this minimal level of mathematical assurance makes programs written in the language an order of magnitude more safe than programs written in C. Anyone who has programmed in Java and C has some idea of the difference. (This is not to say that these compilers are completely safe; the history of Java shows that this is not so.)

I have also built compilers with better mathematical assurance: they compile into a typed assembly language. In my experience, this gives another order of magnitude of safety to programming. I’m very comfortable stating that this eliminates memory safety bugs, except those caused by hardware failures (see below) or linking in libraries written in unsafe languages.

Of course there are many others who have studied these things and come to the same conclusion. There are even those who have measured these things. John Regehr has done some useful work along these lines. For example, here he talks about just how good the CompCert verified compiler is:

Although we love compilers, and we are on friendly terms with as many compiler development teams as possible, when we put on our testing hats, there is only one goal: break the compiler. This is how testing has to work; it is an inherently adversarial activity. From this point of view, CompCert can be viewed as an affront. It spits in the face of our tool which is designed to create maximally evil C code, and which had found silent, previously-unknown wrong code bugs in every single C compiler we have tested, including a number that are routinely used for creating safety-critical embedded systems. So what happens when our unstoppable tool meets an immovable wall of proofs? This is what we set out to determine.

I expected that we would find some flaws in CompCert’s formalisms. My meager experience in formal specification is that getting a large specification right is heroically difficult — perhaps not much easier than getting a complicated software system right in the first place. So far, however, we have found no errors in the CompCert PowerPC specification, their C-light specification, or in their proof strategy. This is pretty amazing and my opinion of the job done by the CompCert team is astronomically high.

CompCert is not perfect (see Regehr’s writings for his opinion) but it is a real-world example of just how well these techniques can work.

Hardware attacks

One of the ways that a proof can fail is when the assumptions of the proof do not match reality. Any compiler proof will include assumptions about how the hardware works, and an incorrect assumption here can lead directly to a memory safety bug.

There is one kind of behavior that I have never seen addressed correctly in any hardware specification: hardware faults such as bit-flips caused by cosmic rays. This makes sense, because a specification that allows any bit of memory to be randomly flipped at any time would not be a terribly useful specification.

Unfortunately, these kinds of faults can be induced by attackers (e.g., Rowhammer), and when they occur, they can cause memory safety violations in programs written in safe languages. The reason that this is possible is that the attackers have caused the hardware to perform outside of the hardware specification used in the proof of safety—the proof does not apply to this situation.

I don’t know of any effective protection against these attacks. That’s a research topic.

It’s the math, stupid

This post is too long to start a discussion of alternative approaches to memory safety. For now, let me just summarize that the reason safe programming languages are the best solution for memory safety is that they protect all programs written in the language, and they do this with mathematical assurance. Moreover, they permit a pay-as-you go deployment strategy, where it is easy to get a high level of safety, and this level of safety can be strengthened arbitrarily through increased use of mathematics.

None of the other techniques that I’m aware of come close to this level of assurance.