Parsing is the weakest link
August 9, 2012  

In any software program, the parser is the first line of defense against malicious adversaries. An attacker can only compromise your system by interacting with it, and that means sending it input, which is processed first by your parser. The parser is therefore the ideal place to filter out bad inputs, before they reach the rest of your program.

The flip side is that the parser is also the part of your program most exposed to attacks. This makes it easy for attackers to find parser bugs and exploit them. Unfortunately, history shows that parser bugs are common. I believe that the great majority of buffer overflows that have been found and exploited by attackers are in parsers. In other words, parsing is the weakest link in software security.

Take the latest batch of Cisco vulnerabilities. This batch, for July 2012, lists ten vulnerabilities, each of which could lead to remote code execution or denial of service.

Nine out of the ten vulnerabilities are due to improper handling of “malformed” inputs. I do not have access to examples of the malformed inputs or the (proprietary) code, so I cannot be certain that these are parser bugs. However, the use of the word “malformed” is a red flag. A malformed input should not be processed by the program, and the parser is the logical place to reject it—it should be rejected as soon as possible. And if the parser is not rejecting it, odds are that the parser itself is assuming that the input is well-formed, which often leads to buffer overflows, which are the most common way to achieve remote code execution or denial of service.

The last of the ten vulnerabilities can be exploited by a script injection attack. Script injection is also closely related to parsing, as I’ve shown elsewhere.

I’m not the only one who has remarked on the importance of parsing in security. However, there is very little research directed at improving parser security compared with the enormous volume of research into buffer overflows, even though most buffer overflow vulnerabilties that are actually exploited occur in parsers.