Today, code for input data validation is typically written manually in an ad-hoc manner. For commonly-used electronic data formats, input validation is, at a minimum, a problem of scale whereby specifications of these formats comprise hundreds to thousands of pages. Input validation thus translates to thousands or more conditions to be checked against the input data before the data can be safely processed. Manually writing the code to parse and validate input, and then manually auditing whether that code implements all the necessary checks completely and correctly, does not scale. Moreover, manual parser coding and auditing typically fails even for electronic data formats specifically designed to be easier to perform such tasks, e.g., JSON and XML. A variety of critical vulnerabilities have been found in major parser implementations for these formats.
Widely deployed mitigations against crafted input attacks include (a) trying to prevent the flow of untrusted data to vulnerable software; and (b) testing software with randomized inputs to find and patch flaws that could be triggered by maliciously created inputs. Unfortunately, neither of these approaches offer security assurance guarantees.
Mitigations for preventing the flow of untrusted data to vulnerable software, which can be implemented via network or host-based measures such as firewalls, application proxies, antivirus scanners, etc., neither remove the underlying vulnerability from the target, nor encode complete knowledge of document or message format internals. Attacker bypasses of such mitigations exploit incompleteness of the mitigations' understanding of the data format to exploit the still-vulnerable targets.
The Safe Documents (SafeDocs) program will develop novel verified programming methodologies for building high assurance parsers for extant electronic data formats, and novel methodologies for comprehending, simplifying, and reducing these formats to their safe, unambiguous, verification-friendly subsets (“safe sub-setting”). SafeDocs will address the ambiguity and complexity obstacles that hinder the application of verified programming posed by extant electronic data formats. SafeDocs’ multi-pronged approach will combine:
The parser construction kits developed by SafeDocs will be usable by industry programmers who understand the syntax of electronic data formats but lack the theoretical background in verified programming. These tools will enable developers to construct verifiable parsers for new electronic data formats as well as extant ones. The tools will guide the syntactic design of new formats, by making verification-friendly format syntax easy to express, and vice versa.
Additional information is available in the SafeDocs BAA.
You are now leaving the DARPA.mil website that is under the control and
management of DARPA. The appearance of hyperlinks does not constitute
endorsement by DARPA of non-U.S. Government sites or the information,
products, or services contained therein. Although DARPA may or may not
use these sites as additional distribution channels for Department of
Defense information, it does not exercise editorial control over all of
the information that you may find at these locations. Such links are
provided consistent with the stated purpose of this website.
After reading this message, click to continue