Summary
Extracting symbolic representations of software’s algorithmic parts, such as control laws for a physical process encoded in a cyber-physical system, currently requires fully manual analysis by highly specialized experts.
There is no mechanized capability to translate and route relevant parts of the software to experts, such as control engineers, in an appropriate form for them to effectively analyze the mathematical expressions. In contrast, malware analysis has become considerably automated with aspects, such as provenance and behavioral characterization, gaining considerable traction in recent years.
Recent research demonstrates effective use of artificial intelligence (AI) translation techniques for tasks like mechanized symbolic differentiation and integration, in which going in one direction is relatively straightforward. Inverse problems like symbolic integration, however, are more challenging and require expert ingenuity in repeated recognition and application of patterns.
Although programming mathematical models is not as straightforward as symbolic differentiation, it is still a relatively straightforward skill with massive quantities of available examples and broadly taught practices. Conversely, recovering a mathematical description of software, even when the function of the software is known, has defied automation.
The Recovery of Symbolic Mathematics from Code (ReMath) Artificial Intelligence Exploration (AIE) program aims to discover whether a combination of recent advances in AI techniques, such as neural machine translation, sequence-to-sequence encoders, etc., can effectively recover mathematical structures implemented in software into their natural mathematical forms of symbolic expression. These techniques could improve the understanding of complex software and may enable future methods for analysis and testing of cyber physical systems.
This program is now complete
This content is available for reference purposes. This page is no longer maintained.