Program Summary
The Synergistic Discovery and Design (SD2) program aims to develop data-driven methods to accelerate scientific discovery and robust design in domains that lack complete models. Engineers regularly use high-fidelity simulations to create robust designs in complex domains such as aeronautics, automobiles, and integrated circuits. In contrast, robust design remains elusive in domains such as synthetic biology, neuro-computation, and polymer chemistry due to the lack of high-fidelity models. SD2 seeks to develop tools to enable robust design despite the lack of complete scientific models.
Examples of complex systems where inventors lack complete scientific models to support their design efforts include biological systems that have millions of protein-metabolite interactions, neuro-processes that require computations across billions of neurons, and advanced materials influenced by millions of monomer-protein combinations. These systems are part of domains that exhibit millions of unpredictable, interacting components for which robust models do not exist, and internal states are often only partially observable. In such domains, small perturbations in the environment can lead to unexpected design failures, and the number of engineering variables required to characterize stable operational envelopes remains unknown.
While domain experts remain geographically dispersed, they collectively analyze hundreds of terabytes of data to build models and refine designs. However, manually-intensive analysis of small datasets remains inefficient and often yields unreproducible results. In response, researchers have begun to outsource high-throughput experiments to automated labs and randomly search constrained parameter spaces for robust designs. However, these random search-based approaches work best with small parameter spaces and provide no insight into why some designs succeed and others fail.
SD2 will address the problem of design in domains that lack complete models by discovering models and refining designs via methods that extract information from data at petabyte scale. SD2 aims to develop data-driven methods to automatically discover models and refine designs in parallel at scale. To ensure realism, challenge problems drawn from cutting edge domains will drive the development of SD2 methods. By the end of the program, SD2 plans to provide new data-driven methods to accelerate discovery and design and create a cloud-based open data exchange for research communities in complex domains.