Defense Advanced Research Projects Agency

Data Analysis at Massive Scales

Extracting information and insights from massive datasets; "big data"; "data mining"

As computing devices become more pervasive, the software systems that control them have become increasingly more complex and sophisticated. Consequently, despite the tremendous resources devoted to making software more robust and resilient, ensuring that programs are correct—especially at scale—remains a difficult and challenging endeavor. Unfortunately, uncaught errors triggered during program execution can lead to potentially crippling security violations, unexpected runtime failure or unintended behavior, all of which can have profound negative consequences on economic productivity, reliability of mission-critical systems, and correct operation of important and sensitive cyber infrastructure.
The goal of the Modeling Adversarial Activity (MAA) program is to develop mathematical and computational techniques for modeling adversarial activity for the purpose of producing high-confidence indications and warnings of efforts to acquire, fabricate, proliferate, and/or deploy weapons of mass terror (WMTs). MAA assumes that an adversary’s WMT activities will result in observable transactions.
The Molecular Informatics program brings together a collaborative interdisciplinary community to explore completely new approaches to store and process information with molecules. Chemistry offers an untapped, rich palette of molecular diversity that may yield a vast design space to enable dense data representations and highly versatile computing concepts outside of traditional digital, logic-based approaches.
Warfighters encounter foreign language images in many forms, including captured paper documents and computer files. Given the quantity of foreign-language material and the scarcity of linguists, military personnel and analysts can find it difficult to identify, translate and interpret important information in a timely fashion. What these personnel and analysts have lacked to date is the capability to automatically and rapidly convert foreign-language text images into English transcripts that provide relevant, distilled and actionable information.
The Physics of Artificial Intelligence (PAI) program is part of a broad DAPRA initiative to develop and apply “Third Wave” AI technologies to sparse data and adversarial spoofing, and that incorporate domain-relevant knowledge through generative contextual and explanatory models.