Data Analysis at Massive Scales

Extracting information and insights from massive datasets; "big data"; "data mining"

The Molecular Informatics program brings together a collaborative interdisciplinary community to explore completely new approaches to store and process information with molecules. Chemistry offers an untapped, rich palette of molecular diversity that may yield a vast design space to enable dense data representations and highly versatile computing concepts outside of traditional digital, logic-based approaches.
The purpose of the Multi-Domain Analytics (MDA) program is to enable automated data analysis across networks at different security levels, without manually moving impracticably large amounts of data. Each network contains different sets of data, which must be correlated in order to create a comprehensive context.
Warfighters encounter foreign language images in many forms, including captured paper documents and computer files. Given the quantity of foreign-language material and the scarcity of linguists, military personnel and analysts can find it difficult to identify, translate and interpret important information in a timely fashion. What these personnel and analysts have lacked to date is the capability to automatically and rapidly convert foreign-language text images into English transcripts that provide relevant, distilled and actionable information.
The Physics of Artificial Intelligence (PAI) program is part of a broad DAPRA initiative to develop and apply “Third Wave” AI technologies to sparse data and adversarial spoofing, and that incorporate domain-relevant knowledge through generative contextual and explanatory models.
Machine learning – the ability of computers to understand data, manage results and infer insights from uncertain information – is the force behind many recent revolutions in computing. Email spam filters, smartphone personal assistants and self-driving vehicles are all based on research advances in machine learning. Unfortunately, even as the demand for these capabilities is accelerating, every new application requires a Herculean effort. Teams of hard-to-find experts must build expensive, custom tools that are often painfully slow and can perform unpredictably against large, complex data sets.