Data Analysis at Massive Scales

Extracting information and insights from massive datasets; "big data"; "data mining"

Social media, sensor feeds, and scientific studies generate large amounts of valuable data. However, understanding the relationships among this data can be challenging. Graph analytics has emerged as an approach by which analysts can efficiently examine the structure of the large networks produced from these data sources and draw conclusions from the observed patterns.
Rapid comprehension of world events is essential for informing U.S. national security - a task that becomes more difficult as the amount of unstructured, multimedia information grows exponentially. Humans make sense of events by organizing them into narrative structures that occur frequently. These structures are abstracted into schemas, which are organized units of knowledge that represent a pattern of memory used in human cognition.
In supervised machine learning (ML), the ML system learns by example to recognize things, such as objects in images or speech. Humans provide these examples to ML systems during their training in the form of labeled data. With enough labeled data, we can generally build accurate pattern recognition models.
The Department of Defense (DoD)’s Joint Logistics Enterprise, which spans both supply chain and logistics operations, provides the means to muster, transport, and sustain military power anywhere in the world at a high level of readiness.
The U.S. Government operates globally and frequently encounters so-called “low-resource” languages for which no automated human language technology capability exists. Historically, development of technology for automated exploitation of foreign language materials has required protracted effort and a large data investment. Current methods can require multiple years and tens of millions of dollars per language—mostly to construct translated or transcribed corpora.