Defense Advanced Research Projects AgencyTagged Content List

Analytics for Data at Massive Scales

Extracting information from large data sets

Showing 74 results for Analytics RSS
Social media, sensor feeds, and scientific studies generate large amounts of valuable data. However, understanding the relationships among this data can be challenging. Graph analytics has emerged as an approach by which analysts can efficiently examine the structure of the large networks produced from these data sources and draw conclusions from the observed patterns.
Military intelligence analysts face the monumental and escalating task of analyzing massive volumes of complex data from multiple, diverse sources such as physical sensors, human contacts and contextual databases. These analysts consume and process information from all available sources to provide mission-relevant, timely insights to commanders. To enhance this largely manual process, analysts require more effective and efficient means to receive, correlate, analyze, report and share intelligence.
The Department of Defense’s information technology (IT) infrastructure is made up of a large, complex network of connected local networks comprised of thousands of devices. Cyber defenders must understand and monitor the entire environment to defend it effectively. Toward this end, cyber-defenders work to correlate and understand the information contained in log files, executable files, databases of varying formats, directory structures, communication paths, file and message headers, as well as in the volatile and non-volatile memory of the devices on the network. Meanwhile, adversaries increasingly use targeted attacks that disguise attacks as legitimate actions, making discovery far more difficult. It is within this complicated web of networked systems that cyber defenders must find targeted cyber-attacks.
The U.S. Government operates globally and frequently encounters so-called “low-resource” languages for which no automated human language technology capability exists. Historically, development of technology for automated exploitation of foreign language materials has required protracted effort and a large data investment. Current methods can require multiple years and tens of millions of dollars per language—mostly to construct translated or transcribed corpora.
Historically, the U.S. Government deployed and operated a variety of collection systems that provided imagery with assured integrity. In recent years however, consumer imaging technology (digital cameras, mobile phones, etc.) has become ubiquitous, allowing people the world over to take and share images and video instantaneously.