During the 1854 cholera epidemic in London, Dr. John Snow plotted cholera deaths on a map, and in the corner of a particularly hard-hit quadrangle of buildings was a water pump. Snow's maps, a 19th-century version of big data, suggested an association between cholera and the pump, but the germ theory of disease had not yet been invented and it took human ingenuity to realize that the pump was a causal mechanism of disease transmission.
Understanding local languages is essential for effective situational awareness in military operations, and particularly in humanitarian assistance and disaster relief efforts that require immediate and close coordination with local communities. With more than 7,000 languages spoken worldwide, however, the U.S. military frequently encounters languages for which translators are rare and no automated translation capabilities exist. DARPA’s Low Resource Languages for Emergent Incidents (LORELEI) program aims to change this state of affairs by providing real-time essential information in any language to support emergent missions such as humanitarian assistance/disaster relief, peacekeeping and infectious disease response. The program recently awarded Phase 1 contracts to 13 organizations.
Popular search engines are great at finding answers for point-of-fact questions like the elevation of Mount Everest or current movies running at local theaters. They are not, however, very good at answering what-if or predictive questions—questions that depend on multiple variables, such as “What influences the stock market?” or “What are the major drivers of environmental stability?” In many cases that shortcoming is not for lack of relevant data. Rather, what’s missing are empirical models of complex processes that influence the behavior and impact of those data elements.
Some of the systems that matter most to the Defense Department are very complicated. Ecosystems, brains and economic and social systems have many parts and processes, but they are studied piecewise, and their literatures and data are fragmented, distributed and inconsistent. It is difficult to build complete, explanatory models of complicated systems, and so effects in these systems that are brought about by many interacting factors are poorly understood.
Understanding the complex and increasingly data-intensive world around us relies on the construction of robust empirical models, i.e., representations of real, complex systems that enable decision makers to predict behaviors and answer “what-if” questions. Today, construction of complex empirical models is largely a manual process requiring a team of subject matter experts and data scientists.