Defense Advanced Research Projects AgencyTagged Content List

Data Analysis at Massive Scales

Extracting information and insights from massive datasets; "big data"; "data mining"

Showing 16 results for Data + Language RSS
The U.S. Government operates globally and frequently encounters so-called “low-resource” languages for which no automated human language technology capability exists. Historically, development of technology for automated exploitation of foreign language materials has required protracted effort and a large data investment. Current methods can require multiple years and tens of millions of dollars per language—mostly to construct translated or transcribed corpora.
Warfighters encounter foreign language images in many forms, including captured paper documents and computer files. Given the quantity of foreign-language material and the scarcity of linguists, military personnel and analysts can find it difficult to identify, translate and interpret important information in a timely fashion. What these personnel and analysts have lacked to date is the capability to automatically and rapidly convert foreign-language text images into English transcripts that provide relevant, distilled and actionable information.
Program Manager
Dr. Boyan Onyshkevych joined I2O as a program manager in 2013. His research interests include human language technologies and knowledge-based systems applied to the areas of information extraction, language understanding and semantic computing.
Program Manager
Dr. William Corvey joined DARPA as a program manager in the Information Innovation Office (I2O) in June 2020 to develop, execute, and transition programs in language processing.
05/13/2020
I2O Thrust Areas