Program Summary
Social media, sensor feeds, and scientific studies generate large amounts of valuable data. However, understanding the relationships among this data can be challenging. Graph analytics has emerged as an approach by which analysts can efficiently examine the structure of the large networks produced from these data sources and draw conclusions from the observed patterns. By understanding the complex relationships both within and between data sources, a more complete picture of the analysis problem can be understood. With lessons learned from innovations in the expanding realm of deep neural networks, the Hierarchical Identify Verify Exploit (HIVE) program seeks to advance the arena of graph analytics.
The HIVE program is looking to build a graph analytics processor that can process streaming graphs 1000X faster and at much lower power than current processing technology. If successful, the program will enable graph analytics techniques powerful enough to solve tough challenges in cyber security, infrastructure monitoring and other areas of national interest. Graph analytic processing that currently requires racks of servers could become practical in tactical situations to support front-line decision making. What ’s more, these advanced graph analytics servers could have the power to analyze the billion- and trillion-edge graphs that will be generated by the Internet of Things, ever-expanding social networks, and future sensor networks.
In parallel with the hardware development of a HIVE processor, DARPA is working with MIT Lincoln Laboratory and Amazon Web Services (AWS) to host the HIVE Graph Challenge with the goal of developing a trillion-edge dataset. This freely available dataset will spur innovative software and hardware solutions in the broader graph analysis community that will contribute to the HIVE program.
The overall objective is to accelerate innovation in graph analytics to open new pathways for meeting the challenge of understanding an ever-increasing torrent of data. The HIVE program features two primary challenges:
- The first is a static graph problem focused on sub-graph Isomorphism. This task is to further the ability to search a large graph in order to identify a particular subsection of that graph.
- The second is a dynamic graph problem focused on trying to find optimal clusters of data within the graph.
Both challenges will include a small graph problem in the billions of nodes and a large graph problem in the trillions of nodes.