Biometrics Research Group, Inc. has observed that national security and military applications are driving a large proportion of “Big Data” research spending.
Big Data is a term used to describe large and complex data sets that can provide insightful conclusions when analyzed and visualized in a meaningful way. Conventional database tools do not have capabilities to manage large volumes of unstructured data. The U.S. Government is therefore investing in programs to develop new tools and technologies to manage highly complex data. The basic components of Big Data include hardware, software, services and storage.
Biometrics Research Group estimates that federal agencies spent approximately US$5 billion on Big Data resources in the 2012 fiscal year. We estimate that annual spending will grow to US$6 billion in 2014 and then to US$8 billion by 2017 at a compound annual growth rate of 10 percent. Our industry analysis projects that most of this spending will be directed through the military apparatus of the U.S. government in the near to midterm. Currently, federal agencies are pursuing over 150 Big Data projects involving procurements, grants or related activities.
Reminiscent of the emergence of the Internet, Big Data research is being mainly driven by the military establishment, with over 30 projects led by the U.S. Department of Defense. Specifically, the Defense Advanced Research Projects Agency (DARPA) is leading nine major projects focused on algorithmic improvement, espionage and surveillance. Some DARPA Big Data projects are also attempting to make improvements to natural speech recognition and video and image retrieval systems.
DARPA’s Anomaly Detection at Multiple Scales (ADAMS) program addresses the problem of anomaly detection and characterization in massive data sets. In this context, anomalies in data are intended to cue collection of additional, actionable information in a wide variety of real-world contexts. The initial ADAMS application domain is insider threat detection, in which anomalous actions by an individual are detected against a background of routine network activity.
The Cyber-Insider Threat (CINDER) program seeks to develop novel approaches to detect activities consistent with cyber espionage in military computer networks. As a means to expose hidden operations, CINDER will apply various models of adversary missions to “normal” activity on internal networks. CINDER also aims to increase the accuracy, rate and speed with which cyber threats are detected.
The Insight program addresses key shortfalls in current intelligence, surveillance and reconnaissance systems. Automation and integrated human-machine reasoning enable operators to analyze greater numbers of potential threats ahead of time-sensitive situations. The Insight program aims to develop a resource management system to automatically identify threat networks and irregular warfare operations through the analysis of information from imaging and non-imaging sensors and other sources.
DARPA’s Machine Reading program seeks to realize artificial intelligence applications by developing learning systems that process natural text and insert the resulting semantic representation into a knowledge base rather than relying on expensive and time-consuming current processes for knowledge representation that require expert and associated-knowledge engineers to hand craft information.
The Mind’s Eye program seeks to develop a capability for “visual intelligence” in machines. Whereas traditional study of machine vision has made progress in recognizing a wide range of objects and their properties—the nouns in the description of a scene—Mind’s Eye seeks to add the perceptual and cognitive underpinnings needed for recognizing and reasoning about the verbs in those scenes. Together, these technologies could enable a more complete visual narrative.
The Mission-oriented Resilient Clouds program aims to address security challenges inherent in cloud computing by developing technologies to detect, diagnose and respond to attacks, effectively building a “community health system” for the cloud. The program also aims to develop technologies to enable cloud applications and infrastructure to continue functioning while under attack. The loss of individual hosts and tasks within the cloud ensemble would be allowable as long as overall mission effectiveness was preserved.
The Programming Computation on Encrypted Data (PROCEED) research effort seeks to overcome a major challenge for information security in cloud-computing environments by developing practical methods and associated modern programming languages for computation on data that remains encrypted the entire time it is in use. Giving users the ability to manipulate encrypted data without first decrypting it would make interception by an adversary more difficult.
The Video and Image Retrieval and Analysis Tool (VIRAT) program aims to develop a system to provide military imagery analysts with the capability to exploit the vast amount of overhead video content being collected. If successful, VIRAT will enable analysts to establish alerts for activities and events of interest as they occur. VIRAT also seeks to develop tools that would enable analysts to rapidly retrieve, with high precision and recall, video content from extremely large video libraries.
The XDATA program seeks to develop computational techniques and software tools for analyzing large volumes of semi-structured and unstructured data. Central challenges to be addressed include scalable algorithms for processing imperfect data in distributed data stores and effective human-computer interaction tools that are rapidly customizable to facilitate visual reasoning for diverse missions. The program envisions open source software toolkits for flexible software development that enable processing of large volumes of data for use in targeted defense applications.
In addition, the National Security Agency (NSA) has unveiled “Vigilant Net”, which is effectively a competition to foster and test cyber defense situational awareness at scale. The project will explore the feasibility of conducting an online contest for developing data visualizations in the defense of massive computer networks, beginning with the identification of best practices in the design and execution of such an event. The U.S. intelligence community has identified a set of coordination, outreach and program activities to collaborate with a wide variety of partners throughout the U.S. Government, academia and industry, combining cybersecurity and Big Data and making its perspective accessible to the unclassified science community.
Previous editorial commentary on BiometricUpdate.com has also indicated that “big” government is already exploiting Big Data in the areas of surveillance for criminal and anti-terror investigations. BiometricUpdate.com reporting has also found that U.S. Government is also increasingly developing “predictive pattern-matching” techniques to determine suspicious patterns of behavior by actively collecting and collating metadata from the Internet, cellular phone networks and other publicly-accessible sources.
While the aforementioned programs are just a sample of some of the new, experimental Big Data research and development that has been undertaken, a recent survey of 150 IT executives in the U.S. Government, sponsored by EMC, found that 70 percent of respondents believe that within five years, Big Data will be critical to all government operations.
The EMC sponsored survey argues that Big Data has the potential to transform government by substantially increasing efficiency, enabling smarter decisions and deepening insight, thereby creating a potential to save nearly US$500 billion – or 14 percent of agency budgets – across the entire federal government. Further, 54 percent of survey respondents believe that Big Data will be central to executing all future military, surveillance and reconnaissance missions.
Biometrics Research Group, Inc. predicts that in the long-term, a large proportion of Big Data government spending will eventually be directed at efforts to manage large amounts of biometric data. We believe this Big Data management of biometric data will include the development and maintenance of the FBI’s newly proposed Next Generation Identification program, and visa application systems maintained by the U.S. Department of Homeland Security.