The term Big Data means capturing, storaging, analysing, transferring, querying and visualizing big and complex data sets, in order to extract value from data. The value of the extracted data is that it could be used by algorithmic behavior analytics to factually describe the essential elements of the activity of an individual and make predictions about her/his future actions.
Big Data is obviously built on recording and storing heaps of data. Apart from storaging there is also a need to analyse the data and it is done by either clouds of commercially available personal computers or supercomputers organized in clusters or grids. Cloud services opened the gates for all-inclusive provider data centers, which are holding data on behalf of the subscribers (whom are normally companies with large data volumes).
The avaiability of these huge dumps and the corresponding technological advances however changed the landscape of how intelligence services are thinking. Because of a number of reasons, including the possibility to access larger and larger funds, the nation-states are moving towards a method of working where they collect everything instead of collecting that what is necessary. This method is called “All-Source Intelligence” sometimes, but if you redact the more traditional intelligence disciplines, like the HUMINT, SIGINT and GEOINT, etc. what is left is the “All Data”.
What uses there could be of All Data? Apparently there are novel ways to explore relationships between incoherent data – even real-time, using Activity-Based Intelligence toolsets. This enables an intelligence organization to notice evnets as they are emerging, and for this reason it will be a prized tool.
The other thing to consider is that some information could be valuable even after years, and even the encrypted and as-of-now undecryptable data has its value. Intelligence agencies collect and store huge amounts of possibly interesting data traffic for the future. If new technology or cryptanalytic techniques enable the decryption of these archives, this can have devastating consequences. Imagine critical information about important people, organizations operations or political decisions, decrypted after 20 or 30 years. Many of the involved people would still be alive or even in office.
All Data is often used to host target data to be curved out of the whole data set. This means that if there is a target group with unknown members, we could only possibly have their communications intercepted, if we intercept the communications of everyone. Then as we progress with our investigation, we could ID more and more group members and differentiate the relevant data from the noise. There are also algorithms aimed at finding connections that might presuppose the creation of new and unknown groups of interest.
Usually mining the All Data is done by computational agent-based models (ABM), evaluating past actions and interactions and simulating future actions and interactions of both individual or collective entities such as organizations or groups, with a view to assessing their effects on the security of the country as a whole.