Business intelligence is the process of using large data sets, processed using statistical techniques, to aid in business decision-making. The Internet has provided vast quantities of data, which can be used to gain insight. For example, a large retailer gathers data on every purchase, such as what items were purchased together, when they were bought, how they were bought and if possible customer characteristics as well. This information can provide valuable insights. For example, the business might be able to learn about cross-price elasticities by identifying products that are frequently purchased together and then adjusting prices of the items to learn about the impact on the sales of the other. A company can learn what the elasticity is for, say, fans, with respect to changes in temperature. It can learn more about the target markets for specific goods. All of this information helps businesses get an edge on the market and on competition (Chen Chiang & Storey, 2012). As data sets become larger, the quality of the data gets better, and technology is helping to lower the price of this insight as well.
What is Investigational Computing?
The Data Warehouse Initiative (2013) proposes that investigative analytics fills a gap between traditional analytics and predictive analytics. Predictive analytics are analytics that “consume data from structured and semi-structured sources” as in the examples above. Predictive analytics is commonly used to aid in business decision-making, because it can help managers to example the effects of changes to independent variables on whatever dependent variable they have enough data to support significant analysis on. One of the drawbacks of predictive analytics, it is argued, is that it requires a high level of statistical ability to interpret the results. As a tool, it is limited because it needs to be translated into plain language that managers can understand. While there is software that can do this, actually building the models and knowing what data to capture requires a certain amount of statistical ability that makes predictive analytics an inherently specialized field. The company roughly delineates between the process (investigational computing) and the hardware (investigational computing), though for practical purposes these concepts cannot be viewed separately.
In contrast with the closed-ended nature of predictive analytics, which is essentially the ability to answer a question that you have asked, investigative computing is a more open-ended concept. With investigative computing, managers can use it to learn about questions that they should ask; in other words, for hypothesis generation (TDWI, 2013). Investigative computing does this by looking for patterns, anomalies and clusters. This will lead, eventually to testing such hypotheses via predictive or traditional analytics, but it is in this initial stage were tremendous value is to be had. For example, consider the store that wants to know what the elasticity of demand is regarding fans and hot days. Predictive analytics can deliver the answer to that question. Investigative computing would be more like asking a probing question, such as what products spike in demand on hot days. The idea is that the manager can gain insights that otherwise may have been overlooked. For example, if there is a difference in what products spike in sales on hot days between northern and southern states, maybe that is identified via investigative analytics. Perhaps the data shows that there is something worth investigating, such as differences in hot day sales anomalies between stores located within two miles of the coast and stores further inland. There are many questions managers could ask if they think of them,
What all this means is that investigational computing could be described as brainstorming on steroids. While it replicates the age-old techniques of noticing things about your business and then investigating the phenomenon for insights, it does so in a way that harnesses the power of big data. When a business has a tremendous amount of data, investigational computing will run through that data to find insights that may have remained overlooked because of the vast amounts of data. It would be impossible, without investigational computing, to notice all of the meaningful trends or anomalies in a data set as large as that held by a company like Google or Amazon, for example. Any one manager can notice a handful of trends or anomalies, but even a superstar manager who noticed a new trend everyday would miss some, simply because the data set being analyzed is too large. What investigational computing does is allow for all of the trends and anomalies to be identified. Thus, IA is more powerful than any process by which the same task is done manually, for the same reasons that predictive analytics are so powerful — processing speed and the comprehensive nature of the technology that allows it to examine every single data point, rather than just those that are noticed by someone. In fact, processing speed has been cited as a major factor for the rise of predictive analytics, and with data sets the same size it flows naturally that increased processing speeds have been critical to the development of investigational computing.
As of 2013, TDWI was reporting that there were two main components to the physical infrastructure of investigative computing. Hadoop is an open source platform that distributes data storage. Originally conceived as a means of defending against storage hardware failure, Hadoop has a significant role to play in investigative computing. One of the benefits of Hadoop is that it allows for much faster processing of data, because it uses multiple nodes (similar to the way Torrent works). It delivers much greater capacity for data storage and processing in this way, allowing for companies to process much greater volumes of data, critical to deriving value from data sets in the petabyte class (Swoyer, 2013).
Investigative computing is also iterative in nature. In other words, it learns. With each iteration of data processing the algorithms become more refined, and this results in better analysis of the data each time. As a consequence, the insights gained from this system are better as well. By running models more quickly, and being able to have those models refined more frequently, not only does investigational computing increase the speed at which analysis is performed, but it increases the accuracy. This, in turn, facilitates more open-ended queries. Another critical benefit is that this entire process is so efficient that it can be embedded in operational processes. Typically business analytics demands that the data is presented to a human, who then analyzes the data and uses it to make decisions. The infrastructure that underlies investigational computing goes to the next level in being embedded directly into processes, using the analytical output to make decisions without human intermediary. The result is potentially very powerful, as a key bottleneck (human intervention) can be bypassed in many instances.
How Investigational computing is Used
As noted by TDWI (2013) and Swoyer (2013), investigational computing performs a different role that either traditional or predictive analytics. Investigational computing is therefore a complement to these other forms. The first major aspect of its role is that investigative computing can be presented to managers who do not have a firm grasp of statistics, and this can be done without translation. The programs themselves are not testing correlations and causal relationships, but rather simply identifying trends and anomalies. This lowers the bar with respect to understanding the basic concepts at work — it is easier by far to understand standard deviations than more advanced regression statistics. Thus, the information gained from investigational computing is easier to present in a way that can be understood by the average end user.
The end user can then, when presented with this information, start to ask more questions. The manager has the ability to determine which things are worth pursuing. Remember that investigational computing does not deliver correlations or anything like that, it just presents information that a manager can follow up on. An example would be something like Amazon’s “people who bought x also bought y.” Predictive analytics would try to assign odds to a given user’s likelihood of purchasing y, given that they have put x in their cart. But the investigational aspect of that is just to determine what the most common co-purchases are. If management wants to find a causal relationship between the co-purchases, or if management wants to determine odds of co-purchase, price elasticities or purchase triggers, those are all questions that a manager can follow up with using predictive analytics, once the basic data has been uncovered via investigational computing.
Investigational computing can be embedded into operational processes (Swoyer, 2013). Conventional business intelligence has tended to lack the analytical edge that investigational computing delivers. So while BI can tell us what products the people who bought x also bought, it requires investigational computing to combine this with other data and actually operationalize that information. In this way, IA “provides a foundation for both traditional and advanced analytic activities” (TDWI, 2013).
This concept also fits well with the Internet of things concept.…