<< Chapter < Page | Chapter >> Page > |
MOTIVATION
Data analysis is the process by which we glean understanding from data. While the origins of data analysis extend at least asfar back as Francis Bacon and certainly further, the term “Data Analysis” was first introduced as a field of academic study in 1962 by John Tukey.
Improvements in technology have increased both the amount of data that we can store and the speed with which we can analyze it(Friedman 1997). With each improvement, data analysis becomes more relevant. Modern commentators now claim we live in the midst of a “data deluge,” where weno longer have the cognitive power to understand all of the data available (Hey 2003). Further advances in data collection technology will require furtheradvances in data analysis methods.
The fields of Machine Learning, Data Mining, InfoVis, and Visual Analytics are all attempts to improve upon Data Analysis tobetter meet our analytical needs. But even with the research already done in these areas, scientists claim that there is very little Data Analysis theory tobuild upon, and that the theory that is available is hard to access (Unwin 2001, Mallows 2006, Cox 2007). This lack of theoretical understanding stymiesimprovement in the field. Many academic disciplines create innovations by extending existing theory in new ways. Data analysis appears to proceed througha trial and error process.
Researchers have offered multiple suggestions to remedy this. Cox and Mallows propose reviewing data analysis case studies to induce ageneral pattern of analysis. Unwin suggests creating a pattern language of Data Analysis similar to the pattern language first proposed by architects Alexander,Ishikawa, and Silverstein (1977), and used successfully in the field of software engineering (Coplien 1996). While we are intrigued by Unwin’s proposition, we donot presently have the resources to define a complete pattern language. However, we begin our examination of data analysis by reviewing the data analysis casestudies that exist in the literature of statistical consulting, as suggested by Cox and Mallows.
RESEARCH QUESTION
Can the sensemaking model of cognitive science provide a theoretical model for data analysis?
PREVIOUS MODELS OF DATA ANALYSIS
Past efforts to describe data analysis reveal a lack of consensus about the process. Below are three illustrations of theprocess provided by Box (1976), Box, Hunter, and Hunter (1978), and Wild and Pfannkuch (1999).
Notification Switch
Would you like to follow the 'The art of the pfug' conversation and receive update notifications?