Difference between Data Analysis and Statistical Analysis

Last updated: 03-26-2020

Fifty years, ago, the lines between "data analysis" and "statistical analysis" were pretty clear. But as data analysis evolved, those lines became blurred. The differences between the two terms are now very much a grey area, but there are still a few notable differences.

Data scientists and statisticians typically define "data analysis" in different ways.

Both data scientists and statisticians use data to make inferences about consumer cohorts, a general population, or target market. However, they will approach the issue of data analysis quite differently.

The lifecycle of data is key to data workflow in data science:

You can perform many data analysis steps in data science with very little statistical basis: data prep, transforming data.

Generally speaking, statistical analysis is the science of uncovering patterns and trends in data, using statistics. Note the key word here is "statistics". In order to perform any statistical analysis at all you have to use statistics. Historically, only statisticians used statistical techniques on data. And data science wasn't even a thing in the mainframe days of tape mounting and Cobol programming. But as data science has evolved, it's blended with many areas once thought to be the exclusive realm of the statistician: data visualization, optimization, high-dimensional analysis to name but a few.

There is a large grey area: data analysis is a part of statistical analysis, and statistical analysis is part of data analysis. Any competent data analyst will have a good grasp of statistical tools and some statisticians will have some experience with programming languages like R.

If you're confused about where the line is, or where that separation occurs, the key question really is,

Are the two fields of data science and statistics really separate entities?

In the "old school" way of thinking about statistics (i.e. grey-haired statistician scribbling formulas in a binder, sifting through tables and performing obscure hypothesis tests understood by few) vs data science (sexy, at the forefront of technological revolution), then you could argue that yes, they are completely separate. However, if you hold the belief that modern statistics is more about "...the broader idea of greater data science (e.g. by putting more focus on computation in education, research and communication)" (Carmichael & Marron, 2018), then the answer is probably no.

Carmichael, I. & Marron, J. (2018). Data Science vs. Statistics: two cultures? Perspectives on data science for advanced statistics.

6 Methods of data collection and analysis - The Open University

Whats the difference between statistical analysis and data analysis?