Logo

The Data Daily

What is Data Mining?

What is Data Mining?

Let's look at a process used to discover patterns in large data sets called data mining. The interesting thing about data mining is that it involves different methods at the intersection of statistics, machine learning, and database systems. The overall goal in data mining is to extract the most relevant information from a given dataset and have it structured for further use. It includes the raw data analysis step, but it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, and visualization and online updating.

Before we look into different possible applications, I'd like to point out that as with many terms in this field, the term data mining is not correctly applied. Why is that? Mining refers to the extraction of data itself. However, in data mining, the goal is to detect and extract patterns and knowledge from a large data set. This buzzword is frequently used in the context of large scale information processes such as warehouse management. Another data mining application area is in artificial intelligence (e.g., machine learning). Here, data mining is used to make correct decisions with the help of information systems for business intelligence and decision-making based on large data sets.

The genuine data mining assignment is the semi-automatic or programmed examination of vast amounts of information to extricate intricate patterns such as bunches of information records (cluster analysis), unordinary records (anomaly detection), and conditions (association rule mining, sequential pattern mining). This ordinarily includes database procedures such as spatial indices. These patterns are a kind of outline of the input data used in predictive analysis or machine learning applications. For example, the data mining step might identify multiple groups in the data to obtain more accurate prediction results by a decision support system.

There are many different use cases of data mining across various industries, such as financial fraud detection applications, e-Commerce applications, and various business applications. Overall, detecting patterns in large data sets is beneficial in many ways and can accelerate multiple information systems and applications. Having useful patterns emerging from vast data accelerates projects, fuels applications, and provides precise and adapted data fit for decision-making.

For this blog post, I have used the following resources:

As always, have a great day and stay curious!

Images Powered by Shutterstock