Logo

The Data Daily

Essential Guide to Machine Learning Model Monitoring in Production

Essential Guide to Machine Learning Model Monitoring in Production

Model Monitoring is an important component of the end-to-end data science model development pipeline. The robustness of the model not only...

Model Monitoring is an important component of the end-to-end data science model development pipeline. The robustness of the model not only depends upon the training of the feature engineered data but also depends on how well the model is monitored after deployment.

Typically a machine learning model’s performance degrades over time, so it’s essential to detect the cause of the decrease in performance of the model. The main cause of the same can be drift in the independent or/and dependent features which may violate the model’s assumption and distribution about the data.

In this article, we will discuss various techniques to detect the data drift independent or independent features in the production inference data.

There are various reasons why the performance of the model degrades over time:

The above mentioned are the key reasons why a model performance degrades over time. The deployed model needs to be monitored after deployment to measure the model performance and data distribution. After the cause of the model decaying is decided, the existing model is retrained with the updated dataset.

The actual target class label for the inference data is mostly not present upfront. So it’s difficult to measure the model performance using the standard evaluation metrics such as precision, recall, accuracy, log-loss, etc.

Sometimes it takes time till the actual target class label is made available. But one can also measure the robustness of the model by observing the data distribution. There are various techniques to measure the data drift in the independent and dependent features.

There are various aspects to monitoring the drift in the independent features.

If we observe a change in the distribution of engineered or raw features of the inference data, we can expect a decline in model performance. Some of the popular statistical techniques to measure the deviation are:

One needs to monitor the statistical features of the inference and baseline data, to observe the divergence in the dataset. Some of the statistical features are:

Machine learning models develop some interactions between the features to make predictions. If the pattern or distribution between the features is changed then it may lead to a decrease in model performance. The technique to detect the multivariate feature distribution is:

The dependent feature (target label) for the inference target class maybe not be present upfront in production. Once the dependent feature is present, there are various techniques to measure the drift and come to a conclusion of whether the model performance has deteriorated or not.

For the classification task, the target class label is categorical in nature. The idea is to compare the distribution of target class labels between the inference data and base data.

For regression tasks, the histogram plot, or statistical feature of the continuous target label can be used to measure the drift in the data.

Once the actual target class label is made available, then the model drift can be detected by evaluating and comparing the performance of the model on standard metrics. If the model metrics show less than expected numbers, the model needs to be re-trained.

In this article, we have discussed various techniques to detect the drift in the inference dataset, after deployment of the model in production. Model drift may result in deterioration of the model performance over time. So it’s important to monitor the performance of the model after productionizing it.

Article originally posted here by Satyam Kumar. Reposted with permission.

Images Powered by Shutterstock