Member-only story

As machine learning engineers or data scientists, when we build machine learning models, we often face conditions where our models don’t work the way we want them to, and we need to debug. There may be many different reasons for that—for instance, the error rate is high or the model works fine on training data but underperforms on real-world data.
Luckily, in these situations, we have quite a few ways to improve our models:
1. Adjust model parameters
2. Get more and more training data
3. Create a (relatively) more complex model
However, before we can jump into the solutions, we need to first understand the challenges we’re trying to solve. So in this post (part one), we’ll discuss two common problems in machine learning model development:
- High variance: This problem will occur when the algorithm fits the training data perfectly. In other words, this means that the model is bad at generalizing. As one can guess, this model will perform poorly on unseen data. This problem is also called overfitting. The generalization error is the error measured on previously unseen data for your model.
- High bias: This problem will occur when the algorithm…