To solve the mystery of bias-variance trade off, we need to first understand what is meant by bias and variance in data science. Why it is so important for analysis of machine learning model. Does it create any issue while training or testing? If so, what are the techniques to handle those problems.
let’s dive in.
In any machine learning model, error is comprised of three things i.e. bias, variance, one component as irreducible error i.e. noise.
In this post, we will discuss error caused by bias and variance.
What is variance?
In layman terms, variance of any data explains distributions of it’s features.
Here I am referring variance as part of prediction error.
Suppose you are training the model with some X data. If training data will be changed to some Y then predicted output on test data will be different. How much variations is occurred in prediction is explained by error in variance term.
Issues with variance error
If variance is too high, it means training error is too low and testing error is too high in comparison to training error and it will result into case of overfitting.
e.g. If train set error: 2% , test set error: 10%
Here as training error is very low, we can think at first that model has learnt well but as testing error is too high as compare to training error it will be case of overfitting i.e. case of High Variance.
What is bias?
Bias error shows that predicted output is how much deviated from the actual output.
If machine learning model is trained with different sets of data (as I mentioned in previous case) then predictions will be different each time. So how much these predictions are deviated from the actual value is explained by bias error.
Problem with bias error
If difference between predicted output and actual value is high, it means model has not learnt efficiently. This is a case of under-fitting.
e.g. Training Error: 17% Testing Error: 18%
As in this case although training error and testing error’s difference is very low but error itself is high. It means model has not learnt properly.
Sometimes it happens that even 15% error is tolerable and sometimes even 3% becomes intolerable. So how to decide benchmark?
This benchmark is decided by optimal error or human error. For example, if blur images of dogs are given then even human may problems to identifying it correctly. That’s why it is quite possible that machine learning model will show high error(may be 15%, 16% or anything). But in that case, it will be tolerable.
What is bias and variance trade off
As we have seen above to obtain optimized output, we need both bias and variance low.
But it’s not possible to make both the values low.
Suppose your model showing high variance i.e. overfitting case. In that case you will need more data or some regularization techniques to reduce overfitting.
But if you have more data, your model will not be able to learn properly with the same network. So in that case you need to make the nerual network or machine learning model bigger.
Now you made the network bigger but again all features will be learnt so that may result to overfitting.
And this cycle will go on…
So, what’s the solution?
Best solution is “Deep Learning Network“
Using deep learning network, you can make the network architecture large and at the same time more data you can use.
With the increase of hidden layers, you can control the learning of features to some extent.
Also, regularization methods can be used to resolve the trade off between bias and variance.