Overfitting means that a model the training data too well. This occurs when you have a limited data set or too many data set.
Let us look at the following example of US presidential elections.
With another less than a week away from the USA election, it is a good time to look at this analysis.
1980 it is the first time that president elected after his divorce. So if we collect data with the candidates marital status, since there are no divorce presidents until 1980, our model will say that Ronald Regan will lose with 100% accuracy!
Let us look the overfitting in Sri Lankan Presidential election context. Unlike USA, Sri Lanka had only limited elections. We had eight elections in 1982, 1988, 1994, 1999, 2005, 2010, 2015 and 2019.
Hope you understood how important it is it collect adequate data set along with important attributes rather than selecting all the attributes that you come across.
No comments:
Post a Comment