ONE HOT ENCODING (Including Dummy Variable Trap) and ORDINAL ENCODING


For categorical variables where no ordinal relationship exists, the integer encoding may not be enough, at best, or misleading to the model at worst.


When categorical features in the dataset contain variables with intrinsic natural order such as Low, Medium and High, these must be encoded differently than nominal variables.

0 Low
1 Low
2 Medium
3 Medium
4 High
5 Low
6 Medium
7 High
8 Low
Score  Scale
0 Low 1
1 Low 1
2 Medium 2
3 Medium 2
4 High 3
5 Low 1
6 Medium 2
7 High 3
8 Low 1



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store