knn + vc dimension
- Used to classify data
- if the k in knn is =1. then we only use the nearest neighbour to define the category
- if k=11, then we use the 11 nearest neighbors
- it's always easier to work with an odd value of k
How to pick a value for k?
- Have to try out a few values of k before settling on one. Do this by pretending that a part of the training data is "unknown".
- low values for k can be noisy and subject to the effects of outliers
- large values of k smooth over things
#VC Dimension
- it is a value based on the no. of points that can be shattered by the classifier in a model.
- the no. of points or labels that the classifier in the model can correctly categorize.