Read about MNIST here: https://en.wikipedia.org/wiki/MNIST_database
Questions about curse of dimensionality noted here.
- Fork and clone this repo
- Take the 60,000 handwritten TRAIN images load into Pandas
- Take the 10,000 handwritten TEST images and compute accuracy for the dataset
- For example you classified 9000 correctly
- And so 1000 were classified incorrectly
- Therefore your accuracy would be 90%
- Try different amounts for "k" and see how that affects accuracy (hyperparameter)
- Each image is 28x28 matrix
- Convert this to a vector and use Euclidean distance (L2) to find its nearest neighbor
- What is your accuracy for the test dataset?
- Push up your final code to github