You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently - the /train route pushes all validly solved sudoku puzzles that came from uploaded images into the S3 folder pre_validated_data . sagemaker_train is the folder in S3 the Sagemaker endpoint pulls from to train the model.
The reason for this split, is that right now, data is submitted as-is, without any validation that the predicted digits the model generates are accurate for the newly uploaded digits.
Ideally, all of the newly generated values that are stored in the pre_validated_data folder are group by digit classification, then validated by humans to ensure only accurate predictions are supplemented for the Sagemaker model training.
This could be done either in coordination with a future Labs Front End team to create an admin panel for digit validation, or as a simple Flask HTML page that would load the .csv of new predictions, sorted by predicted digit, then updating predicted class for any misclassified digits. This will enable to train a model as part of an ongoing basis as users upload more Sudoku Puzzles, with a train job automated in sagemaker to run on either a time basis, or after a new threshold of puzzles are shared.
The text was updated successfully, but these errors were encountered:
Currently - the
/train
route pushes all validly solved sudoku puzzles that came from uploaded images into the S3 folderpre_validated_data
.sagemaker_train
is the folder in S3 the Sagemaker endpoint pulls from to train the model.The reason for this split, is that right now, data is submitted as-is, without any validation that the predicted digits the model generates are accurate for the newly uploaded digits.
Ideally, all of the newly generated values that are stored in the
pre_validated_data
folder are group by digit classification, then validated by humans to ensure only accurate predictions are supplemented for the Sagemaker model training.This could be done either in coordination with a future Labs Front End team to create an admin panel for digit validation, or as a simple Flask HTML page that would load the .csv of new predictions, sorted by predicted digit, then updating predicted class for any misclassified digits. This will enable to train a model as part of an ongoing basis as users upload more Sudoku Puzzles, with a train job automated in sagemaker to run on either a time basis, or after a new threshold of puzzles are shared.
The text was updated successfully, but these errors were encountered: