CNN practical notes
some notes on my training with CNN
Last updated
Was this helpful?
some notes on my training with CNN
Last updated
Was this helpful?
On 2019.12.6, I tried to use torchvision.models.resnet18
to train CIFAR10
Some findings in time sequential orders
without weight decay, model can only reach 70% something.
pretrained weights from ImageNet does not actually helps.
using torchvision.models.resnet18
can only achieve 85% something accuracy. This is because torchvision.models
is perfectly tuned for ImageNet, and when training on other datasets, the results usually won't went well.
See also
(I use this implementation and achieve comparable claimed test set accuracy 8.27%)
Takeaway
weight decay is important, yet torch.optim
disable it by default. set it to 1e-4 or 5e-4.
It's not preferrable to directly use torchvision.models
or other pretrained model architectures on datasets other than ImageNet. That's what so called 'Hyperparameter tuning is important'.