CNN practical notes
some notes on my training with CNN
Train CIFAR10
On 2019.12.6, I tried to use
torchvision.models.resnet18
to train CIFAR10Some findings in time sequential orders
without weight decay, model can only reach 70% something.
pretrained weights from ImageNet does not actually helps.
using
torchvision.models.resnet18
can only achieve 85% something accuracy. This is becausetorchvision.models
is perfectly tuned for ImageNet, and when training on other datasets, the results usually won't went well.See also
https://github.com/akamaster/pytorch_resnet_cifar10 (I use this implementation and achieve comparable claimed test set accuracy 8.27%)
Takeaway
weight decay is important, yet
torch.optim
disable it by default. set it to 1e-4 or 5e-4.It's not preferrable to directly use
torchvision.models
or other pretrained model architectures on datasets other than ImageNet. That's what so called 'Hyperparameter tuning is important'.
Pretrained models
Last updated