On 2019.12.6, I tried to use
torchvision.models.resnet18 to train CIFAR10
Some findings in time sequential orders
without weight decay, model can only reach 70% something.
pretrained weights from ImageNet does not actually helps.
torchvision.models.resnet18 can only achieve 85% something accuracy. This is because
torchvision.models is perfectly tuned for ImageNet, and when training on other datasets, the results usually won't went well.
https://github.com/akamaster/pytorch_resnet_cifar10 (I use this implementation and achieve comparable claimed test set accuracy 8.27%)
weight decay is important, yet
torch.optim disable it by default. set it to 1e-4 or 5e-4.
It's not preferrable to directly use
torchvision.models or other pretrained model architectures on datasets other than ImageNet. That's what so called 'Hyperparameter tuning is important'.