PyTorch 06: PyTorch保存和加载模型

梦里梦外; 2023-03-13 15:29 160阅读 0赞

下面学习如何使用 PyTorch 保存和加载模型。我们经常需要加载之前训练过的模型,或继续用新的数据训练模型。所以这部分还是挺重要的。

  1. %matplotlib inline
  2. %config InlineBackend.figure_format = 'retina'
  3. import matplotlib.pyplot as plt
  4. import torch
  5. from torch import nn
  6. from torch import optim
  7. import torch.nn.functional as F
  8. from torchvision import datasets, transforms
  9. import helper
  10. import fc_model
  11. # Define a transform to normalize the data
  12. transform = transforms.Compose([transforms.ToTensor(),
  13. transforms.Normalize((0.5,), (0.5,))])
  14. # Download and load the training data
  15. trainset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=True, transform=transform)
  16. trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
  17. # Download and load the test data
  18. testset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=False, transform=transform)
  19. testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

下面是一个图像示例。

  1. image, label = next(iter(trainloader))
  2. helper.imshow(image[0,:]);

在这里插入图片描述

训练网络

我将上一部分的模型架构和训练代码移到了文件 fc_model 中。通过导入此模块,我们可以使用 fc_model.Network 轻松创建一个完全连接的网络,并使用 fc_model.train 训练网络。我会使用经过训练后的模型来演示保存和加载。

  1. # Create the network, define the criterion and optimizer
  2. model = fc_model.Network(784, 10, [512, 256, 128])
  3. criterion = nn.NLLLoss()
  4. optimizer = optim.Adam(model.parameters(), lr=0.001)
  5. fc_model.train(model, trainloader, testloader, criterion, optimizer, epochs=2)
  6. Epoch: 1/2.. Training Loss: 1.703.. Test Loss: 0.997.. Test Accuracy: 0.659
  7. Epoch: 1/2.. Training Loss: 1.060.. Test Loss: 0.738.. Test Accuracy: 0.733
  8. ...
  9. Epoch: 2/2.. Training Loss: 0.528.. Test Loss: 0.445.. Test Accuracy: 0.840
  10. Epoch: 2/2.. Training Loss: 0.502.. Test Loss: 0.465.. Test Accuracy: 0.829
  11. Epoch: 2/2.. Training Loss: 0.540.. Test Loss: 0.439.. Test Accuracy: 0.837

保存和加载网络

每次需要使用网络时都去训练它不太现实,也很不方便。我们可以保存训练过的网络,之后加载这些网络来继续训练或用它们进行预测。

PyTorch 网络的参数保存在模型的 state_dict 中。可以看到这个状态字典包含每个层级的权重和偏差矩阵。

  1. print("Our model: \n\n", model, '\n')
  2. print("The state dict keys: \n\n", model.state_dict().keys())
  3. Our model:
  4. Network(
  5. (hidden_layers): ModuleList(
  6. (0): Linear(in_features=784, out_features=512, bias=True)
  7. (1): Linear(in_features=512, out_features=256, bias=True)
  8. (2): Linear(in_features=256, out_features=128, bias=True)
  9. )
  10. (output): Linear(in_features=128, out_features=10, bias=True)
  11. (dropout): Dropout(p=0.5, inplace=False)
  12. )
  13. The state dict keys:
  14. odict_keys(['hidden_layers.0.weight', 'hidden_layers.0.bias', 'hidden_layers.1.weight', 'hidden_layers.1.bias', 'hidden_layers.2.weight', 'hidden_layers.2.bias', 'output.weight', 'output.bias'])

最简单的方法是使用 torch.save 保存状态字典。例如,我们可以将其保存到文件 'checkpoint.pth' 中。

  1. torch.save(model.state_dict(), 'checkpoint.pth')

然后,使用 torch.load 加载这个状态字典。

  1. state_dict = torch.load('checkpoint.pth')
  2. print(state_dict.keys())
  3. odict_keys(['hidden_layers.0.weight', 'hidden_layers.0.bias', 'hidden_layers.1.weight', 'hidden_layers.1.bias', 'hidden_layers.2.weight', 'hidden_layers.2.bias', 'output.weight', 'output.bias'])

要将状态字典加载到神经网络中,需要执行 model.load_state_dict(state_dict)

  1. model.load_state_dict(state_dict)
  2. <All keys matched successfully>

看上去很简单?其实不然!只有模型结构和检查点的结构完全一样时,状态字典才能加载成功哦。如果我在创建模型时使用了不同的结构,便无法顺利加载。

  1. # Try this
  2. model = fc_model.Network(784, 10, [400, 200, 100])
  3. # This will throw an error because the tensor sizes are wrong!
  4. model.load_state_dict(state_dict)
  5. ---------------------------------------------------------------------------
  6. RuntimeError Traceback (most recent call last)
  7. <ipython-input-13-d859c59ebec0> in <module>
  8. 2 model = fc_model.Network(784, 10, [400, 200, 100])
  9. 3 # This will throw an error because the tensor sizes are wrong!
  10. ----> 4 model.load_state_dict(state_dict)
  11. ~/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
  12. 845 if len(error_msgs) > 0:
  13. 846 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
  14. --> 847 self.__class__.__name__, "\n\t".join(error_msgs)))
  15. 848 return _IncompatibleKeys(missing_keys, unexpected_keys)
  16. 849
  17. RuntimeError: Error(s) in loading state_dict for Network:
  18. size mismatch for hidden_layers.0.weight: copying a param with shape torch.Size([512, 784]) from checkpoint, the shape in current model is torch.Size([400, 784]).
  19. size mismatch for hidden_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([400]).
  20. size mismatch for hidden_layers.1.weight: copying a param with shape torch.Size([256, 512]) from checkpoint, the shape in current model is torch.Size([200, 400]).
  21. size mismatch for hidden_layers.1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([200]).
  22. size mismatch for hidden_layers.2.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([100, 200]).
  23. size mismatch for hidden_layers.2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([100]).
  24. size mismatch for output.weight: copying a param with shape torch.Size([10, 128]) from checkpoint, the shape in current model is torch.Size([10, 100]).

这就是说,我们需要重新构建和训练时完全一样的模型。我们需要将模型架构信息与状态字典一起保存在检查点里。所以,你需要创建一个字典,其中包含完全重新构建模型所需的所有信息。

  1. checkpoint = { 'input_size': 784,
  2. 'output_size': 10,
  3. 'hidden_layers': [each.out_features for each in model.hidden_layers],
  4. 'state_dict': model.state_dict()}
  5. torch.save(checkpoint, 'checkpoint.pth')

现在,检查点中包含了重建训练模型所需的全部信息。你可以随意将它编写为函数。同样,我们可以编写一个函数来加载检查点。

  1. def load_checkpoint(filepath):
  2. checkpoint = torch.load(filepath)
  3. model = fc_model.Network(checkpoint['input_size'],
  4. checkpoint['output_size'],
  5. checkpoint['hidden_layers'])
  6. model.load_state_dict(checkpoint['state_dict'])
  7. return model
  8. model = load_checkpoint('checkpoint.pth')
  9. print(model)

发表评论

表情:
评论列表 (有 0 条评论,160人围观)

还没有评论,来说两句吧...

相关阅读

    相关 PyTorch模型保存

    PyTorch模型保存与加载 在利用PyTorch构建深度学习模型时,模型的保存和加载是非常重要的一步。这不仅可以保证我们的模型得以长期保存和重复使用,还可以方便我们在不同的