继续浏览精彩内容
慕课网APP
程序员的梦工厂
打开
继续
感谢您的支持,我会继续努力的
赞赏金额会直接到老师账户
将二维码发送给自己后长按识别
微信支付
支付宝支付

Pytorch 定义神经网络

心之宙
关注TA
已关注
手记 71
粉丝 35
获赞 167

Pytorch 定义神经网络

import torch
from torch import nn
from torch.nn import functional as F


class Net(nn.Module):
    def __init__(self):
        super().__init__()
        # 1 个输出通道, 6 个输出通道, 5x5 的卷积核
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # 仿射运算: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # 对于方形仅需要给定一个参数
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)
Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)
params = list(net.parameters())
print(len(params))
print(params[0].size())  # conv1 的 .weight
10
torch.Size([6, 1, 5, 5])
type(params[0])  # 查看数据类型
torch.nn.parameter.Parameter
x_input = torch.randn(1, 1, 32, 32)
out = net(x_input)
print(out)
tensor([[-0.0391, -0.0291, -0.0254, -0.0962,  0.1101,  0.0504,  0.0392,  0.0566,
          0.0468,  0.0377]], grad_fn=<AddmmBackward>)

Zero the gradient buffers of all parameters and backprops with random gradients(将具有随机梯度的所有参数和后向传播的梯度缓冲区归零):

net.zero_grad()
out.backward(torch.randn(1, 10))

计算损失函数:

output = net(x_input)
target = torch.randn(10)  # a dummy target, for example
target = target.view(1, -1)  # make it the same shape as output
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)
tensor(0.9708, grad_fn=<MseLossBackward>)

backprops(后向传播)

print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU
<MseLossBackward object at 0x000002151D3BA908>
<AddmmBackward object at 0x000002151D3BA2B0>
<AccumulateGrad object at 0x000002151D3BAB00>
net.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)

loss.backward()

print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)
conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
conv1.bias.grad after backward
tensor([ 0.0073,  0.0028, -0.0098,  0.0208, -0.0209, -0.0117])

手动更新权重:

learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)
f.grad
tensor([ 0.3875, -0.0547, -0.0328, -0.1468, -0.0062,  0.2169,  0.1387, -0.1924,
        -0.2262,  0.2408])

torch.optim 更新权重:

from torch import optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(x_input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # Does the update

注意:需要手动设置 optimizer.zero_grad() 在 Backprop 阶段出现的梯度累积,以避免 CUDA 显存不足。

打开App,阅读手记
0人推荐
发表评论
随时随地看视频慕课网APP