Talk is cheap, show me the code!
那么,从现在开始,让我们通过代码来进行实验和入门吧。
1. 导入和版本
import torch
print(torch.__version__)
1.0.1.post2
2. Tensor
和NumPy中的NDArray类似,PyTorch中也包含一个基础的数据结构作为所有操作的载体,在这里,这种数据类型被定义为Tensor
所有直接用于进行神经网络搭建的处理都被放在torch.nn中,而除去这部分,torch是一个和NumPy类似的计算库
2.1 创建Tensor
2.1.1 可以通过如下枚举的方式进行新建
分别输出x及其数据类型和形状进行观察
x = torch.tensor([1, 2, 3])
print(x)
print(x.dtype)
print(x.shape)
tensor([1, 2, 3])
torch.int64
torch.Size([3])
除去枚举的方式,我们可以通过某些特殊的值对一个已知形状的矩阵进行填充
2.1.2 使用一些无意义的小值对矩阵进行填充,只是为了使其非零
x = torch.empty((5, 3))
print(x)
tensor([[ 0.0000e+00, 2.5244e-29, -3.7822e-12],
[ 8.5920e+09, 5.6052e-45, 9.8091e-45],
[ 0.0000e+00, 8.4078e-45, 0.0000e+00],
[ 4.2039e-45, 0.0000e+00, 5.6052e-45],
[ 0.0000e+00, 2.8026e-45, 0.0000e+00]])
2.1.3 使用0或是1进行创建,只需要给出形状即可
zeros = torch.zeros(3, 3)
print(zeros)
print(zeros.dtype)
ones = torch.ones((3, 3))
print(ones)
print(ones.dtype)
tensor([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])
torch.float32
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
torch.float32
也可直接给出一个tensor,使用它的维度信息作为需要目标尺寸
得到和上述操作一样的结果
zeros_ = torch.zeros_like(zeros)
print((zeros_ == zeros).all())
tensor(1, dtype=torch.uint8)
2.1.4 当用于填充的数字不是0或是1时,可以得到一种更加普适性的方式
twos = torch.full((3, 3), 2)
print(twos)
print(twos.dtype)
tensor([[2., 2., 2.],
[2., 2., 2.],
[2., 2., 2.]])
torch.float32
2.1.5 获得随机数Tensor
# 本质是产生伪随机数,通过设置设定随机数种子得到可以复现的结果
# 以后每次调用torch.rand(5, 3)都会获得同样的结果
print(torch.initial_seed())
x = torch.rand(5, 3)
print(x)
torch.manual_seed(2)
x = torch.rand(5, 3)
print(x)
17737129808135602629
tensor([[0.9797, 0.0224, 0.9102],
[0.4042, 0.2427, 0.1232],
[0.5472, 0.6808, 0.2440],
[0.1356, 0.5017, 0.3099],
[0.8007, 0.5330, 0.0159]])
tensor([[0.6147, 0.3810, 0.6371],
[0.4745, 0.7136, 0.6190],
[0.4425, 0.0958, 0.6142],
[0.0573, 0.5657, 0.5332],
[0.3901, 0.9088, 0.5334]])
观察上面的输出的tensor数据类型可以发现torch默认数据类型是浮点型,这是一种很常见的处理方式,可以支持更多种计算
2.2 操作Tensor
2.2.1 计算
x = torch.ones(5, 3)
y = torch.randn(5, 3)
print(x+y)
tensor([[ 0.2241, -0.2011, 1.9231],
[-0.3245, 1.1724, 0.7176],
[ 1.0219, 0.6591, -0.1657],
[ 1.8022, 1.5602, 1.9671],
[ 1.2931, 0.2195, -1.4234]])
result = torch.empty(5, 3)
torch.add(x, y, out = result)
print(result)
tensor([[ 0.2241, -0.2011, 1.9231],
[-0.3245, 1.1724, 0.7176],
[ 1.0219, 0.6591, -0.1657],
[ 1.8022, 1.5602, 1.9671],
[ 1.2931, 0.2195, -1.4234]])
2. 2. 2 in-place操作,添加后缀
print(x)
x.add_(1)
print(x)
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
tensor([[2., 2., 2.],
[2., 2., 2.],
[2., 2., 2.],
[2., 2., 2.],
[2., 2., 2.]])
2.2.3 信息提取
x.item()可以得到全部的数值信息
且 only one element tensors can be converted to Python scalars
x = torch.tensor([1])
print(x.item())
1
2.2.4 和NumPy之间的转换
import numpy as np
x = torch.tensor([1])
print(type(x))
x_numpy = x.numpy()
print(type(x_numpy))
numpy_y = np.array(1)
print(type(numpy_y))
print(type(torch.from_numpy(numpy_y)))
# 常见的图像处理库对于图片的存储方式都是 height*width*channel 的格式,而在torch中使用的是channel*height*width格式,需要对其进行转换
# torch.tranpose每次对两个维度进行变换
# 则上述的操作需要进行两次操作
img = np.random.rand(14, 14, 3)
img_torch = torch.transpose(torch.transpose(torch.tensor(img), 0, 2), 1, 2)
print(img_torch.shape)
<class 'torch.Tensor'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'torch.Tensor'>
torch.Size([3, 14, 14])
2.3 Broadcast
1. Broadcast的可行性判定
条件1. Each tensor has at least one dimension.
条件2. When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.
写出二者的形状,向右边对齐,对维度进行垂直匹配
1. 若维度完全一致即可进行广播
2. 维度不一致时,观察结果,若是尺度不同的维度方向的两个值中有一个为1或是不存在即可广播
2. Broadcast的结果尺度判定
先通过在缺失位置补1使得二者位置一致
在每一个维度方向上都选择较大的数字
x=torch.empty(5,3,4,1)
y=torch.empty( 3,1,1)
# 二者可广播
# 从右边看,第一位都是1
# 第二位数值不同,但是有一个为1
# 第三位相同
# 第四位y的对应维度不存在
# 尺度应当输出(5, 3, 4, 1)
print((x+y).shape)
torch.Size([5, 3, 4, 1])
3. 使用PyTorch重写KNN算法
import sklearn
from sklearn import datasets
iris = datasets.load_iris()
iris_data = torch.tensor(iris.data, dtype = torch.float)
iris_target = torch.tensor(iris.target, dtype = torch.float)
class PyTorch_KNN_Classifier():
def __init__(self, K, data, target):
self.K = K
self.data = data
self.target = target
def fit(self):
return
def pred(self, x):
self.classes = dict()
for i in set(self.target):
self.classes[int(i.item())] = 0
distance = self.distance(x)
_, the_top_K_index = torch.topk(distance, self.K, largest=False)
refs = self.target[the_top_K_index]
for i in refs:
self.classes[int(i.item())] += 1
return max(self.classes, key=self.classes.get)
def distance(self, sample, method = 'Euclidian'):
"""
to get different distance metrics
use method "Eculidian" for L2 Norm, this is what for default
use method "Manhattan" for L1 Norm
use method "Chebyshev" for L-infinite Norm
"""
if method == 'Chebyshev':
return torch.max((self.data - sample), 1)
if method == "Euclidian" :
p = 2
elif method == "Manhattan":
p = 1
distances = torch.pow(torch.sum(torch.pow((self.data - sample), p), 1),p)
return distances
class PyTorch_KNN_Regressor():
def __init__(self, K, data, target):
self.K = K
self.data = data
self.target = target
def fit(self):
return
def pred(self, x):
distance = self.distance(x)
_, the_top_K_index = torch.topk(distance, self.K, largest=False)
refs = self.target[the_top_K_index]
return torch.mean(refs)
def distance(self, sample, method = 'Euclidian'):
"""
to get different distance metrics
use method "Eculidian" for L2 Norm, this is what for default
use method "Manhattan" for L1 Norm
use method "Chebyshev" for L-infinite Norm
"""
if method == 'Chebyshev':
return torch.max((self.data - sample), 1)
if method == "Euclidian" :
p = 2
elif method == "Manhattan":
p = 1
distances = torch.pow(torch.sum(torch.pow((self.data - sample), p), 1),p)
return distances
iris_classifier = PyTorch_KNN_Classifier(2, iris_data, iris_target)
iris_classifier.fit()
correct = 0
for i in range(len(iris_data)):
if iris_classifier.pred(iris_data[i]) == iris_target[i]:
correct += 1
print(correct)
147