如何从 Pytorch 中的单个图像中提取特征向量？

3回答

萧十郎

pytorch 中的所有默认值都nn.Modules需要一个额外的批次维度。如果模块的输入是形状 (B, ...) 那么输出也将是 (B, ...) （尽管后面的维度可能会根据层而改变）。此行为允许同时对 B 批输入进行有效推理。为了使您的代码符合您的要求，您可以在将张量发送到您的模型以使其成为 (1, ...) 张量之前，在张量unsqueeze的前面增加一个单一维度。如果你想将它复制到你的一维张量中t_img，你还需要在存储它之前flatten的输出。layermy_embedding其他几件事：您应该在上下文中进行推断torch.no_grad()以避免计算梯度，因为您将不需要它们（请注意，model.eval()只是更改某些层的行为，如 dropout 和批归一化，它不会禁用计算图的构建，但会torch.no_grad()禁用）。我认为这只是一个复制粘贴问题，但它transforms是一个导入模块的名称以及一个全局变量。o.data只是返回o. 在旧Variable界面（大约 PyTorch 0.3.1 及更早版本）中，这曾经是必需的，但该Variable界面在 PyTorch 0.4.0中已被弃用，不再做任何有用的事情；现在它的使用只会造成混乱。不幸的是，许多教程仍在使用这种陈旧且不必要的界面编写。更新后的代码如下：import torchimport torchvisionimport torchvision.models as modelsfrom PIL import Imageimg = Image.open("Documents/01235.png")# Load the pretrained modelmodel = models.resnet18(pretrained=True)# Use the model object to select the desired layerlayer = model._modules.get('avgpool')# Set model to evaluation modemodel.eval()transforms = torchvision.transforms.Compose([ torchvision.transforms.Resize(256), torchvision.transforms.CenterCrop(224), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),])def get_vector(image): # Create a PyTorch tensor with the transformed image t_img = transforms(image) # Create a vector of zeros that will hold our feature vector # The 'avgpool' layer has an output size of 512 my_embedding = torch.zeros(512) # Define a function that will copy the output of a layer def copy_data(m, i, o): my_embedding.copy_(o.flatten()) # <-- flatten # Attach that function to our selected layer h = layer.register_forward_hook(copy_data) # Run the model on our transformed image with torch.no_grad(): # <-- no_grad context model(t_img.unsqueeze(0)) # <-- unsqueeze # Detach our copy function from the layer h.remove() # Return the feature vector return my_embeddingpic_vector = get_vector(img)

qq_笑_17

您可以使用create_feature_extractorfrom 从torchvision.models.feature_extraction模型中提取所需层的特征。ResNet18 中最后一个隐藏层的节点名称flatten基本上是扁平化的 1D avgpool。你可以通过在下面的字典中添加它们来提取你想要的任何层return_nodes。from torchvision.io import read_imagefrom torchvision.models import resnet18, ResNet18_Weightsfrom torchvision.models.feature_extraction import create_feature_extractor# Step 1: Initialize the model with the best available weightsweights = ResNet18_Weights.DEFAULTmodel = resnet18(weights=weights)model.eval()# Step 2: Initialize the inference transformspreprocess = weights.transforms()# Step 3: Create the feature extractor with the required nodesreturn_nodes = {'flatten': 'flatten'}feature_extractor = create_feature_extractor(model, return_nodes=return_nodes)# Step 4: Load the image(s) and apply inference preprocessing transformsimage = "?"image = read_image(image).unsqueeze(0)model_input = preprocess(image)# Step 5: Extract the featuresfeatures = feature_extractor(model_input)flatten_fts = features["flatten"].squeeze()print(flatten_fts.shape)

潇潇雨雨

model(t_img)而不是这个在这里做——model(t_img[None])这将增加一个额外的维度，因此图像将具有形状[1,3,224,224]并且可以使用。