猿问

如何为PyTorch中的掩码R-CNN预测中的图像生成准确的掩码?

我已经训练了一个掩码RCNN网络,例如苹果的分割。我能够加载权重并为我的测试图像生成预测。正在生成的掩码似乎位于正确的位置,但掩模本身没有真正的形式。它看起来就像一堆像素

训练是根据本文中的数据集完成的,以下是用于训练和生成权重的代码的github链接

预测代码如下。(我省略了我创建路径变量并分配路径的部分)

import os

import glob

import numpy as np

import pandas as pd

import cv2 as cv

import fileinput


import torch

import torch.utils.data

import torchvision


from data.apple_dataset import AppleDataset

from torchvision.models.detection.faster_rcnn import FastRCNNPredictor

from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor


import utility.utils as utils

import utility.transforms as T


from PIL import Image

from matplotlib import pyplot as plt

%matplotlib inline



def get_transform(train):

    transforms = []

    transforms.append(T.ToTensor())

    if train:

        transforms.append(T.RandomHorizontalFlip(0.5))

    return T.Compose(transforms)


def get_maskrcnn_model_instance(num_classes):

    # load an instance segmentation model pre-trained pre-trained on COCO

    model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=False)


    # get number of input features for the classifier

    in_features = model.roi_heads.box_predictor.cls_score.in_features

    # replace the pre-trained head with a new one

    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)


    # now get the number of input features for the mask classifier

    in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels

    hidden_layer = 256

    # and replace the mask predictor with a new one

    model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask, hidden_layer, num_classes)

    return model


num_classes = 2

device = torch.device('cpu')


model = get_maskrcnn_model_instance(num_classes)

checkpoint = torch.load('model_49.pth', map_location=device)

model.load_state_dict(checkpoint['model'], strict=False)


dataset_test = AppleDataset(test_image_files_path, get_transform(train=False))

img, _ = dataset_test[1]

model.eval()


with torch.no_grad():

    prediction = model([img.to(device)])


慕斯709654
浏览 190回答 1
1回答

温温酱

来自掩码 R-CNN 的预测具有以下结构:在推理过程中,模型只需要输入张量,并将后处理的预测作为 ,每个输入图像返回一个。的字段如下:List[Dict[Tensor]]Dictboxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W  labels (Int64Tensor[N]): the predicted labels for each image  scores (Tensor[N]): the scores or each prediction  masks (UInt8Tensor[N, 1, H, W]): the predicted masks for each instance, in 0-1 range.您可以使用 OpenCV 和函数来绘制蒙版,如下所示:findContoursdrawContoursimg_cv = cv2.imread('input.jpg', cv2.COLOR_BGR2RGB)for i in range(len(prediction[0]['masks'])):    # iterate over masks    mask = prediction[0]['masks'][i, 0]    mask = mask.mul(255).byte().cpu().numpy()    contours, _ = cv2.findContours(            mask.copy(), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)    cv2.drawContours(img_cv, contours, -1, (255, 0, 0), 2, cv2.LINE_AA)cv2.imshow('img output', img_cv)
随时随地看视频慕课网APP

相关分类

Python
我要回答