Featured image of post 炼丹问题实录

炼丹问题实录

踩坑合集

本文用于记录在炼丹时踩过的坑,以及解决方案。

之前在AutoDL训的时候很多都忘记录了…只记得基本每训一次之前得改5 6个bug

所以现在从padddleX开始记录。

TypeError: Argument ‘bb’ has incorrect type (expected numpy.ndarray, got list)

问题描述

在使用自己打标好的COCO数据集训练时,出现了如下错误:

---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)/tmp/ipykernel_128/738473502.py in <module>
     15     warmup_start_lr=0.0,
     16     save_dir='output/mask_rcnn_r50_fpn',
---> 17     use_vdl=True)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/detector.py in train(self, num_epochs, train_dataset, train_batch_size, eval_dataset, optimizer, save_interval_epochs, log_interval_steps, save_dir, pretrain_weights, learning_rate, warmup_steps, warmup_start_lr, lr_decay_epochs, lr_decay_gamma, metric, use_ema, early_stop, early_stop_patience, use_vdl, resume_checkpoint)
    289             early_stop=early_stop,
    290             early_stop_patience=early_stop_patience,
--> 291             use_vdl=use_vdl)
    292 
    293     def quant_aware_train(self,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/base.py in train_loop(self, num_epochs, train_dataset, train_batch_size, eval_dataset, save_interval_epochs, log_interval_steps, save_dir, ema, early_stop, early_stop_patience, use_vdl)
    331                     outputs = self.run(ddp_net, data, mode='train')
    332                 else:
--> 333                     outputs = self.run(self.net, data, mode='train')
    334                 loss = outputs['loss']
    335                 loss.backward()
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/detector.py in run(self, net, inputs, mode)
    102 
    103     def run(self, net, inputs, mode):
--> 104         net_out = net(inputs)
    105         if mode in ['train', 'eval']:
    106             outputs = net_out
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
    900                 self._built = True
    901 
--> 902             outputs = self.forward(*inputs, **kwargs)
    903 
    904             for forward_post_hook in self._forward_post_hooks.values():
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/architectures/meta_arch.py in forward(self, inputs)
     24 
     25         if self.training:
---> 26             out = self.get_loss()
     27         else:
     28             out = self.get_pred()
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/architectures/mask_rcnn.py in get_loss(self)
    121 
    122     def get_loss(self, ):
--> 123         bbox_loss, mask_loss, rpn_loss = self._forward()
    124         loss = {}
    125         loss.update(rpn_loss)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/architectures/mask_rcnn.py in _forward(self)
     98             # Mask Head needs bbox_feat in Mask RCNN
     99             mask_loss = self.mask_head(body_feats, rois, rois_num, self.inputs,
--> 100                                        bbox_targets, bbox_feat)
    101             return rpn_loss, bbox_loss, mask_loss
    102         else:
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
    900                 self._built = True
    901 
--> 902             outputs = self.forward(*inputs, **kwargs)
    903 
    904             for forward_post_hook in self._forward_post_hooks.values():
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/heads/mask_head.py in forward(self, body_feats, rois, rois_num, inputs, targets, bbox_feat, feat_func)
    244         if self.training:
    245             return self.forward_train(body_feats, rois, rois_num, inputs,
--> 246                                       targets, bbox_feat)
    247         else:
    248             im_scale = inputs['scale_factor']
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/heads/mask_head.py in forward_train(self, body_feats, rois, rois_num, inputs, targets, bbox_feat)
    182         tgt_labels, _, tgt_gt_inds = targets
    183         rois, rois_num, tgt_classes, tgt_masks, mask_index, tgt_weights = self.mask_assigner(
--> 184             rois, tgt_labels, tgt_gt_inds, inputs)
    185 
    186         if self.share_bbox_feat:
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/proposal_generator/target_layer.py in __call__(self, rois, tgt_labels, tgt_gt_inds, inputs)
    256 
    257         outs = generate_mask_target(gt_segms, rois, tgt_labels, tgt_gt_inds,
--> 258                                     self.num_classes, self.mask_resolution)
    259 
    260         # mask_rois, mask_rois_num, tgt_classes, tgt_masks, mask_index, tgt_weights
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/proposal_generator/target.py in generate_mask_target(gt_segms, rois, labels_int32, sampled_gt_inds, num_classes, resolution)
    351                 results.append(
    352                     rasterize_polygons_within_box(new_segm[j], boxes[j],
--> 353                                                   resolution))
    354         else:
    355             results.append(
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/proposal_generator/target.py in rasterize_polygons_within_box(poly, box, resolution)
    306 
    307     # 3. Rasterize the polygons with coco api
--> 308     mask = polygons_to_mask(polygons, resolution, resolution)
    309     mask = paddle.to_tensor(mask, dtype='int32')
    310     return mask
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/modeling/proposal_generator/target.py in polygons_to_mask(polygons, height, width)
    282     import pycocotools.mask as mask_util
    283     assert len(polygons) > 0, "COCOAPI does not support empty polygons"
--> 284     rles = mask_util.frPyObjects(polygons, height, width)
    285     rle = mask_util.merge(rles)
    286     return mask_util.decode(rle).astype(np.bool)
pycocotools/_mask.pyx in pycocotools._mask.frPyObjects()
TypeError: Argument 'bb' has incorrect type (expected numpy.ndarray, got list)

解决方案

这是因为json文件里面的segmentation中的数据不符合要求,正常来说这里面是类似于[x,y,x,y,x,y…..x,y]按顺序排列的点序列,并且这里面的点序列个数是偶数,同时点的个数至少要超过2个(4个最稳),也就是要构面。

而我在打标时用的是矩形打标,只记录了对角两个点。因此这里面的数据是[x1,y1,x2,y2],这里面的点序列个数是4个,而且是不符合要求的,因此需要将这里面的数据转换成符合要求的数据。

写一个小脚本将[x1,y1,x2,y2]转换成[x1,y1,x1,y2,x2,y2,x2,y1],这样就符合要求了。

import json

with open('instances_val2017.json') as f:
    data = json.load(f)
    ann = data['annotations']
    for i in ann:
        if len(i['segmentation'][0]) == 4:
            x1, y1, x2, y2 = i['segmentation'][0]
            i['segmentation'][0] = [x1, y1, x2, y1, x2, y2, x1, y2]

with open('instances_val2017.json', 'w') as f:
    json.dump(data, f, indent=4)

SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.

解决方案

参考这个ISSUE

AssertionError: Results do not correspond to current coco set

assert set(annsImgIds)== (set(annsImgIds)& set(self.getImgIds())),\
'Results do not correspond to current coco set'

解决方案

COCO数据集不规范,检查下合成时有没有空的json,只有id没有坐标。

真·解决方案

最后发现是把eval_dataset.add_negative_samples(image_dir='o_Natural_empty_light')注释掉就行。。。我还重新生成了几次coco数据集。我真服了,paddle魔改完的库能不能多测试下啊