【笔记】JSON ：COCO API 读取 COCO数据集

2021-10-06 13:01:31 阅读：884 来源： 互联网

标签：return mask ids np JSON API COCO array def

一、COCO数据集的结构

假定dataDir的目录结构：annotations，test2014，train2014，val2014

由于annotations文件是一个json文件，所以用json来看看数据基本结构

import json
dataDir=r'D:\data\coco\coco2014'
dataType='val2014'
annFile='{}/annotations/instances_{}.json'.format(dataDir,dataType)
data=json.load(open(annFile,'r'))

先看看最顶层的结构：

for k in data:
    print(k)
-------------------------------
info
images
licenses
annotations
categories

其中最重要的是三个：images，annotations ，categories

1、images的结构：

for k in data["images"][0]:
    print(k)
-----------------------------
license
file_name
coco_url
height
width
date_captured
flickr_url
id

2、annotations 的结构：

for k in data["annotations "][0]:
    print(k)
------------------------------------
segmentation
area
iscrowd
image_id
bbox
category_id
id

3、categories的结构：

for k in data["categories"][0]:
    print(k)
-----------------------------------
supercategory
id
name

二、cocoapi（cocoapi）：

pycocotools下有三个模块：coco、cocoeval、mask、_mask。

1、coco模块：

# The following API functions are defined:
#  COCO       - COCO api class that loads COCO annotation file and prepare data structures.
#  getAnnIds  - Get ann ids that satisfy given filter conditions.
#  getCatIds  - Get cat ids that satisfy given filter conditions.
#  getImgIds  - Get img ids that satisfy given filter conditions.
#  loadAnns   - Load anns with the specified ids.
#  loadCats   - Load cats with the specified ids.
#  loadImgs   - Load imgs with the specified ids.
#  annToMask  - Convert segmentation in an annotation to binary mask.
#  showAnns   - Display the specified annotations.
#  loadRes    - Load algorithm results and create API for accessing them.
#  download   - Download COCO images from mscoco.org server.
# Throughout the API "ann"=annotation, "cat"=category, and "img"=image.
# Help on each functions can be accessed by: "help COCO>function".

COCO类定义了10个方法：

（1）获取标注id：

def getAnnIds(self, imgIds=[], catIds=[], areaRng=[], iscrowd=None):
        """
        Get ann ids that satisfy given filter conditions. default skips that filter
        :param imgIds  (int array)     : get anns for given imgs
               catIds  (int array)     : get anns for given cats
               areaRng (float array)   : get anns for given area range (e.g. [0 inf])
               iscrowd (boolean)       : get anns for given crowd label (False or True)
        :return: ids (int array)       : integer array of ann ids
        """

（2）获取类别id：

def getCatIds(self, catNms=[], supNms=[], catIds=[]):
        """
        filtering parameters. default skips that filter.
        :param catNms (str array)  : get cats for given cat names
        :param supNms (str array)  : get cats for given supercategory names
        :param catIds (int array)  : get cats for given cat ids
        :return: ids (int array)   : integer array of cat ids
        """

（3）获取图片id：

def getImgIds(self, imgIds=[], catIds=[]):
        '''
        Get img ids that satisfy given filter conditions.
        :param imgIds (int array) : get imgs for given ids
        :param catIds (int array) : get imgs with all given cats
        :return: ids (int array)  : integer array of img ids
        '''

（4）加载标注：

def loadAnns(self, ids=[]):
        """
        Load anns with the specified ids.
        :param ids (int array)       : integer ids specifying anns
        :return: anns (object array) : loaded ann objects
        """

（5）加载类别：

def loadCats(self, ids=[]):
        """
        Load cats with the specified ids.
        :param ids (int array)       : integer ids specifying cats
        :return: cats (object array) : loaded cat objects
        """

（6）加载图片：

def loadImgs(self, ids=[]):
        """
        Load anns with the specified ids.
        :param ids (int array)       : integer ids specifying img
        :return: imgs (object array) : loaded img objects
        """

（7）用matplotlib在图片上显示标注：

def showAnns(self, anns):
        """
        Display the specified annotations.
        :param anns (array of object): annotations to display
        :return: None
        """

（8）加载结果文件：

def loadRes(self, resFile):
        """
        Load result file and return a result api object.
        :param   resFile (str)     : file name of result file
        :return: res (obj)         : result api object
        """

（9）下载数据集（国内用这个真的行吗？还是百度网盘更好吧？）：

def download(self, tarDir = None, imgIds = [] ):
        '''
        Download COCO images from mscoco.org server.
        :param tarDir (str): COCO results directory name
               imgIds (list): images to be downloaded
        :return:
        '''

（10）ann转为rle格式：

def annToRLE(self, ann):
        """
        Convert annotation which can be polygons, uncompressed RLE to RLE.
        :return: binary mask (numpy 2D array)
        """

（11）获取mask：

def annToMask(self, ann):
        """
        Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask.
        :return: binary mask (numpy 2D array)
        """

2、mask模块下定义了四个函数：

def encode(bimask)：
def decode(rleObjs):
def area(rleObjs):
def toBbox(rleObjs):

3、cocoeval模块定义了COCOeval和Params类：

    # The usage for CocoEval is as follows:
    #  cocoGt=..., cocoDt=...       # load dataset and results
    #  E = CocoEval(cocoGt,cocoDt); # initialize CocoEval object
    #  E.params.recThrs = ...;      # set parameters as desired
    #  E.evaluate();                # run per image evaluation
    #  E.accumulate();              # accumulate per image results
    #  E.summarize();               # display summary metrics of results

4、更底层的模块_mask：（略）

三、示例（jupyter notebook）：

%matplotlib inline
from pycocotools.coco import COCO
from pycocotools.mask import encode,decode,area,toBbox

import numpy as np
import skimage.io as io
import matplotlib.pyplot as plt
import pylab
pylab.rcParams['figure.figsize'] = (8.0, 10.0)

dataDir=r'D:\data\coco\coco2014'
dataType='val2014'
annFile='{}/annotations/instances_{}.json'.format(dataDir,dataType)

coco=COCO(annFile)

imgIds = coco.getImgIds()
imags=coco.loadImgs(imgIds)

annIds = coco.getAnnIds(imgIds=imgIds)
ann = coco.loadAnns(annIds)[0]

mask=coco.annToMask(ann)
rle=coco.annToRLE(ann)

rle=encode(mask)
mask=decode(rle)

area(rle)
toBbox(rle)

四、segmentation的两种格式：RLE（run-length encoding）和polygon：

1、iscrowd=1时表示格式是RLE，iscrowd=0时表示格式是polygon：

polygon:

{"segmentation": [[499.71, 397.28,......342.71, 172.31]], 
"area": 43466.12825, 
"iscrowd": 0, 
"image_id": 182155, 
"bbox": [338.89, 51.69, 205.82, 367.61], 
"category_id": 1, 
"id": 1248258},

RLE:

{"segmentation": {"counts": [66916, 6, 587,..... 1, 114303], "size": [594, 640]}, 
"area": 6197, 
"iscrowd": 1, 
"image_id": 284445, 
"bbox": [112, 322, 335, 94], 
"category_id": 1, 
"id": 9.001002844e+11}

关于这两个问题的讨论见The RLE or Polygon format of "segmentation".

coco数据集好像都是polygon格式，而understanding_cloud_organization就用的是RLE。

2、polygon与mask之间的转换：

import cv2

def mask2polygon(mask):
    contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
    # mask_new, contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
    segmentation = []
    for contour in contours:
        contour_list = contour.flatten().tolist()
        if len(contour_list) > 4:# and cv2.contourArea(contour)>10000
            segmentation.append(contour_list)
    return segmentation

def polygons_to_mask(img_shape, polygons):
    mask = np.zeros(img_shape, dtype=np.uint8)
    polygons = np.asarray(polygons, np.int32) # 这里必须是int32，其他类型使用fillPoly会报错
    shape=polygons.shape
    polygons=polygons.reshape(shape[0],-1,2)
    cv2.fillPoly(mask, polygons,color=1) # 非int32 会报错
    return mask
#test------------------------------
import numpy as np
mask = np.ones((100, 100))
for i in range(10):
    for j in range(10):
        mask[i][j]=0
mask2polygon(mask)
--------------------------
[[10, 0, 10, 9, 9, 10, 0, 10, 0, 99, 99, 99, 99, 0]]

另外的方法，binary_mask_to_polygon（没试过，供参考）。

3、RLE与mask之间的转换：

def mask2rle(img):
    '''
    img: numpy array, 1 - mask, 0 - background
    Returns run length as string formated
    '''
    pixels= img.T.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

def rle2mask(rle, input_shape):
    width, height = input_shape[:2]
    
    mask= np.zeros( width*height ).astype(np.uint8)
    
    array = np.asarray([int(x) for x in rle.split()])
    starts = array[0::2]
    lengths = array[1::2]

    current_position = 0
    for index, start in enumerate(starts):
        mask[int(start):int(start+lengths[index])] = 1
        current_position += lengths[index]   
    return mask.reshape(height, width).T

4、计算mask的bbox：

def bounding_box(img):
    # return max and min of a mask to draw bounding box
    rows = np.any(img, axis=1)
    cols = np.any(img, axis=0)
    rmin, rmax = np.where(rows)[0][[0, -1]]
    cmin, cmax = np.where(cols)[0][[0, -1]]

    return rmin, rmax, cmin, cmax

五、其他格式的数据集转化为coco格式数据集

参看一个示例：convert-dataset-to-coco-format-tools

标签：return,mask,ids,np,JSON,API,COCO,array,def
来源： https://blog.csdn.net/nyist_yangguang/article/details/120623603

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

【笔记】JSON ：COCO API 读取 COCO数据集