mmcv库的中文文档

2020-11-23 16:33:37 阅读：1549 来源： 互联网

标签：中文 mmcv img cfg py 文档 dict config

之前自己实现了一遍mmcv这个库，现在把API文档翻译一遍。英文官方文档地址：https://mmcv.readthedocs.io/en/latest/api.html 项目github地址：https://github.com/open-mmlab/mmcv 发现这个库的安装的时候常常很麻烦，因为太经常更新了，但其实核心部分也就下面这些功能嗷。一、File IO （1）这个模块提供常用的各种文件的加载和复制：比如json/yaml/pkl文件。

import mmcv
#从文件中加载数据
data = mmcv.load('test.json')
data = mmcv.load('test.yaml')
data = mmcv.load('test.pkl')
#从类文件对象加载数据
with open('test.json', 'r') as f:
    data = mmcv.load(f)
#复制数据到字符串文件
json_str = mmcv.dump(data, file_formate='json')
mmcv.dump(data, 'out.pkl')
#使用类文件对象将数据转储到文件中   
with open('test.yaml', 'w') as f:
    data = mmcv.dump(data, f, file_format='yaml')

扩展api以支持更多的文件格式也非常方便。您所需要做的就是编写一个继承自BaseFileHandler的文件处理程序，并用一种或几种文件格式注册它。您需要实现至少3个方法。

import mmcv
# To register multiple file formats, a list can be used as the argument.
# @mmcv.register_handler(['txt', 'log'])
@mmcv.register_handler('txt')
class TxtHandler1(mmcv.BaseFileHandler):
    def load_from_fileobj(self, file):
        return file.read()
    def dump_to_fileobj(self, obj, file):
        file.write(str(obj))
    def dump_to_str(self, obj, **kwargs):
        return str(obj)

下面是PickleHandler的一个示例。

import pickle
class PickleHandler(mmcv.BaseFileHandler):
    def load_from_fileobj(self, file, **kwargs):
        return pickle.load(file, **kwargs)
    def load_from_path(self, filepath, **kwargs):
        return super(PickleHandler, self).load_from_path(
            filepath, mode='rb', **kwargs)
    def dump_to_str(self, obj, **kwargs):
        kwargs.setdefault('protocol', 2)
        return pickle.dumps(obj, **kwargs)
    def dump_to_fileobj(self, obj, file, **kwargs):
        kwargs.setdefault('protocol', 2)
        pickle.dump(obj, file, **kwargs)
    def dump_to_path(self, obj, filepath, **kwargs):
        super(PickleHandler, self).dump_to_path(
            obj, filepath, mode='wb', **kwargs)

（2）以列表或字典的形式加载文本文件例如，a.txt是一个5行文本文件。 a b c d e 然后使用list_from_file从a.txt加载列表。

>>> mmcv.list_from_file('a.txt')
['a', 'b', 'c', 'd', 'e']
>>> mmcv.list_from_file('a.txt', offset=2)
['c', 'd', 'e']
>>> mmcv.list_from_file('a.txt', max_num=2)
['a', 'b']
>>> mmcv.list_from_file('a.txt', prefix='/mnt/')
['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']

例如，b.txt是一个有3行文本文件。 1 cat 2 dog cow 3 panda 然后使用dict_from_file从b.txt加载列表。

>>> mmcv.dict_from_file('b.txt')
{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
>>> mmcv.dict_from_file('b.txt', key_type=int)
{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}

二、Imgae 这个模块提供了一些图像处理方法，需要安装opencv。（1）读/写/显示图像要读取或写入图像文件，请使用imread或imwrite。

import mmcv
img = mmcv.imread('test.jpg')
img = mmcv.imread('test.jpg', flag='grayscale')
img_ = mmcv.imread(img) # nothing will happen, img_ = img
mmcv.imwrite(img, 'out.jpg')

从字节中读取图像

with open('test.jpg', 'rb') as f:
    data = f.read()
img = mmcv.imfrombytes(data)

显示一个图像文件或加载的图像

mmcv.imshow('tests/data/color.jpg')
# this is equivalent to
for i in range(10):
    img = np.random.randint(256, size=(100, 100, 3), dtype=np.uint8)
    mmcv.imshow(img, win_name='test image', wait_time=200)

（2）颜色空间变换提供下列的方法

bgr2gray
gray2bgr
bgr2rgb
rgb2bgr
bgr2hsv
hsv2bgr

img = mmcv.imread('tests/data/color.jpg')
img1 = mmcv.bgr2rgb(img)
img2 = mmcv.rgb2gray(img1)
img3 = mmcv.bgr2hsv(img)

（3）缩放有三个调整大小的方法。所有的imresize_*方法都有一个参数return_scale，如果这个参数是假的，那么返回值仅仅是调整大小的图像，否则是一个元组(resized_img, scale)。

# resize to a given size
mmcv.imresize(img, (1000, 600), return_scale=True)

# resize to the same size of another image
mmcv.imresize_like(img, dst_img, return_scale=False)

# resize by a ratio
mmcv.imrescale(img, 0.5)

# resize so that the max edge no longer than 1000, short edge no longer than 800
# without changing the aspect ratio
mmcv.imrescale(img, (1000, 800))

（4）旋转使用imrotate旋转图像以一定角度旋转。可以指定中心，默认为原始图像的中心。旋转有两种方式，一种是保持图像大小不变，这样旋转后图像的某些部分会被裁剪，另一种是扩展图像大小以适应旋转后的图像。

img = mmcv.imread('tests/data/color.jpg')

# rotate the image clockwise by 30 degrees.
img_ = mmcv.imrotate(img, 30)

# rotate the image counterclockwise by 90 degrees.
img_ = mmcv.imrotate(img, -90)

# rotate the image clockwise by 30 degrees, and rescale it by 1.5x at the same time.
img_ = mmcv.imrotate(img, 30, scale=1.5)

# rotate the image clockwise by 30 degrees, with (100, 100) as the center.
img_ = mmcv.imrotate(img, 30, center=(100, 100))

# rotate the image clockwise by 30 degrees, and extend the image size.
img_ = mmcv.imrotate(img, 30, auto_bound=True)

（5）翻转要翻转图像，请使用imflip。

img = mmcv.imread('tests/data/color.jpg')

# flip the image horizontally
mmcv.imflip(img)

# flip the image vertically
mmcv.imflip(img, direction='vertical')

（6）剪裁 imcrop可以用一个或一些区域来裁剪图像，表示为(x1, y1, x2, y2)。

import mmcv
import numpy as np

img = mmcv.imread('tests/data/color.jpg')

# crop the region (10, 10, 100, 120)
bboxes = np.array([10, 10, 100, 120])
patch = mmcv.imcrop(img, bboxes)

# crop two regions (10, 10, 100, 120) and (0, 0, 50, 50)
bboxes = np.array([[10, 10, 100, 120], [0, 0, 50, 50]])
patches = mmcv.imcrop(img, bboxes)

# crop two regions, and rescale the patches by 1.2x
patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2)

（7）填充有两个方法impad和impad_to_multiple可以用给定的值将图像填充到特定大小。

img = mmcv.imread('tests/data/color.jpg')

# pad the image to (1000, 1200) with all zeros
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=0)

# pad the image to (1000, 1200) with different values for three channels.
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=[100, 50, 200])

# pad the image on left, right, top, bottom borders with all zeros
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=0)

# pad the image on left, right, top, bottom borders with different values
# for three channels.
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200])

# pad an image so that each edge is a multiple of some value.
img_ = mmcv.impad_to_multiple(img, 32)

三、Video 此模块提供以下功能。 1、一个VideoReader类，具有友好的api来读取和转换视频。 2、剪辑(剪切，concat，调整大小)视频的一些方法。 3、光流读/写/变形。（1）视频读取 VideoReader类提供了类似api的序列来访问视频帧。它将在内部缓存已访问的帧

video = mmcv.VideoReader('test.mp4')

# obtain basic information
print(len(video))
print(video.width, video.height, video.resolution, video.fps)

# iterate over all frames
for frame in video:
    print(frame.shape)

# read the next frame
img = video.read()

# read a frame by index
img = video[100]

# read some frames
img = video[5:10]

将视频转换为图像或从图像目录生成视频。

# split a video into frames and save to a folder
video = mmcv.VideoReader('test.mp4')
video.cvt2frames('out_dir')

# generate video from frames
mmcv.frames2video('out_dir', 'test.avi')

（2）操作视频还有一些用于编辑视频的方法，它们包装了ffmpeg的命令。

# cut a video clip
mmcv.cut_video('test.mp4', 'clip1.mp4', start=3, end=10, vcodec='h264')

# join a list of video clips
mmcv.concat_video(['clip1.mp4', 'clip2.mp4'], 'joined.mp4', log_level='quiet')

# resize a video with the specified size
mmcv.resize_video('test.mp4', 'resized1.mp4', (360, 240))

# resize a video with a scaling ratio of 2
mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)

（3）光流操作我们提供了两个选项来转储光流文件:未压缩和压缩。未压缩的方法只是将浮点数转储到二进制文件中。它是无损的，但转储文件有一个更大的大小。这种压缩方法将光流量化到0-255，并将其转储为jpeg图像。x-dim和y-dim的流程将被连接成一个单独的图像。

flow = np.random.rand(800, 600, 2).astype(np.float32)
# dump the flow to a flo file (~3.7M)
mmcv.flowwrite(flow, 'uncompressed.flo')
# dump the flow to a jpeg file (~230K)
# the shape of the dumped image is (800, 1200)
mmcv.flowwrite(flow, 'compressed.jpg', quantize=True, concat_axis=1)

# read the flow file, the shape of loaded flow is (800, 600, 2) for both ways
flow = mmcv.flowread('uncompressed.flo')
flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)

使用mmcv.flowshow()可以可视化光流。

mmcv.flowshow(flow)

四、可视化 mmcv可以显示图像和注释(当前支持的类型包括边框)。

# show an image file
mmcv.imshow('a.jpg')

# show a loaded image
img = np.random.rand(100, 100, 3)
mmcv.imshow(img)

# show image with bounding boxes
img = np.random.rand(100, 100, 3)
bboxes = np.array([[0, 0, 50, 50], [20, 20, 60, 60]])
mmcv.imshow_bboxes(img, bboxes)

mmcv还可以可视化特殊的图像，如光流。

flow = mmcv.flowread('test.flo') mmcv.flowshow(flow)

五、Utils （1）Config配置这个类很经常拿来使用配置网络 Config类用于操作配置和配置文件。它支持从多种文件格式加载config，包括python、json和yaml。它提供了类似dict的api来获取和设置值。下面是配置文件test.py的示例。

a = 1
b = dict(b1=[0, 1, 2], b2=None)
c = (1, 2)
d = 'string'

>>> cfg = Config.fromfile('test.py')
>>> print(cfg)
>>> dict(a=1,
...      b=dict(b1=[0, 1, 2], b2=None),
...      c=(1, 2),
...      d='string')

对于所有的配置格式，都支持一些预定义的变量。它将{{var}}中的变量与它的实际值进行转换。目前，它支持四个预定义变量: {{fileDirname}} -当前打开文件的dirname，例如/home/your-username/your-project/文件夹 {{fileBasename}} -当前打开文件的basename，例如file.ext {{fileBasenameNoExtension}} -当前打开文件的basename，不带文件扩展名，例如file {{fileExtname}} -当前打开文件的扩展名，例如.ext 这些变量名是从VS代码中引用的。下面是一个带有预定义变量的配置示例。

config_a.py
a = 1
b = './work_dir/{{ fileBasenameNoExtension }}'
c = '{{ fileExtname }}'

>>> cfg = Config.fromfile('./config_a.py')
>>> print(cfg)
>>> dict(a=1,
...      b='./work_dir/config_a',
...      c='.py')

对于所有的配置格式，都支持继承。要在其他配置文件中重用字段，请指定_base_='./config_a。或一个configs _base_=['./config_a的列表。py”、“。/ config_b.py ']。下面是4个配置继承的例子。第一类：从基本配置继承，没有重叠的键

config_a.py
a = 1
b = dict(b1=[0, 1, 2], b2=None)
config_b.py
_base_ = './config_a.py'
c = (1, 2)
d = 'string'

>>> cfg = Config.fromfile('./config_b.py')
>>> print(cfg)
>>> dict(a=1,
...      b=dict(b1=[0, 1, 2], b2=None),
...      c=(1, 2),
...      d='string')

config_b.py中的新字段与config_a.py中的旧字段结合在一起第二类：从基础配置继承重叠的键

config_c.py
_base_ = './config_a.py'
b = dict(b2=1)
c = (1, 2)

>>> cfg = Config.fromfile('./config_c.py')
>>> print(cfg)
>>> dict(a=1,
...      b=dict(b1=[0, 1, 2], b2=1),
...      c=(1, 2))

config_c.py中的b.b b2=1替换config_a中的b.b b2=1。第三类：从忽略字段的基本配置继承

config_d.py
_base_ = './config_a.py'
b = dict(_delete_=True, b2=None, b3=0.1)
c = (1, 2)

>>> cfg = Config.fromfile('./config_d.py')
>>> print(cfg)
>>> dict(a=1,
...      b=dict(b2=None, b3=0.1),
...      c=(1, 2))

还可以设置_delete_=True来忽略基配置中的一些字段。b中的所有旧键b1, b2, b3都被替换为新键b2, b3。第四类：继承多个基配置(基配置不应该包含相同的键)

config_e.py
c = (1, 2)
d = 'string'
config_f.py
_base_ = ['./config_a.py', './config_e.py']

>>> cfg = Config.fromfile('./config_f.py')
>>> print(cfg)
>>> dict(a=1,
...      b=dict(b1=[0, 1, 2], b2=None),
...      c=(1, 2),
...      d='string')

（2）进度条如果您想对项目列表应用一个方法并跟踪进度，track_progress是一个不错的选择。它将显示一个进度条来告知进度和ETA。

import mmcv
def func(item):
    # do something
    pass
tasks = [item_1, item_2, ..., item_n]
mmcv.track_progress(func, tasks)

还有另一个方法track_parallel_progress，它包装了多处理和进程可视化。

mmcv.track_parallel_progress(func, tasks, 8)  # 8 workers

如果您想迭代或枚举一列项目并跟踪进度，track_iter_progress是一个不错的选择。它将显示一个进度条来告知进度和ETA。

import mmcv

tasks = [item_1, item_2, ..., item_n]

for task in mmcv.track_iter_progress(tasks):
    # do something like print
    print(task)

for i, task in enumerate(mmcv.track_iter_progress(tasks)):
    # do something like print
    print(i)
    print(task)

（3）计时器用计时器计算代码块的运行时间是方便的。

import time
with mmcv.Timer():
    # simulate some code block
    time.sleep(1)

或者尝试使用since_start()和since_last_check()。前者可以返回自计时器启动以来的运行时，后者将返回自上次检查以来的时间。

timer = mmcv.Timer()
# code block 1 here
print(timer.since_start())
# code block 2 here
print(timer.since_last_check())
print(timer.since_start())

六、Runner runner模块旨在帮助用户用更少的代码开始训练，同时保持灵活性和可配置性。文档和示例仍在更新中。七、Register MMCV实现了registry来管理在检测器中共享类似功能的不同模块，如backbone、head和neck。OpenMMLab中的大多数项目都使用注册表来管理数据集和模型模块，如MMDetection、MMDetection3D、MMClassification、MMEditing等。什么是register？在MMCV中，registry可以看作是类到字符串的映射。单个注册表包含的这些类通常具有类似的api，但实现不同的算法或支持不同的数据集。使用注册表，用户可以通过相应的字符串查找和实例化类，并根据需要使用实例化的模块。一个典型的例子是大多数OpenMMLab项目中的配置系统，它们使用注册表通过配置创建钩子(hook)、运行器(runner)、模型(model)和数据集(datasets)。要通过注册表管理代码基中的模块，有如下三个步骤。（1）创建一个注册表（2）创建一个构建方法（3）使用这个注册表来管理模块一个简单的例子：这里我们展示了一个使用registry管理包中的模块的简单示例。您可以在OpenMMLab项目中找到更多实际的示例。假设我们希望实现一系列数据集转换器，用于将不同格式的数据转换为预期的数据格式。我们将目录创建为一个名为converters的包。在包中，我们首先创建一个文件来实现构建器，名为converters/builder.py。如下所示。

from mmcv.utils import Registry
# create a registry for converters
CONVERTERS = Registry('converter')
# create a build function
def build_converter(cfg, *args, **kwargs):
    cfg_ = cfg.copy()
    converter_type = cfg_.pop('type')
    if converter_type not in CONVERTERS:
        raise KeyError(f'Unrecognized task type {converter_type}')
    else:
        converter_cls = CONVERTERS.get(converter_type)
    converter = converter_cls(*args, **kwargs, **cfg_)
    return converter

然后我们可以在包中实现不同的转换器。例如，在converters/ Converter1 .py中实现Converter1

from .builder import CONVERTERS
# use the registry to namge the module
@CONVERTERS.register_module()
class Converter1(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b

使用registry管理模块的关键步骤是在创建模块时通过@CONVERTERS.register_module()将实现的模块注册到注册表转换器中。如果模块注册成功，您可以通过configs as使用这个转换器

converter_cfg = dict(type='Converter1', a=a_value, b=b_value)
converter = build_converter(converter_cfg)

八、CNN 我们提供了一些CNNs的构建块，包括层构建、模块捆绑和权重初始化。（1）构建层在运行实验时，我们可能需要尝试相同类型的不同层，但不希望不时地修改代码。这里我们提供了一些从dict构建层的方法，dict可以用configs编写，也可以通过命令行参数指定

cfg = dict(type='Conv3d')
layer = build_norm_layer(cfg, in_channels=3, out_channels=8, kernel_size=3)

build_conv_layer: Supported types are Conv1d, Conv2d, Conv3d, Conv (alias for Conv2d).
build_norm_layer: Supported types are BN1d, BN2d, BN3d, BN (alias for BN2d), SyncBN, GN, LN, IN1d, IN2d, IN3d, IN (alias for IN2d).
build_activation_layer: Supported types are ReLU, LeakyReLU, PReLU, RReLU, ReLU6, ELU, Sigmoid, Tanh.
build_upsample_layer: Supported types are nearest, bilinear, deconv, pixel_shuffle.
build_padding_layer: Supported types are zero, reflect, replicate.

我们还允许使用定制的层和操作符扩展构建方法。

编写和注册您自己的模块。
from mmcv.cnn import UPSAMPLE_LAYERS
@UPSAMPLE_LAYERS.register_module()
class MyUpsample:
    def __init__(self, scale_factor):
        pass
    def forward(self, x):
        pass

将MyUpsample导入到某个地方(例如，在__init__.py中)，然后使用它。
cfg = dict(type='MyUpsample', scale_factor=2)
layer = build_upsample_layer(cfg)

（2）模块捆绑我们还提供了通用模块包，方便网络建设。ConvModule是卷积层、归一化层和激活层的捆绑，详细请参考api。

# conv + bn + relu
conv = ConvModule(3, 8, 2, norm_cfg=dict(type='BN'))
# conv + gn + relu
conv = ConvModule(3, 8, 2, norm_cfg=dict(type='GN', num_groups=2))
# conv + relu
conv = ConvModule(3, 8, 2)
# conv
conv = ConvModule(3, 8, 2, act_cfg=None)
# conv + leaky relu
conv = ConvModule(3, 8, 3, padding=1, act_cfg=dict(type='LeakyReLU'))
# bn + conv + relu
conv = ConvModule(
    3, 8, 2, norm_cfg=dict(type='BN'), order=('norm', 'conv', 'act'))

（3）权重初始化

constant_init
xavier_init
normal_init
uniform_init
kaiming_init
caffe2_xavier_init
bias_init_with_prob
conv1 = nn.Conv2d(3, 3, 1)
normal_init(conv1, std=0.01, bias=0)
xavier_init(conv1, distribution='uniform')

（4）除了torchvision的预训练模型，我们还提供以下CNN的预训练模型:

VGG Caffe
ResNet Caffe
ResNeXt
ResNet with Group Normalization
ResNet with Group Normalization and Weight Standardization
HRNetV2
Res2Net
RegNet

MMCV中的模型zoo链接由JSON文件管理。json文件由模型名称及其url或路径的键-值对组成。一个json文件：

{
    "model_a": "https://example.com/models/model_a_9e5bac.pth",
    "model_b": "pretrain/model_b_ab3ef2c.pth"
}

OpenMMLab AWS上托管的预训练模型的默认链接可以在这里找到。你可以通过打开mmlab来覆盖默认链接。json MMCV_HOME之下。如果在环境中找不到MMCV_HOME，则为~/。默认使用cache/mmcv。您可以导出MMCV_HOME=/your/path来使用自己的路径。外部json文件将合并到默认文件中。如果在外部json和默认json中都有相同的键，那么将使用外部键。（5）加载权重 mmcv.load_checkpoint()的文件名参数支持以下类型。（1）文件路径:checkpoint的文件路径。（2）http://xxx和https://xxx:下载checkpoint的链接。SHA256后缀应该包含在文件名中。（3）torchvision//xxx:模型链接在torchvision.models中。详情请参阅torchvision。（4）open-mmlab://xxx:默认和其他json文件中提供的模型链接或文件路径。九、我们实现了常用CUDA ops在检测、分割等方面的应用。

BBoxOverlaps
CARAFE
CrissCrossAttention
ContextBlock
CornerPool
Deformable Convolution v1/v2
Deformable RoIPool
GeneralizedAttention
MaskedConv
NMS
PSAMask
RoIPool
RoIAlign
SimpleRoIAlign
SigmoidFocalLoss
SoftmaxFocalLoss
SoftNMS
Synchronized BatchNorm
Weight standardization

标签：中文,mmcv,img,cfg,py,文档,dict,config
来源： https://www.cnblogs.com/HIKSEEKER/p/14025194.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

mmcv库的中文文档