ICode9

精准搜索请尝试: 精确搜索
首页 > 编程语言> 文章详细

PaddlePaddle inference 源码分析(四)

2021-12-24 18:03:23  阅读:357  来源: 互联网

标签:inference tensor PaddlePaddle paddle platform 源码 place VarType scope


本节介绍预测处理的流程。预测处理流程主要分为3部分,包括准备输入数据、执行、获取输出数据。

一、放入输入数据

简单的使用方法如下所示:

vector<string> input_names = predictor->GetInputNames();
unique_ptr<Tensor> input_t = predictor->GetInputHandle(input_names[0]);
input_t->Reshape(input_shape);
input_t->CopyFromCpu(input.data());

我们按照这个流程一步一步来深入

1、GetInputNames

这个调用有点绕,因为对外提供的头文件是paddle_infer作用域,因此这里的实际实现是先在paddle_infer下函数调用,然后调用了实际创建出来的AnalysisPredictor::GetInputNamse。

这一步是获取输入的节点名称。这里idx2feeds_是std::map<size_t, std::string>,保存的是模型文件中op->Type==feed的名称

// 接口类实现
namespace paddle_infer {
std::vector<std::string> Predictor::GetInputNames() {
  return predictor_->GetInputNames();
}
}

// 实际实现
std::vector<std::string> AnalysisPredictor::GetInputNames() {
  std::vector<std::string> input_names;
  for (auto &item : idx2feeds_) {
    input_names.push_back(item.second);
  }
  return input_names;
}

2、GetInputHandle

作用是根据节点名称获取到对应的内存区域。前文介绍过Scope中保存了所有节点的信息,这里就是拿到输入节点Scope的内存区域.这里executor保存的scope是predictor的sub_scope。

namespace paddle_infer {
std::unique_ptr<Tensor> Predictor::GetInputHandle(const std::string &name) {
  return predictor_->GetInputTensor(name);
}
}
std::unique_ptr<ZeroCopyTensor> AnalysisPredictor::GetInputTensor(
    const std::string &name) {
  PADDLE_ENFORCE_NOT_NULL(
      executor_->scope()->FindVar(name),
      platform::errors::PreconditionNotMet(
          "The variable named %s is not found in the scope of the exector.",
          name));
  // 拿到scope
  std::unique_ptr<ZeroCopyTensor> res(
      new ZeroCopyTensor(static_cast<void *>(executor_->scope())));
  res->input_or_output_ = true;
  res->SetName(name);
  // 根据设备获取对应place
  if (platform::is_cpu_place(place_)) {
    res->SetPlace(PaddlePlace::kCPU);
  } else if (platform::is_xpu_place(place_)) {
    if (config_.lite_engine_enabled()) {
      // Currently, Paddle-Lite's XPU user interface only supports the transfer
      // of host data pointers. If it is currently used as a subgraph, execution
      // efficiency will be sacrificed, so it is temporarily set to cpu place.
      // And, the current lite engine of xpu must execute all parts of the
      // model.
      res->SetPlace(PaddlePlace::kCPU);
    } else {
      auto xpu_place = BOOST_GET_CONST(platform::XPUPlace, place_);
      res->SetPlace(PaddlePlace::kXPU, xpu_place.GetDeviceId());
    }
  } else if (platform::is_npu_place(place_)) {
    auto npu_place = BOOST_GET_CONST(platform::NPUPlace, place_);
    res->SetPlace(PaddlePlace::kNPU, npu_place.GetDeviceId());
  } else {
    auto gpu_place = BOOST_GET_CONST(platform::CUDAPlace, place_);
    res->SetPlace(PaddlePlace::kGPU, gpu_place.GetDeviceId());
  }
  return res;
}

3、ZeroCopyTensor::Reshape

这一步骤的作用就是操作输入tensort,重新确定输入数据的维度信息。这里我们会详细介绍一下tensor的操作。

3.1 基类及接口是paddle_infer::Tensor(paddle_tensor.h/inference/api/details/zero_copy_tensor.cc).ZeroCopyTensor(paddle_api.h)是paddle_infer::Tensor的子类,主要重写了copy相关的函数。会在下一小结具体讲述。

3.2 实际的reshape操作作用在Tensor::Reshape中,实际逻辑为从sub_scope中取出对应名称的Variable(framework/variable.h)并对其进行操作。

void Tensor::Reshape(const std::vector<int> &shape) {
  // 判断是否设置了name
  PADDLE_ENFORCE_EQ(
      name_.empty(), false,
      paddle::platform::errors::PreconditionNotMet(
          "Need to SetName first, so that the corresponding tensor can "
          "be retrieved."));
  // 判断是否为input,只有input才能重新设置
  PADDLE_ENFORCE_EQ(input_or_output_, true,
                    paddle::platform::errors::PermissionDenied(
                        "Can't reshape the output tensor, it is readonly"));
  // 获取scope,然后取出对应名称节点的变量并进行设置。这里使用的是sub_scope,其中保存的都是非永久性的节点
  auto *scope = static_cast<paddle::framework::Scope *>(scope_);
  auto *var = scope->FindVar(name_);
  PADDLE_ENFORCE_NOT_NULL(
      var, paddle::platform::errors::PreconditionNotMet(
               "No tensor called [%s] in the runtime scope", name_));
  auto *tensor = var->GetMutable<paddle::framework::LoDTensor>();
  tensor->Resize(paddle::framework::make_ddim(shape));
}

3.3 var->GetMutable,这里实际在Variable中创建对应类型的存储数据。存储数据用LoDTensor(lod_tensor.h),创建一个LoDTensor的对象赋值给

  template <typename T>
  T* GetMutable() {
    if (!holder_) {
      holder_.reset(new PlaceholderImpl<T>());
    } else {
      PADDLE_ENFORCE_EQ(
          holder_->Type(), VarTypeTrait<T>::kId,
          platform::errors::InvalidArgument(
              "The Variable type must be %s, but the type it holds is %s.",
              ToTypeName(VarTypeTrait<T>::kId), ToTypeName(holder_->Type())));
    }
    return static_cast<T*>(holder_->Ptr());
  }

PlaceholderImpl是一个模板类,用于包装T,这样Variable类在构造时不需要包含模板,只需要把Placeholder指针作为成员变量即可std::shared_ptr<Placeholder> holder_;PlaceholderImpl构造时会保存obj指针,同时保存obj的类型序号,序号实际在proto::VarType中定义。对应关系实现已注册好。

  // Placeholder hides type T, so it doesn't appear as a template
  // parameter of Variable.
  template <typename T>
  struct PlaceholderImpl : public Placeholder {
    static_assert(
        IsRegisteredVarType<T>(),
        "Not registered type. Please register T inside var_type_traits.h");
    PlaceholderImpl() { this->Init(&obj_, VarTypeTrait<T>::kId); }

   private:
    T obj_;
  };

这里会检查T类型是否已注册,注册列表详见framework/var_type_traits.h

REG_PROTO_VAR_TYPE_TRAIT(LoDTensor, proto::VarType::LOD_TENSOR);
REG_PROTO_VAR_TYPE_TRAIT(SelectedRows, proto::VarType::SELECTED_ROWS);
REG_PROTO_VAR_TYPE_TRAIT(std::vector<Scope *>, proto::VarType::STEP_SCOPES);
REG_PROTO_VAR_TYPE_TRAIT(LoDRankTable, proto::VarType::LOD_RANK_TABLE);
REG_PROTO_VAR_TYPE_TRAIT(LoDTensorArray, proto::VarType::LOD_TENSOR_ARRAY);
REG_PROTO_VAR_TYPE_TRAIT(platform::PlaceList, proto::VarType::PLACE_LIST);
REG_PROTO_VAR_TYPE_TRAIT(ReaderHolder, proto::VarType::READER);
REG_PROTO_VAR_TYPE_TRAIT(FeedList, proto::VarType::FEED_LIST);
REG_PROTO_VAR_TYPE_TRAIT(FetchList, proto::VarType::FETCH_LIST);
REG_PROTO_VAR_TYPE_TRAIT(int, proto::VarType::INT32);
REG_PROTO_VAR_TYPE_TRAIT(float, proto::VarType::FP32);
REG_PROTO_VAR_TYPE_TRAIT(Vocab, proto::VarType::VOCAB);
REG_PROTO_VAR_TYPE_TRAIT(String, proto::VarType::STRING);
REG_PROTO_VAR_TYPE_TRAIT(Strings, proto::VarType::STRINGS);

3.4 LoDTensor这里命名空间为paddle::framework,注意与之前paddle_infer::Tensor区分开。LoDTensor的父类为paddle::framework::Tensor(framework/tensor.h),Resize操作也是直接使用父类函数

Tensor& Tensor::Resize(const DDim& dims) {
  dims_ = dims;
  return *this;
}

4. ZeroCopyTensor::CopyFromCpu 

这一步真正进行内存拷贝。我们分4步详细介绍

template <typename T>
void Tensor::CopyFromCpu(const T *data) {
  1.EAGER_GET_TENSOR(paddle::framework::LoDTensor);
  PADDLE_ENFORCE_GE(tensor->numel(), 0,
                    paddle::platform::errors::PreconditionNotMet(
                        "You should call Tensor::Reshape(const "
                        "std::vector<int> &shape)"
                        "function before copying data from cpu."));
  2.size_t ele_size = tensor->numel() * sizeof(T);

  3.if (place_ == PlaceType::kCPU) {
    auto *t_data = tensor->mutable_data<T>(paddle::platform::CPUPlace());
    std::memcpy(static_cast<void *>(t_data), data, ele_size);
  4.} else if (place_ == PlaceType::kGPU) {
#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)
    paddle::platform::DeviceContextPool &pool =
        paddle::platform::DeviceContextPool::Instance();
    paddle::platform::CUDAPlace gpu_place(device_);
    auto *t_data = tensor->mutable_data<T>(gpu_place);
    auto *dev_ctx = static_cast<const paddle::platform::CUDADeviceContext *>(
        pool.Get(gpu_place));

    paddle::memory::Copy(gpu_place, static_cast<void *>(t_data),
                         paddle::platform::CPUPlace(), data, ele_size,
                         dev_ctx->stream());
#else
    PADDLE_THROW(paddle::platform::errors::Unavailable(
        "Can not create tensor with CUDA place because paddle is not compiled "
        "with CUDA."));
#endif
  } else if (place_ == PlaceType::kXPU) {
   ...// 昆仑xpu相关
  } else if (place_ == PlaceType::kNPU) {
   ...// 华为昇腾相关
  } else {
    PADDLE_THROW(paddle::platform::errors::InvalidArgument(
        "The analysis predictor supports CPU, GPU, NPU and XPU now."));
  }
}

4.1 取出scope对应var中创建的LoDTensor指针,赋值给tensor_

1调用入口
EAGER_GET_TENSOR(paddle::framework::LoDTensor);
2 调用FindTensor获取指针
#define EAGER_GET_TENSOR(tensor_type)    \
  if (!tensor_) {                        \
    tensor_ = FindTensor<tensor_type>(); \
  }                                      \
  auto *tensor = static_cast<tensor_type *>(tensor_);
3 实际逻辑,在scope对应var中使用GetMutable,由于Reshape时已经调用该接口进行了创建,而且本地调用类型与创建类型一致,会直接获取之前创建的LoDTensor对象指针。
template <typename T>
void *Tensor::FindTensor() const {
  PADDLE_ENFORCE_EQ(
      name_.empty(), false,
      paddle::platform::errors::PreconditionNotMet(
          "Need to SetName first, so that the corresponding tensor can "
          "be retrieved."));
  auto *scope = static_cast<paddle::framework::Scope *>(scope_);
  auto *var = scope->FindVar(name_);
  PADDLE_ENFORCE_NOT_NULL(
      var, paddle::platform::errors::PreconditionNotMet(
               "No tensor called [%s] in the runtime scope", name_));
  auto *tensor = var->GetMutable<T>();
  return tensor;
}

4.2 

标签:inference,tensor,PaddlePaddle,paddle,platform,源码,place,VarType,scope
来源: https://www.cnblogs.com/zl1991/p/15728406.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有