ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

Android System.loadLibrary深度剖析

2022-01-04 09:02:00  阅读:387  来源: 互联网

标签:load return name System library so loadLibrary Android 加载


Android System.loadLibrary深度剖析

缘起:

从Android 6.0 & AGP 3.6.0开始,系统支持直接加载apk中未压缩的so,也就是说在App安装时,系统不再将apk中的so解压,而在加载so时,直接从apk中加载。
具体见:https://developer.android.com/guide/topics/manifest/application-element#extractNativeLibs
然而,熟悉glibc开发的程序员知道,dlopen系列函数不支持这个,那么应该是Android扩展了libc(bionic)加载so的能力。本文在讲解Android扩展加载so能力的同时,“深度剖析”整个so的加载过程。将从Java代码开始,深入到libart源码,然后再贯穿解析bionic源码,直到Linux syscall级别。

本文源码基于Android 11.0.0_r46。

测试Demo:https://github.com/huchao/MySystemLoadLibrary

下面开始源码分析:

一、libcore(Java)

先以Java代码作为切入点分析,具体实现位于libcore

1. System.loadLibrary

源码:libcore/ojluni/src/main/java/java/lang/System.java

public static void loadLibrary(String libname) {
    Runtime.getRuntime().loadLibrary0(Reflection.getCallerClass(), libname);
}

跟OpenJDK源码一致,System.loadLibrary转调Runtime.getRuntime().loadLibrary0。

2. Runtime.loadLibrary0

源码:libcore/ojluni/src/main/java/java/lang/Runtime.java

private synchronized void loadLibrary0(ClassLoader loader, Class<?> callerClass, String libname) {
    if (libname.indexOf((int)File.separatorChar) != -1) {
        throw new UnsatisfiedLinkError(
"Directory separator should not appear in library name: " + libname);
    }
    String libraryName = libname;
    // Android-note: BootClassLoader doesn't implement findLibrary(). http://b/111850480
    // Android's class.getClassLoader() can return BootClassLoader where the RI would
    // have returned null; therefore we treat BootClassLoader the same as null here.
    if (loader != null && !(loader instanceof BootClassLoader)) {
        String filename = loader.findLibrary(libraryName);
        if (filename == null &&
                (loader.getClass() == PathClassLoader.class ||
                    loader.getClass() == DelegateLastClassLoader.class)) {
            // Don't give up even if we failed to find the library in the native lib paths.
            // The underlying dynamic linker might be able to find the lib in one of the linker
            // namespaces associated with the current linker namespace. In order to give the
            // dynamic linker a chance, proceed to load the library with its soname, which
            // is the fileName.
            // Note that we do this only for PathClassLoader  and DelegateLastClassLoader to
            // minimize the scope of this behavioral change as much as possible, which might
            // cause problem like b/143649498. These two class loaders are the only
            // platform-provided class loaders that can load apps. See the classLoader attribute
            // of the application tag in app manifest.
            filename = System.mapLibraryName(libraryName);
        }
        if (filename == null) {
            // It's not necessarily true that the ClassLoader used
            // System.mapLibraryName, but the default setup does, and it's
            // misleading to say we didn't find "libMyLibrary.so" when we
            // actually searched for "liblibMyLibrary.so.so".
            throw new UnsatisfiedLinkError(loader + " couldn't find \"" +
                                            System.mapLibraryName(libraryName) + "\"");
        }
        String error = nativeLoad(filename, loader);
        if (error != null) {
            throw new UnsatisfiedLinkError(error);
        }
        return;
    }

    // We know some apps use mLibPaths directly, potentially assuming it's not null.
    // Initialize it here to make sure apps see a non-null value.
    getLibPaths();
    String filename = System.mapLibraryName(libraryName);
    String error = nativeLoad(filename, loader, callerClass);
    if (error != null) {
        throw new UnsatisfiedLinkError(error);
    }
}

loadLibrary0经过几次重载调用,最终来到如下的loadLibrary0方法,首先通过libraryName参数调用loader.findLibrary去查找so文件路径,找到后再通过路径调用nativeLoad加载so到进程中,此处nativeLoad已是Native方法。

2.1. DexPathList.findLibrary

源码:libcore/dalvik/src/main/java/dalvik/system/DexPathList.java

public String findLibrary(String libraryName) {
    String fileName = System.mapLibraryName(libraryName);

    for (NativeLibraryElement element : nativeLibraryPathElements) {
        String path = element.findNativeLibrary(fileName);

        if (path != null) {
            return path;
        }
    }

    return null;
}
  1. 在进入nativeLoad之前,支线看一下上面提到的loader.findLibrary,看系统是怎样通过libraryName来找到需要加载类的全路径的(在如上例子中即为通过"mytest"字符串查找到字符串"/data/app/~~fZ_4-EauEUei3h26P_847A==/com.huchao.mysystemloadlibrary-iVnUuIYOZ3V4zuSqrYZ5uw==/base.apk!/lib/arm64-v8a/libmytest.so")。
  2. loader.findLibrary最终的实现是在DexPathList.findLibrary中,此时libraryName为"mytest",经过System.mapLibraryName转换后,得到fileName为"libmytest.so"(System.mapLibraryName调用到Native中的System_mapLibraryName,实际上只是一个字符串format操作)。
  3. 接下来一个for循环,遍历nativeLibraryPathElements,通过每个element查找fileName。此处的nativeLibraryPathElements为Native Library查找路径集合,这个集合在App启动时初始化,早于Application.attachBaseContext,其中按顺序包含nativeLibraryDirectories(App本地路径)、systemNativeLibraryDirectories(系统路径)。在本例中其取值为:
"directory "/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/lib/arm64""
"zip file "/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/base.apk", dir "lib/arm64-v8a""
"directory "/system/lib64""
"directory "/system/system_ext/lib64""
"directory "/system/product/lib64""
  1. 此处可见,加载so是先查找App路径下,然后再查找系统路径。通过前缀,也能发现,支持从zip文件base.apk中直接加载so。

2.2. DexPathList.findNativeLibrary

源码:libcore/dalvik/src/main/java/dalvik/system/DexPathList.java

public String findNativeLibrary(String name) {
    maybeInit();

    if (zipDir == null) {
        String entryPath = new File(path, name).getPath();
        if (IoUtils.canOpenReadOnly(entryPath)) {
            return entryPath;
        }
    } else if (urlHandler != null) {
        // Having a urlHandler means the element has a zip file.
        // In this case Android supports loading the library iff
        // it is stored in the zip uncompressed.
        String entryName = zipDir + '/' + name;
        if (urlHandler.isEntryStored(entryName)) {
            return path.getPath() + zipSeparator + entryName;
        }
    }

    return null;
}
  1. 下面接着看element.findNativeLibrary。首先判断zipDir是否为null,zipDir是指当前是否需要在zip文件中查找so,即是否要在apk中查找,对应刚才nativeLibraryPathElements中的"zip file "项。
  2. 当zipDir = null,即不需要在apk中查找,则拼接路径后调用IoUtils.canOpenReadOnly,判断so文件是否能采用ReadOnly方式打开,如果可以则返回全路径。如:/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/lib/arm64/libmytest.so
  3. 当zipDir != null,即需要在apk中查找,则拼接路径后调用urlHandler.isEntryStored判断apk中的so是否可用,如果可以则返回路径。如:/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/base.apk!/lib/arm64-v8a/libmytest.so

2.3. ClassPathURLStreamHandler.isEntryStored

源码:libcore/luni/src/main/java/libcore/io/ClassPathURLStreamHandler.java

public boolean isEntryStored(String entryName) {
  ZipEntry entry = jarFile.getEntry(entryName);
  return entry != null && entry.getMethod() == ZipEntry.STORED;
}
  1. 刚才的isEntryStored将调用到ClassPathURLStreamHandler.isEntryStored中,通过jarFile判断entryName是否存在,本例中entryName为lib/arm64-v8a/libmytest.so
  2. 如果存在,并且压缩方式为ZipEntry.STORED,则返回true,表示找到对应so。Zip压缩有STORED(仅存储)、DEFLATED(Deflate压缩)两种方式。

二、libcore(Native)

此处到了libcore的Native代码。

1. Runtime_nativeLoad

源码:libcore/ojluni/src/main/native/Runtime.c

JNIEXPORT jstring JNICALL
Runtime_nativeLoad(JNIEnv* env, jclass ignored, jstring javaFilename,
                   jobject javaLoader, jclass caller)
{
    return JVM_NativeLoad(env, javaFilename, javaLoader, caller);
}

接着上面的Runtime.loadLibrary0,在路径查找完成后,将JNI调用nativeLoad函数,最终调用到Runtime_nativeLoad,然后再转调libart中的JVM_NativeLoad。

三、libart(Native)

此处到了libart的代码。

1. JVM_NativeLoad

源码:art/openjdkjvm/OpenjdkJvm.cc

JNIEXPORT jstring JVM_NativeLoad(JNIEnv* env,
                                 jstring javaFilename,
                                 jobject javaLoader,
                                 jclass caller) {
  ScopedUtfChars filename(env, javaFilename);
  if (filename.c_str() == nullptr) {
    return nullptr;
  }

  std::string error_msg;
  {
    art::JavaVMExt* vm = art::Runtime::Current()->GetJavaVM();
    bool success = vm->LoadNativeLibrary(env,
                                         filename.c_str(),
                                         javaLoader,
                                         caller,
                                         &error_msg);
    if (success) {
      return nullptr;
    }
  }

  // Don't let a pending exception from JNI_OnLoad cause a CheckJNI issue with NewStringUTF.
  env->ExceptionClear();
  return env->NewStringUTF(error_msg.c_str());
}

通过各类参数判断后,继续转调JavaVMExt::LoadNativeLibrary。

2. JavaVMExt::LoadNativeLibrary

源码:art/runtime/jni/java_vm_ext.cc

bool JavaVMExt::LoadNativeLibrary(JNIEnv* env,
                                  const std::string& path,
                                  jobject class_loader,
                                  jclass caller_class,
                                  std::string* error_msg) {
  ......

  void* handle = android::OpenNativeLibrary(
      env,
      runtime_->GetTargetSdkVersion(),
      path_str,
      class_loader,
      (caller_location.empty() ? nullptr : caller_location.c_str()),
      library_path.get(),
      &needs_native_bridge,
      &nativeloader_error_msg);

  ......
}

JavaVMExt::LoadNativeLibrary源码中省略了非核心部分代码(包括:首先判断so是否已经加载过了,并且可用,则直接返回true)。如果是第一次加载,则转调android::OpenNativeLibrary,此处返回值handle即为so的入口地址,类似于dlopen的返回值。

3. OpenNativeLibrary

源码:art/libnativeloader/native_loader.cpp

void* OpenNativeLibrary(JNIEnv* env, int32_t target_sdk_version, const char* path,
                        jobject class_loader, const char* caller_location, jstring library_path,
                        bool* needs_native_bridge, char** error_msg) {
  ......

  return OpenNativeLibraryInNamespace(ns, path, needs_native_bridge, error_msg);
}

OpenNativeLibrary源码中省略了非核心部分代码,然后转调OpenNativeLibraryInNamespace。

4. OpenNativeLibraryInNamespace

源码:art/libnativeloader/native_loader.cpp

void* OpenNativeLibraryInNamespace(NativeLoaderNamespace* ns, const char* path,
                                   bool* needs_native_bridge, char** error_msg) {
  ......                                     
  auto handle = ns->Load(path);
  ......
}

OpenNativeLibraryInNamespace源码中省略了非核心部分代码,然后转调NativeLoaderNamespace::Load

5. NativeLoaderNamespace::Load

源码:art/libnativeloader/native_loader_namespace.cpp

Result<void*> NativeLoaderNamespace::Load(const char* lib_name) const {
  if (!IsBridged()) {
    android_dlextinfo extinfo;
    extinfo.flags = ANDROID_DLEXT_USE_NAMESPACE;
    extinfo.library_namespace = this->ToRawAndroidNamespace();
    void* handle = android_dlopen_ext(lib_name, RTLD_NOW, &extinfo);
    if (handle != nullptr) {
      return handle;
    }
  } else {
    void* handle =
        NativeBridgeLoadLibraryExt(lib_name, RTLD_NOW, this->ToRawNativeBridgeNamespace());
    if (handle != nullptr) {
      return handle;
    }
  }
  return Error() << GetLinkerError(IsBridged());
}

此处不关心else中的Bridged情况。NativeLoaderNamespace::Load最终调用android_dlopen_ext加载所需so,采用Flag RTLD_NOW执行立即加载,android_dlopen_ext为Android扩展的dlopen实现,至此可以发现,Android的System.loadLibrary底层调用android_dlopen_ext来加载so,而非OpenJDK采用的dlopen(OpenJDK System.loadLibrary的源码剖析见底部参考资料)。

四、libdl(bionic)

此处到了bionic的动态链接处理库libdl中。

1. android_dlopen_ext

源码:bionic/libdl/libdl.cpp

void* android_dlopen_ext(const char* filename, int flag, const android_dlextinfo* extinfo) {
  const void* caller_addr = __builtin_return_address(0);
  return __loader_android_dlopen_ext(filename, flag, extinfo, caller_addr);
}

android_dlopen_ext直接转调内部的__loader_android_dlopen_ext。

2. __loader_android_dlopen_ext

源码:bionic/linker/dlfcn.cpp

void* __loader_android_dlopen_ext(const char* filename,
                           int flags,
                           const android_dlextinfo* extinfo,
                           const void* caller_addr) {
  return dlopen_ext(filename, flags, extinfo, caller_addr);
}

__loader_android_dlopen_ext直接转调dlopen_ext。

3. dlopen_ext

源码:bionic/linker/dlfcn.cpp

static void* dlopen_ext(const char* filename,
                        int flags,
                        const android_dlextinfo* extinfo,
                        const void* caller_addr) {
  ScopedPthreadMutexLocker locker(&g_dl_mutex);
  g_linker_logger.ResetState();
  void* result = do_dlopen(filename, flags, extinfo, caller_addr);
  if (result == nullptr) {
    __bionic_format_dlerror("dlopen failed", linker_get_error_buffer());
    return nullptr;
  }
  return result;
}

dlopen_extc处理线程同步问题后,转调do_dlopen

4. do_dlopen

源码:bionic/linker/linker.cpp

void* do_dlopen(const char* name, int flags,
                const android_dlextinfo* extinfo,
                const void* caller_addr) {
  ......
  soinfo* si = find_library(ns, translated_name, flags, extinfo, caller);
  ......
}

do_dlopen经过一系列参数处理,Log打印,Trace处理后,最终转调find_library

4. find_library

源码:bionic/linker/linker.cpp

static soinfo* find_library(android_namespace_t* ns,
                            const char* name, int rtld_flags,
                            const android_dlextinfo* extinfo,
                            soinfo* needed_by) {

  soinfo* si = nullptr;

  if (name == nullptr) {
    si = solist_get_somain();
  } else if (!find_libraries(ns,
                             needed_by,
                             &name,
                             1,
                             &si,
                             nullptr,
                             0,
                             rtld_flags,
                             extinfo,
                             false /* add_as_children */,
                             true /* search_linked_namespaces */)) {
    if (si != nullptr) {
      soinfo_unload(si);
    }
    return nullptr;
  }

  si->increment_ref_count();

  return si;
}

此处name不为nullptr,函数随即调用至find_libraries,如果成功最后对引用计数加1。下面将深入核心函数find_libraries查看。

5. find_libraries

源码:bionic/linker/linker.cpp

bool find_libraries(android_namespace_t* ns,
                    soinfo* start_with,
                    const char* const library_names[],
                    size_t library_names_count,
                    soinfo* soinfos[],
                    std::vector<soinfo*>* ld_preloads,
                    size_t ld_preloads_count,
                    int rtld_flags,
                    const android_dlextinfo* extinfo,
                    bool add_as_children,
                    bool search_linked_namespaces,
                    std::vector<android_namespace_t*>* namespaces) {

  // Step 0: prepare.
  std::unordered_map<const soinfo*, ElfReader> readers_map;
  LoadTaskList load_tasks;

  for (size_t i = 0; i < library_names_count; ++i) {
    const char* name = library_names[i];
    load_tasks.push_back(LoadTask::create(name, start_with, ns, &readers_map));
  }

  // If soinfos array is null allocate one on stack.
  // The array is needed in case of failure; for example
  // when library_names[] = {libone.so, libtwo.so} and libone.so
  // is loaded correctly but libtwo.so failed for some reason.
  // In this case libone.so should be unloaded on return.
  // See also implementation of failure_guard below.

  if (soinfos == nullptr) {
    size_t soinfos_size = sizeof(soinfo*)*library_names_count;
    soinfos = reinterpret_cast<soinfo**>(alloca(soinfos_size));
    memset(soinfos, 0, soinfos_size);
  }

  // list of libraries to link - see step 2.
  size_t soinfos_count = 0;

  auto scope_guard = android::base::make_scope_guard([&]() {
    for (LoadTask* t : load_tasks) {
      LoadTask::deleter(t);
    }
  });

  ZipArchiveCache zip_archive_cache;

  // Step 1: expand the list of load_tasks to include
  // all DT_NEEDED libraries (do not load them just yet)
  for (size_t i = 0; i<load_tasks.size(); ++i) {
    LoadTask* task = load_tasks[i];
    soinfo* needed_by = task->get_needed_by();

    bool is_dt_needed = needed_by != nullptr && (needed_by != start_with || add_as_children);
    task->set_extinfo(is_dt_needed ? nullptr : extinfo);
    task->set_dt_needed(is_dt_needed);

    LD_LOG(kLogDlopen, "find_libraries(ns=%s): task=%s, is_dt_needed=%d", ns->get_name(),
           task->get_name(), is_dt_needed);

    // Note: start from the namespace that is stored in the LoadTask. This namespace
    // is different from the current namespace when the LoadTask is for a transitive
    // dependency and the lib that created the LoadTask is not found in the
    // current namespace but in one of the linked namespace.
    if (!find_library_internal(const_cast<android_namespace_t*>(task->get_start_from()),
                               task,
                               &zip_archive_cache,
                               &load_tasks,
                               rtld_flags,
                               search_linked_namespaces || is_dt_needed)) {
      return false;
    }

    soinfo* si = task->get_soinfo();

    if (is_dt_needed) {
      needed_by->add_child(si);
    }

    // When ld_preloads is not null, the first
    // ld_preloads_count libs are in fact ld_preloads.
    if (ld_preloads != nullptr && soinfos_count < ld_preloads_count) {
      ld_preloads->push_back(si);
    }

    if (soinfos_count < library_names_count) {
      soinfos[soinfos_count++] = si;
    }
  }

  // Step 2: Load libraries in random order (see b/24047022)
  LoadTaskList load_list;
  for (auto&& task : load_tasks) {
    soinfo* si = task->get_soinfo();
    auto pred = [&](const LoadTask* t) {
      return t->get_soinfo() == si;
    };

    if (!si->is_linked() &&
        std::find_if(load_list.begin(), load_list.end(), pred) == load_list.end() ) {
      load_list.push_back(task);
    }
  }
  bool reserved_address_recursive = false;
  if (extinfo) {
    reserved_address_recursive = extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS_RECURSIVE;
  }
  if (!reserved_address_recursive) {
    // Shuffle the load order in the normal case, but not if we are loading all
    // the libraries to a reserved address range.
    shuffle(&load_list);
  }

  // Set up address space parameters.
  address_space_params extinfo_params, default_params;
  size_t relro_fd_offset = 0;
  if (extinfo) {
    if (extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS) {
      extinfo_params.start_addr = extinfo->reserved_addr;
      extinfo_params.reserved_size = extinfo->reserved_size;
      extinfo_params.must_use_address = true;
    } else if (extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS_HINT) {
      extinfo_params.start_addr = extinfo->reserved_addr;
      extinfo_params.reserved_size = extinfo->reserved_size;
    }
  }

  for (auto&& task : load_list) {
    address_space_params* address_space =
        (reserved_address_recursive || !task->is_dt_needed()) ? &extinfo_params : &default_params;
    if (!task->load(address_space)) {
      return false;
    }
  }

  // Step 3: pre-link all DT_NEEDED libraries in breadth first order.
  ......

  // Step 4: Construct the global group. Note: DF_1_GLOBAL bit of a library is
  // determined at step 3.

  // Step 4-1: DF_1_GLOBAL bit is force set for LD_PRELOADed libs because they
  // must be added to the global group
  ......

  // Step 4-2: Gather all DF_1_GLOBAL libs which were newly loaded during this
  // run. These will be the new member of the global group
  ......

  // Step 4-3: Add the new global group members to all the linked namespaces
  ......

  // Step 5: Collect roots of local_groups.
  // Whenever needed_by->si link crosses a namespace boundary it forms its own local_group.
  // Here we collect new roots to link them separately later on. Note that we need to avoid
  // collecting duplicates. Also the order is important. They need to be linked in the same
  // BFS order we link individual libraries.
  ......

  // Step 6: Link all local groups
  ......

  // Step 7: Mark all load_tasks as linked and increment refcounts
  // for references between load_groups (at this point it does not matter if
  // referenced load_groups were loaded by previous dlopen or as part of this
  // one on step 6)
  ......
  return true;
}

find_libraries整个函数分为8个步骤:

  1. (Step 0: prepare.)在本例中,调用System.loadLibrary(“mytest”),library_names_count为1。此步将需要加载的so封装为LoadTask,并存放在名为load_tasks的容器中。LoadTask中有几个关键成员变量需要说一下:
  • LoadTask.name_表示该Task所需加载的so全路径
  • LoadTask.file_offset_表示改Task加载so时,so文件对应于加载文件(如:apk)的文件偏移(从apk中直接加载so时,此字段将 > 0)
  • LoadTask.is_dt_needed_表示是否依赖so,如:libmytest.so为主加载so,所以此字段为0,libmytest.so依赖于libc.so,所以在加载libc.so时,此字段为1。
    另外,此处还对多个so加载时的原子性做了预处理,即:如果要加载2个so,而第2个so加载失败,则也需要将第1个so unload。
  1. (Step 1: expand the list of load_tasks to include all DT_NEEDED libraries (do not load them just yet))展开本so的所有依赖so的依赖so,此处的so依赖是一个树形结构。此步骤主要是一个for循环,根据上一步的结果得知load_tasks.size()为1,那么先推断for循环执行一次,这完全不足以展开本so(libmytest.so)的所有依赖so,所以刚才推断应该是哪里错了。仔细检查后发现,在for循环体中,将load_tasks的地址作为参数传入find_library_internal了,在find_library_internal中应该会将依赖so添加到load_tasks中。然后这一步主要是将so依赖树添加至load_tasks容器中。
  • 下面开支线到find_library_internal中查看,以印证如上推测。首先通过so全路径调用find_loaded_library_by_soname,查找so是否已经被加载,本例中假设为第一次调用System.loadLibrary(“mytest”),所以libmytest.so肯定没有被加载(在for循环后续执行时,一些libdl/libc等常用库由于事先已被加载到进程中,所以find_library_internal在执行到find_loaded_library_by_soname时就return true了)。然后将task,load_tasks作为参数调用了load_library。
static bool find_library_internal(android_namespace_t* ns,
                                 LoadTask* task,
                                 ZipArchiveCache* zip_archive_cache,
                                 LoadTaskList* load_tasks,
                                 int rtld_flags,
                                 bool search_linked_namespaces) {
 soinfo* candidate;

 if (find_loaded_library_by_soname(ns, task->get_name(), search_linked_namespaces, &candidate)) {
   LD_LOG(kLogDlopen,
          "find_library_internal(ns=%s, task=%s): Already loaded (by soname): %s",
          ns->get_name(), task->get_name(), candidate->get_realpath());
   task->set_soinfo(candidate);
   return true;
 }

 // Library might still be loaded, the accurate detection
 // of this fact is done by load_library.
 TRACE("[ \"%s\" find_loaded_library_by_soname failed (*candidate=%s@%p). Trying harder... ]",
       task->get_name(), candidate == nullptr ? "n/a" : candidate->get_realpath(), candidate);

 if (load_library(ns, task, zip_archive_cache, load_tasks, rtld_flags, search_linked_namespaces)) {
   return true;
 }

 ......
 return false;
}
  • 继续开支线到load_library中查看,判断相关flag等操作后,转调open_library,如果成功(fd != -1),将task,load_tasks作为参数继续转调重载的load_library。
static bool load_library(android_namespace_t* ns,
                         LoadTask* task,
                         ZipArchiveCache* zip_archive_cache,
                         LoadTaskList* load_tasks,
                         int rtld_flags,
                         bool search_linked_namespaces) {

  const char* name = task->get_name();
  soinfo* needed_by = task->get_needed_by();

  ......

  // Open the file.
  int fd = open_library(ns, zip_archive_cache, name, needed_by, &file_offset, &realpath);
  if (fd == -1) {
    if (task->is_dt_needed()) {
      if (needed_by->is_main_executable()) {
        DL_OPEN_ERR("library \"%s\" not found: needed by main executable", name);
      } else {
        DL_OPEN_ERR("library \"%s\" not found: needed by %s in namespace %s", name,
                    needed_by->get_realpath(), task->get_start_from()->get_name());
      }
    } else {
      DL_OPEN_ERR("library \"%s\" not found", name);
    }
    return false;
  }

  task->set_fd(fd, true);
  task->set_file_offset(file_offset);
  return load_library(ns, task, load_tasks, rtld_flags, realpath, search_linked_namespaces);
}
  • 我们先跳过open_library,权且认为返回成功,回头再来分析。继续深入到重载的load_library中查看,首先调用soinfo_alloc分配所需内存结构,然后读取ELF Header,并读取ELF加载所需的段(Segment),然后根据段信息调用for_each_dt_needed,将所有类型为DT_NEEDED的条目添加到load_tasks容器中。至此证明了我们上面的猜测:如果libmytest.so存在依赖so,那么for循环不止一次,并且在循环体中给容器load_tasks添加了项。更进一步还可以得出结论,so的所有依赖so将组成一颗依赖树,而此处采用了广度优先的树遍历算法。
static bool load_library(android_namespace_t* ns,
                         LoadTask* task,
                         LoadTaskList* load_tasks,
                         int rtld_flags,
                         const std::string& realpath,
                         bool search_linked_namespaces) {

  off64_t file_offset = task->get_file_offset();

  ......

  soinfo* si = soinfo_alloc(ns, realpath.c_str(), &file_stat, file_offset, rtld_flags);
  if (si == nullptr) {
    return false;
  }

  task->set_soinfo(si);

  // Read the ELF header and some of the segments.
  if (!task->read(realpath.c_str(), file_stat.st_size)) {
    soinfo_free(si);
    task->set_soinfo(nullptr);
    return false;
  }

  ......

  for_each_dt_needed(task->get_elf_reader(), [&](const char* name) {
    LD_LOG(kLogDlopen, "load_library(ns=%s, task=%s): Adding DT_NEEDED task: %s",
           ns->get_name(), task->get_name(), name);
    load_tasks->push_back(LoadTask::create(name, si, ns, task->get_readers_map()));
  });

  return true;
}
  • 先说句题外话,也可以通过readelf -d来查看so所依赖的so的列表。readelf -d libmytest.so | grep NEEDED(详情见参考资料:Linux so剖析)。如下所示:
huchao@ubuntu:~$ readelf -d libmytest.so | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [liblog.so]
 0x0000000000000001 (NEEDED)             Shared library: [libandroid.so]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so]
  • 接下来回过头看刚才跳过的open_library,本例我们分析加载libmytest.so,此so的路径下必然包含“/”,所以将直接加载。
  • 如果是直接从apk中加载so。name将类似于/data/app/~~WdKfQO1G6r3htDT7Rgo1DQ==/com.huchao.mysystemloadlibrary-6wbqoASC9saPFntEre3_MQ==/base.apk!/lib/arm64-v8a/libmytest.so,其路径中包含kZipFileSeparator(!/)将去apk文件中查找,调用open_library_in_zipfile,返回打开的apk文件的fd。
  • 如果是从本地路径中加载so。name将类似于/data/app/~~bK-kneb_uAxNsrsy-CEmDw==/com.huchao.mysystemloadlibrary-wRz7Al17VLjWqWAtbV_l0A==/lib/arm64/libmytest.so,调用Linux syscall open,返回so文件的fd。
static int open_library(android_namespace_t* ns,
                        ZipArchiveCache* zip_archive_cache,
                        const char* name, soinfo *needed_by,
                        off64_t* file_offset, std::string* realpath) {
  TRACE("[ opening %s from namespace %s ]", name, ns->get_name());

  // If the name contains a slash, we should attempt to open it directly and not search the paths.
  if (strchr(name, '/') != nullptr) {
    int fd = -1;

    if (strstr(name, kZipFileSeparator) != nullptr) {
      fd = open_library_in_zipfile(zip_archive_cache, name, file_offset, realpath);
    }

    if (fd == -1) {
      fd = TEMP_FAILURE_RETRY(open(name, O_RDONLY | O_CLOEXEC));
      if (fd != -1) {
        *file_offset = 0;
        if (!realpath_fd(fd, realpath)) {
          if (!is_first_stage_init()) {
            PRINT("warning: unable to get realpath for the library \"%s\". Will use given path.",
                  name);
          }
          *realpath = name;
        }
      }
    }

    return fd;
  }

  ......
}
  • 接上面在apk中加载so,深入查看open_library_in_zipfile,核心查看*file_offset = entry.offset;,意味着在apk文件中找到so对应的entry后,并且压缩方式为kCompressStored,并且对齐为内存页大小(4096),那么将赋值file_offset,并返回对应的fd,最终这些值将被设置到LoadTask结构中,最终更新至load_tasks,供后续加载使用。注意:到此为止,仅仅将so依赖树遍历完成,并未开始加载so。
static int open_library_in_zipfile(ZipArchiveCache* zip_archive_cache,
                                   const char* const input_path,
                                   off64_t* file_offset, std::string* realpath) {
  ......

  int fd = TEMP_FAILURE_RETRY(open(zip_path, O_RDONLY | O_CLOEXEC));
  if (fd == -1) {
    return -1;
  }

  ZipArchiveHandle handle;
  if (!zip_archive_cache->get_or_open(zip_path, &handle)) {
    // invalid zip-file (?)
    close(fd);
    return -1;
  }

  ZipEntry entry;

  if (FindEntry(handle, file_path, &entry) != 0) {
    // Entry was not found.
    close(fd);
    return -1;
  }

  // Check if it is properly stored
  if (entry.method != kCompressStored || (entry.offset % PAGE_SIZE) != 0) {
    close(fd);
    return -1;
  }

  *file_offset = entry.offset;

  ......
  return fd;
}
  1. (Step 2: Load libraries in random order (see b/24047022))这一步终于到期待已久的加载so了,在深入进去看如何加载前,我们先回顾一下GNU/Linux的dlopen的基本逻辑,man dlopen中可以看到,dlopen的type为3,意味着这个函数是Library calls (functions within program libraries),进一步意味着,这个dlopen加载so不是Linux内核提供的能力,而是libc采用Linux syscall封装而来的。理论上来说,Linux所有可执行文件与so都是ELF格式,而进程加载so主要是将so按照ELF约定好的段(Segment)加载到自己的虚拟内存空间中,在内核中采用struct vm_area_struct与其对应起来(Linux Kernel为用户层提供了procfs伪文件系统,通过/proc/[pid]/maps便可以查看加载进来的so的struct vm_area_struct),然后对PLT/GOT等进行重定位。更进一步,将文件与内存对应起来,并使之对应于内核中的struct vm_area_struct,最常规的方式就是Linux syscall mmap,接下来就去源码中一探究竟。
DLOPEN(3)                                                                             Linux Programmer's Manual                                                                             DLOPEN(3)

NAME
       dlclose, dlopen, dlmopen - open and close a shared object
  • 在这一步中,采用了新的容器load_list取代了之前的load_tasks,load_tasks中包含了本libmytest.so与所有依赖so的列表,而load_list是其子集,仅包含需要加载的so列表,因为上面提到过,某些so可能前期已经被加载过了。整理好load_tasks后,接下来遍历load_tasks并调用LoadTask.load函数,真实加载so。

  • LoadTask.load函数中,主要是通过elf_reader转调ElfReader.Load

bool load(address_space_params* address_space) {
  ElfReader& elf_reader = get_elf_reader();
  if (!elf_reader.Load(address_space)) {
    return false;
  }

  ......
  return true;
}
  • 继续深入ElfReader.Load,发现又转调了ReserveAddressSpace、LoadSegments、FindPhdr3个函数,如果这3个函数都成功,那么也就意味着so加载完成并成功了。
bool ElfReader::Load(address_space_params* address_space) {
  CHECK(did_read_);
  if (did_load_) {
    return true;
  }
  if (ReserveAddressSpace(address_space) && LoadSegments() && FindPhdr()) {
    did_load_ = true;
  }

  return did_load_;
}
  • 通过查看ElfReader::ReserveAddressSpace函数的注释发现,本函数将通过mmap申请足够大的匿名虚拟内存,以备后续加载使用。切实mmap调用是在ReserveAligned中进行的,其中也对其了Linux内存分页的边界(4096)。
// Reserve a virtual address range big enough to hold all loadable
// segments of a program header table. This is done by creating a
// private anonymous mmap() with PROT_NONE.
bool ElfReader::ReserveAddressSpace(address_space_params* address_space) {
  ......
  ReserveAligned(load_size_, kLibraryAlignment);
  ......
}

// Reserve a virtual address range such that if it's limits were extended to the next 2**align
// boundary, it would not overlap with any existing mappings.
static void* ReserveAligned(size_t size, size_t align) {
  int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS;
  if (align == PAGE_SIZE) {
    void* mmap_ptr = mmap(nullptr, size, PROT_NONE, mmap_flags, -1, 0);
    if (mmap_ptr == MAP_FAILED) {
      return nullptr;
    }
    return mmap_ptr;
  }

  // Allocate enough space so that the end of the desired region aligned up is still inside the
  // mapping.
  size_t mmap_size = align_up(size, align) + align - PAGE_SIZE;
  uint8_t* mmap_ptr =
      reinterpret_cast<uint8_t*>(mmap(nullptr, mmap_size, PROT_NONE, mmap_flags, -1, 0));
  if (mmap_ptr == MAP_FAILED) {
    return nullptr;
  }

  ......
}

  • 虚拟内存空间分配好了,接下来就该切实加载了,接下来的便是函数ElfReader::LoadSegments。函数开始就是一个for循环,phdr_num_为需要加载的so的Segment的数目(phdr_num_的意思是Elf64_Phdr结构体的数目),然后终于看到我们期待已久的mmap64了,这样so中的所有Segment就都mmap到虚拟内存中了。提一下mmap64最后两个参数,fd_即为上面open打开的文件的FD,可以对应磁盘上的so或apk文件。file_offset_ + file_page_start为对应文件的偏移,当fd_对应so时,一般来说so中会有多个Segment,偏移为so文件中对应Segment的偏移;但fd_对应apk时,偏移则为整个apk文件中未压缩so的偏移。
bool ElfReader::LoadSegments() {
  for (size_t i = 0; i < phdr_num_; ++i) {
    ......
    {
      void* seg_addr = mmap64(reinterpret_cast<void*>(seg_page_start),
                            file_length,
                            prot,
                            MAP_FIXED|MAP_PRIVATE,
                            fd_,
                            file_offset_ + file_page_start);
      if (seg_addr == MAP_FAILED) {
        DL_ERR("couldn't map \"%s\" segment %zd: %s", name_.c_str(), i, strerror(errno));
        return false;
      }
    }

  ......
  return true;
}
  1. 接下来便是设置pre-link,global group等操作了,限于篇幅,本文将不再展开,感兴趣的朋友继续阅读源码。

总结

到了总结时刻,我们概述一下System.loadLibrary整体流程:

  1. libcore中的Java代码提供了System.loadLibrary这个API,并进行简单封装后,JNI转调libcore。此处Java代码中主要是一些业务逻辑,如:so的查找路径的梳理等
  2. libcore中的Native代码仍然是简单封装,然后转调libart
  3. libart主要是为了承载上面的Java,然后转调libdl。从分析来看,仍然不涉及so加载的核心
  4. libdl来到了bionic中,其中逐步解析so文件格式,然后按照Segment将so mmap到进程的虚拟内存空间中,至此才结束整个流程

继续往下还能挖很多更深入的知识点,如:

  • PLT/GOT重定位是如何实现的?
  • mmap陷入内核后,内核中如何通过struct vm_area_struct对各个Segment进行管理?
  • 如上提到的内存分页机制是什么意思?以及为啥这个值总为4096,能否改为其他值?

最后说一下做这个事情的出发点吧,从Android 6.0 & AGP 3.6.0开始,如果开启了apk中so不压缩属性,App运行后,将无法通过分析/proc/[pid]/maps找到App加载自身的so列表,经过调研后便发现了apk中不压缩so的特性,于是好奇Linux是如何支持这种在apk中直接加载so特性的。前期查看Java代码,libart代码均未能找到原因,最终继续深入libc才得以解决,现在回过头来看发现,其实Linux一直支持这种特性,只是glibc封装后的dlopen不支持而已,Android的bionic扩展了其能力。

参考资料:

OpenJDK System.loadLibrary的源码剖析:https://blog.csdn.net/xt_xiaotian/article/details/122194883
Linux so剖析:https://blog.csdn.net/xt_xiaotian/article/details/116446531

标签:load,return,name,System,library,so,loadLibrary,Android,加载
来源: https://blog.csdn.net/xt_xiaotian/article/details/122296084

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有