`map_images`和`load_images`(上)

语言: CN / TW / HK

上一篇文章中,我们探索了dyld的加载流程,dyld在初始化动态库的时候程序在_objc_init中通过_dyld_objc_notify_register()调用map_imagesload_images这两个函数。接下来我们就来研究一下map_imagesload_images这两个函数都做了什么操作。

上一篇文章中我们也知道_objc_init初始化的时候,还进行了很多的init操作。 ```js void _objc_init(void) {     // fixme defer initialization until an objc-using image is found?     environ_init(); // 环境变量的初始化 tls_init(); // 为每一条线程创建析构函数 tls:线程的局部存储     static_init(); // 运行c++的静态构造函数
    runtime_init(); // 初始化分类和类表     exception_init(); // 初始化异常处理系统 Called by map_images()

if OBJC2

cache_t::init(); // 初始化缓存

endif

_imp_implementationWithBlock_init(); // mac os 处理     _dyld_objc_notify_register(&map_images, load_images, unmap_image); } ``` 我们先来看一下这些初始化操作都做了什么

environ_init

```js void environ_init(void) {     bool PrintHelp = false;     bool PrintOptions = false;     bool maybeMallocDebugging = false;

for (char p = _NSGetEnviron(); p != nil; p++) {         ......         if (0 == strncmp(p, "OBJC_HELP=", 10)) {             PrintHelp = true;             continue;         }         if (0 == strncmp(p, "OBJC_PRINT_OPTIONS=", 19)) {             PrintOptions = true;             continue;         } ......     } ......     // Print OBJC_HELP and OBJC_PRINT_OPTIONS output.     if (PrintHelp  ||  PrintOptions) {         if (PrintHelp) {             _objc_inform("Objective-C runtime debugging. Set variable=YES to enable.");             _objc_inform("OBJC_HELP: describe available environment variables");             if (PrintOptions) {                 _objc_inform("OBJC_HELP is set");             }             _objc_inform("OBJC_PRINT_OPTIONS: list which options are set");         }         if (PrintOptions) {             _objc_inform("OBJC_PRINT_OPTIONS is set");         }         for (size_t i = 0; i < sizeof(Settings)/sizeof(Settings[0]); i++) {             const option_t opt = &Settings[i];                         if (PrintHelp) _objc_inform("%s: %s", opt->env, opt->help);             if (PrintOptions && opt->var) _objc_inform("%s is set", opt->env);         }     } } `` 我们可以看到这里处理的是我们配置的一些环境变量,我们可以在工程的SchemeArguments`中加入变量看一下控制台的打印日志 20220513122934337.png

20220513133434609.png 我们看到这里打印的是我们环境变量的配置信息,我们可以在这里获取想要改变的环境变量,然后在Scheme进行修改,如想要关闭指针优化就在Scheme中配置OBJC_DISABLE_NONPOINTER_ISA:YES即可,还有打印所有实现+load方法的类或分类的配置OBJC_PRINT_LOAD_METHODS,打印dyld加载的所有imageOBJC_PRINT_IMAGES等等,

runtime_init

js void runtime_init(void) {     objc::unattachedCategories.init(32); // 初始化分类表     objc::allocatedClasses.init(); // 初始化类表 } 初始化分类及类的表,对于分类及类的加载就会在这两张表中进行插入数据。

load_images

先来看一下其内部源码 ```js load_images(const char path __unused, const struct mach_header mh) {     if (!didInitialAttachCategories && didCallDyldNotifyRegister) {         didInitialAttachCategories = true;         loadAllCategories();     }

// Return without taking locks if there are no +load methods here.     if (!hasLoadMethods((const headerType )mh)) return;     recursive_mutex_locker_t lock(loadMethodLock);     // Discover load methods     {         mutex_locker_t lock2(runtimeLock);         prepare_load_methods((const headerType )mh);     }

// Call +load methods (without runtimeLock - re-entrant)     call_load_methods(); } ``` 很简单,加载所有分类,查找load方法,执行所有load方法,我们先看一下怎么查找load方法的

js void prepare_load_methods(const headerType *mhdr) {     size_t count, i;     runtimeLock.assertLocked(); // 获取所有的非懒加载类     classref_t const *classlist = _getObjc2NonlazyClassList(mhdr, &count);     for (i = 0; i < count; i++) {         schedule_class_load(remapClass(classlist[i]));     } // 获取所有的非懒加载分类     category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);     for (i = 0; i < count; i++) {         category_t *cat = categorylist[i];         Class cls = remapClass(cat->cls);         if (!cls) continue;  // category for ignored weak-linked class         if (cls->isSwiftStable()) {             _objc_fatal("Swift class extensions and categories on Swift "                         "classes are not allowed to have +load methods");         }         realizeClassWithoutSwift(cls, nil);         ASSERT(cls->ISA()->isRealized());         add_category_to_loadable_list(cat);     } } 先获取所有的非懒加载类,类分为懒加载类与非懒加载类,只要重写load方法都是非懒加载类或非懒加载分类,类通过schedule_class_load递归调用其父类的schedule_class_load,然后使用add_class_to_loadable_list将类添加到loadable_classes中,每个元素是一个名为loadable_class结构体,其中包含了类和其load方法IMPjs void add_class_to_loadable_list(Class cls) {     IMP method;     loadMethodLock.assertLocked();     method = cls->getLoadMethod(); // 获取load方法实现     if (!method) return;  // Don't bother if cls has no +load method     if (PrintLoading) {         _objc_inform("LOAD: class '%s' scheduled for +load", cls->nameForLogging());     } // 扩容     if (loadable_classes_used == loadable_classes_allocated) {         loadable_classes_allocated = loadable_classes_allocated*2 + 16;         loadable_classes = (struct loadable_class *) realloc(loadable_classes, loadable_classes_allocated * sizeof(struct loadable_class));     } // 保存类及load方法     loadable_classes[loadable_classes_used].cls = cls;     loadable_classes[loadable_classes_used].method = method;     loadable_classes_used++; //指针指向下一地址 } 类的load方法找完后,再找分懒加载分类的load方法,add_category_to_loadable_list将分类添加到loadable_categories中,每个元素是一个名为loadable_categories结构体,其中包含了分类和其load方法IMP,和类的区别就是,类会递归调用父类,分类只会调用本类。下面我们来看一下call_load_methods()是如果执行load方法的

```js void call_load_methods(void) {     static bool loading = NO;     bool more_categories;     loadMethodLock.assertLocked();     // Re-entrant calls do nothing; the outermost call will finish the job.     if (loading) return;     loading = YES;     void *pool = objc_autoreleasePoolPush();     do {         // 1. Repeatedly call class +loads until there aren't any more         while (loadable_classes_used > 0) {             call_class_loads();         }         // 2. Call category +loads ONCE         more_categories = call_category_loads();         // 3. Run more +loads if there are classes OR more untried categories     } while (loadable_classes_used > 0  ||  more_categories);     objc_autoreleasePoolPop(pool);     loading = NO; }

static void call_class_loads(void){     int i;     // Detach current loadable list.     struct loadable_class *classes = loadable_classes;     int used = loadable_classes_used;     loadable_classes = nil;     loadable_classes_allocated = 0;     loadable_classes_used = 0;

// Call all +loads for the detached list.     for (i = 0; i < used; i++) {         Class cls = classes[i].cls;         load_method_t load_method = (load_method_t)classes[i].method;         if (!cls) continue;          if (PrintLoading) {             _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());         }         (*load_method)(cls, @selector(load));     }     // Destroy the detached list.     if (classes) free(classes); }

static bool call_category_loads(void){     int i, shift;     bool new_categories_added = NO;     // Detach current loadable list.     struct loadable_category *cats = loadable_categories;     int used = loadable_categories_used;     int allocated = loadable_categories_allocated;     loadable_categories = nil;     loadable_categories_allocated = 0;     loadable_categories_used = 0;

// Call all +loads for the detached list.     for (i = 0; i < used; i++) {         Category cat = cats[i].cat;         load_method_t load_method = (load_method_t)cats[i].method;         Class cls;         if (!cat) continue;         cls = _category_getClass(cat);         if (cls  &&  cls->isLoadable()) {             if (PrintLoading) {                 _objc_inform("LOAD: +[%s(%s) load]\n",  cls->nameForLogging(), _category_getName(cat));             }             (*load_method)(cls, @selector(load));             cats[i].cat = nil;         }     } ......     return new_categories_added;

} `` 在这个方法中,对获取到的类和分类两个list进行遍历,由于我们在存储的时候已经存储了IMP,所以可以直接使用IMP进行调用load方法。在类的列表中,由于结构体的一个元素就是class,另一个就是loadIMP,所以使用(*load_method)(cls, @selector(load))就是直接调用类的load`的方法。而在分类中,就是通过分类拿到类,然后再进行调用。

map_images

同样源码先行

```js void map_images(unsigned count, const char * const paths[], const struct mach_header * const mhdrs[]) {     mutex_locker_t lock(runtimeLock);     return map_images_nolock(count, paths, mhdrs); }

void  map_images_nolock(unsigned mhCount, const char * const mhPaths[], const struct mach_header * const mhdrs[]) {     static bool firstTime = YES;     header_info *hList[mhCount];     uint32_t hCount;     size_t selrefCount = 0; // 共享缓存的优化处理     if (firstTime) {         preopt_init();     } ......

// Find all images with Objective-C metadata.

hCount = 0;     // 统计所以的类     int totalClasses = 0;     int unoptimizedTotalClasses = 0;     { uint32_t i = mhCount;         while (i--) { ...... } }

if (firstTime) {

sel_init(selrefCount); // 初始化c++的构造和析构函数         arr_init(); //初始化自动释放池、散列表、关联对象 }

if (hCount > 0) {

_read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);     }     firstTime = NO;     // Call image load funcs after everything is set up.     for (auto func : loadImageFuncs) {         for (uint32_t i = 0; i < mhCount; i++) {             func(mhdrs[i]);         }     } 内部其实是调用`map_images_nolock`,第一次进入会先去执行`preopt_init`,其内部就是优化共享缓存的处理js void preopt_init(void){ ......     const uintptr_t start = (uintptr_t)_dyld_get_shared_cache_range(&length);     if (start) {         objc::dataSegmentsRanges.setSharedCacheRange(start, start + length);     }     // opt not set at compile time in order to detect too-early usage     const char failure = nil;     opt = &_objc_opt_data; //共享缓存的处理     if (DisablePreopt) { // 环境变量,关闭dyld共享缓存优化         // OBJC_DISABLE_PREOPTIMIZATION is set         // If opt->version != VERSION then you continue at your own risk.         failure = "(by OBJC_DISABLE_PREOPTIMIZATION)";     } ......     if (failure) {         // All preoptimized selector references are invalid.         preoptimized = NO;         opt = nil;         disableSharedCacheOptimizations();     } else {         // Valid optimization data written by dyld shared cache         preoptimized = YES;     } } 接下来`map_images_nolock`使用循环统计所有的类,再次初始化c++的构造和析构函数,初始化自动释放池、散列表、关联对象,然后调用`_read_images`js void _read_images(header_info hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses) {     header_info hi;     uint32_t hIndex;     size_t count;     size_t i;     Class *resolvedFutureClasses = nil;     size_t resolvedFutureClassCount = 0;     static bool doneOnce;     bool launchTime = NO;     TimeLogger ts(PrintImageTimes);     runtimeLock.assertLocked(); #define EACH_HEADER \     hIndex = 0;         \     hIndex < hCount && (hi = hList[hIndex]); \     hIndex++     if (!doneOnce) {         doneOnce = YES;         launchTime = YES; ...开启指针优化... if (DisableTaggedPointers) { // 关闭NSNumber、NSString等小对象 的Tagged Pointer优化             disableTaggedPointers();         }         initializeTaggedPointerObfuscator(); // Tagged Pointer 处理         // 创建 gdb_objc_realized_classes 表,存放dyld share cache的数据         int namedClassesSize = (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3;         gdb_objc_realized_classes = NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);         ts.log("IMAGE TIMES: first time tasks");     }

// Fix up @selector references

static size_t UnfixedSelectors;     {         mutex_locker_t lock(selLock);         for (EACH_HEADER) {             if (hi->hasPreoptimizedSelectors()) continue;             bool isBundle = hi->isBundle();             SEL sels = _getObjc2SelectorRefs(hi, &count);             UnfixedSelectors += count;             for (i = 0; i < count; i++) {                 const char name = sel_cname(sels[i]);                 SEL sel = sel_registerNameNoLock(name, isBundle);                 if (sels[i] != sel) {                     sels[i] = sel;                 }             }         }     } ts.log("IMAGE TIMES: fix up selector references"); // Discover classes. Fix up unresolved future classes. Mark bundle classes.     bool hasDyldRoots = dyld_shared_cache_some_image_overridden();     for (EACH_HEADER) {         ……         classref_t const classlist = _getObjc2ClassList(hi, &count);         bool headerIsBundle = hi->isBundle();         bool headerIsPreoptimized = hi->hasPreoptimizedClasses();         for (i = 0; i < count; i++) {             Class cls = (Class)classlist[i];             Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);             if (newCls != cls  &&  newCls) {                 // Class was moved but not deleted. Currently this occurs                  // only when the new class resolved a future class.                 // Non-lazily realize the class below.                 resolvedFutureClasses = (Class )                     realloc(resolvedFutureClasses,                              (resolvedFutureClassCount+1) * sizeof(Class));                 resolvedFutureClasses[resolvedFutureClassCount++] = newCls;             }         }     }     ts.log("IMAGE TIMES: discover classes"); ...... ts.log("IMAGE TIMES: discover categories"); ts.log("IMAGE TIMES: realize non-lazy classes"); ts.log("IMAGE TIMES: realize future classes"); if (DebugNonFragileIvars) {         realizeAllClasses();     } 我们现在使用的是虚拟内存(ASLR),需要dyld进行rebase和binding操作来修复我们镜像的资源指针,指向正确的内存地址,`_read_images`首先`doneOnce`只调用一次,内部开启指针优化和`Tagged Pointer`处理,并创建一张存放dyld共享缓存数据的表`gdb_objc_realized_classes`;接下来会去修复`selector`,将指向错误的内存地址的sel修复成正确的sel内存地址,这个过程就是rebase的过程,将虚拟内存地址换成真实内存地址,下面继续修复`protocol`、`objc_msgSend_fixup`等;然后初始化所有非懒加载类,进行rw、ro等操作;遍历已标记的懒加载类,并作初始化操作;处理所以分类,包括类和元类;初始化所有未初始化的类。 这其中有个很重要的过程,`Discover classes`,发现所有的类,其内部使用`readClass`进行类的获取js Class readClass(Class cls, bool headerIsBundle, bool headerIsPreoptimized) {     const char mangledName = cls->nonlazyMangledName();     if (missingWeakSuperclass(cls)) {         ……     }     cls->fixupBackwardDeployingStableSwift();     Class replacing = nil;     if (mangledName != nullptr) {         if (Class newCls = popFutureNamedClass(mangledName)) {             if (newCls->isAnySwift()) {                 ……             }             class_rw_t rw = newCls->data();             const class_ro_t old_ro = rw->ro();             memcpy(newCls, cls, sizeof(objc_class));             // Manually set address-discriminated ptrauthed fields             // so that newCls gets the correct signatures.             newCls->setSuperclass(cls->getSuperclass());             newCls->initIsa(cls->getIsa());             rw->set_ro((class_ro_t )newCls->data());             newCls->setData(rw);             freeIfMutable((char )old_ro->getName());             free((void )old_ro);             addRemappedClass(cls, newCls);             replacing = cls;             cls = newCls;         }     }

if (headerIsPreoptimized  &&  !replacing) { ......     } else {         if (mangledName) { //some Swift generic classes can lazily generate their names             addNamedClass(cls, mangledName, replacing);         } else {             Class meta = cls->ISA();             const class_ro_t *metaRO = meta->bits.safe_ro();             ASSERT(metaRO->getNonMetaclass() && "Metaclass with lazy name must have a pointer to the corresponding nonmetaclass.");             ASSERT(metaRO->getNonMetaclass() == cls && "Metaclass nonmetaclass pointer must equal the original class.");         }         addClassTableEntry(cls);     }     // for future reference: shared cache never contains MH_BUNDLEs     if (headerIsBundle) {         cls->data()->flags |= RO_FROM_BUNDLE;         cls->ISA()->data()->flags |= RO_FROM_BUNDLE;     }     return cls; }

static void addClassTableEntry(Class cls, bool addMeta = true) {     runtimeLock.assertLocked();     // This class is allowed to be a known class via the shared cache or via     // data segments, but it is not allowed to be in the dynamic table already.     auto &set = objc::allocatedClasses.get(); // runtime init 时初始化的表     ASSERT(set.find(cls) == set.end());     if (!isKnownClass(cls))         set.insert(cls);     if (addMeta)         addClassTableEntry(cls->ISA(), false); } `` 可是当我们在rwro处进行断点,发现其并没有进入执行,所以rwro并不是这里初始化的,不过其内部调用addClassTableEntry初始化类,然后再次调用addClassTableEntry第二个参数为false初始化元类,并将类和元类都加入到runtime init初始化的allocatedClasses表中。readClass的作用是把已经分配过内存空间的元类添加到allocatedClasses`表中。

我们将初始化非懒加载类代码单独拿出来 js // Realize non-lazy classes (for +load methods and static instances)     for (EACH_HEADER) {         classref_t const *classlist = hi->nlclslist(&count);         for (i = 0; i < count; i++) {             Class cls = remapClass(classlist[i]);             if (!cls) continue;             addClassTableEntry(cls);             if (cls->isSwiftStable()) {                 if (cls->swiftMetadataInitializer()) {                     _objc_fatal("Swift class %s with a metadata initializer "                                 "is not allowed to be non-lazy",                                 cls->nameForLogging());                 }                 // fixme also disallow relocatable classes                 // We can't disallow all Swift classes because of                 // classes like Swift.__EmptyArrayStorage             }             realizeClassWithoutSwift(cls, nil);         }     } 其内部同样调用了addClassTableEntry初始化类,同时还调用了realizeClassWithoutSwift,我们查看其源代码发现,realizeClassWithoutSwift才是真正的初始化本身父类元类,并进行rwro等操作。

```js static Class realizeClassWithoutSwift(Class cls, Class previously) {     runtimeLock.assertLocked();     class_rw_t rw;     Class supercls;     Class metacls;     ……     auto ro = (const class_ro_t )cls->data();     auto isMeta = ro->flags & RO_META;     if (ro->flags & RO_FUTURE) {         // This was a future class. rw data is already allocated.         rw = cls->data();         ro = cls->data()->ro();         ASSERT(!isMeta);         cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);     } else {         // Normal class. Allocate writeable class data.         rw = objc::zalloc();         rw->set_ro(ro);         rw->flags = RW_REALIZED|RW_REALIZING|isMeta;         cls->setData(rw);     }     cls->cache.initializeToEmptyOrPreoptimizedInDisguise();

if FAST_CACHE_META

if (isMeta) cls->cache.setBit(FAST_CACHE_META);

endif

// Choose an index for this class.     // Sets cls->instancesRequireRawIsa if indexes no more indexes are available     cls->chooseClassArrayIndex(); ……

// 递归调用 初始化 supercls 和 metacls

supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);     metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);

if SUPPORT_NONPOINTER_ISA

if (isMeta) {         cls->setInstancesRequireRawIsa();     } else {         // Disable non-pointer isa for some classes and/or platforms.         // Set instancesRequireRawIsa.         bool instancesRequireRawIsa = cls->instancesRequireRawIsa();         bool rawIsaIsInherited = false;         static bool hackedDispatch = false;         if (DisableNonpointerIsa) {             // Non-pointer isa disabled by environment or app SDK version             instancesRequireRawIsa = true;         } else if (!hackedDispatch  &&  0 == strcmp(ro->getName(), "OS_object")) {             // hack for libdispatch et al - isa also acts as vtable pointer             hackedDispatch = true;             instancesRequireRawIsa = true;         } else if (supercls  &&  supercls->getSuperclass()  && supercls->instancesRequireRawIsa()) {             // This is also propagated by addSubclass()             // but nonpointer isa setup needs it earlier.             // Special case: instancesRequireRawIsa does not propagate             // from root class to root metaclass             instancesRequireRawIsa = true;             rawIsaIsInherited = true;         }

if (instancesRequireRawIsa) {             cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited);         }     } // SUPPORT_NONPOINTER_ISA

endif

// Update superclass and metaclass in case of remapping     cls->setSuperclass(supercls); // 设置父类     cls->initClassIsa(metacls); // 设置元类

// Reconcile instance variable offsets / layout.     // This may reallocate class_ro_t, updating our ro variable.     if (supercls  &&  !isMeta) reconcileInstanceVariables(cls, supercls, ro);     // Set fastInstanceSize if it wasn't set already.     cls->setInstanceSize(ro->instanceSize);     // Copy some flags from ro to rw     if (ro->flags & RO_HAS_CXX_STRUCTORS) {         cls->setHasCxxDtor();         if (! (ro->flags & RO_HAS_CXX_DTOR_ONLY)) {             cls->setHasCxxCtor();         }     }

// Propagate the associated objects forbidden flag from ro or from     // the superclass.     if ((ro->flags & RO_FORBIDS_ASSOCIATED_OBJECTS) || (supercls && supercls->forbidsAssociatedObjects())) {         rw->flags |= RW_FORBIDS_ASSOCIATED_OBJECTS;     }

// Connect this class to its superclass's subclass lists     if (supercls) {         addSubclass(supercls, cls);     } else {         addRootClass(cls);     }

// Attach categories     methodizeClass(cls, previously);

return cls; } `` 里面我们也清晰的看到了rw是在该函数中进行的alloc,而ro已经在编译的时候存在了,在这里只是赋值给rw`。

总结

load_images就是调用load方法,所以load方法先于main函数。其流程是会把所有非懒加载类的load方法放到loadable_classes中,把分类的load方法loadable_categories中,然后对其进行遍历,找到相关的类和方法直接使用存储的load method调用。 - load的查找流程是父类->子类->分类 - 因为子类中会递归查找父类load方法,所以我们重写load时是不需要使用[super load]的 - 两个分类都重写了load,则看编译器顺序,哪个后编译,哪个先执行 - 获取和调用load方法时都进行了加锁操作,所以load方法是线程安全的

非懒加载类的真正初始化是由realizeClassWithoutSwift进行的,其内部开辟rw的内存空间,将ro放入rw中。

rebase:将指向错误的内存地址指针修复成指向正确的内存地址,将虚拟内存地址换成真实内存地址