|
| 1 | +# Shared libraries in .NET for Android applications |
| 2 | + |
| 3 | +Applications contain a number of shared libraries which are placed in the |
| 4 | +per-rid directories inside APK/AAB archives (`lib/ABI/lib*.so`). The libraries |
| 5 | +have different purposes and come from different sources: |
| 6 | + |
| 7 | + 1. .NET PAL (Platform Abstraction Layer), used by various Base Class Library |
| 8 | + assemblies. |
| 9 | + 2. .NET runtime (`libmonosgen-2.0.so` containing the Mono VM) |
| 10 | + 3. AOT images (`libaot*.so`, containing pre-JITed **data** which is loaded by |
| 11 | + MonoVM at runtime and processed to turn into executable code) |
| 12 | + 4. .NET for Android runtime and support libraries |
| 13 | + 5. .NET for Android data payload libraries |
| 14 | + |
| 15 | +Most of those libraries have fairly obvious purpose and layout, this document |
| 16 | +focuses on `.NET for Android` data payload libraries. |
| 17 | + |
| 18 | +# `.NET for Android` data payload libraries |
| 19 | + |
| 20 | +## Android packaging introduction |
| 21 | + |
| 22 | +Android allows applications to ship ABI-specific code inside the APK/AAB archives in |
| 23 | +order to enable applications which need some sort of native code, while otherwise written |
| 24 | +in a managed language like C#, Java or Kotlin. These libraries must be compiled to target |
| 25 | +the platforms supported by Android and they must somehow co-exist in the same APK/AAB |
| 26 | +archive (they always have the same name, just target a different platform/ABI). The way |
| 27 | +chosen by Android to implement it is to place the per-ABI libraries in the `lib/{ABI}/` |
| 28 | +directory of the archive. |
| 29 | + |
| 30 | +All of the libraries placed in the `lib/{ABI}` directories are expected to be ELF shared |
| 31 | +library images, as required by the Android Linux kernel. |
| 32 | + |
| 33 | +## .NET for Android runtime, libraries and data |
| 34 | + |
| 35 | +`.NET for Android` runtime is composed of two libraries, one being the pre-compiled runtime |
| 36 | +itself (`libmonodroid.so` in the APK) and another library being built together with the |
| 37 | +application, containing application-specific dynamically generated code (`libxamarin-app.so` |
| 38 | +in the APK). These two libraries together contain all the code and data to make the application |
| 39 | +run properly on all the supported targets. |
| 40 | + |
| 41 | +In addition to the above, `.NET for Android` ships a number of managed assemblies. For a number |
| 42 | +of years (starting with `Mono for Android`, through `Xamarin.Android`), all the assemblies had |
| 43 | +been completely platform agnostic and, thus, were shipped in a custom directory in the APK archive |
| 44 | +named `assemblies/`. However, at some point during transition to `dotnet/runtime` and its BCL, a |
| 45 | +handful of managed libraries became platform specific and, thus, had to be shipped in a way that took |
| 46 | +the platform requirement into account. As all those libraries shared the same name across platforms, |
| 47 | +we had to find a way to package them so that they wouldn't conflict with each other. Thus the |
| 48 | +`assemblies/` directory gained a subdirectory per ABI, which contained the platform specific assemblies. |
| 49 | +Later on, the same was implemented in [assembly stores](AssemblyStores.md) - they would contain both kinds |
| 50 | +of managed assemblies. |
| 51 | + |
| 52 | +The downside of packaging all the assemblies (or assembly stores) in the `assemblies/` directory was that |
| 53 | +all the platforms would get copies of platform specific assemblies for the other supported ABIs, thus wasting |
| 54 | +storage on the end user devices. |
| 55 | + |
| 56 | +Introduction of platform specific assemblies posed another problem. We discovered that in some instances, the |
| 57 | +dotnet linker/trimmer would generate assemblies that might fail on certain platforms without us having any |
| 58 | +prior warning. The solution to this was to make **all** the assemblies platform specific, making sure that |
| 59 | +whatever the trimmer did, we'd always have the correct assembly loaded on the right platform. |
| 60 | + |
| 61 | +Making all assemblies platform specific, however, poses a problem of APK/AAB size - all of the assemblies would |
| 62 | +exist in X copies and we couldn't allow such a big increase of archive size. Thus, all the assemblies (and also |
| 63 | +assembly stores as well as a runtime configuration blob file) were moved to the `lib/{ABI}/` directories and |
| 64 | +"masqueraded" as ELF shared libraries, by giving them the `lib*.so` names. However, the files were still managed |
| 65 | +assemblies, not valid ELF images. |
| 66 | + |
| 67 | +Earlier this year, however, Google [announced](https://android-developers.googleblog.com/2024/08/adding-16-kb-page-size-to-android.html) that |
| 68 | +Android 15 will enable shared libraries aligned to 16k instead of the "traditional" 4k and, at some point, the alignment |
| 69 | +will become a requirement for submission to the Play Store. This made us suspect that the libraries in `lib/{ABI}/` will |
| 70 | +be actually verified to be valid ELF images at some point and we decided to proactively turn our data files shipped in |
| 71 | +those directories into actual ELF shared libraries. The way it is done is described in the following section. |
| 72 | + |
| 73 | +## Data payload stub library |
| 74 | + |
| 75 | +ELF binaries consist of a number of sections, which contain code, data (read-only and read-write), debug symbols etc. |
| 76 | +However, the ELF specification doesn't dictate names of any of those sections and, thus, developers are free to lay out |
| 77 | +ELF binaries any way they see fit, as long as the binary conforms to the ELF specification and the operating system |
| 78 | +requirements. This gave us the idea of placing our data files (assemblies, assembly stores, debug data, config files etc) |
| 79 | +in a custom section inside the ELF image. The resulting file would pass any verification Android will perform at some |
| 80 | +point and, at the same time, it won't slow down our operation because we can still load data directly from the shared |
| 81 | +library (by using the `mmap(2)` Unix call) without having to load the ELF image into memory. |
| 82 | + |
| 83 | +To implement that, we added to our distribution a "stub" of a shared ELF library, which is essentially a small, valid |
| 84 | +but otherwise empty ELF image. This stub is built together with the rest of the `.NET for Android` runtime and its |
| 85 | +layout is discovered and remembered, so that at runtime we can quickly move to the location where our data lives and |
| 86 | +load it as we see fit. The runtime `mmap`s the entire file, looks at the file header and finds the start of payload |
| 87 | +section, then stores that location in a pointer for further use. |
| 88 | + |
| 89 | +The way the data is placed in the ELF image is by appending a new section, called `payload`, to the stub binary at |
| 90 | +application build time. This is done by using the `llvm-objcopy` utility, which we ship, and then the result is |
| 91 | +packaged into the `lib/{ABI}/` directory. The section is properly aligned, the entire file is a valid ELF image. |
| 92 | + |
| 93 | +One downside of this approach is that if one were to run the `llvm-strip` or `strip` utility on the resulting |
| 94 | +shared libray, the `payload` section (as it uses a "non-standard" name) would be considered by the strip utility |
| 95 | +to be unnecessary and summarily removed. |
| 96 | + |
| 97 | +### Layout of the payload library |
| 98 | + |
| 99 | +In order to examine content of our "payload" ELF shared library, one can run the `llvm-readelf` utility which is |
| 100 | +shipped with the Android NDK (and also part of native developer tools on macOS and Linux distributions which have |
| 101 | +the LLVM Clang toolchain installed), or the `readelf` utility which is part of GNU binutils. |
| 102 | + |
| 103 | +File used in the samples below is the `.NET for Android` assembly store, wrapped in an ELF image for the Arm64 |
| 104 | +(`AArch64`) architecture. |
| 105 | + |
| 106 | +The first command verifies that the file is a valid ELF image and shows the header information, including the |
| 107 | +target platform/abi/machine: |
| 108 | + |
| 109 | +```shell |
| 110 | +$ llvm-readelf --file-header libassemblies.arm64-v8a.blob.so |
| 111 | +ELF Header: |
| 112 | + Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |
| 113 | + Class: ELF64 |
| 114 | + Data: 2's complement, little endian |
| 115 | + Version: 1 (current) |
| 116 | + OS/ABI: UNIX - System V |
| 117 | + ABI Version: 0 |
| 118 | + Type: DYN (Shared object file) |
| 119 | + Machine: AArch64 |
| 120 | + Version: 0x1 |
| 121 | + Entry point address: 0x0 |
| 122 | + Start of program headers: 64 (bytes into file) |
| 123 | + Start of section headers: 849480 (bytes into file) |
| 124 | + Flags: 0x0 |
| 125 | + Size of this header: 64 (bytes) |
| 126 | + Size of program headers: 56 (bytes) |
| 127 | + Number of program headers: 8 |
| 128 | + Size of section headers: 64 (bytes) |
| 129 | + Number of section headers: 11 |
| 130 | + Section header string table index: 9 |
| 131 | +``` |
| 132 | +
|
| 133 | +The second command lists the sections contained within the ELF image, their alignment, sizes and offsets |
| 134 | +into the file where the sections begin: |
| 135 | +
|
| 136 | +```shell |
| 137 | +$ llvm-readelf --section-headers libassemblies.arm64-v8a.blob.so |
| 138 | +There are 11 section headers, starting at offset 0xcf648: |
| 139 | +
|
| 140 | +Section Headers: |
| 141 | + [Nr] Name Type Address Off Size ES Flg Lk Inf Al |
| 142 | + [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 |
| 143 | + [ 1] .note.gnu.build-id NOTE 0000000000000200 000200 000024 00 A 0 0 4 |
| 144 | + [ 2] .dynsym DYNSYM 0000000000000228 000228 000030 18 A 5 1 8 |
| 145 | + [ 3] .gnu.hash GNU_HASH 0000000000000258 000258 000020 00 A 2 0 8 |
| 146 | + [ 4] .hash HASH 0000000000000278 000278 000018 04 A 2 0 4 |
| 147 | + [ 5] .dynstr STRTAB 0000000000000290 000290 000032 00 A 0 0 1 |
| 148 | + [ 6] .dynamic DYNAMIC 00000000000042c8 0002c8 0000b0 10 WA 5 0 8 |
| 149 | + [ 7] .relro_padding NOBITS 0000000000004378 000378 000c88 00 WA 0 0 1 |
| 150 | + [ 8] .data PROGBITS 0000000000008378 000378 000001 00 WA 0 0 1 |
| 151 | + [ 9] .shstrtab STRTAB 0000000000000000 000379 00005e 00 0 0 1 |
| 152 | + [10] payload PROGBITS 0000000000000000 004000 0cb647 00 0 0 16384 |
| 153 | +Key to Flags: |
| 154 | + W (write), A (alloc), X (execute), M (merge), S (strings), I (info), |
| 155 | + L (link order), O (extra OS processing required), G (group), T (TLS), |
| 156 | + C (compressed), x (unknown), o (OS specific), E (exclude), |
| 157 | + R (retain), p (processor specific) |
| 158 | +``` |
| 159 | +
|
| 160 | +Of interest to us is the presence of the `payload` section, its starting offset (it will usually |
| 161 | +be `0x4000`, that is 16k into the file but it might be a multiple of the value, if the stub ever |
| 162 | +grows) and its size will, obviously, differ depending on the payload. |
| 163 | +
|
| 164 | +The information above is sufficient to verify that the file is valid `.NET for Android` payload |
| 165 | +shared library. |
| 166 | +
|
| 167 | +In order to extract payload from the ELF image, one can use the following command: |
| 168 | +
|
| 169 | +```shell |
| 170 | +$ llvm-objcopy --dump-section=payload=payload.bin libassemblies.arm64-v8a.blob.so |
| 171 | +$ ls -gG payload.bin |
| 172 | +-rw-rw-r-- 1 833095 Sep 12 11:32 payload.bin |
| 173 | +``` |
| 174 | +
|
| 175 | +To verify the size is correct, we can convert the section size indicated in the section headers |
| 176 | +output from hexadecimal to decimal: |
| 177 | +
|
| 178 | +```shell |
| 179 | +$ printf "%d\n" 0x0cb647 |
| 180 | +833095 |
| 181 | +``` |
| 182 | +
|
| 183 | +In this case, the payload file is an assembly store, which should have its first 4 bytes read |
| 184 | +`XABA`, we can verify this with the following command: |
| 185 | +
|
| 186 | +```shell |
| 187 | +$ hexdump -c -n 4 payload.bin |
| 188 | +0000000 X A B A |
| 189 | +0000004 |
| 190 | +``` |
0 commit comments