llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	d98c172712	[ELF] Fix TimeTraceScope for "Finalize .eh_frame"	2022-12-03 18:00:51 +00:00
Guillaume Chatelet	08e2a76381	[lld][NFC] rename ELF alignment into addralign	2022-12-01 16:20:12 +00:00
Fangrui Song	910204cfbd	[ELF] createSyntheticSections: simplify config->relocatable. NFC We can add .riscv.attributes synthetic section here in the future.	2022-11-22 20:09:15 -08:00
Fangrui Song	8610cb0460	[ELF] -r: don't define _TLS_MODULE_BASE_ _TLS_MODULE_BASE_ is supposed to be defined by the final link. Defining it in a relocatable link may render the final link value incorrect. GNU ld i386/x86-64 have the same issue: https://sourceware.org/bugzilla/show_bug.cgi?id=29820	2022-11-22 12:59:45 -08:00
Fangrui Song	d9ef5574d4	[ELF] -r: don't define __global_pointer$ This symbol is supposed to be defined by the final executable link. The new behavor matches GNU ld.	2022-11-22 12:37:51 -08:00
Fangrui Song	9015e41f0f	[ELF] addRelIpltSymbols: make it explicit some passes are for non-relocatable links. NFC and prepare for __global_pointer$ and _TLS_MODULE_BASE_ fix.	2022-11-22 11:38:57 -08:00
Fangrui Song	2bf5d86422	[ELF] Change rawData to content() and data() to contentMaybeDecompress() Clarify data() which may trigger decompression and make it feasible to refactor the member variable rawData.	2022-11-20 22:43:22 +00:00
Kazu Hirata	d42357007d	[lld] Use llvm::reverse (NFC)	2022-11-06 08:39:41 -08:00
Fangrui Song	14f996dca8	[ELF] Move inputSections/ehInputSections into Ctx. NFC	2022-10-16 00:49:48 -07:00
Fangrui Song	1837333dac	[ELF] --check-sections: allow address 0xffffffff for ELFCLASS32 Fix https://github.com/llvm/llvm-project/issues/58101	2022-10-01 15:37:07 -07:00
Fangrui Song	9c626d4a0d	[ELF] Remove symtab indirection. NFC Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.	2022-10-01 14:46:49 -07:00
Fangrui Song	34fa860048	[ELF] Remove ctx indirection. NFC Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection. We can move other global variables into ctx without indirection concern. In the long term we may consider passing Ctx as a parameter to various functions and eliminate global state as much as possible and then remove `Ctx::reset`.	2022-10-01 12:06:33 -07:00
Nico Weber	cd7ffa2e52	lld: Include name of output file in "failed to write output" diag Differential Revision: https://reviews.llvm.org/D133110	2022-09-14 14:57:47 -04:00
Fangrui Song	12607f57da	[ELF] Cache compute_thread_count. NFC	2022-09-12 19:09:08 -07:00
Fangrui Song	e6aebff674	[ELF] Parallelize relocation scanning * Change `Symbol::flags` to a `std::atomic<uint16_t>` * Add `llvm::parallel::threadIndex` as a thread-local non-negative integer * Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex * Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output. MIPS and PPC64 use global states for relocation scanning. Keep serial scanning. Speed-up with mimalloc and --threads=8 on an Intel Skylake machine: * clang (Release): 1.27x as fast * clang (Debug): 1.06x as fast * chrome (default): 1.05x as fast * scylladb (default): 1.04x as fast Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64): * clang (Release): 1.31x as fast * scylladb (default): 1.06x as fast Reviewed By: andrewng Differential Revision: https://reviews.llvm.org/D133003	2022-09-12 12:56:35 -07:00
Fangrui Song	94ca041905	[ELF] Move scanRelocations into Relocations.cpp. NFC	2022-09-04 21:31:18 -07:00
Fangrui Song	3b4d800911	[ELF] Parallelize writes of different OutputSections We currently process one OutputSection at a time and for each OutputSection write contained input sections in parallel. This strategy does not leverage multi-threading well. Instead, parallelize writes of different OutputSections. The default TaskSize for parallelFor often leads to inferior sharding. We prepare the task in the caller instead. * Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup * Add llvm::parallel::TaskGroup::execute. * Change writeSections to declare TaskGroup and pass it to writeTo. Speed-up with --threads=8: * clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast * clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast * chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast * scylladb build/release: 1.09x as fast On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast. Differential Revision: https://reviews.llvm.org/D131247	2022-08-24 09:40:03 -07:00
Sam Clegg	2cd4cd9a32	[lld][ELF] Rename SymbolTable::symbols() to SymbolTable::getSymbols(). NFC This change renames this method match its original name and the name used in the wasm linker. Back in `d8f8abbd4a` the ELF SymbolTable method `getSymbols()` was replaced with `forEachSymbol`. Then in `a2fc964417` `forEachSymbol` was replaced with a `llvm::iterator_range`. Then in `e9262edf0d` we came full circle and the `llvm::iterator_range` was replaced with a `symbols()` accessor that was identical the original `getSymbols()`. `getSymbols` also matches the name used elsewhere in the ELF linker as well as in both COFF and wasm backend (e.g. `InputFiles.h` and `SyntheticSections.h`) Differential Revision: https://reviews.llvm.org/D130787	2022-08-19 14:56:08 -07:00
Alex Brachet	dbd04b853b	[ELF] Support --package-metadata This was recently introduced in GNU linkers and it makes sense for ld.lld to have the same support. This implementation omits checking if the input string is valid json to reduce size bloat. Differential Revision: https://reviews.llvm.org/D131439	2022-08-08 21:31:58 +00:00
Fangrui Song	c09d323599	[ELF] Move EhInputSection out of inputSections. NFC inputSections temporarily contains EhInputSection objects mainly for combineEhSections. Place EhInputSection objects into a new vector ehInputSections instead of inputSections.	2022-07-31 11:58:08 -07:00
Fangrui Song	0a28cfdff5	[ELF] Simplify getRankProximity. NFC	2022-07-30 16:32:42 -07:00
Fangrui Song	2e2d5304f0	[ELF] Move combineEhSections from Writer to SyntheticSections. NFC This not only places the function in the right place, but also allows inlining addSection.	2022-07-29 00:47:30 -07:00
Fangrui Song	c72973608d	[ELF] Combine EhInputSection removal and MergeInputSection removal. NFC	2022-07-29 00:39:57 -07:00
Fangrui Song	8d4b11b4f1	[ELF] Remove redundant isa<InputSection>(sec). NFC combineEhSections has been called to remove EhInputSection.	2022-07-29 00:30:52 -07:00
Fangrui Song	85cfd91723	[ELF] Optimize some non-constant alignTo with alignToPowerOf2. NFC My x86-64 lld executable is 2KiB smaller. .eh_frame writing gets faster as there were lots of divisions.	2022-07-24 11:20:49 -07:00
Fangrui Song	51b9e099d5	[ELF] Reword --no-allow-shlib-undefined diagnostic Use a format more similar to unresolved references from regular object files. It's probably easier to read for people who are less familiar with the linker diagnostics. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D129790	2022-07-15 01:29:58 -07:00
YongKang Zhu	2324c2e3c3	[LLD] Two tweaks to symbol ordering scheme When `--symbol-ordering-file` is specified, the linker today will always put hot contributions in the middle of cold ones when targeting RISC machine, so to minimize the chances that branch thunks need be generated for hot code calling into cold code. This is not necessary when user specifies an ordering of read-only data (vs. function) symbols, or when output section is small such that no branch thunk would ever be required. The latter is common for mobile apps. For example, among all the native ARM64 libraries in Facebook Instagram App for Android, 80% of them have text section smaller than 64KB and the largest text section seen is less than 8MB, well below the distance that a BRANCH26 can reach. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D128382	2022-07-12 11:34:17 -07:00
Fangrui Song	6611d58f5b	[ELF] Relax R_RISCV_ALIGN Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle -mrelax object files (i.e. -mno-relax is no longer needed) and creates a framework for future relaxation. `relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing auxiliary information for relaxation. In the first pass, `relaxAux` is allocated. The main data structure is `relocDeltas`: when referencing `relocations[i]`, the actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`. `relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text section. Then, adjust st_value/st_size for symbols relative to this section based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows that the size has changed. Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for convergence of text sections and other address dependent sections (e.g. SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases but has issues in some linker script edge cases. After convergence, compute section contents: shrink the NOP sequence of each R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let the next memcpy skip the desired number of bytes. Section content computation is parallelizable, but let's ensure the implementation is mature before optimizations. Technically we can save a copy if we interleave some code with `OutputSection::writeTo`, but let's not pollute the generic code (we don't have templated relocation resolving, so using conditions can impose overhead to non-RISCV.) Tested: `make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable. FreeBSD RISCV64 system using -mrelax is bootable. bash/curl/firefox/libevent/vim/tmux using -mrelax works. Differential Revision: https://reviews.llvm.org/D127581	2022-07-07 10:16:09 -07:00
Fangrui Song	e0612c91cd	[ELF] Optimize getInputSections. NFC In the majority of cases (e.g. orphan sections), an OutputSection has at most one InputSectionDescription (isd). By changing the return type to ArrayRef<InputSection *> we can just reference the isd->sections. For OutputSections with more than one InputSectionDescription we use a caller provided SmallVector to copy the elements as before. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D129111	2022-07-05 23:31:09 -07:00
Fangrui Song	9a572164d5	[ELF] Move InputFiles global variables (memoryBuffers, objectFiles, etc) into Ctx. NFC	2022-06-29 18:53:38 -07:00
Nico Weber	7effcbda49	Rename parallelForEachN to just parallelFor Patch created by running: rg -l parallelForEachN \| xargs sed -i '' -c 's/parallelForEachN/parallelFor/' No behavior change. Differential Revision: https://reviews.llvm.org/D128140	2022-06-19 17:49:00 -04:00
Fangrui Song	2ac8ce5d56	Revert D125410 "[ELF] Align the end of PT_GNU_RELRO to max-page-size instead of common-page-size" This reverts commit `ebdb9d635a`. Changing p_memsz is insufficient and may make PT_GNU_RELRO extend beyond the PT_LOAD.	2022-05-12 20:41:22 -07:00
Fangrui Song	ebdb9d635a	[ELF] Align the end of PT_GNU_RELRO to max-page-size instead of common-page-size We picked common-page-size to match GNU ld. Recently, the resolution to GNU ld https://sourceware.org/bugzilla/show_bug.cgi?id=28824 (milestone: 2.39) switched to max-page-size so that the last page can be protected by RELRO in case the system page size is larger than common-page-size. Thanks to our two RW PT_LOAD scheme (D58892), switching to max-page-size does not change file size (while GNU ld's scheme may increase file size). Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D125410	2022-05-12 11:03:12 -07:00
Fangrui Song	5a44980f0a	[ELF] Support custom sections between DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END We currently hard code RELRO sections. When a custom section is between DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END, we may report a spurious `error: section: ... is not contiguous with other relro sections`. GNU ld makes such sections RELRO. glibc recently switched to default --with-default-link=no. This configuration places `__libc_atexit` and others between DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END. This patch allows such a ld.bfd --verbose linker script to be fed into lld. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D124656	2022-05-04 01:10:46 -07:00
Fangrui Song	be01af4a0f	[ELF] Fix non-relocatable-non-emit-relocs --gc-sections to discard .L symbols This reverts commit `764cd491b1`, which I incorrectly assumed NFC partly because there were no test coverage for the non-relocatable non-emit-relocs case before 9d6d936243fe343abe89323a27c7241b395af541. The interaction of {,-r,--emit-relocs} {,--discard-locals} {,--gc-sections} is complex but without -r/--emit-relocs, --gc-sections does need to discard .L symbols like --no-gc-sections. The behavior matches GNU ld.	2022-04-07 14:34:32 -07:00
Mitch Phillips	786c89fed3	[ELF][MTE] Add --android-memtag-* options to synthesize ELF notes This ELF note is aarch64 and Android-specific. It specifies to the dynamic loader that specific work should be scheduled to enable MTE protection of stack and heap regions. Current synthesis of the ".note.android.memtag" ELF note is done in the Android build system. We'd like to move that to the compiler. This patch adds the --memtag-stack, --memtag-heap, and --memtag-mode={async, sync, none} flags to the linker, which synthesises the note for us. Future changes will add -fsanitize=memtag* flags to clang which will pass these through to lld. Depends on D119381. Differential Revision: https://reviews.llvm.org/D119384	2022-04-04 11:17:36 -07:00
Fangrui Song	7370a489b1	[ELF] --emit-relocs: fix missing STT_SECTION when the first input section is synthetic addSectionSymbols suppresses the STT_SECTION symbol if the first input section is non-SHF_MERGE synthetic. This is incorrect when the first input section is synthetic while a non-synthetic input section exists: * `.bss : { (COMMON) (.bss) }` (`abc388ed3c` regressed the case because COMMON symbols precede .bss in the absence of a linker script) * Place a synthetic section in another section: `.data : { (.got) (.data) }` For `%t/a1` in the new test emit-relocs-synthetic.s, ld.lld produces incorrect relocations with symbol index 0. ``` 0000000000000000 <_start>: 0: 8b 05 33 00 00 00 movl 51(%rip), %eax # 0x39 <bss> 0000000000000002: R_X86_64_PC32 ABS+0xd 6: 8b 05 1c 00 00 00 movl 28(%rip), %eax # 0x28 <common> 0000000000000008: R_X86_64_PC32 common-0x4 c: 8b 05 06 00 00 00 movl 6(%rip), %eax # 0x18 000000000000000e: R_X86_64_GOTPCRELX ABS+0x4 ``` Fix the issue by checking every input section. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D122463	2022-03-29 08:56:21 -07:00
Fangrui Song	8565a87fd4	[ELF] Simplify MergeInputSection::getParentOffset. NFC and remove overly verbose comments.	2022-03-28 10:02:35 -07:00
Fangrui Song	940bd4c771	[ELF] addSectionSymbols: simplify isec->getOutputSection(). NFC	2022-03-24 21:54:20 -07:00
Fangrui Song	d3e5b6f753	[ELF] Implement --build-id={md5,sha1} with truncated BLAKE3 --build-id was introduced as "approximation of true uniqueness across all binaries that might be used by overlapping sets of people". It does not require the some resistance mentioned below. In practice, people just use --build-id=md5 for 16-byte build ID and --build-id=sha1 for 20-byte build ID. BLAKE3 has 256-bit key length, which provides 128-bit security against (second-)preimage, collision, and differentiability attacks. Its portable implementation is fast. It additionally provides Arm Neon/AVX2/AVX-512. Just implement --build-id={md5,sha1} with truncated BLAKE3. Linking clang 14 RelWithDebInfo with --threads=8 on a Skylake CPU: * 1.13x as fast with --build-id=md5 * 1.15x as fast with --build-id=sha1 --threads=4 on Apple m1: * 1.25x as fast with --build-id=md5 * 1.17x as fast with --build-id=sha1 Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D121531	2022-03-24 11:31:39 -07:00
Fangrui Song	6c814931bc	[ELF] Don't use multiple inheritance for OutputSection. NFC Add an OutputDesc class inheriting from SectionCommand. An OutputDesc wraps an OutputSection. This change allows InputSection::getParent to be inlined. Differential Revision: https://reviews.llvm.org/D120650	2022-03-08 11:23:42 -08:00
Fangrui Song	9e9c86fd67	[ELF] Change some non-null pointer parameters to references. NFC To decrease difference for D120650. Also, rename some `OutputSection *sec` (and `cmd`) to the more common `osec`.	2022-02-28 11:19:00 -08:00
Fangrui Song	8d01ac75e7	[ELF] Replace an unneeded dyn_cast_or_null with dyn_cast. NFC	2022-02-28 00:50:06 -08:00
Fangrui Song	7fd3849b35	[ELF] Move --print-archive-stats= and --why-extract= beside --warn-backrefs report So that early errors don't suppress their output.	2022-02-27 20:23:09 +00:00
Fangrui Song	8ca46bba23	[ELF] Move isUsedInRegularObj assignment from ctor to call sites. NFC This removes the tricky `isUsedInRegularObj(!file \|\| file->kind() == InputFile::ObjKind)` and the copy from `Symbol::mergeProperties`.	2022-02-23 21:32:50 -08:00
Fangrui Song	b01430a04f	[ELF] Don't rely on Symbols.h's transitive inclusion of InputFiles.h. NFC	2022-02-23 19:18:24 -08:00
Fangrui Song	fc0aa8424c	[ELF] Check COMMON symbols for PROVIDE and don't redefine COMMON symbols edata/end/etext In GNU ld, the definition precedence is: regular symbol assignment > relocatable object definition > `PROVIDE` symbol assignment. GNU ld's internal linker scripts define the non-reserved (by C and C++) edata/end/etext with `PROVIDE` so the relocatable object definition takes precedence. This makes sense because `int end;` is valid. We currently redefine such symbols if they are COMMON, but not if they are regular definitions, so `int end;` with -fcommon is essentially a UB in ld.lld. Fix this (also improve consistency and match GNU ld) by using the `isDefined` code path for `isCommon`. In GNU ld, reserved identifiers like `__ehdr_start` do not use `PROVIDE`, while we treat them all as `PROVIDE`, this seems fine. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D120389	2022-02-23 10:15:42 -08:00
Fangrui Song	ae1ba6194f	[ELF] Replace uncompressed InputSectionBase::data() with rawData. NFC In many call sites we know uncompression cannot happen (non-SHF_ALLOC, or the data (even if compressed) must have been uncompressed by a previous pass). Prefer rawData in these cases. data() increases code size and prevents optimization on rawData.	2022-02-21 00:39:26 -08:00
Jez Ng	69297cf639	[lld-macho] Don't include CommandFlags.h in CommonLinkerContext.h Main motivation: including `llvm/CodeGen/CommandFlags.h` in `CommonLinkerContext.h` means that the declaration of `llvm::Reloc` is visible in any file that includes `CommonLinkerContext.h`. Since our cpp files have both `using namespace llvm` and `using namespace lld::macho`, this results in conflicts with `lld::macho::Reloc`. I suppose we could put `llvm::Reloc` into a nested namespace, but in general, I think we should avoid transitively including too many header files in a very widely used header like `CommonLinkerContext.h`. RegisterCodeGenFlags' ctor initializes a bunch of function-`static` structures and does nothing else, so it should be fine to "initialize" it as a temporary stack variable rather than as a file static. Reviewed By: aganea Differential Revision: https://reviews.llvm.org/D119913	2022-02-16 20:05:07 -05:00
Fangrui Song	27bb799095	[ELF] Clean up headers. NFC	2022-02-07 21:53:34 -08:00

1 2 3 4 5 ...

1731 Commits