[lld-macho] Enable EH frame relocation / pruning
This just removes the code that gates the logic. The main issue here is perf impact: without {D122258}, LLD takes a significant perf hit because it now has to do a lot more work in the input parsing phase. But with that change to eliminate unnecessary EH frames from input object files, the perf overhead here is minimal. Concretely, here are the numbers for some builds as measured on my 16-core Mac Pro: **chromium_framework** This is without the use of `-femit-dwarf-unwind=no-compact-unwind`: base diff difference (95% CI) sys_time 1.826 ± 0.019 1.962 ± 0.034 [ +6.5% .. +8.4%] user_time 9.306 ± 0.054 9.926 ± 0.082 [ +6.2% .. +7.1%] wall_time 8.225 ± 0.068 8.947 ± 0.128 [ +8.0% .. +9.6%] samples 15 22 With that flag enabled, the regression mostly disappears, as hoped: base diff difference (95% CI) sys_time 1.839 ± 0.062 1.866 ± 0.068 [ -0.9% .. +3.8%] user_time 9.452 ± 0.068 9.490 ± 0.067 [ -0.1% .. +0.9%] wall_time 8.383 ± 0.127 8.452 ± 0.114 [ -0.1% .. +1.8%] samples 17 21 **Unnamed internal app** Without `-femit-dwarf-unwind`, this is the perf hit: base diff difference (95% CI) sys_time 1.372 ± 0.029 1.317 ± 0.024 [ -4.6% .. -3.5%] user_time 2.835 ± 0.028 2.980 ± 0.027 [ +4.8% .. +5.4%] wall_time 3.205 ± 0.079 3.383 ± 0.066 [ +4.9% .. +6.2%] samples 102 83 With `-femit-dwarf-unwind`, the perf hit almost disappears: base diff difference (95% CI) sys_time 1.274 ± 0.026 1.270 ± 0.025 [ -0.9% .. +0.3%] user_time 2.812 ± 0.023 2.822 ± 0.035 [ +0.1% .. +0.7%] wall_time 3.166 ± 0.047 3.174 ± 0.059 [ -0.2% .. +0.7%] samples 95 97 Just for fun, I measured the impact of `-femit-dwarf-unwind` on ld64 (`base` has the extra DWARF unwind info in the input object files, `diff` doesn't): base diff difference (95% CI) sys_time 1.128 ± 0.010 1.124 ± 0.023 [ -1.3% .. +0.6%] user_time 7.176 ± 0.030 7.106 ± 0.094 [ -1.5% .. -0.4%] wall_time 7.874 ± 0.041 7.795 ± 0.121 [ -1.7% .. -0.3%] samples 16 25 And for LLD: base diff difference (95% CI) sys_time 1.315 ± 0.019 1.280 ± 0.019 [ -3.2% .. -2.0%] user_time 2.980 ± 0.022 2.822 ± 0.016 [ -5.5% .. -5.0%] wall_time 3.369 ± 0.038 3.175 ± 0.033 [ -6.2% .. -5.3%] samples 47 47 So parsing the extra EH frames is a lot more expensive for us than for ld64. But given that we are quite a lot faster than ld64 to begin with, I guess this isn't entirely unexpected... Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D129540
This commit is contained in:
parent
422e6e7629
commit
403d61aedd
|
@ -131,9 +131,6 @@ struct Configuration {
|
|||
bool omitDebugInfo = false;
|
||||
bool warnDylibInstallName = false;
|
||||
bool ignoreOptimizationHints = false;
|
||||
// Temporary config flag that will be removed once we have fully implemented
|
||||
// support for __eh_frame.
|
||||
bool parseEhFrames = false;
|
||||
uint32_t headerPad;
|
||||
uint32_t dylibCompatibilityVersion = 0;
|
||||
uint32_t dylibCurrentVersion = 0;
|
||||
|
|
|
@ -1305,7 +1305,6 @@ bool macho::link(ArrayRef<const char *> argsArr, llvm::raw_ostream &stdoutOS,
|
|||
config->callGraphProfileSort = args.hasFlag(
|
||||
OPT_call_graph_profile_sort, OPT_no_call_graph_profile_sort, true);
|
||||
config->printSymbolOrder = args.getLastArgValue(OPT_print_symbol_order);
|
||||
config->parseEhFrames = static_cast<bool>(getenv("LLD_IN_TEST"));
|
||||
|
||||
// FIXME: Add a commandline flag for this too.
|
||||
config->zeroModTime = getenv("ZERO_AR_DATE");
|
||||
|
|
|
@ -347,7 +347,7 @@ void ObjFile::parseSections(ArrayRef<SectionHeader> sectionHeaders) {
|
|||
section.subsections.push_back({0, isec});
|
||||
} else if (auto recordSize = getRecordSize(segname, name)) {
|
||||
splitRecords(*recordSize);
|
||||
} else if (config->parseEhFrames && name == section_names::ehFrame &&
|
||||
} else if (name == section_names::ehFrame &&
|
||||
segname == segment_names::text) {
|
||||
splitEhFrames(data, *sections.back());
|
||||
} else if (segname == segment_names::llvm) {
|
||||
|
@ -1117,7 +1117,7 @@ template <class LP> void ObjFile::parse() {
|
|||
}
|
||||
if (compactUnwindSection)
|
||||
registerCompactUnwind(*compactUnwindSection);
|
||||
if (config->parseEhFrames && ehFrameSection)
|
||||
if (ehFrameSection)
|
||||
registerEhFrames(*ehFrameSection);
|
||||
}
|
||||
|
||||
|
|
|
@ -65,7 +65,11 @@ MinGW Improvements
|
|||
MachO Improvements
|
||||
------------------
|
||||
|
||||
* Item 1.
|
||||
* We now support proper relocation and pruning of EH frames. **Note:** this
|
||||
comes at some performance overhead on x86_64 builds, and we recommend adding
|
||||
the ``-femit-compact-unwind=no-compact-unwind`` compile flag to avoid it.
|
||||
(`D129540 <https://reviews.llvm.org/D129540>`_,
|
||||
`D122258 <https://reviews.llvm.org/D122258>`_)
|
||||
|
||||
WebAssembly Improvements
|
||||
------------------------
|
||||
|
|
Loading…
Reference in New Issue