Commit Graph

356 Commits

Author SHA1 Message Date
Fangrui Song 89fab98e88 [DebugInfo] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-05 00:09:22 +00:00
Fangrui Song f4c16c4473 [MC] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 21:36:08 +00:00
Fangrui Song ea47ccc78f [BOLT] Fix after DebugInfoMetadata change 0ca43d4488 2022-12-04 18:57:52 +00:00
Kazu Hirata e324a80fab [BOLT] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 23:12:38 -08:00
Kazu Hirata 1028b165ee [BOLT] Fix a build error
This patch fixes:

  bolt/lib/Profile/DataAggregator.cpp:264:66: error: no viable
  conversion from 'Optional<llvm::StringRef>[3]' to
  'ArrayRef<std::optional<StringRef>>'
2022-12-01 15:48:03 -08:00
Kazu Hirata 04b59e7af9 [BOLT] Fix unused function warnings
This patch fixes:

  bolt/lib/Passes/CallGraph.cpp:27:15: error: unused function
  'hash_int64_fallback' [-Werror,-Wunused-function]

  bolt/lib/Passes/CallGraph.cpp:40:15: error: unused function
  'hash_int64' [-Werror,-Wunused-function]
2022-11-29 11:13:14 -08:00
Guillaume Chatelet 702126aec5 [NFC] Add helper method to ensure min alignment on MCSection
Follow up on D138653.

Differential Revision: https://reviews.llvm.org/D138686
2022-11-28 10:00:34 +00:00
Guillaume Chatelet 6c09ea3fdd [Alignment][NFC] Use Align in MCStreamer::emitValueToAlignment
Differential Revision: https://reviews.llvm.org/D138674
2022-11-24 16:09:44 +00:00
Guillaume Chatelet 4f17734175 [Alignment][NFC] Use Align in MCStreamer::emitCodeAlignment
This patch makes code less readable but it will clean itself after all functions are converted.

Differential Revision: https://reviews.llvm.org/D138665
2022-11-24 14:51:46 +00:00
Guillaume Chatelet e647b4f519 [reland][Alignment][NFC] Use the Align type in MCSection
Differential Revision: https://reviews.llvm.org/D138653
2022-11-24 13:19:18 +00:00
Kazu Hirata 34bcadc38c Use std::nullopt_t instead of NoneType (NFC)
This patch replaces those occurrences of NoneType that would trigger
an error if the definition of NoneType were missing in None.h.

To keep this patch focused, I am deliberately not replacing None with
std::nullopt in this patch or updating comments.  They will be
addressed in subsequent patches.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Differential Revision: https://reviews.llvm.org/D138539
2022-11-23 14:16:04 -08:00
Nico Weber e8ce5f1ec9 [bolt] Use llvm::sys::RWMutex instead of std::shared_timed_mutex
This has the following advantages:
- std::shared_timed_mutex is macOS 10.12+ only. llvm::sys::RWMutex
  automatically switches to a different implementation internally
  when targeting older macOS versions.
- bolt only needs std::shared_mutex, not std::shared_timed_mutex.
  llvm::sys::RWMutex automatically uses std::shared_mutex internally
  where available.

std::shared_mutex and RWMutex have the same API, so no code changes
other than types and includes are needed.

Differential Revision: https://reviews.llvm.org/D138423
2022-11-21 19:24:32 -05:00
Kazu Hirata 1fa870b1bd Use None consistently (NFC)
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.

In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.

Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:

  using NoneType = std::nullopt_t;
  inline constexpr std::nullopt_t None = std::nullopt;

to ease the migration from llvm::Optional to std::optional.

Differential Revision: https://reviews.llvm.org/D138376
2022-11-20 00:24:40 -08:00
Nico Weber f65e8c3c51 [bolt] Fix std::prev()-past-begin in veneer handling code
matchLinkerVeneer() returns 3 if `Instruction` and the last
two instructions in `[Instructions.begin, Instructions.end())`
match the pattern

    ADRP  x16, imm
    ADD   x16, x16, imm
    BR    x16

BinaryContext.cpp used to use

    --Count;
    for (auto It = std::prev(Instructions.end()); Count != 0;
         It = std::prev(It), --Count) {
      ...use It...
    }

to walk these instructions. The first `--Count` skips the
instruction that's in `Instruction` instead of in `Instructions`.
The loop then walks over `Instructions`.

However, on the last iteration, this calls `std::prev()` on an
iterator that points at the container's begin(), which can blow
up.

Instead, use rbegin(), which sidesteps this issue.

Fixes test/AArch64/veneer-gold.s on a macOS host.
With this, check-bolt passes on macOS.

Differential Revision: https://reviews.llvm.org/D138313
2022-11-18 14:42:08 -05:00
Nico Weber d731d6df64 [bolt] add missing space in "llvm-bolt -help" output 2022-11-18 09:47:11 -05:00
revunov.denis@huawei.com c92ff2a3c4 [BOLT][NFC] Fix possible use-after-free
If NewName twine has reference to the old name, then after
Section.Name = NewName.str(); this reference is invalidated,
so we cannot use NewName.str() anymore.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D137616
2022-11-14 13:30:22 +00:00
Rafael Auler 3698994492 [BOLT] Always move JTs in jump-table=move
We should always move jump tables when requested. Previously,
we were not moving jump tables of non-simple functions in relocation
mode. That caused a bug detailed in the attached test case: in PIC
jump tables, we force jump tables to be moved, but if they are not
moved because the function is not simple, we could incorrectly update
original entries in .rodata, corrupting it under special circumstances
(see testcase).

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D137357
2022-11-04 13:20:11 -07:00
Alexey Moksyakov 1fb186198a adds huge pages support of PIE/no-PIE binaries
This patch adds the huge pages support (-hugify) for PIE/no-PIE
binaries. Also returned functionality to support the kernels < 5.10
where there is a problem in a dynamic loader with the alignment of
pages addresses.

Differential Revision: https://reviews.llvm.org/D129107
2022-11-04 15:14:21 +03:00
serge-sans-paille f71d32a0ee
Honor LLVM_LIBDIR_SUFFIX
Some distribution install libraries under lib64. LLVM supports this
through LLVM_LIBDIR_SUFFIX, have bolt do the same.

Differential Revision: https://reviews.llvm.org/D137039
2022-11-01 23:54:06 +01:00
Hongtao Yu d5a963ab8b [PseudoProbe] Replace relocation with offset for entry probe.
Currently pseudo probe encoding for a function is like:
	- For the first probe, a relocation from it to its physical position in the code body
	- For subsequent probes, an incremental offset from the current probe to the previous probe

The relocation could potentially cause relocation overflow during link time. I'm now replacing it with an offset from the first probe to the function start address.

A source function could be lowered into multiple binary functions due to outlining (e.g, coro-split). Since those binary function have independent link-time layout, to really avoid relocations from .pseudo_probe sections to .text sections, the offset to replace with should really be the offset from the probe's enclosing binary function, rather than from the entry of the source function. This requires some changes to previous section-based emission scheme which now switches to be function-based. The assembly form of pseudo probe directive is also changed correspondingly, i.e, reflecting the binary function name.

Most of the source functions end up with only one binary function. For those don't, a sentinel probe is emitted for each of the binary functions with a different name from the source. The sentinel probe indicates the binary function name to differentiate subsequent probes from the ones from a different binary function. For examples, given source function

```
Foo() {
  …
  Probe 1
  …
  Probe 2
}
```

If it is transformed into two binary functions:

```
Foo:
   …

Foo.outlined:
   …
```

The encoding for the two binary functions will be separate:

```

GUID of Foo
  Probe 1

GUID of Foo
  Sentinel probe of Foo.outlined
  Probe 2
```

Then probe1 will be decoded against binary `Foo`'s address, and Probe 2 will be decoded against `Foo.outlined`. The sentinel probe of `Foo.outlined` makes sure there's not accidental relocation from `Foo.outlined`'s probes to `Foo`'s entry address.

On the BOLT side, to be minimal intrusive, the pseudo probe re-encoding sticks with the old encoding format. This is fine since unlike linker, Bolt processes the pseudo probe section as a whole and it is free from relocation overflow issues.

The change is downwards compatible as long as there's no mixed use of the old encoding and the new encoding.

Reviewed By: wenlei, maksfb

Differential Revision: https://reviews.llvm.org/D135912
Differential Revision: https://reviews.llvm.org/D135914
Differential Revision: https://reviews.llvm.org/D136394
2022-10-27 13:28:22 -07:00
Maksim Panchenko 20204db503 [BOLT] Add mold-style PLT support
mold linker creates symbols for PLT entries and that caught BOLT by
surprise. Add the support for marked PLT entries.

Fixes: #58498

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D136655
2022-10-25 11:03:52 -07:00
Rafael Auler c0d954a068 [BOLT] Ignore duplicate global symbols
We noticed some binaries with duplicated global symbol
entries (same name, address and size). Ignore them as it is possibly a
bug in the linker, and continue processing, unless the symbol has a
different size or address.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D136122
2022-10-19 11:52:06 -07:00
Alexander Yermolovich fcd7717ddf [BOLT][DWARF] Add support for DW_FORM_addr for DW_AT_call_return_pc
GCC 12 produces DW_FORM_addr for DW_AT_call_return_pc. Added support for that.
Fixes facebookincubator/BOLT#307

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D136204
2022-10-19 10:44:09 -07:00
Maksim Panchenko 28d70d3f1e [BOLT][NFC] Refactor EFMM initialization
Move EFMM initialization code to emitAndLink(), where EFMM is used.

Reviewed By: yavtuk

Differential Revision: https://reviews.llvm.org/D136205
2022-10-18 20:31:10 -07:00
Maksim Panchenko bcc4c90954 [BOLT] Fix instruction encoding validation
Always use non-symbolizing disassembler for instruction encoding
validation as symbols will be treated as undefined/zeros be the encoder
and causing byte sequence mismatches.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D136118
2022-10-18 13:50:00 -07:00
Maksim Panchenko dc8035bddd [BOLT][NFCI] Avoid calling registerName() twice
Calling registerName() for the same symbol twice, even with a different
size, has no effect other than the lookup overhead. Avoid the
redundancy.

Fixes facebookincubator/BOLT#299

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D136115
2022-10-17 16:16:31 -07:00
Maksim Panchenko 4d3a0cade2 [BOLT] Section-handling refactoring/overhaul
Simplify the logic of handling sections in BOLT. This change brings more
direct and predictable mapping of BinarySection instances to sections in
the input and output files.

* Only sections from the input binary will have a non-null SectionRef.
  When a new section is created as a copy of the input section,
  its SectionRef is reset to null.

* RewriteInstance::getOutputSectionName() is removed as the section name
  in the output file is now defined by BinarySection::getOutputName().

* Querying BinaryContext for sections by name uses their original name.
  E.g., getUniqueSectionByName(".rodata") will return the original
  section even if the new .rodata section was created.

* Input file sections (with relocations applied) are emitted via MC with
  ".bolt.org" prefix. However, their name in the output binary is
  unchanged unless a new section with the same name is created.

* New sections are emitted internally with ".bolt.new" prefix if there's
  a name conflict with an input file section. Their original name is
  preserved in the output file.

* Section header string table is properly populated with section names
  that are actually used. Previously we used to include discarded
  section names as well.

* Fix the problem when dynamic relocations were propagated to a new
  section with a name that matched a section in the input binary.
  E.g., the new .rodata with jump tables had dynamic relocations from
  the original .rodata.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135494
2022-10-13 23:10:39 -07:00
Rafael Auler 4f158995b9 [BOLT] Add pass to fix ambiguous memory references
This adds a round of checks to memory references, looking for
incorrect references to jump table objects. Fix them by replacing the
jump table reference with another object reference + offset.

This solves bugs related to regular data references in code
accidentally being bound to a jump table, and this reference being
updated to a new (incorrect) location because we moved this jump
table.

Fixes #55004

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D134098
2022-10-12 18:39:50 -07:00
Rafael Auler 8d1fc45dc3 [BOLT][NFC] Refactor creation of symbol+addend references
Put code that creates references to symbol+addend behind MCPlusBuilder.
Will use this later in validate memory references pass.

Reviewed By: #bolt, maksfb, yota9

Differential Revision: https://reviews.llvm.org/D134097
2022-10-12 18:39:26 -07:00
Maksim Panchenko 5fca9c5763 [BOLT] Change order of new sections
While the order of new sections in the output binary was deterministic
in the past (i.e. there was no run-to-run variation), it wasn't always
rational as we used size to define the precedence of allocatable
sections within "code" or "data" groups (probably unintentionally).
Fix that by defining stricter section-ordering rules.

Other than the order of sections, this should be NFC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135235
2022-10-07 11:20:42 -07:00
Maksim Panchenko 0b213c9090 [BOLT] Fix writing out unmarked .eh_frame section
When BOLT updates .eh_frame section, it concatenates newly-generated
contents (from CFI directives) with the original .eh_frame that has
relocations applied to it. However, if no new content is generated,
the original .eh_frame has to be left intact. In that case, BOLT was
still writing out the relocatable copy of the original .eh_frame section
to the new segment, even though this copy was never used and was not
even marked in the section header table.

Detect the scenario above and skip allocating extra space for .eh_frame.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135223
2022-10-07 11:19:51 -07:00
Maksim Panchenko c683e281cd [BOLT] Properly set _end symbol
To properly set the "_end" symbol, we need to track the last allocatable
address. Simply emitting "_end" at the end of some section is not
sufficient since the order of section allocation is unknown during the
emission step.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135121
2022-10-07 11:19:14 -07:00
Maksim Panchenko 3e097fab5a [BOLT][NFC] Remove text section assertion
We can emit a binary without a new text section. Hence, the text section
assertion is not needed.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135120
2022-10-07 11:18:37 -07:00
Gabriel Ravier 9966b3e728 [BOLT] Fixed some typos
I went over the output of the following mess of a command:

`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`

and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).

Reviewed By: Amir, maksfb

Differential Revision: https://reviews.llvm.org/D130824
2022-09-30 17:07:04 +02:00
Amir Ayupov 90d87dbf4b [BOLT] Report BB reordering %-age vs profiled and total number of functions
Reviewed By: spupyrev

Differential Revision: https://reviews.llvm.org/D134819
2022-09-29 12:35:45 +02:00
Rafael Auler ba9cc6537c [PERF2BOLT] Fix unittest failure
Fix failure caused by commit e549ac072b "Do not issue parsing error on
weird build ids".
2022-09-28 16:01:57 -07:00
Rafael Auler e549ac072b [PERF2BOLT] Do not issue parsing error on weird build ids
In weird entries we were issueing a parse error. For example, in line 5 here:

6862acc063b0aa86595f52ff81628577df4296ff a.so
6862acc063b0aa86595f52ff81628577df4296ff a.so
6862acc063b0aa86595f52ff81628577df4296ff a.so
db758cb3c970044e78d5a4c99b011708a9995636 bin1
60326683eab31acfd03435d9ed4ff9a8         bin2
7d448e51851b4bdb33eac84f90e74628a14a5f00 b.so
742aa26e0211794356cc25f415c25230a26aa045 c.so

Error reading BOLT data input file: line 89, column 33: malformed field

Fix that.

Reviewed By: #bolt, Amir

Differential Revision: https://reviews.llvm.org/D134822
2022-09-28 14:41:55 -07:00
Huan Nguyen 153eeb4a5e [BOLT] Disable -lite when split function is present
In lite mode, BOLT only transforms a subset of functions, leave the
remaining functions intact.

For NoPIC, it is fine. BOLT can scan relocations and fix-up all refs
that point to any function body in the subset.

For no-split function PIC, it is fine. Since jump tables are intra-
procedural transfer, BOLT can find both the jump table base and the
target within same function. Thus, BOLT can update and/or move jump
tables.

However, it is wrong to process a subset of functions in split function
PIC. This is because BOLT does not know if functions in the subset are
isolated, i.e., cannot be accessed by functions out of the subset,
especially via split jump table.

For example, BOLT only process three functions A, B and C. Suppose that
A is reached via jump table from A.cold, which is not processed. When
A is moved (due to optimization), the jump table in A.cold is invalid.
We cannot fix-up this jump table since it is only recognized in A.cold,
which BOLT does not process.

Solution: Disable lite mode if split function is present.

Future improvement: In lite mode, if split function is found, BOLT
processes both functions in the subset and all of their sibling
fragments.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir, maksfb

Differential Revision: https://reviews.llvm.org/D131283
2022-09-28 19:26:17 +02:00
serge-sans-paille 61cff9079c [BOLT] Support building bolt when LLVM_LINK_LLVM_DYLIB is ON
This does *not* link with libLLVM, but with static archives instead. Not
super-great, but at least the build works, which is probably better than
failing.

Related to #57551

Differential Revision: https://reviews.llvm.org/D134434
2022-09-23 07:59:30 +02:00
serge-sans-paille 9029ed2e4b [BOLT] Fix (part of) dylib compatibility
Non-LLVM components should not be listed as part of LLVM_LINK_COMPONENTS.

Differential Revision: https://reviews.llvm.org/D134278
2022-09-22 10:41:40 +02:00
serge-sans-paille 3ca61941c1 Revert "[bolt] Fix (part of) dylib compatibility"
This reverts commit 34ad83d883.
2022-09-22 10:41:21 +02:00
serge-sans-paille 34ad83d883 [bolt] Fix (part of) dylib compatibility
Non-LLVM component should not be listed as part of LLVM_LINK_COMPONENTS

Differential Revision: https://reviews.llvm.org/D134278
2022-09-22 10:32:40 +02:00
Amir Ayupov 39336fc09c [BOLT] Control aggregation mode output profile file format
In perf2bolt and `-aggregate-only` BOLT mode, the output profile file is written
in fdata format by default. Provide a knob `-profile-format=[fdata,yaml]` to
control the format.
Note that `-w` option still dumps in YAML format.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D133995
2022-09-19 13:37:10 -07:00
Kazu Hirata 981fa1c15c Fix unused variable warnings:
This patch fixes warnings during a release build:

  mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp:198:52: error:
  lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]

  bolt/lib/Rewrite/RewriteInstance.cpp:5318:18: error: unused variable
  'HasNoAddress' [-Werror,-Wunused-variable]
2022-09-19 10:42:50 -07:00
spupyrev 539b6c68cb [BOLT] Unifying implementations of ext-tsp
After BOLT's merge to LLVM, there are two (almost identical) versions of the
code layout algorithm. The diff unifies the implementations by keeping the one
in LLVM.

There are mild changes in the resulting block orders. I tested the changes
extensively both on the clang binary and on prod services. Didn't see stat sig
differences on average.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D129895
2022-09-19 08:29:08 -07:00
Kazu Hirata c9696322bd [BOLT] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.

Note that no use of llvm::empty requires the ability of llvm::empty to
determine the emptiness from begin/end only.
2022-09-18 11:01:56 -07:00
Maksim Panchenko f1a11d770e [BOLT][NFC] Remove unreachable assertion
Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D134094
2022-09-16 17:03:35 -07:00
Maksim Panchenko 1d5393526c [BOLT] Change base class of ExecutableFileMemoryManager
When we derive EFMM from SectionMemoryManager, it brings into EFMM extra
functionality, such as the registry of exception handling sections,
page permission management, etc. Such functionality is of no use to
llvm-bolt and can even be detrimental (see
https://github.com/llvm/llvm-project/issues/56726).

Change the base class of ExecutableFileMemoryManager to MemoryManager,
avoid registering EH sections, and skip memory finalization.

Fixes #56726

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D133994
2022-09-16 13:39:12 -07:00
Maksim Panchenko 9742c25b98 [BOLT] Fix empty function emission in non-relocation mode
In non-relocation mode, every function is emitted in its own section. If
a function is empty, RuntimeDyld will still allocate 1-byte section
for the function and initialize it with zero. As a result, we will
overwrite the first byte of the original function contents with zero.
Such scenario can happen when the input function had only NOP
instructions which BOLT removes by default. Even though such functions
likely cause undefined behavior, it's better to preserve their contents.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D133978
2022-09-16 13:38:32 -07:00
Amir Ayupov e002523b65 [BOLT] Verify externally referenced blocks against jump table targets
For functions with references to internal offsets from data, verify externally
referenced blocks against the set of jump table targets. Mark the function
as non-simple if there are any unclaimed data to code references.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D132495
2022-09-16 11:44:33 -07:00