DWARF v5 6.2.4 The Line Number Program Header says:
> The first entry is the current directory of the compilation. Each additional
> path entry is either a full path name or is relative to the current directory of
> the compilation.
When forming a path, relative DW_AT_comp_dir and directories[0] are not supposed
to be joined together. Fix getFileNameByIndex to special case DWARF v5 DirIdx == 0.
Reviewed By: #debug-info, dblaikie
Differential Revision: https://reviews.llvm.org/D131804
Implements the pc element for the symbolizing filter, including it's
"ra" and "pc" modes. Return addresses ("ra") are adjusted by
decrementing one. By default, {{{pc}}} elements are assumed to point to
precise code ("pc") locations. Backtrace elements will adopt the
opposite convention.
Along the way, some minor refactors of value printing and colorization.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D131115
These are new debug types that ships with the latest
Windows SDK and would warn and finally fail lld-link.
The symbols seems to be related to Microsoft's XFG
which is their version of CFG. We can't handle any of
this yet, so for now we can just ignore these types
so that lld doesn't fail with a new version of Windows
SDK.
Fixes: #56285
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D129378
Given a TypeIndex or CVType return its size in bytes. Basically it
is the inverse to 'CodeViewDebug::lowerTypeBasic', that returns a
TypeIndex based in a size.
Reviewed By: rnk, djtodoro
Differential Revision: https://reviews.llvm.org/D129846
This connects the Symbolizer to the markup filter and enables the first
working end-to-end flow using the filter.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D130187
This review is extracted from D96035.
DWARF Debuginfo classes have two representations for DIEs: DWARFDebugInfoEntry
(short) and DWARFDie(extended). Depending on the task, it might be more convenient
to use DWARFDebugInfoEntry or/and DWARFDie. DWARFUnit class already has methods
working with DWARFDie and DWARFDebugInfoEntry. This patch adds more
methods working with DWARFDebugInfoEntry to have paired functionality.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D126059
Teach libDebugInfo (llvm-dwarfdump) and lldb about DWARF tags and
attributes for pointer authentication. These values have been emitted by
Apple clang for several releases. Although upstream LLVM doesn't emit
these values yet, we hope to upstream that part sometime soon.
Differential revision: https://reviews.llvm.org/D130215
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.
Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.
Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D129519
clang 14 removed -gz=zlib-gnu and ld.lld/llvm-objcopy removed .zdebug support
recently. llvm-dwp currently doesn't support SHF_COMPRESSED. Add support and
remove .zdebug support.
Simplify llvm::object::Decompressor which has no .zdebug user now.
While here, add tests for ELF32LE, ELF32BE, and ELF64BE.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D129728
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
The DWARF spec says:
Any debugging information entry representing the declaration of an object,
module, subprogram or type may have DW_AT_decl_file, DW_AT_decl_line and
DW_AT_decl_column attributes, each of whose value is an unsigned integer
^^^^^^^^
constant.
If however, a producer happens to emit DW_AT_decl_file /
DW_AT_decl_line using a signed integer form, llvm-dwarfdump crashes,
like so:
(... snip ...)
0x000000b4: DW_TAG_structure_type
DW_AT_name ("test_struct")
DW_AT_byte_size (136)
DW_AT_decl_file (llvm-dwarfdump: (... snip ...)/llvm/include/llvm/ADT/Optional.h:197: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() &
[with T = long unsigned int]: Assertion `hasVal' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /opt/rocm/llvm/bin/llvm-dwarfdump ./testsuite/outputs/gdb.rocm/lane-pc-vega20/lane-pc-vega20-kernel.so
#0 0x000055cc8e78315f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x000055cc8e780d3d SignalHandler(int) Signals.cpp:0:0
#2 0x00007f8f2cae8420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
#3 0x00007f8f2c58d00b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#4 0x00007f8f2c56c859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7
#5 0x00007f8f2c56c729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8
#6 0x00007f8f2c56c729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34
#7 0x00007f8f2c57dfd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
#8 0x000055cc8e58ceb9 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2e0eb9)
#9 0x000055cc8e58bec3 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2dfec3)
#10 0x000055cc8e5b28a3 llvm::DWARFCompileUnit::dump(llvm::raw_ostream&, llvm::DIDumpOptions) (.part.21) DWARFCompileUnit.cpp:0:0
Likewise with DW_AT_call_file / DW_AT_call_line.
The problem is that the code in llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
dumping these attributes assumes that
FormValue.getAsUnsignedConstant() returns an armed optional. If in
debug mode, we get an assertion line the above. If in release mode,
and asserts are compiled out, then we proceed as if the optional had a
value, running into undefined behavior, printing whatever random
value.
Fix this by checking whether the optional returned by
FormValue.getAsUnsignedConstant() has a value, like done in other
places.
In addition, DWARFVerifier.cpp is validating DW_AT_call_file /
DW_AT_decl_file, but not AT_call_line / DW_AT_decl_line. This commit
fixes that too.
The llvm-dwarfdump/X86/verify_file_encoding.yaml testcase is extended
to cover these cases. Current llvm-dwarfdump crashes running the
newly-extended test.
"make check-llvm-tools-llvm-dwarfdump" shows no regressions, on x86-64
GNU/Linux.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D129392
This library implements the class `DebuginfodCollection`, which scans a set of directories for binaries, classifying them according to whether they contain debuginfo. This also provides the `DebuginfodServer`, an `HTTPServer` which serves debuginfod's `/debuginfo` and `/executable` endpoints. This is intended as the final new supporting library required for `llvm-debuginfod`.
As implemented here, `DebuginfodCollection` only finds ELF binaries and DWARF debuginfo. All other files are ignored. However, the class interface is format-agnostic. Generalizing to support other platforms will require refactoring of LLVM's object parsing libraries to eliminate use of `report_fatal_error` ([[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/WasmObjectFile.cpp#L74 | e.g. when reading WASM files ]]), so that the debuginfod daemon does not crash when it encounters a malformed file on the disk.
The `DebuginfodCollection` is tested by end-to-end tests of the debuginfod server (D114846).
Reviewed By: mysterymath
Differential Revision: https://reviews.llvm.org/D114845
llvm::codeview::visitMemberRecordStream expects to receive an array ref that's FieldListRecord's Data not a CVType's data which has 4 more bytes preceeding. The first 2 bytes indicate the size of the FieldListRecord, and following 2 bytes is always 0x1203. Inside llvm::codeview::visitMemberRecordStream, it iterates to the data to check if first two bytes matching some type record kinds. If the size coincidentally matches one type kind, it will start parsing from there and causing crash.
This adds a --filter option to llvm-symbolizer. This takes log-bearing
symbolizer markup from stdin and writes a human-readable version to
stdout.
For now, this only implements the "symbol" markup tag; all others are
passed through unaltered. This is a proof-of-concept bit of
functionalty; implement the various tags is more-or-less just a matter
of hooking up various parts of the Symbolize library to the architecture
established here.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D126980
This allows registering certain tags as possibly beginning multi-line
elements in the symbolizer markup parser. The parser is kept agnostic to
how lines are delimited; it reports the entire contents, including line
endings, once the end of element marker is reached.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D124798
Running iwyu-diff on LLVM codebase since fb67d683db detected a few
regressions, fixing them.
The impact on preprocessed output is negligible: -4k lines.
Patch created by running:
rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'
No behavior change.
Differential Revision: https://reviews.llvm.org/D128140
Type attributes are currently printed as:
DW_AT_type (<address> "<name>")
For example:
DW_AT_type (0x00000086 "double")
However, containing_type attributes omit the name, for example:
DW_AT_containing_type (0x00000086)
In order to make the dwarf dumps easier to read, and to have consistency
between the type-like attributes, this commit changes the way
DW_AT_containing_type is printed so that it includes the name of the
type it refers to:
DW_AT_containing_type (0x00000086 "double")
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D127078
This adds a parser for the log symbolizer markup format discussed in
https://discourse.llvm.org/t/rfc-log-symbolizer/61282. The parser
operates in a line-by-line fashion with minimal memory requirements.
This doesn't yet include support for multi-line tags or specific parsing
for ANSI X3.64 SGR control sequences, but it can be extended to do so.
The latter can also be relatively easily handled by examining the
resulting text elements.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D124686
The bug was introduced when the AddressRange class was no longer able to modify the End address directly and the entire range of the .text address range that contained the trailing empty symbol was replaced. There was no unit test for this, so it wasn't caught. I fixed the bug and added a unit test for it.
The effects of this bug are serious as the AddressOffsetSize in the header would be incorrectly calculated and an invalid GSYM would be created.
Differential Revision: https://reviews.llvm.org/D127811
Add proper CodeView encoding for positive constant integer values greater than
127. In addition, use the two byte encoding form for positive values less
than LF_NUMERIC.
Differential Revision: https://reviews.llvm.org/D126968
- truncateQuotedNameFront: The last use was removed on Jul 10, 2017 in
commit a9d944fd6f.
- truncateQuotedNameBack: The last use was removed on Mar 26, 2018 in
commit 7b84b678a9.
- truncateStringMiddle: The last use was removed on Mar 26, 2018 in
commit 7b84b678a9.
- truncateStringBack: The last use is in truncateQuotedNameBack being
removed above.
- truncateStringFront: The last use is in truncateQuotedNameFront
being removed above.
This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:
* Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
* DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90006
Currently, llvm-symbolizer doesn't like to parse .debug_info in order to
show the line info for global variables. addr2line does this. In the
future, I'm looking to migrate AddressSanitizer off of internal metadata
over to using debuginfo, and this is predicated on being able to get the
line info for global variables.
This patch adds the requisite support for getting the line info from the
.debug_info section for symbolizing global variables. This only happens
when you ask for a global variable to be symbolized as data.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D123538
Discovered in a large object that would need a 64 bit index (but the
cu/tu index format doesn't include a 64 bit offset/length mode in
DWARF64 - a spec bug) but instead binutils dwp overflowed the offsets
causing overlapping regions.
llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.
Why do we still want a warning on such case? because it doesn't follow dwarf standard. https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D124121
Right now, if we want to dump symbol at specified offset, we need to use `grep`.
And it can only show surrounding symbols in layout (not in lexical scope sense).
This adds similar options to `dump` command as `llvm-dwarfdump` to allow users
to dump symbol record at specified offset and its parents or children with
spcified depth.
`--symbol-offset=` must be used with `--modi` to dump only one symbol at given
offset.
`--show-parents`/`--show-children` must be used with `--symbol-offset` to
dump all symbols that are parents/children of the symbol at given offset.
`--parent-recurse-depth`/`--children-recurse-depth` must be used with
`--show-parents`/`--show-children` to specify the max up/down depth.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D124317
llvm-gsymutil has an implementation of AddressRange and AddressRanges
classes. That implementation might be reused in other parts of llvm.
This patch moves AddressRange and AddressRanges classes into llvm/ADT.
Differential Revision: https://reviews.llvm.org/D124350
The change described by:
https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses a broken '-modi' argument handling, which
causes an assertion if its value is other than '0' or '1'.
In addition, it moves the assertion for the number of occurrences
of the '-modi' argument from the PDB library into the llvm-pdbutil
driver.
Reviewed By: zequanwu
Differential Revision: https://reviews.llvm.org/D123483
The changes described by:
https://reviews.llvm.org/D121801https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses one outstanding issue concerning the global
state (Filters) created in the PDB library.
- Move 'Filters' inside the 'LinePrinter' class.
- Omit 'Optional' and just pass 'PrintScope &HeaderScope' everywhere.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D122887
This fixes the issue when the current line offset is actually for next range.
Maintain a current code range with current line offset and cache next file/line
offset. Update file/line offset after finishing current range.
Differential Revision: https://reviews.llvm.org/D123151
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:
* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`
As part of this patch also:
* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.
Differential Revision: https://reviews.llvm.org/D123100
Since enumerators may not be available in every translation unit they
can't be reliably used to name entities. (this also makes simplified
template name roundtripping infeasible - since the expected name could
only be rebuilt if the enumeration definition could be found (or only if
it couldn't be found, depending on the context of the original name))
At Sony we are developing llvm-dva
https://lists.llvm.org/pipermail/llvm-dev/2020-August/144174.html
For its PDB support, it requires functionality already present in
llvm-pdbutil.
We intend to move that functionaly into the PDB library to be
shared by both tools. That change will be done in 2 steps, that
will be submitted as 2 patches:
(1) Replace 'ExitOnError' with explicit error handling.
(2) Move the intended shared code to the PDB library.
Patch for step (1): https://reviews.llvm.org/D121801
This patch is for step (2).
Move InputFile.cpp[h], FormatUtil.cpp[h] and LinePrinter.cpp[h]
files to the debug PDB library.
It exposes the following functionality that can be used by tools:
- Open a PDB file.
- Get module debug stream.
- Traverse module sections.
- Traverse module subsections.
Most of the needed functionality is in InputFile, but there are
dependencies from LinePrinter and FormatUtil.
Some other functionality is in the following functions in
DumpOutputStyle.cpp file:
- iterateModuleSubsections
- getModuleDebugStream
- iterateOneModule
- iterateSymbolGroups
- iterateModuleSubsections
Only these specific functions from DumpOutputStyle are moved to
the PDB library.
Reviewed By: aganea, dblaikie, rnk
Differential Revision: https://reviews.llvm.org/D122226
This change adds a simple LRU cache to the Symbolize class to put a cap
on llvm-symbolizer memory usage. Previously, the Symbolizer's virtual
memory footprint would grow without bound as additional binaries were
referenced.
I'm putting this out there early for an informal review, since there may be
a dramatically different/better way to go about this. I still need to
figure out a good default constant for the memory cap and benchmark the
implementation against a large symbolization workload. Right now I've
pegged max memory usage at zero for testing purposes, which evicts the whole
cache every time.
Unfortunately, it looks like StringRefs in the returned DI objects can
directly refer to the contents of binaries. Accordingly, the cache
pruning must be explicitly requested by the caller, as the caller must
guarantee that none of the returned objects will be used afterwards.
For llvm-symbolizer this a light burden; symbolization occurs
line-by-line, and the returned objects are discarded after each.
Implementation wise, there are a number of nested caches that depend
on one another. I've implemented a simple Evictor callback system to
allow derived caches to register eviction actions to occur when the
underlying binaries are evicted.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D119784
On some standard library configurations these have a dependency on the
complete type of SymbolizableModule. They also do a lot of
copying/freeing so no point in inlining them.
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"
Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files
Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848
Which is a great diff!
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
Most notably,
llvm/Object/Binary.h no longer includes llvm/Support/MemoryBuffer.h
llvm/Object/MachOUniversal*.h no longer include llvm/Object/Archive.h
llvm/Object/TapiUniversal.h no longer includes llvm/Object/TapiFile.h
llvm-project preprocessed size:
before: 1068185081
after: 1068324320
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119457
This adds a --build-id=<hex build ID> flag to llvm-symbolizer. If --obj
is unspecified, this will attempt to look up the provided build ID using
whatever mechanisms are available to the Symbolizer (typically,
debuginfod). The semantics are then as if the found binary were given
using the --obj flag.
Reviewed By: jhenderson, phosek
Differential Revision: https://reviews.llvm.org/D118633
Debuginfod can pull in libcurl as a dependency, which isn't appropriate
for libLLVM. (See
https://gitlab.freedesktop.org/mesa/mesa/-/issues/5732).
This change breaks out debuginfod into a separate non-component library
that can be used directly in llvm-symbolizer. The tool can inject
debuginfod into the Symbolizer library via an abstract DebugInfoFetcher
interface, breaking the dependency of Symbolizer on debuinfod.
See https://github.com/llvm/llvm-project/issues/52731
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D118413
Major user-facing changes:
Many headers in llvm/DebugInfo/CodeView no longer include
llvm/Support/BinaryStreamReader.h or llvm/Support/BinaryStreamWriter.h,
those headers may need to be included manually.
Several headers in llvm/DebugInfo/CodeView no longer include
llvm/DebugInfo/CodeView/EnumTables.h or llvm/DebugInfo/CodeView/CodeView.h,
those headers may need to be included manually.
Some statistics:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/DebugInfo/CodeView/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 2794466
before: 2832765
Discourse thread on the topic: https://discourse.llvm.org/t/include-what-you-use-include-cleanup/
Differential Revision: https://reviews.llvm.org/D119092
This change moves the SymbolizableObjectFile header to
include/llvm/DebugInfo/Symbolize. Making this header available to other
llvm libraries simplifies use cases where implicit caching, multiple
platform support and other features of the Symbolizer class are not
required. This also makes the dependent libraries easier to unit test
by having mocks which derive from SymbolizableModule.
Differential Revision: https://reviews.llvm.org/D116781
The convert only worked on CUs in main binary.
If it's a skeleton CU it will now use the DWO CU
when invoking handleDie.
Test Plan:
llvm-lit
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D118521
This patch creates functions which might be used to dump types.
This functionality was already implemented by DWARFTypePrinter.
Now it could be reused. It will help D96035, which uses DWARFTypePrinter.
Differential Revision: https://reviews.llvm.org/D117134