This reverts commit ea1826ee57.
This change is breaking the ability of tests to override the profile
output file. Need to add a mechanism to do that before resubmitting.
For the RAII lock usage we need to create a local var. There were some headers which clang-tidy identified as unused.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D138593
With all of the writing of the memprof profile consolidated into one
place, there is no need to set up the profile file (which creates the
file and also redirects all printing from the runtime to it) until we
are ready to dump the profile.
This allows errors and other messages to be dumped to stderr instead of
the profile file, which by default is in a binary format. Additionally,
reset the output file to stderr after dumping the profile so that any
requested memprof allocator statistics are printed to stderr.
Differential Revision: https://reviews.llvm.org/D138175
When in-tree libcxx is selected as the sanitizer C++ ABI, use
libcxx-abi-* targets rather than libcxxabi and libunwind directly.
Differential Revision: https://reviews.llvm.org/D134855
Currently memprof profile is dumped when program exits (call `FinishAndWrite()` in ~Allocator) or `__memprof_profile_dump` is manually called.
For programs that never exit (e.g. server-side application), it will be useful to dump memprof profile when specific signal is received.
This patch installs a signal handler for deadly signals(SIGSEGV, SIGBUS, SIGABRT, SIGILL, SIGTRAP, SIGFPE) like we do in other sanitizers. In the signal handler `__memprof_profile_dump` is called to dump memprof profile.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D134795
It casued some runtimes builds to fail with cmake error
No target "libcxx-abi-static"
see code review.
> When in-tree libcxx is selected as the sanitizer C++ ABI, use
> libcxx-abi-* targets rather than libcxxabi and libunwind directly.
>
> Differential Revision: https://reviews.llvm.org/D134855
This reverts commit 414f9b7d2f.
When in-tree libcxx is selected as the sanitizer C++ ABI, use
libcxx-abi-* targets rather than libcxxabi and libunwind directly.
Differential Revision: https://reviews.llvm.org/D134855
We already link libunwind explicitly so avoid trying to link toolchain's
default libunwind which may be missing. This matches what we already do
for libcxx and libcxxabi.
Differential Revision: https://reviews.llvm.org/D129472
This is a follow up to D118200 which applies a similar cleanup to
headers when using in-tree libc++ to avoid accidentally picking up
the system headers.
Differential Revision: https://reviews.llvm.org/D128035
This is a follow up to D118200 which applies a similar cleanup to
headers when using in-tree libc++ to avoid accidentally picking up
the system headers.
Differential Revision: https://reviews.llvm.org/D128035
While investigating another issue, I noticed that `MaybeReexec()` never
actually "re-executes via `execv()`" anymore. `DyldNeedsEnvVariable()`
only returned true on macOS 10.10 and below.
Usually, I try to avoid "unnecessary" cleanups (it's hard to be certain
that there truly is no fallout), but I decided to do this one because:
* I initially tricked myself into thinking that `MaybeReexec()` was
relevant to my original investigation (instead of being dead code).
* The deleted code itself is quite complicated.
* Over time a few other things were mushed into `MaybeReexec()`:
initializing `MonotonicNanoTime()`, verifying interceptors are
working, and stripping the `DYLD_INSERT_LIBRARIES` env var to avoid
problems when forking.
* This platform-specific thing leaked into `sanitizer_common.h`.
* The `ReexecDisabled()` config nob relies on the "strong overrides weak
pattern", which is now problematic and can be completely removed.
* `ReexecDisabled()` actually hid another issue with interceptors not
working in unit tests. I added an explicit `verify_interceptors`
(defaults to `true`) option instead.
Differential Revision: https://reviews.llvm.org/D129157
While investigating another issue, I noticed that `MaybeReexec()` never
actually "re-executes via `execv()`" anymore. `DyldNeedsEnvVariable()`
only returned true on macOS 10.10 and below.
Usually, I try to avoid "unnecessary" cleanups (it's hard to be certain
that there truly is no fallout), but I decided to do this one because:
* I initially tricked myself into thinking that `MaybeReexec()` was
relevant to my original investigation (instead of being dead code).
* The deleted code itself is quite complicated.
* Over time a few other things were mushed into `MaybeReexec()`:
initializing `MonotonicNanoTime()`, verifying interceptors are
working, and stripping the `DYLD_INSERT_LIBRARIES` env var to avoid
problems when forking.
* This platform-specific thing leaked into `sanitizer_common.h`.
* The `ReexecDisabled()` config nob relies on the "strong overrides weak
pattern", which is now problematic and can be completely removed.
* `ReexecDisabled()` actually hid another issue with interceptors not
working in unit tests. I added an explicit `verify_interceptors`
(defaults to `true`) option instead.
Differential Revision: https://reviews.llvm.org/D129157
This reverts commit 857ec0d01f.
Fixes -DLLVM_ENABLE_MODULES=On build by adding the new textual
header to the modulemap file.
Reviewed in https://reviews.llvm.org/D117722
This patch refactors out the MemInfoBlock definition into a macro based
header which can be included to generate enums, structus and code for
each field recorded by the memprof profiling runtime.
Differential Revision: https://reviews.llvm.org/D117722
An infinite loop without any effects is illegal C++ and can be optimized
away by the compiler.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D119575
The definition of the MemInfoBlock is shared between the memprof
compiler-rt runtime and llvm/lib/ProfileData/. This change removes the
memprof_meminfoblock header and moves the struct to the shared include
file. To enable this sharing, the Print method is moved to the
memprof_allocator (the only place it is used) and the remaining uses are
updated to refer to the MemInfoBlock defined in the MemProfData.inc
file.
Also a couple of other minor changes which improve usability of the
types in MemProfData.inc.
* Update the PACKED macro to handle commas.
* Add constructors and equality operators.
* Don't initialize the buildid field.
Differential Revision: https://reviews.llvm.org/D116780
When using the in-tree libc++, we should be using the full path to
ensure that we're using the right library and not accidentally pick up
the system library.
Differential Revision: https://reviews.llvm.org/D118200
Currently we use very common names for macros like ACQUIRE/RELEASE,
which cause conflicts with system headers.
Prefix all macros with SANITIZER_ to avoid conflicts.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D116652
According comments on D44404, something like that was the goal.
Reviewed By: morehouse, kstoimenov
Differential Revision: https://reviews.llvm.org/D114991
The added test demonstrates loading a dynamic library with static TLS.
Such static TLS is a hack that allows a dynamic library to have faster TLS,
but it can be loaded only iff all threads happened to allocate some excess
of static TLS space for whatever reason. If it's not the case loading fails with:
dlopen: cannot load any more object with static TLS
We used to produce a false positive because dlopen will write into TLS
of all existing threads to initialize/zero TLS region for the loaded library.
And this appears to be racing with initialization of TLS in the thread
since we model a write into the whole static TLS region (we don't what part
of it is currently unused):
WARNING: ThreadSanitizer: data race (pid=2317365)
Write of size 1 at 0x7f1fa9bfcdd7 by main thread:
0 memset
1 init_one_static_tls
2 __pthread_init_static_tls
[[ this is where main calls dlopen ]]
3 main
Previous write of size 8 at 0x7f1fa9bfcdd0 by thread T1:
0 __tsan_tls_initialization
Fix this by ignoring accesses during dlopen.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114953
The first 8b of each raw profile section need to be aligned to 8b since
the first item in each section is a u64 count of the number of items in
the section.
Summary of changes:
* Assert alignment when reading counts.
* Update test to check alignment, relax some size checks to allow padding.
* Update raw binary inputs for llvm-profdata tests.
Differential Revision: https://reviews.llvm.org/D114826
The memprof unit tests use an older version of gmock (included in the
repo) which does not build cleanly with -pedantic:
https://github.com/google/googletest/issues/2650
For now just silence the warning by disabling pedantic and add the
appropriate flags for gcc and clang.
This commit adds initial support to llvm-profdata to read and print
summaries of raw memprof profiles.
Summary of changes:
* Refactor shared defs to MemProfData.inc
* Extend show_main to display memprof profile summaries.
* Add a simple raw memprof profile reader.
* Add a couple of tests to tools/llvm-profdata.
Differential Revision: https://reviews.llvm.org/D114286
We dropped the printing of live on exit blocks in rG1243cef245f6 -
the commit changed the insertOrMerge logic. Remove the message since it
is no longer needed (all live blocks are inserted into the hashmap)
before serializing/printing the profile. Furthermore, the original
intent was to capture evicted blocks so it wasn't entirely correct.
Also update the binary format test invocation to remove the redundant
print_text directive now that it is the default.
Differential Revision: https://reviews.llvm.org/D114285
MemProfUnitTest now depends on libcxx but the dependency is not
explicitly expressed in build system, causing build races. This patch
addresses this issue.
Differential Revision: https://reviews.llvm.org/D114267
memprof does not use user_id for anything,
so don't pass it to ThreadCreate.
Passing a random field of MemprofThread as user_id
does not make much sense anyway.
Depends on D113920.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113921
Since glibc 2.34, dlsym does
1. malloc 1
2. malloc 2
3. free pointer from malloc 1
4. free pointer from malloc 2
These sequence was not handled by trivial dlsym hack.
This fixes https://bugs.llvm.org/show_bug.cgi?id=52278
Reviewed By: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D112588
Set the default memprof serialization format as binary. 9 tests are
updated to use print_text=true. Also fixed an issue with concatenation
of default and test specified options (missing separator).
Differential Revision: https://reviews.llvm.org/D113617
This change implements the raw binary format discussed in
https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
Summary of changes
* Add a new memprof option to choose binary or text (default) format.
* Add a rawprofile library which serializes the MIB map to profile.
* Add a unit test for rawprofile.
* Mark sanitizer procmaps methods as virtual to be able to mock them.
* Extend memprof_profile_dump regression test.
Differential Revision: https://reviews.llvm.org/D113317
The existing implementation uses a cache + eviction based scheme to
record heap profile information. This design was adopted to ensure a
constant memory overhead (due to fixed number of cache entries) along
with incremental write-to-disk for evictions. We find that since the
number to entries to track is O(unique-allocation-contexts) the overhead
of keeping all contexts in memory is not very high. On a clang workload,
the max number of unique allocation contexts was ~35K, median ~11K.
For each context, we (currently) store 64 bytes of data - this amounts
to 5.5MB (max). Given the low overheads for a complex workload, we can
simplify the implementation by using a hashmap without eviction.
Other changes:
* Memory map is dumped at the end rather than startup. The relative
order in the profile dump is unchanged since we no longer have evicted
entries at runtime.
* Added a test to check meminfoblocks are merged.
Differential Revision: https://reviews.llvm.org/D111676
Move the memprof MemInfoBlock struct to it's own header as requested
during the review of D111676.
Differential Revision: https://reviews.llvm.org/D113315
Previously for mem* intrinsics we only incremented the access count for
the first word in the range. However, after thinking it through I think
it makes more sense to record an access for every word in the range.
This better matches the behavior of inlined memory intrinsics, and also
allows better analysis of utilization at a future date.
Differential Revision: https://reviews.llvm.org/D110799
Previously we used a global Allocator-scope mutex to lock when adding a
deallocation to the MIB cache. This resulted in a lot of contention.
Instead add and use per-set mutexes.
Along with this, we now need to remove the global miss and access count
variables and instead utilize the per-set statistics to report the
overall miss rate.
Differential Revision: https://reviews.llvm.org/D109853