Commit Graph

25 Commits

Author SHA1 Message Date
Kazuaki Ishizaki a1e7e401d2 [compiler-rt] NFC: Fix trivial typo
Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D77457
2021-09-04 14:12:58 +05:30
Vitaly Buka f0c9d1e95f [tsan] Remove special SyncClock::kInvalidTid
Followup for D101428.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D101604
2021-04-30 13:39:15 -07:00
Dmitry Vyukov 92a3a2dc3e sanitizer_common: introduce kInvalidTid/kMainTid
Currently we have a bit of a mess related to tids:
 - sanitizers re-declare kInvalidTid multiple times
 - some call it kUnknownTid
 - implicit assumptions that main tid is 0
 - asan/memprof claim their tids need to fit into 24 bits,
   but this does not seem to be true anymore
 - inconsistent use of u32/int to store tids

Introduce kInvalidTid/kMainTid in sanitizer_common
and use them consistently.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101428
2021-04-30 15:58:05 +02:00
Dmitry Vyukov aff73487c9 tsan: increase dense slab alloc capacity
We've got a user report about heap block allocator overflow.
Bump the L1 capacity of all dense slab allocators to maximum
and be careful to not page the whole L1 array in from .bss.
If OS uses huge pages, this still may cause a limited RSS increase
due to boundary huge pages, but avoiding that looks hard.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101161
2021-04-29 07:34:50 +02:00
Dmitry Vyukov 4408eeed0f tsan: fix false positives in AcquireGlobal
Add ThreadClock:: global_acquire_ which is the last time another thread
has done a global acquire of this thread's clock.

It helps to avoid problem described in:
https://github.com/golang/go/issues/39186
See test/tsan/java_finalizer2.cpp for a regression test.
Note the failuire is _extremely_ hard to hit, so if you are trying
to reproduce it, you may want to run something like:
$ go get golang.org/x/tools/cmd/stress
$ stress -p=64 ./a.out

The crux of the problem is roughly as follows.
A number of O(1) optimizations in the clocks algorithm assume proper
transitive cumulative propagation of clock values. The AcquireGlobal
operation may produce an inconsistent non-linearazable view of
thread clocks. Namely, it may acquire a later value from a thread
with a higher ID, but fail to acquire an earlier value from a thread
with a lower ID. If a thread that executed AcquireGlobal then releases
to a sync clock, it will spoil the sync clock with the inconsistent
values. If another thread later releases to the sync clock, the optimized
algorithm may break.

The exact sequence of events that leads to the failure.
- thread 1 executes AcquireGlobal
- thread 1 acquires value 1 for thread 2
- thread 2 increments clock to 2
- thread 2 releases to sync object 1
- thread 3 at time 1
- thread 3 acquires from sync object 1
- thread 1 acquires value 1 for thread 3
- thread 1 releases to sync object 2
- sync object 2 clock has 1 for thread 2 and 1 for thread 3
- thread 3 releases to sync object 2
- thread 3 sees value 1 in the clock for itself
  and decides that it has already released to the clock
  and did not acquire anything from other threads after that
  (the last_acquire_ check in release operation)
- thread 3 does not update the value for thread 2 in the clock from 1 to 2
- thread 4 acquires from sync object 2
- thread 4 detects a false race with thread 2
  as it should have been synchronized with thread 2 up to time 2,
  but because of the broken clock it is now synchronized only up to time 1

The global_acquire_ value helps to prevent this scenario.
Namely, thread 3 will not trust any own clock values up to global_acquire_
for the purposes of the last_acquire_ optimization.

Reviewed-in: https://reviews.llvm.org/D80474
Reported-by: nvanbenschoten (Nathan VanBenschoten)
2020-05-27 16:27:47 +02:00
Dmitry Vyukov 180d211770 tsan: Adding releaseAcquire() to ThreadClock
realeaseAcquire() is a new function added to TSan in support of the Go data-race detector.
It's semantics is:

void ThreadClock::releaseAcquire(SyncClock *sc) const {
  for (int i = 0; i < kMaxThreads; i++) {
    tmp = clock[i];
    clock[i] = max(clock[i], sc->clock[i]);
    sc->clock[i] = tmp;
  }
}

For context see: https://go-review.googlesource.com/c/go/+/220419

Reviewed-in: https://reviews.llvm.org/D76322
Author: dfava (Daniel Fava)
2020-03-24 11:27:46 +01:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Dmitry Vyukov 9f2c6207d5 tsan: optimize sync clock memory consumption
This change implements 2 optimizations of sync clocks that reduce memory consumption:

Use previously unused first level block space to store clock elements.
Currently a clock for 100 threads consumes 3 512-byte blocks:

2 64-bit second level blocks to store clock elements
+1 32-bit first level block to store indices to second level blocks
Only 8 bytes of the first level block are actually used.
With this change such clock consumes only 2 blocks.

Share similar clocks differing only by a single clock entry for the current thread.
When a thread does several release operations on fresh sync objects without intervening
acquire operations in between (e.g. initialization of several fields in ctor),
the resulting clocks differ only by a single entry for the current thread.
This change reuses a single clock for such release operations. The current thread time
(which is different for different clocks) is stored in dirty entries.

We are experiencing issues with a large program that eats all 64M clock blocks
(32GB of non-flushable memory) and crashes with dense allocator overflow.
Max number of threads in the program is ~170 which is currently quite unfortunate
(consume 4 blocks per clock). Currently it crashes after consuming 60+ GB of memory.
The first optimization brings clock block consumption down to ~40M and
allows the program to work. The second optimization further reduces block consumption
to "modest" 16M blocks (~8GB of RAM) and reduces overall RAM consumption to ~30GB.

Measurements on another real world C++ RPC benchmark show RSS reduction
from 3.491G to 3.186G and a modest speedup of ~5%.

Go parallel client/server HTTP benchmark:
https://github.com/golang/benchmarks/blob/master/http/http.go
shows RSS reduction from 320MB to 240MB and a few percent speedup.

Reviewed in https://reviews.llvm.org/D35323

llvm-svn: 308018
2017-07-14 11:30:06 +00:00
Dmitry Vyukov 62b9ad718f tsan: refactor SyncClock code
1. Add SyncClock::ResetImpl which removes code
   duplication between ctor and Reset.
2. Move SyncClock::Resize to SyncClock methods,
   currently it's defined between ThreadClock methods.

llvm-svn: 307785
2017-07-12 12:50:36 +00:00
Dmitry Vyukov 5f924089e5 tsan: prepare clock for future changes
Pass ClockCache to ThreadClock::set and introduce ThreadCache::ResetCached.
For now both are unused, but will reduce future diffs.

llvm-svn: 307784
2017-07-12 12:45:20 +00:00
Alexey Samsonov a49cfd8f94 Revert "Apply modernize-use-default to compiler-rt."
This reverts commit r250823.

Replacing at least some of empty
constructors with "= default" variants is a semantical change which we
don't want. E.g. __tsan::ClockBlock contains a union of large arrays,
and it's critical for correctness and performance that we don't memset()
these arrays in the constructor.

llvm-svn: 251717
2015-10-30 18:52:31 +00:00
Angel Garcia Gomez ea61047c6f Apply modernize-use-default to compiler-rt.
Summary: Replace empty bodies of default constructors and destructors with '= default'.

Reviewers: klimek, bkramer

Subscribers: alexfh, cfe-commits

Differential Revision: http://reviews.llvm.org/D13892

llvm-svn: 250823
2015-10-20 12:53:50 +00:00
Dmitry Vyukov dc1caa7cb8 tsan: address comments in r214912
See http://reviews.llvm.org/D4794

llvm-svn: 216900
2014-09-02 09:34:34 +00:00
Dmitry Vyukov 70db9d4d72 tsan: allocate vector clocks using slab allocator
Vector clocks is the most actively allocated object in tsan runtime.
Current internal allocator is not scalable enough to handle allocation
of clocks in scalable way (too small caches). This changes transforms
clocks to 2-level array with 512-byte blocks. Since all blocks are of
the same size, it's possible to cache them more efficiently in per-thread caches.

llvm-svn: 214912
2014-08-05 18:45:02 +00:00
Dmitry Vyukov bde4c9c773 tsan: refactor storage of meta information for heap blocks and sync objects
The new storage (MetaMap) is based on direct shadow (instead of a hashmap + per-block lists).
This solves a number of problems:
 - eliminates quadratic behaviour in SyncTab::GetAndLock (https://code.google.com/p/thread-sanitizer/issues/detail?id=26)
 - eliminates contention in SyncTab
 - eliminates contention in internal allocator during allocation of sync objects
 - removes a bunch of ad-hoc code in java interface
 - reduces java shadow from 2x to 1/2x
 - allows to memorize heap block meta info for Java and Go
 - allows to cleanup sync object meta info for Go
 - which in turn enabled deadlock detector for Go

llvm-svn: 209810
2014-05-29 13:50:54 +00:00
Dmitry Vyukov b5eb8f0212 tsan: fix vector clocks
the new optimizations break when thread ids gets reused (clocks go backwards)
add the necessary tests as well

llvm-svn: 206035
2014-04-11 15:38:03 +00:00
Dmitry Vyukov d23118c3b2 tsan: optimize vector clock operations
Make vector clock operations O(1) for several important classes of use cases.
See comments for details.
Below are stats from a large server app, 77% of all clock operations are handled as O(1).

Clock acquire                     :         25983645
  empty clock                     :          6288080
  fast from release-store         :         14917504
  contains my tid                 :          4515743
  repeated (fast)                 :          2141428
  full (slow)                     :          2636633
  acquired something              :          1426863
Clock release                     :          2544216
  resize                          :             6241
  fast1                           :           197693
  fast2                           :          1016293
  fast3                           :             2007
  full (slow)                     :          1797488
  was acquired                    :           709227
  clear tail                      :                1
  last overflow                   :                0
Clock release store               :          3446946
  resize                          :           200516
  fast                            :           469265
  slow                            :          2977681
  clear tail                      :                0
Clock acquire-release             :           820028

llvm-svn: 204656
2014-03-24 18:54:20 +00:00
Dmitry Vyukov e11f2920c9 tsan: more precise handling of finalizers
llvm-svn: 167530
2012-11-07 15:08:20 +00:00
Dmitry Vyukov ba827dfdae tsan: don't release disabled clocks
llvm-svn: 167451
2012-11-06 13:16:25 +00:00
Dmitry Vyukov 904d3f9c06 tsan: add ReleaseStore() function that merely copies vector clock rather than combines two clocks
fix clock setup for finalizer goroutine (Go runtime)

llvm-svn: 160918
2012-07-28 15:27:41 +00:00
Dmitry Vyukov dfc8e52400 tsan: suport for Go finalizers
llvm-svn: 160723
2012-07-25 13:16:35 +00:00
Dmitry Vyukov 302cebb8f1 tsan: add shadow memory flush + fix few bugs
llvm-svn: 157270
2012-05-22 18:07:45 +00:00
Dmitry Vyukov fee5b7d2e0 tsan: detect accesses to freed memory
http://codereview.appspot.com/6214052

llvm-svn: 156990
2012-05-17 14:17:51 +00:00
Kostya Serebryany 07c4805175 [tsan] run more kinds of builds as presubmit test (and fix gcc debug build)
llvm-svn: 156616
2012-05-11 14:42:24 +00:00
Kostya Serebryany 4ad375f0a9 [tsan] First commit of ThreadSanitizer (TSan) run-time library.
Algorithm description: http://code.google.com/p/thread-sanitizer/wiki/ThreadSanitizerAlgorithm

Status:
The tool is known to work on large real-life applications, but still has quite a few rough edges.
Nothing is guaranteed yet.

The tool works on x86_64 Linux.
Support for 64-bit MacOS 10.7+ is planned for late 2012.
Support for 32-bit OSes is doable, but problematic and not yet planed.

Further commits coming:
  - tests
  - makefiles
  - documentation
  - clang driver patch

The code was previously developed at http://code.google.com/p/data-race-test/source/browse/trunk/v2/
by Dmitry Vyukov and Kostya Serebryany with contributions from
Timur Iskhodzhanov, Alexander Potapenko, Alexey Samsonov and Evgeniy Stepanov.

llvm-svn: 156542
2012-05-10 13:48:04 +00:00