Commit Graph

3530 Commits

Author SHA1 Message Date
Dmitry Ilvokhin 143f458188 Fix `hpa_strict_min_purge_interval` option logic
We update `shard->last_purge` on each call of `hpa_try_purge` if we
purged something. This means, when `hpa_strict_min_purge_interval`
option is set only one slab will be purged, because on the next
call condition for too frequent purge protection
`since_last_purge_ms < shard->opts.min_purge_interval_ms` will always
be true. This is not an intended behaviour.

Instead, we need to check `min_purge_interval_ms` once and purge as many
pages as needed to satisfy requirements for `hpa_dirty_mult` option.

Make possible to count number of actions performed in unit tests (purge,
hugify, dehugify) instead of binary: called/not called. Extended current
unit tests with cases where we need to purge more than one page for a
purge phase.
2024-08-20 10:02:38 -07:00
Dmitry Ilvokhin 0a9f51d0d8 Simplify `hpa_shard_maybe_do_deferred_work`
It doesn't make much sense to repeat purging once we done with
hugification, because we can de-hugify pages that were hugified just
moment ago for no good reason. Let them wait next deferred work phase
instead. And if they still meeting purging conditions then, purge them.
2024-08-20 10:02:38 -07:00
Amaury Séchet a25b9b8ba9 Simplify the logic when bumping lg_fill_div. 2024-08-06 13:31:49 -07:00
Shirui Cheng 8fefabd3a4 increase the ncached_max in fill_flush test case to 1024 2024-08-06 13:16:09 -07:00
Shirui Cheng 47c9bcd402 Use a for-loop to fulfill flush requests that are larger than CACHE_BIN_NFLUSH_BATCH_MAX items 2024-08-06 13:16:09 -07:00
Shirui Cheng 48f66cf4a2 add a size check when declare a stack array to be less than 2048 bytes 2024-08-06 13:16:09 -07:00
Burton Li 8dc97b1108 Fix NSTIME_MONOTONIC for win32 implementation 2024-07-30 10:30:41 -07:00
Nathan Slingerland bc32ddff2d Add usize to prof_sample_hook_t 2024-07-30 10:29:30 -07:00
Dmitry Ilvokhin b66f689764 Emit long string values without truncation
There are few long options (`bin_shards` and `slab_sizes` for example)
when they are specified and we emit statistics value gets truncated.

Moved emitting logic for strings into separate `emitter_emit_str`
function. It will try to emit string same way as before and if value is
too long will fallback emiting rest partially with chunks of `BUF_SIZE`.

Justification for long strings (longer than `BUF_SIZE`) is not
supported.
2024-07-29 13:58:31 -07:00
Danny Lin c893fcd169 Change macOS mmap tag to fix conflict with CoreMedia
Tag 101 is assigned to "CoreMedia Capture Data", which makes for confusing output when debugging.

To avoid conflicts, use a tag in the reserved application-specific range from 240–255 (inclusive).

All assigned tags: 94d3b45284/osfmk/mach/vm_statistics.h (L773-L775)
2024-06-26 14:53:48 -07:00
Shirui Cheng a1fcbebb18 skip tcache GC for tcache_max unit test 2024-06-25 12:59:45 -07:00
Guangli Dai 8477ec9562 Set dependent as false for all rtree reads without ownership 2024-06-24 10:50:20 -07:00
Guangli Dai 21bcc0a8d4 Make JEMALLOC_CXX_THROW definition compatible with newer C++ versions 2024-06-13 11:03:05 -07:00
Dmitry Ilvokhin 867c6dd7dc Option to guard `hpa_min_purge_interval_ms` fix
Change in `hpa_min_purge_interval_ms` handling logic is not backward
compatible as it might increase memory usage. Now this logic guarded by
`hpa_strict_min_purge_interval` option.

When `hpa_strict_min_purge_interval` is true, we will purge no more than
`hpa_min_purge_interval_ms`. When `hpa_strict_min_purge_interval` is
false, old purging logic behaviour is preserved.

Long term strategy migrate all users of hpa to new logic and then delete
`hpa_strict_min_purge_interval` option.
2024-06-07 10:52:41 -07:00
Dmitry Ilvokhin 91a6d230db Respect `hpa_min_purge_interval_ms` option
Currently, hugepages aware allocator backend works together with classic
one as a fallback for not yet supported allocations. When background
threads are enabled wake up time for classic interfere with hpa as there
were no checks inside hpa purging logic to check if we are not purging too
frequently. If background thread is running and `hpa_should_purge`
returns true, then we will purge, even if we purged less than
hpa_min_purge_interval_ms ago.
2024-06-07 10:52:41 -07:00
Dmitry Ilvokhin 90c627edb7 Export hugepage size with `arenas.hugepage` 2024-06-05 15:37:41 -07:00
David Goldblatt f9c0b5f7f8 Bin batching: add some stats.
This lets us easily see what fraction of flush load is being taken up by the
bins, and helps guide future optimization approaches (for example: should we
prefetch during cache bin fills? It depends on how many objects the average fill
pops out of the batch).
2024-05-22 10:30:31 -07:00
David Goldblatt fc615739cb Add batching to arena bins.
This adds a fast-path for threads freeing a small number of allocations to
bins which are not their "home-base" and which encounter lock contention in
attempting to do so. In producer-consumer workflows, such small lock hold times
can cause lock convoying that greatly increases overall bin mutex contention.
2024-05-22 10:30:31 -07:00
David Goldblatt 44d91cf243 Tcache flush: Partition by bin before locking.
This accomplishes two things:
- It avoids a full array scan (and any attendant branch prediction misses, etc.)
  while holding the bin lock.
- It allows us to know the number of items that will be flushed before flushing
  them, which will (in an upcoming commit) let us know if it's safe to use the
  batched flush (in which case we won't acquire the bin mutex).
2024-05-22 10:30:31 -07:00
David Goldblatt 6e56848850 Tcache: Split up small/large handling.
The main bits of shared code are the edata filtering and the stats flushing
logic, both of which are fairly simple to read and not so painful to duplicate.
The shared code comes at the cost of guarding all the subtle logic with
`if (small)`, which doesn't feel worth it.
2024-05-22 10:30:31 -07:00
David Goldblatt c085530c71 Tcache batching: Plumbing
In the next commit, we'll start using the batcher to eliminate mutex traffic.
To avoid cluttering up that commit with the random bits of busy-work it entails,
we'll centralize them here.  This commit introduces:
- A batched bin type.
- The ability to mix batched and unbatched bins in the arena.
- Conf parsing to set batches per size and a max batched size.
- mallctl access to the corresponding opt-namespace keys.
- Stats output of the above.
2024-05-22 10:30:31 -07:00
David Goldblatt 70c94d7474 Add batcher module.
This can be used to batch up simple operation commands for later use by another
thread.
2024-05-22 10:30:31 -07:00
David Goldblatt 86f4851f5d Add clang static analyzer suppression macro. 2024-05-22 10:30:31 -07:00
Amaury Séchet 5afff2e44e Simplify the logic in tcache_gc_small. 2024-05-02 18:52:19 -07:00
Qi Wang 8d8379da44 Fix background_thread creation for the oversize_arena.
Bypassing background thread creation for the oversize_arena used to be an
optimization since that arena had eager purging.  However #2466 changed the
purging policy for the oversize_arena -- specifically it switched to the default
decay time when background_thread is enabled.

This issue is noticable when the number of arenas is low: whenever the total #
of arenas is <= 4 (which is the default max # of background threads), in which
case the purging will be stalled since no background thread is created for the
oversize_arena.
2024-05-02 14:45:18 -07:00
Dmitry Ilvokhin 47d69b4eab HPA: Fix infinite purging loop
One of the condition to start purging is `hpa_hugify_blocked_by_ndirty`
function call returns true. This can happen in cases where we have no
dirty memory for this shard at all. In this case purging loop will be an
infinite loop.

`hpa_hugify_blocked_by_ndirty` was introduced at 0f6c420, but at that
time purging loop has different form and additional `break` was not
required. Purging loop form was re-written at 6630c5989, but additional
exit condition wasn't added there at the time.

Repo code was shared by Patrik Dokoupil at [1], I stripped it down to
minimum to reproduce issue in jemalloc unit tests.

[1]: https://github.com/jemalloc/jemalloc/pull/2533
2024-04-30 13:46:32 -07:00
Qi Wang fa451de17f Fix the tcache flush sanity checking around ncached and nstashed.
When there were many items stashed, it's possible that after flushing stashed,
ncached is already lower than the remain, in which case the flush can simply
return at that point.
2024-04-12 16:01:55 -07:00
debing.sun 630434bb0a Fixed type error with allocated that caused incorrect printing on 32bit 2024-04-09 14:44:43 -07:00
Shirui Cheng 4b555c11a5 Enable heap profiling on MacOS 2024-04-09 12:57:01 -07:00
Daniel Hodges 11038ff762 Add support for namespace pids in heap profile names
This change adds support for writing pid namespaces to the filename of a
heap profile. When running with namespaces pids may reused across
namespaces and if mounts are shared where profiles are written there is
not a great way to differentiate profiles between pids.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Signed-off-by: Daniel Hodges <hodgesd@fb.com>
2024-04-09 10:27:52 -07:00
Qi Wang 83b075789b rallocx path: only set errno on the realloc case. 2024-04-05 17:41:43 -07:00
Shirui Cheng 5081c16bb4 Experimental calloc implementation with using memset on larger sizes 2024-04-04 15:31:56 -07:00
Juhyung Park 38056fea64 Set errno to ENOMEM on rallocx() OOM failures
realloc() and rallocx() shares path, and realloc() should set errno to
ENOMEM upon OOM failures.

Fixes: ee961c2310 ("Merge realloc and rallocx pathways.")
Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
2024-04-04 15:13:22 -07:00
Dmitry Ilvokhin 268e8ee880 Include HPA ndirty into page allocator ndirty stat 2024-04-04 12:17:30 -07:00
Dmitry Ilvokhin b2e59a96e1 Introduce getters for page allocator shard stats
Access nactive, ndirty and nmuzzy throught getters and not directly.
There are no functional change, but getters are required to propagate
HPA's statistics up to Page Allocator's statitics.
2024-04-04 12:17:30 -07:00
Amaury Séchet 92aa52c062 Reduce nesting in phn_merge_siblings using an early return. 2024-03-14 13:08:17 -07:00
Amaury Séchet 10d713151d Ensure that the root of a heap is always the best element. 2024-03-14 13:07:45 -07:00
Minsoo Choo 1978e5cdac Update acitons/checkout and actions/upload-artifact to v4 2024-03-12 12:59:15 -07:00
XChy ed9b00a96b Replace unsigned induction variable with size_t in background_threads_enable
This patch avoids unnecessary vectorizations in clang and missed recognition of memset in gcc. See also https://godbolt.org/z/aoeMsjr4c.
2024-03-05 14:54:50 -08:00
Shirui Cheng 373884ab48 print out all malloc_conf settings in stats 2024-02-29 12:12:44 -08:00
Qi Wang 1aba4f41a3 Allow zero sized memalign to pass.
Instead of failing on assertions.  Previously the same change was made for
posix_memalign and aligned_alloc (#1554).  Make memalign behave the same way
even though it's obsolete.
2024-02-16 13:06:07 -08:00
Qi Wang 6d181bc1b7 Fix Cirrus CI.
13.0-RELEASE does not exist anymore.  "The resource
'projects/freebsd-org-cloud-dev/global/images/family/freebsd-13-0' was not
found"
2024-02-16 13:05:40 -08:00
David Goldblatt f96010b7fa gitignore: Start ignoring clangd dirs. 2024-01-23 17:02:01 -08:00
Qi Wang a2c5267409 HPA: Allow frequent reused alloc to bypass the slab_max_alloc limit, as long as
it's within the huge page size.  These requests do not concern internal
fragmentation with huge pages, since the entire range is expected to be
accessed.
2024-01-18 14:51:04 -08:00
guangli-dai b1792c80d2 Add LOGs when entrying and exiting free and sdallocx. 2024-01-11 14:37:20 -08:00
Qi Wang 05160258df When safety_check_fail, also embed hint msg in the abort function name
because there are cases only logging crash stack traces.
2024-01-11 14:19:54 -08:00
Qi Wang 3a6296e1ef Disable FreeBSD on Travis CI since it's not working.
Travis CI currently provides only FreeBSD 12 which is EOL.
2024-01-04 14:47:52 -08:00
Minsoo Choo d284aad027 Test on more FreeBSD versions
Added 14.0-RELEASE
Added 15-CURRENT
Added 14-STABLE
Added 13-STABLE

13.0-RELEASE will be updated when 13.3-RELEASE comes out.
2024-01-04 12:48:24 -08:00
Connor dfb3260b97 Fix missing cleanup message for collected profiles.
```
sub cleanup {
  unlink($main::tmpfile_sym);
  unlink(keys %main::tempnames);

  # We leave any collected profiles in $HOME/jeprof in case the user wants
  # to look at them later.  We print a message informing them of this.
  if ((scalar(@main::profile_files) > 0) &&
      defined($main::collected_profile)) {
    if (scalar(@main::profile_files) == 1) {
      print STDERR "Dynamically gathered profile is in $main::collected_profile\n";
    }
    print STDERR "If you want to investigate this profile further, you can do:\n";
    print STDERR "\n";
    print STDERR "  jeprof \\\n";
    print STDERR "    $main::prog \\\n";
    print STDERR "    $main::collected_profile\n";
    print STDERR "\n";
  }
}
```
On cleanup, it would print out a message for the collected profile.
If there is only one collected profile, it would pop by L691, then `scalar(@main::profile_files)` would be 0, and no message would be printed.
2024-01-03 14:24:38 -08:00
Honggyu Kim f6fe6abdcb build: Make autogen.sh accept quoted extra options
The current autogen.sh script doesn't allow receiving quoted extra
options.

If someone wants to pass extra CFLAGS that is split into multiple
options with a whitespace, then a quote is required.

However, the configure inside autogen.sh fails in this case as follows.

  $ ./autogen.sh CFLAGS="-Dmmap=cxl_mmap -Dmunmap=cxl_munmap"
  autoconf
  ./configure --enable-autogen CFLAGS=-Dmmap=cxl_mmap -Dmunmap=cxl_munmap
  configure: error: unrecognized option: `-Dmunmap=cxl_munmap'
  Try `./configure --help' for more information
  Error 0 in ./configure

It's because the quote discarded unexpectedly when calling configure.

This patch is to fix this problem.

Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
2024-01-03 14:20:34 -08:00