docs: link to the SSWG's `perf` guide (#1882)

Motivation:

Instead of reproducing partial information, it makes more sense to link to the SSWG's guide which is more complete.

Modification:

Remove incomplete information and swap for a link.

Result:

Better docs.

Co-authored-by: Cory Benfield <lukasa@apple.com>
This commit is contained in:
Johannes Weiss 2021-07-21 08:33:38 +01:00 committed by GitHub
parent 25da619b78
commit 6e6465de31
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 1 additions and 48 deletions

View File

@ -177,51 +177,4 @@ In case you're wondering what `dsb2mite_switches.penalty_cycles` is, a web searc
## The bad news: PMU access
### Bare metal
If you have access to a bare-metal Linux machine that reproduces the performance difference you would like to analyse, your life is easy. Install `perf` and use it. In Ubuntu/Debian distributions it's usually `apt-get install linux-tools-generic`, then run `perf`. If it recommends to install a specific package like `linux-tools-5.4.0-65-generic`, go ahead and install that too. After this step, `perf` should work in full fidelity.
### Docker for Mac
Getting `perf` to run in Docker for Mac is not too hard but it won't support all features. Sadly, _you will not have access to the PMU_. That means you will not be able to replicate what is described above. To install `perf` run
apt-get update && apt-get install linux-tools-generic
but when you actually run `perf` it will tell you that you need to install a package called `linux-tools-4.19.121-linuxkit` which does not exist. Instead, you need to execute this command first:
alias perf=/usr/lib/linux-tools/*/perf
and after that, typing `perf` will work as expected (minus the unsupported features). Running `perf stat sleep 1` as an example, you will see
```
Performance counter stats for 'sleep 1':
0.675405 task-clock (msec) # 0.001 CPUs utilized
2 context-switches # 0.003 M/sec
0 cpu-migrations # 0.000 K/sec
52 page-faults # 0.077 M/sec
<not supported> cycles
<not supported> instructions
<not supported> branches
<not supported> branch-misses
1.006128401 seconds time elapsed
```
which shows you that all the information that is retrieved from the CPU's performance counters (through the PMU) is not supported. The problem is that the hypervisor that Docker for Mac uses does not support "PMU passthrough" or "PMU virtualisation".
### In a VM
In a virtual machine, you would install `perf` just like on bare metal. And either `perf` will work just fine with all its features or it will look similarly to what you get on Docker for Mac.
What you need your hypervisor to support (& allow) is "PMU passthrough" or "PMU virtualisation". VMware Fusion does support PMU virtualisation which they call vPMC (VM settings -> Processors & Memory -> Advanced -> Allow code profiling applications in this VM). If you're on a Mac this setting is unfortunately only supported up to including macOS Catalina (and [not on Big Sur](https://kb.vmware.com/s/article/81623)).
If you use `libvirt` to manage your hypervisor and VMs, you can use `sudo virsh edit your-domain` and replace the `<cpu .../>` XML tag with
<cpu mode='host-passthrough' check='none'/>
to allow the PMU to be passed through to the guest. For other hypervisors, an internet search will usually reveal how to enable PMU passthrough.
### In Docker (running on bare-metal Linux)
You will need to launch your container with `docker run --privileged` and then you should have access to the PMU like on bare-metal Linux.
Unfortunately, in some environments like Docker for Mac and other solutions relying on virtualisation it can be hard to get access to the PMU (the CPU's performance measuring unit). Check out this [guide](https://github.com/swift-server/guides/blob/main/docs/linux-perf.md#getting-perf-to-work) for tips on getting `perf` with PMU support to work in your environment.