### Motivation:
In my previous PR https://github.com/apple/swift-nio/pull/2010, I was able to decrease the allocations for both `scheduleTask` and `execute` by 1 already. Gladly, there are no more allocations left to remove from `execute` now; however, `scheduleTask` still provides a couple of allocations that we can try to get rid of.
### Modifications:
This PR removes two allocations inside `Scheduled` where we were using the passed in `EventLoopPromise` to call the `cancellationTask` once the `EventLoopFuture` of the promise fails. This requires two allocations inside `whenFailure` and inside `_whenComplete`. However, since we are passing the `cancellationTask` to `Scheduled` anyhow and `Scheduled` is also the one that is failing the promise from the `cancel()` method. We can just go ahead and store the `cancellationTask` inside `Scheduled` and call it from the `cancel()` method directly instead of going through the future.
Importantly, here is that the `cancellationTask` is not allowed to retain the `ScheduledTask.task` otherwise we would change the semantics and retain the `ScheduledTask.task` longer than necessary. My previous PR https://github.com/apple/swift-nio/pull/2010, already implemented the work to get rid of the retain from the `cancellationTask` closure. So we are good to go ahead and store the `cancellationTask` inside `Scheduled` now
### Result:
`scheduleTask` requires two fewer allocations
### Motivation:
In my previous PR https://github.com/apple/swift-nio/pull/2009, I added baseline performance and allocation tests around `scheduleTask` and `execute`. After analysing, the various allocations that happen when scheduling a task there were only a few that could be optimized away potentially.
### Modifications:
This PR converts the `ScheduledTask` class to a struct which will reduce the number of allocations for scheduling tasks by 1. The only thing that needs to be worked around when converting to a struct is giving it an identity so that we can implement `Equatable` conformance properly. I explored two options. First, using an `ObjectIdentifier` passed to the init. Second, using an atomic counter per EventLoop. I went with the latter since the former requires an additional allocation in the case of calling `execute`
### Result:
`scheduleTask` and `execute` require one less allocation
### Motivation:
In issue https://github.com/apple/swift-nio/issues/1316, we see a large number of allocations to happen when scheduling tasks. This can definitely be optimized. This PR adds a number of baseline allocation and performance tests for both `scheduleTask` and `execute`. In the next PRs, I am going to try a few optimizations to reduce the number of allocations.
### Modifications:
Added baseline performance and allocation tests for `scheduleTask` and `execute`
Motivation:
To justify performance changes we need to measure the code being
changed. We believe that `HTTPHeaders.subscript(canonicalForm:)` is a
little slow.
Modifications:
- Add allocation and performance tests for fetching header values in
their canonical form
Results:
More benchmarks!
Motivation:
Peter thinks that result-erasing maps should not allocate, and we have
special code paths in the code to try to make Void -> Void maps not
allocate. Sadly, both code paths currently do allocate.
Per our rules for not trying to make optimizations without data, we
should start measuing these closures so we can make optimizations.
Modifications:
- Added an alloc couter test for result-erasing maps.
Result:
Alloc counter test suitable for any fix of #1697.
motivation: update syntax for release images
changes: replace use of "base_image" with "ubuntu_version" and "swift_version" pair, which the intended way to use release images
Motivation:
We alloc quite a lot with our implementations of flatMapThrowing and flatMapErrorThrowing.
While we don't use Futures a lot in NIO itself a lot of our users, depend quite a bit on them. Let's make their code faster.
Modifications:
Create a Promise and use _whenComplete directly instead of going through another flatMap method
Result:
In my testing I see a reduction of 3 allocs per invocation 🎉
Motivation:
Due to https://bugs.swift.org/browse/SR-14516 , we sometimes get
allocating (!?) `subscript.read` accessors in the CircularBuffer.first
depending on the `Element` type...
Modifications:
Implement `CircularBuffer.first` instead of inheriting it from
Collection.
Result:
Fewer allocs in some cases.
Motivation:
Usually, we add a more or less random number of slack allocations to
make sure the tests don't spuriously fail. This makes it quite costly to
support new Swift versions.
Modifications:
Add a script which can spit you out the right allocation limits
including slack.
Result:
Easier to support new Swift versions
Motivation:
Any version of ChannelHandler removal that does not have a
ChannelHandlerContext already in hand is currently excessively
expensive. This is because it allocates a promise and a callback for
finding the context, despite already having a promise in hand for users
to complete.
We can remove a pair of allocations here by jumping to the event loop
directly and then running our operations synchronously.
Modifications:
- Rewrite removeHandler(name:promise:) and removeHandler(_:promise:) to
jump directly to the event loops and then work synchronously.
Result:
Cheaper code
Motivation:
Allocation counter tests are good, and we aren't measuring this today.
Modifications:
- Wrote some add/remove tests that use different remove functions.
Result:
Better insight into performance.
Motivation:
We recently added a synchronous view of the `ChannelPipline` so that
callers can avoid allocating futures when they know they're on the right
event loop. We also offer convenience APIs to configure the pipeline for
particular use cases, like an HTTP/1 server but we don't have
synchronous versions of these APIs yet. We should have parity
between as synchronous and asyncronous APIs where feasible.
Modifications:
- Add synchronous helpers to configure HTTP1 client and server pipelines
Result:
Callers to synchronously configure HTTP1 client and server pipelines.
Motivation:
We added synchronous pipeline operations to allow the caller to save
allocations when they know they are already on the correct event loop.
However, we missed a trick! In some cases the caller cannot guarantee
they are on the correct event loop and must use an asynchronous method
instead. If that method returns a void future and is called on the event
loop, then we can perform the operation synchronously and return a
cached void future.
Modifications:
- Add API to `EventLoop` for creating a 'completed' future with a
`Result` (similar to `EventLoopPromise.completeWith`)
- Add an equivalent for making completed void futures
- Use these when asynchronously adding handlers and the caller is
already on the right event loop.
Result:
- Fewer allocations on the happiest of happy paths when adding handlers
asynchronously to a pipeline.
* Add synchronous channel options
Motivation:
The functions for getting and setting channel options are currently
asynchronous. This ensures that options are set and retrieved safely.
However, in some cases the caller knows they are on the correct event
loop but still has to pay the cost of allocating a future to either get
or set an option.
Modifications:
- Add a 'NIOSynchronousChannelOptions' protocol for getting and setting
options
- Add a customisation point to 'Channel' to return 'NIOSynchronousChannelOptions'.
- Default implementation returns nil so as to not break API.
- Add implementations for 'EmbeddedChannel' and 'BaseSocketChannel'
- Allocation tests for getting and setting autoRead
Results:
Options can be get and set synchronously.