Motivation:
When turning a BBV into a Collection, several of the collection
algorithms can do better if you implement the semi-private
`_copyContents` hook. We can easily implement it, so we should. While
I'm here we should clean up some other bits.
Modifications:
- Implement _copyContents.
- Implement a faster `.count` for `ByteBufferView`.
- Make `ByteBuffer.capacity` `@inlinable` to reduce ARC traffic
Result:
Copying a BBV into an Array is faster.
Motivation:
Protocols like Sequence have "private" hooks that can be implemented to
provide fast-paths for some operations. We missed a few on BBV, and I'd
like to add them. This is one use-case where they can help.
Modifications:
- Add an allocation-counter benchmark and a runtime benchmark for
copying BBV to Array.
Result:
We have some benchmarks.
### Motivation:
`read` should never return `EINVAL`. If it does return it though it is a bug inside NIO and we should precondition for it similar to what we do for `EBADF` & `EFAULT`. Closes https://github.com/apple/swift-nio/issues/1339
### Modifications:
This PR adds a new precondition for `EINVAL` and adds test that exercise this precondition.
### Result:
We are now preconditioning against `read` returning `EINVAL`.
Motivation:
An upcoming pull, apple/swift#35707, moves Android to the same single-header
modulemap for Bionic as used on linux, but that doesn't work unless this is
changed. This approach works both with the current release toolchain and
with that new modulemap.
Modifications:
Change how these Bionic constants are imported.
Result:
This repo compiles for Android and the tests pass with both modulemap approaches.
Co-authored-by: Cory Benfield <lukasa@apple.com>
### Motivation:
When scheduling tasks with the same deadline the current order of execution is undefined. Fixes https://github.com/apple/swift-nio/issues/1541
### Modifications:
In my previous PR https://github.com/apple/swift-nio/pull/2010, I added an internal id to every `ScheduledTask` to give them an identity for cancellation purposes. In this PR, I am now using the same id to also ensure that the execution of tasks with the same deadline is the same as the order they were scheduled in.
### Result:
`ScheduledTask`s are now executed in their scheduled order when they have the same deadline.
### Motivation:
Sometimes, it is nice to check `AddressedEnvelope`s for equality or hash them. Especially, in tests this comes in handy.
### Modifications:
This PR, adds conditional conformances to `AddressedEnvelope` for `Equatable` and `Hashable` depending on its generic `DataType` .
I also added a new module `NIOCoreTests` which was an open todoleft from the creation of the `NIOCore` module. To not add more tests that we have to migrate in the future, I started to create the new tests in the new module right away. I also created issue https://github.com/apple/swift-nio/issues/2016, to keep track of our open task to move the other tests over.
### Result:
`AddressedEnvelope` is conditionally `Equatable` and `Hashable`
motivation: fix broken doc generation script
changes:
* pass the --spm flag now required by source-kitten
* update location of source-kitten source to make it more robust
* update list of module to generate docs for to match the structural changes in nio
### Motivation:
In my previous PR https://github.com/apple/swift-nio/pull/2010, I was able to decrease the allocations for both `scheduleTask` and `execute` by 1 already. Gladly, there are no more allocations left to remove from `execute` now; however, `scheduleTask` still provides a couple of allocations that we can try to get rid of.
### Modifications:
This PR removes two allocations inside `Scheduled` where we were using the passed in `EventLoopPromise` to call the `cancellationTask` once the `EventLoopFuture` of the promise fails. This requires two allocations inside `whenFailure` and inside `_whenComplete`. However, since we are passing the `cancellationTask` to `Scheduled` anyhow and `Scheduled` is also the one that is failing the promise from the `cancel()` method. We can just go ahead and store the `cancellationTask` inside `Scheduled` and call it from the `cancel()` method directly instead of going through the future.
Importantly, here is that the `cancellationTask` is not allowed to retain the `ScheduledTask.task` otherwise we would change the semantics and retain the `ScheduledTask.task` longer than necessary. My previous PR https://github.com/apple/swift-nio/pull/2010, already implemented the work to get rid of the retain from the `cancellationTask` closure. So we are good to go ahead and store the `cancellationTask` inside `Scheduled` now
### Result:
`scheduleTask` requires two fewer allocations
### Motivation:
In my previous PR https://github.com/apple/swift-nio/pull/2009, I added baseline performance and allocation tests around `scheduleTask` and `execute`. After analysing, the various allocations that happen when scheduling a task there were only a few that could be optimized away potentially.
### Modifications:
This PR converts the `ScheduledTask` class to a struct which will reduce the number of allocations for scheduling tasks by 1. The only thing that needs to be worked around when converting to a struct is giving it an identity so that we can implement `Equatable` conformance properly. I explored two options. First, using an `ObjectIdentifier` passed to the init. Second, using an atomic counter per EventLoop. I went with the latter since the former requires an additional allocation in the case of calling `execute`
### Result:
`scheduleTask` and `execute` require one less allocation
### Motivation:
In issue https://github.com/apple/swift-nio/issues/1316, we see a large number of allocations to happen when scheduling tasks. This can definitely be optimized. This PR adds a number of baseline allocation and performance tests for both `scheduleTask` and `execute`. In the next PRs, I am going to try a few optimizations to reduce the number of allocations.
### Modifications:
Added baseline performance and allocation tests for `scheduleTask` and `execute`
Motivation:
Frequently when folks have issues with SwiftNIO or related technologies,
we need to ask the same questions over and over again.
It'd be much nicer to just have a script that collects them.
Modifications:
Add `scripts/nio-diagnose`.
Result:
Users/devs can easily create nio-diagnose files which are markdown and
can easily be pasted into GH.
Motivation:
Newer copies of liburing have changed header imports, making some
of our implicit includes fail, and have also changed the types
of some methods in a way that would only produce warnings in C but
which produces errors in Swift. Additionally, liburing broke when
we did the module split, but we didn't notice.
Modifications:
- Import NIOCore for the Uring Swift code.
- Import poll.h in CNIOLinux
- Support arguments that are either pointers or UInt64s.
Results:
liburing code compiles again
* Change Swift Concurrency Availability
* Label GitHub Action Jobs
* Incorporate the insights from Johannes Weiss
* Remove GitHub Actions
* Incorporate the feedback of Cory Benfield
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
For NIO's 'promise leak detector' we added file:line: labels to
makePromise and there it makes sense. A user might create a promise and
then never fulfil it, bad. With the file:line: arguments we can give
good diagnostics.
However, we (probably that @weissi again) also added it to flatMap and
friends where it doesn't make sense at all.
Sure, EventLoopFuture's implementation may create a promise in the
implementation of flatMap but this promise is never leaked unless the
previous future is never fulfilled (or NIO has a terrible bug). Suffice
to say that in a future chain, it's never a flatMap etc which is
responsible for leaking the first promise...
Explain here the context, and why you're making that change.
What is the problem you're trying to solve.
Modifications:
Remove all unnecessary `file:line:` parameters whilst keeping the public
API intact.
Result:
More sensible code.
Motivation:
A lot of libraries that use SwiftNIO don't allow/require the user to
specify what `EventLoop`s a certain function runs on. Something like
```
myHTTPClient.get("https://example.com) -> EventLoopFuture<...>
```
Internally `MyHTTPClient` has not much choice but using the
`EventLoopGroup.next()` method to obtain an `EventLoop` on which to
create the `EventLoopFuture` (that is returned).
This all works fine but unfortunately it usually forces a thread switch
which is most of the time avoidable if we're already running on an
`EventLoop`.
Modifications:
Provide an `EventLoopGroup.any()` method which can be used like so:
```swift
func get(_ url: String) -> EventLoopFuture<Response> {
let promise = self.group.any().makePromise(of: Response.self)
[...]
return promise.futureResult
}
```
`EventLoopGroup.any()` very much works like `EventLoopGroup.next()`
except that it tries -- if possible -- to return the _current_
`EventLoop`.
Note that this means that `any()` is _not_ the right solution if you
want to load balance. Likely, everything will now stay on the same
`EventLoop`.
Result:
Fewer thread switches.
Motivation:
It's like that #1999 was a little wide, and we should have forbidden
EBDAF on close.
Modifications:
Still forbid EBADF on close.
Result:
EBADF still preconditions on close.
Motivation:
`pthread_mutex_t` locks can do advanced error checking with
`PTHREAD_MUTEX_ERRORCHECK`, the cost of which was unknown. With #1994 I
could now measure it.
The differences from my local machine (M1 MBP) for uncontended locks
are:
- on Linux: 10% faster without `PTHREAD_MUTEX_ERRORCHECK`
- on macOS: 25% faster without `PTHREAD_MUTEX_ERRORCHECK`!
Modification:
Do the extended error checking only in debug mode. Note that we still
`precondition` on all pthread return values (which is basically free).
Result:
Faster code.
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
To judge the cost of `PTHREAD_MUTEX_ERRORCHECK` as well as
`os_unfair_lock` vs `pthread_mutex_t` it's useful to have a few simple
benchmarks.
Modification:
Add lock benchmarks.
Result:
Better data.
Motivation:
This test is flaky, and has always been flaky.
The issue here is simply a timing one. It has always been possible
for the thread running in the background DispatchQueue to be take
longer between calling semaphore.signal and promise.cascade(to:)
than it does for the event loop to process the tasks. If that
happens, this test will fail by eventually timing out.
Modifications:
- Adjust the test to avoid using promise.cascade(to:), and so prevent it
from having at timing window.
Result:
The test will be not flaky, or at least less flaky.
Resolves#1971.
Co-authored-by: George Barnett <gbarnett@apple.com>
Motivation:
ByteBuffer is quite core to NIO programs' perf and yet it doesn't even
try to reduce the number of under/overflow & bounds checks.
That's a lot of code that will never be run and that also introduces
unncessary branches which hurt perf.
Modification:
Remove a bunch of low hanging fruit checks.
Result:
Faster (and less) code.
Motivation:
Many network protocols (especially for example NFS) have quite a number
of integer values next to each other. In NIO, you'd normally parse/write
them with multiple read/writeInteger calls.
Unfortunately, that's a bit wasteful because we're checking the bounds
as well as the CoW state every time.
Modifications:
- Provide read/writeMultipleIntegers for up to 15 FixedWidthIntegers.
- Benchmarks
Result:
Faster code. For 10 UInt32s, this is a 5x performance win on my machine,
see benchmarks.
Motivation:
- Currently http 1xx response heads are handled incorrectly
Modifications:
- Add an `InformationalResponseStrategy` that allows the user to specify whether http 1xx responses shall be forwarded or dropped.
- Implement the necessary cases in `HTTPDecoder` and `BetterHTTPParser`
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
I noticed that when replacing `_UInt24` to `UInt32` in `ByteBuffer`'s
code (which unfortunately increases its size and is therefore
unacceptable), then `getSlice` gets fully inlined into `readSlice`.
But in the normal `ByteBuffer` code (with `_UInt24`) that is not the
case. So I had a quick go a shrinking the `getSlice` code. Sadly, it's
still not quite enough to get `getSlice` inlined into `readSlice`.
There are quite a lot of extra shifts for the UInt32 <--> UInt24
conversions and an extra branch (to check the `sliceBeginIndex >= 16
MiB` condition).
Modifications:
- use unchecked arithmetic where safe
- outline the slow path
Result:
much smaller code size
In my local testing I have seen that the compiler adds a retain and release cycle around the use of `Optional.map` in ByteBuffer's read methods.
Modifications:
- Replaced the use of `Optional.map` with `guard let` in ByteBuffer's read methods.
Motivation:
Currently you cannot create a NIOPipeBootstrap when you are on
an event loop due to a precondition check. This check seeks to prevent
consumers of the API from passing in a file descriptor thats referencing
a file on disk or on the network.
The method at hand 'validateFileDescriptorIsNotAFile' uses fstat, which potentially
could block the event loop especially if the fd is referencing a file over the network.
This check however prevents certain use cases of SwiftNIO, where the downsides
of a consumer of this api blocking their own event loop do not weigh in against preventing
an entire use case from SwiftNIO.
Modifications:
Removed the precondition
Result:
After this change it will be possible to feed file descriptors into
the bootstrap which potentially block the current event loop.
A potential (portable) replacement for fstat still has to be found in order to solve this problem completely.