Motivation:
As outlined in #1761, io_uring is a new async I/O facility on Linux.
This commit includes a second stab at adding this to SwiftNIO.
Modifications:
Added Uring Selector implementation.
Added liburing support shims.
Disabled one assert that trips during normal usage likely due to async nature of poll updates, for discussion
Added shared kernel sqpoll ring support (can be run with normal user privs in 5.13)
Support for both single shot polls (should work all the way back to 5.1 kernels, needs testing) and multishot streaming polls and modifications for polls (scheduled due in 5.13) for slightly better performance (and better impedance match to SwiftNIO usage)
Added extensive debug logs which can be enabled with -D compiler flags (should likely be removed when bringup and testing is complete)
Adjusted tests.
Added documentation.
Result:
Basic liburing support is in place.
Co-authored-by: Johannes Weiss <johannesweiss@apple.com>
* Add synchronous channel options
Motivation:
The functions for getting and setting channel options are currently
asynchronous. This ensures that options are set and retrieved safely.
However, in some cases the caller knows they are on the correct event
loop but still has to pay the cost of allocating a future to either get
or set an option.
Modifications:
- Add a 'NIOSynchronousChannelOptions' protocol for getting and setting
options
- Add a customisation point to 'Channel' to return 'NIOSynchronousChannelOptions'.
- Default implementation returns nil so as to not break API.
- Add implementations for 'EmbeddedChannel' and 'BaseSocketChannel'
- Allocation tests for getting and setting autoRead
Results:
Options can be get and set synchronously.
Motivation:
The names of most of the BSD socket option values were brought over into
their NIO helper properties, with their namespace removed. This vastly
increases the risk of collision, particularly for things like TCP_INFO,
which just became .info.
Modifications:
- Added the prefixes back, e.g. `.info` is now `.tcp_info`.
- Renamed socket type `.dgram` to `.datagram`, as this is the nice clean
API and we should use nice clean names.
Result:
Better APIs
* NIO: Add new `BSDSocket` namespace
This starts the split of the POSIX/Linux/Darwin/BSD interfaces and the
BSD Socket interfaces in order to support Windows. The constants on
Windows are not part of the C standard library and need to be explicitly
prefixed. Use the import by name to avoid the `#if` conditions on the
wrapped versions. This will allow us to cleanly cleave the dependency
on the underlying C library.
This change adds a `BSDSocket.OptionLevel` and `BSDSocket.Option` types
which allow us to get proper enumerations for these values and pipe them
throughout the NIO codebase.
* Apply suggestions from code review
Co-Authored-By: Cory Benfield <lukasa@apple.com>
improve flushNow re-entrancy protection
Motivation:
The fix in #1344 wasn't complete, there is another, quite complicated scenario where we would still forget to send flushed bytes until we see another `flush` call. The scenario is
1: writable()
2: --> flushNow (result: .writtenCompletely)
3: --> writabilityChanged callout
4: --> flushNow because user calls writeAndFlush (result: .couldNotWriteEverything)
5: --> registerForWritable (because line 4 could not write everything and flushNow returned .register)
6: --> unregisterForWritable (because line 2 wrote everything and flushNow returned .unregister)
Line 6 undoes the registeration in line 5.
Modification:
This fix makes sure that flushNow never re-enters and therefore the problem described above cannot happen anymore.
Result:
Hopefully actually fixes rdar://58571521
Motivation:
AtomicBox doesn't properly work (#1286) and despite fixing it in #1287,
it's really not worth using AtomicBox (which as of #1287 instroduces a
CAS loop) anymore.
Modifications:
Remove the two uses of AtomicBox.
Result:
No usage of broken/slow data types.
Motivation:
The existing Atomic class holds an UnsafeEmbeddedAtomic which holds an OpaquePointer to raw memory for the C atomic. This results in multiple allocations on init. The new NIOAtomic class uses ManagedBufferPointer which tail allocates memory for the C atomic as part of the NIOAtomic class allocation.
Modifications:
Created NIOAtomic class that uses ManagedBufferPointer to allocate memory for the C atomic. Created version of catmc_atomic_* C functions that expect memory to be allocated/deallocated by caller. Created NIOAtomicPrimitive protocol for the new version of catmc_atomic_* functions.
Result:
Purely additive. NIOAtomic is available as a replacement for Atomic which results in fewer memory allocations.
Motivation:
In Swift, writing
var something: T?
init() {
self.something = someValue
}
means that the compiler will first set `self.something` to `nil` and
then in the init override it with `self.someValue`
(https://bugs.swift.org/browse/SR-11777). Unfortunately, because of
https://bugs.swift.org/browse/SR-11768 , stored property initialisation
cannot be made `@inlinable` (short of using `@frozen` which isn't
available in Swift 5.0).
The combination of SR-11768 and SR-11777 leads to `var something: T?`
having much worse code than `var something: Optional<T>` iff the `init`
is `public` and `@inlinable`.
Modifications:
Change all `var something: T?` to `var something: Optional<T>`
Result:
Faster code, sad NIO developers.
Motivation:
ChannelOptions types like WriteBufferWaterMarkOption are global, public types. These types should be namespaced.
Modifications:
Moved global, public ChannelOptions types into ChannelOptions.Types enum. Created global, public typealiases with deprecation attributes which point to the namespaced types. Switched usage of these types in the codebase to use the namespaced versions.
Result:
Usage of the global, public versions of these types will cause deprecation warnings.
Motivation:
There are use-cases for SwiftNIO where there are no actual sockets but
rather two pipe file descriptors - one for input, one for output.
There's no real reason why SwiftNIO shouldn't work for those.
Modifications:
Add a PipeChannel.
Result:
More use-cases for SwiftNIO.
Motivation:
BaseSocketChannel's description method accessed the (mutable) socket
property of the channel which must only be accessed on the EventLoop.
Description however call be called on any thread.
Modifications:
Don't access the socket from description.
Result:
- more thread-safety
- fixes#1141
Motivation:
In the grpc-swift test suite, we saw a case where the server would
always immediately close the accepted socket. This lead NIO to misbehave
badly because kqueue would send us the `readEOF` before the `writable`
event that finishes an asynchronous `connect`.
What happened is that we just dropped the `readEOF` on the floor so we
would never actually tell the user if the channel ever went away.
Modifications:
Only register for `readEOF` after becoming active.
Result:
- we're happy with servers that immediately close the socket
Motivation:
Darwin does sometimes hand us back an (as of the docs) illegal error
code of EINVAL. So far, NIO has done the non-sensical thing of closing
the server socket as a result.
Modifications:
Just close the accepted socket, leave the server socket open, and fire
the error through the pipeline.
Result:
- fixes#1030
- better behaviour
Motivation:
Unhandled outbound user events should be failed but previously they were
succeeded.
Modifications:
Succeeding them is odd because nothing happened to them, so they should
be failed.
Result:
fixes#487
Motivation:
ChannelOptions used a too complicated mechanism for what they are.
Modifications:
- change the ChannelOptions to a much simpler mechanism
- rename the generic parameters from T to Option
Result:
code easier to read and understand
Motivation:
The `ELF.cascade` methods have a parameter label `promise` that does not match Swift API Guidelines, and a way to cascade just successes is not available - while for failures there is.
Modifications:
`ELF.cascade*` methods that already exist have had their `promise` label renamed to `to`, and a new `ELF.cascadeSuccess` method has been added.
Result:
EventLoopFuture now has the cascade methods `ELF.cascade(to:)`, `ELF.cascadeFailure(to:)`, and `ELF.cascadeSuccess(to:)`
Motivation:
EventLoopPromise.succeed/fail has extraneous labels that don't add
anything but noise.
Modifications:
remove extraneous labels
Result:
less noise in the code
* Make EventLoopFuture.cascade and cascadeFailure accept an optional promise
Motivation:
fixes#756
Modifications:
Change EventLoopFuture.cascade to func cascade(promise: EventLoopPromise<T>?)
Result:
EventLoopFuture.cascade can be called without needing to check if the promise is nil
Motivation:
BaseSocketChannel used to assert that `channelRead0` is only called when
the channel is active. That's bogus and it's trivial to hit this issue
when calling `fireChannelRead` in the last `ChannelHandler` when the
channel has gone inactive.
Modifications:
remove assertion, add comment
Result:
fewer crashes
Motivation:
If we're going to have errors in enumerations, we should aim to have as
few as possible. We wanted to throw away these two, so let's do it.
Modifications:
- Removed `ChannelLifecycleError`, moved case to `ChannelError`.
- Removed `MulticastError`, moved cases to `ChannelError`.
- Removed manual `Equatable` conformance, used synthesised one.
- Pulled `connectFailed` out of `ChannelError`, as it made it impossible to
write a correct implementation of `Equatable`.
- Conformed `NIOConnectionError` to `Error` instead.
Result:
Better error handling.
Resolves#620
Motivation:
Swift naming guidelines mandate that factory methods start with `make`,
like `makeSomething`. We had a few that were `newSomething`.
Modifications:
make all factories start with make
Result:
more compliant to Swift naming rules
Motivation:
making a promise of a type is more grammatical than making a promise for
a type.
Modifications:
s/newPromise(for/newPromise(of/g
Result:
more grammar
Motivation:
When creating new promises I always find it very frustrating to type the
`: EventLoopPromise<Type>` type annotation but it's necessary for the
compiler to know type the promise will be fulfilled with.
Modifications:
allow an optional `for: SomeType.self` parameter for `newPromise` as
let p = eventLoop.newPromise(for: Int.self)
is much easier to type than
let p: EventLoopPromise<Int> = eventLoop.newPromise()
Result:
easier to write code
Motivation:
We were complaining that we can't use `@convention(thin)` in a place
where we really need to make sure we don't get allocations for returning
a closure (that doesn't capture any context), so it should just be a
function pointer.
In reality however, we were actually capturing context...
Modifications:
Don't capture context anymore.
Result:
Closures should not actually not enclose anything.
Motivation:
Dispatch can only precondition that code is run from a certain queue, it
can't return if code is running on a certain queue.
To support that this introduces new EventLoop.assertInEventLoop and
EventLoop.preconditionInEventLoop methods instead of
assert(eventLoop.inEventLoop).
Modifications:
add EventLoop.assertInEventLoop and EventLoop.preconditionInEventLoop
Result:
better support for Network.framework in NIOTS
Motivation:
It's generally good style to explain why something needs to be force
unwrapped/tried if not obvious.
Modifications:
Add explanations to a bunch of places.
Result:
Code easier to understand.
Motivation:
A small number of socket options have values that do not fit into a C
int type. Our current ChannelOption based approach for setting these
simply does not work, and cannot be extended to support the truly arbitrary
types that the setsockopt/getsockopt functions allow here.
This makes it impossible to use some socket options, which is hardly a
good place to be.
There were a number of ways we could have addressed this: we could have
special-cased all socket options with non-integer types in ChannelOption,
but I believe that would be too manual, and risk limiting users that need
to set other socket options. We could also have added a ChannelOption
that simply allows users to pass a buffer to write into or read from,
but that is a very un-Swift-like API that feels pretty gross to hold.
Ultimately, the nicest seemed to be a new protocol users could check
for, and that would provide APIs that let users hold the correct concrete
type. As with setsockopt/getsockopt, while this API is typed it is
not type-safe: ultimately, the struct we have here is treated just as a
buffer by setsockopt/getsockopt. We do not attempt to prevent users from
shooting themselves in the foot here.
This PR does not include an example use case in any server, as I will
provide such an example in a subsequent multicast PR.
Modifications:
- Added a SocketOptionChannel protocol.
- Conformed BaseSocketChannel to SocketOptionChannel.
- Wrote some tests for this.
Result:
Users can set and get sockopts with data that is larger than a C int.
Motivation:
Previously we asserted that a Channel cannot go inactive after calling
out for a read that was triggered by (draining the receive buffer after)
a write error.
This assertion was put in place to guard against `readComplete` events
sent on inactive channels. It did that job just fine but crashes aren't
great so we now conditionally fire the `readComplete` event if the
Channel stays active.
Modifications:
make the readComplete event firing conditional
Result:
fewer crashes, more happy faces
Motivation:
We have one case in the code where we expect a socket error to be set
when we hit a connection that got the `.reset` event. On Linux that's
expected and therefore we just map this to `ECONNRESET`.
On other platforms however that is unexpected. Now I just found an
[Envoy commit](62f8e51bdd (diff-3315c7b2) 91f78a482ebd4cbfc0545362R345)
which suggests this might not work on FreeBSD. This should prompt our
users to let us know if we ever see this error on non-Linux systems.
Modifications:
Add a message that makes it clear this is unexpected and link our
ticket.
Result:
we should learn if this actually happens on macOS.
Motivation:
Swift code that doesn't use explicit `self.` is hard to read, even
harder to search for and quite bug prone. During my analysis I found
quite a number of them, this fixes the ones I came across.
Modifications:
Add a bunch of missing explicit `self.`s
Result:
code easier to read and less bug prone
Motivation:
In Darwin it is possible for readEOF to be received before writable,
if we race connection teardown with registering for writable. See
https://github.com/apple/swift-nio/issues/452 for a more in-depth
explanation.
Modifications:
Widened the assertion.
Result:
People's code won't go bang.
Motivation:
If registration of a Channel fails we need to close that Channel.
Previously we'd just drop it on the floor which triggers an assertion
that an open channel got leaked.
Modifications:
In cases where the Channel registration fails, we need to close that
channel.
Result:
- No crashes if registration fails.
- fixes#417
Motivation:
Previously we would eagerly register any fd with the selector
(epoll/kqueue) which works well with kqueue. With epoll however, you'll
get `EPOLLHUP` immediately if you supplied a socket that hasn't had
connect/bind called on it.
We got away with this because we took quite a bit of care to always make
sure we call connect/bind pretty much immediately after we registered
it. And crucially before we ever asked the selector to tell us about
new events.
Modifications:
made selector registration lazy (when we try activating the channel by
calling connect/bind)
Result:
the user can now register and connect whenever they feel like it.
Motivation:
Once again, we had an extra event loop hop between a channel
registration and its activation. Usually this shows up as `EPOLLHUP` but
not so for accepted channels. What happened instead is that we had a
small race window after we accepted a channel. It was in a state where
it was not marked active _yet_ and therefore we'd not read out its data
in case we received a `.readEOF`. That usually leads to a stale
connection. Fortunately it doesn't happen very often that the client
connects, immediately sends some data and then shuts the write end of
the socket.
Modifications:
prevent the event loop hop between registration and activation
Result:
will always read out the read buffer on .readEOF
Motivation:
@vlm hit an interesting situation which is very likely the sign of
another bug that we have yet to find:
During a close, @vlm hit an error in close which lead to a user callout
which then closed again.
The immediate fix is simple (this PR) is as always not to call out to
the user before reconciling our own state. But hitting this bug also
means that either a `socket.close` or a `selectableEventLoop.deregister`
failed as we only called out in those situations. That definitely hides
a condition that is hard to imagine without another bug that we still
need to find.
Modifications:
in `BaseSocketChannel.close0` follow rule 0 and don't call out before
reconciling state.
Result:
fewer crashes, less potential to hide other bugs.
Motivation:
We can't nil out the connectPromise before we fail it. This totally
slipped through because we never fully waited for it because of the
reasons below.
It's quite easy to misuse `eventLoop.submit` as if you do some
asynchronous operation (will return `EventLoopFuture<T>`) the result of
the submit will be `EventLoopFuture<EventLoopFuture<T>>`. So if you just
do
try eventLoop.submit {
return somethingAsynchronous
}.wait()
then you will have only waited for the asynchronous operation to have
started (it might still be running). Fortunately, the compiler usually
warns you about that unless it returns `Void`. So it naturally tells you
to write
try eventLoop.submit {
return somethingAsynchronous
}.wait().wait()
which will wait for both. Unfortunately, `XCTAssertNoThrow` will just
eat away any type so there it's advisable to write
XCTAssertNoThrow(try eventLoop.submit { ... }.wait.wait() as Void)
to be sure it's actually a void. The Swift compiler however has an
annoying bug where it won't always realise that that's a problem
(https://bugs.swift.org/browse/SR-7478). Fortunately the 4.1 release
behaves okay so CI will catch it.
Modifications:
- replace the bad pattern
_ = try el.submit { ... }
- added `as Void`s inside `XCTAssertNoThrow`s that do submit
- added missing `.wait()`s
Result:
tests should be more correct.
Motivation:
Often its useful to be still be able to access the local / remote address during channelInactive / handlerRemoved callbacks to for example log it. We should ensure its still accessible during it.
Modifications:
- Fallback to slow-path in ChannelHandlerContext.localAddress0 / remoteAddress0 if fast-path fails to try accessing the address via the cache.
- Clear cached addresses after all callbacks are run.
- Add unit test.
Result:
Be able to access addresses while handlers are notified.
Motivation:
Since forever we had a major bug in the Selector: In this condition:
- kqueue/epoll had many events
- in one of the earlier events we unregister a Channel whose fd is on of
the later events
- we subsequently (still in the same event loop tick) register a new
channel which gets the same fd as the previously closed one
then we would deliver an event that was meant for a previous channel to
a newly opened one.
Thanks to @mcdappdev for hitting this bug, helping us debug it and also
providing a repeatedly working repro.
Modifications:
if during event delivery any fd gets unregistered, we stop delivering
the remaining events and rely on the selector to redeliver them
again next time.
Result:
we don't deliver events for previously closed channels to new ones.
Motivation:
When a pending connect is still in process and the Channel is closed we need to ensure we use the correct ordering when notify the promises and ChannelHandlers in the pipeline.
The ordering should be:
- Notify pending connect promise
- Notify close promise
- Call channelInactive(...)
Modifications:
- Correct ordering of notification
- Add unit test
Result:
Ensure correct ordering when connect is still pending
Motivation:
In some contexts it is important to be able to pass file descriptors
around instead of specifying the particular host and port. This allows removing
certain windows for race conditions, and has positive security implications.
Modifications:
Added `withConnectedSocket(_:)` call to the `ClientBootstrap` that can
be used to create a new `Channel` out of existing file descriptor
representing the connected socket.
Result:
Allows initializing `Channel`s off the existing file descriptors.
Motivation:
Since #286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.
Modifications:
- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
know we have been reset but not why and didn't encounter any errors
- write tests for these situations
Result:
connection resets that don't hit any other errors will now see the
correct error
Motivation:
When a connect() call returns an error other than EINPROGRESS, we
currently leave the FD registered and don't close the channel. This is
wrong, we should take this as a signal to close the socket immediately,
rathern than letting the selector tell us this FD is dead: after all,
we know it's dead.
Modifications:
Call `close0()` in synchronous connect failures.
Result:
Faster failures!
Motivation:
We had a number of problems:
1. We wanted to lazily process input EOFs and connection resets only
when the user actually calls `read()`. On Linux however you cannot
unsubscribe from `EPOLLHUP` so that's not possible.
2. Lazily processing input EOFs/connection resets wastes kernel
resources and that could potentially lead to a DOS
3. The very low-level `Selector` interpreted the eventing mechanism's
events quite a lot so the `EventLoop`/`Channel` only ever saw
`readable` or `writable` without further information what exactly
happened.
4. We completely ignored `EPOLLHUP` until now which on Unix Domain
Socket close leads to a 100% CPU spin (issue #277)
Modifications:
- made the `Selector` interface richer, it now sends the following
events: `readable`, `writable`, `readEOF` (input EOF), `reset`
(connection reset or some error)
- process input EOFs and connection resets/errors eagerly
- change all tests which relied on using unconnected and unbound sockets
to user connected/bound ones as `epoll_wait` otherwise would keep
sending us a stream of `EPOLLHUP`s which would now lead to an eager
close
Result:
- most importantly: fix issue #277
- waste less kernel resources (by dealing with input EOFs/connection
resets eagerly)
- bring kqueue/epoll more in line
Motivation:
Callouts to user code allows the user to make calls that re-enter
channel code. In the case of channel closure, we could call out to the
user before the channel knew it was completely closed, which would
trigger a safety assertion (in debug mode) or hit a fatalError
(in release mode).
Modifications:
Reconciled channel state before we call out to user code.
Added a test for this case.
Result:
Fewer crashes, better channel state management.
Motivation:
When an asynchronous connect failed we would crash because we didn't
correctly close the connection to start tearing everything down.
That correctly made the `SocketChannelLifecycleManager` trip.
This was filed as #302.
Modifications:
Correctly deregister and close the connection if an async connect fails.
Result:
Fewer crashes, fixes#302.
Motivation:
NIO 1.3.1 was TSan clean but `master` regressed. In total I could see
two breakages both proudly presented by `SocketChannelLifecycleManager`.
Sorry :| :
1. accessing the `isActiveAtomic` from arbitrary threads:
this was clearly a bug because `SocketChannelLifecycleManager` is
usually held mutabe. So `isActiveAtomic` needs to be injected into
`SocketChannelLifecycleManager` rather than contstructed there
2. accessing the `ChannelPipeline` through `SocketChannelLifecycleManager`:
same problem as above really.
Modifications:
inject both the ChannelPipeline and the isActiveAtomic in the
appropriate places.
Result:
NIO much more correct, TSan happy
Motivation:
We had a lot of problems with the Channel lifecycle statemachine as it
wasn't explicit, this fixes this. Additionally, it asserts a lot more.
Modifications:
- made Channel lifecycle statemachine explicit
- lots of asserts
Result:
- hopefully the state machine works better
- the asserts should guide our way to work on in corner cases as well