swift-nio

Commit Graph

Author	SHA1	Message	Date
Cory Benfield	be4ea8ac74	Add SocketOptionChannel for wider socket options. (#589 ) Motivation: A small number of socket options have values that do not fit into a C int type. Our current ChannelOption based approach for setting these simply does not work, and cannot be extended to support the truly arbitrary types that the setsockopt/getsockopt functions allow here. This makes it impossible to use some socket options, which is hardly a good place to be. There were a number of ways we could have addressed this: we could have special-cased all socket options with non-integer types in ChannelOption, but I believe that would be too manual, and risk limiting users that need to set other socket options. We could also have added a ChannelOption that simply allows users to pass a buffer to write into or read from, but that is a very un-Swift-like API that feels pretty gross to hold. Ultimately, the nicest seemed to be a new protocol users could check for, and that would provide APIs that let users hold the correct concrete type. As with setsockopt/getsockopt, while this API is typed it is not type-safe: ultimately, the struct we have here is treated just as a buffer by setsockopt/getsockopt. We do not attempt to prevent users from shooting themselves in the foot here. This PR does not include an example use case in any server, as I will provide such an example in a subsequent multicast PR. Modifications: - Added a SocketOptionChannel protocol. - Conformed BaseSocketChannel to SocketOptionChannel. - Wrote some tests for this. Result: Users can set and get sockopts with data that is larger than a C int.	2018-09-18 13:40:38 +02:00
Johannes Weiss	df764fedf0	don't crash when Channel goes inactive in read triggered by write error (#594 ) Motivation: Previously we asserted that a Channel cannot go inactive after calling out for a read that was triggered by (draining the receive buffer after) a write error. This assertion was put in place to guard against `readComplete` events sent on inactive channels. It did that job just fine but crashes aren't great so we now conditionally fire the `readComplete` event if the Channel stays active. Modifications: make the readComplete event firing conditional Result: fewer crashes, more happy faces	2018-08-27 11:49:58 +01:00
Johannes Weiss	a9072729bb	output a hint if we get a reset connection w/o error set (#573 ) Motivation: We have one case in the code where we expect a socket error to be set when we hit a connection that got the `.reset` event. On Linux that's expected and therefore we just map this to `ECONNRESET`. On other platforms however that is unexpected. Now I just found an [Envoy commit](`62f8e51bdd (diff-3315c7b2)` 91f78a482ebd4cbfc0545362R345) which suggests this might not work on FreeBSD. This should prompt our users to let us know if we ever see this error on non-Linux systems. Modifications: Add a message that makes it clear this is unexpected and link our ticket. Result: we should learn if this actually happens on macOS.	2018-08-16 15:13:02 +01:00
Johannes Weiss	d32d5c6a50	add forgotten selfs (#574 ) Motivation: Swift code that doesn't use explicit `self.` is hard to read, even harder to search for and quite bug prone. During my analysis I found quite a number of them, this fixes the ones I came across. Modifications: Add a bunch of missing explicit `self.`s Result: code easier to read and less bug prone	2018-08-16 09:28:24 +01:00
Cory Benfield	695afc5205	Widen assertion on readEOF. (#453 ) Motivation: In Darwin it is possible for readEOF to be received before writable, if we race connection teardown with registering for writable. See https://github.com/apple/swift-nio/issues/452 for a more in-depth explanation. Modifications: Widened the assertion. Result: People's code won't go bang.	2018-05-30 14:35:10 +01:00
Johannes Weiss	ec880b0853	don't just drop Channels on the floor if registration fails Motivation: If registration of a Channel fails we need to close that Channel. Previously we'd just drop it on the floor which triggers an assertion that an open channel got leaked. Modifications: In cases where the Channel registration fails, we need to close that channel. Result: - No crashes if registration fails. - fixes #417	2018-05-22 11:29:20 +01:00
Johannes Weiß	2f55baf537	implement lazy selector registration (#388 ) Motivation: Previously we would eagerly register any fd with the selector (epoll/kqueue) which works well with kqueue. With epoll however, you'll get `EPOLLHUP` immediately if you supplied a socket that hasn't had connect/bind called on it. We got away with this because we took quite a bit of care to always make sure we call connect/bind pretty much immediately after we registered it. And crucially before we ever asked the selector to tell us about new events. Modifications: made selector registration lazy (when we try activating the channel by calling connect/bind) Result: the user can now register and connect whenever they feel like it.	2018-05-04 19:09:59 +02:00
Johannes Weiß	a5db2a6751	fix event loop hop between registration and activation of accepted channels (#389 ) Motivation: Once again, we had an extra event loop hop between a channel registration and its activation. Usually this shows up as `EPOLLHUP` but not so for accepted channels. What happened instead is that we had a small race window after we accepted a channel. It was in a state where it was not marked active _yet_ and therefore we'd not read out its data in case we received a `.readEOF`. That usually leads to a stale connection. Fortunately it doesn't happen very often that the client connects, immediately sends some data and then shuts the write end of the socket. Modifications: prevent the event loop hop between registration and activation Result: will always read out the read buffer on .readEOF	2018-05-04 15:48:08 +01:00
Johannes Weiß	5bebbf5565	don't crash if close is called when close fails (#387 ) Motivation: @vlm hit an interesting situation which is very likely the sign of another bug that we have yet to find: During a close, @vlm hit an error in close which lead to a user callout which then closed again. The immediate fix is simple (this PR) is as always not to call out to the user before reconciling our own state. But hitting this bug also means that either a `socket.close` or a `selectableEventLoop.deregister` failed as we only called out in those situations. That definitely hides a condition that is hard to imagine without another bug that we still need to find. Modifications: in `BaseSocketChannel.close0` follow rule 0 and don't call out before reconciling state. Result: fewer crashes, less potential to hide other bugs.	2018-05-03 17:35:49 +02:00
Johannes Weiß	86fce7fce8	don't nil out connectPromise prematurely and wait fully (#337 ) Motivation: We can't nil out the connectPromise before we fail it. This totally slipped through because we never fully waited for it because of the reasons below. It's quite easy to misuse `eventLoop.submit` as if you do some asynchronous operation (will return `EventLoopFuture<T>`) the result of the submit will be `EventLoopFuture<EventLoopFuture<T>>`. So if you just do try eventLoop.submit { return somethingAsynchronous }.wait() then you will have only waited for the asynchronous operation to have started (it might still be running). Fortunately, the compiler usually warns you about that unless it returns `Void`. So it naturally tells you to write try eventLoop.submit { return somethingAsynchronous }.wait().wait() which will wait for both. Unfortunately, `XCTAssertNoThrow` will just eat away any type so there it's advisable to write XCTAssertNoThrow(try eventLoop.submit { ... }.wait.wait() as Void) to be sure it's actually a void. The Swift compiler however has an annoying bug where it won't always realise that that's a problem (https://bugs.swift.org/browse/SR-7478). Fortunately the 4.1 release behaves okay so CI will catch it. Modifications: - replace the bad pattern _ = try el.submit { ... } - added `as Void`s inside `XCTAssertNoThrow`s that do submit - added missing `.wait()`s Result: tests should be more correct.	2018-05-01 15:24:20 +01:00
Norman Maurer	9f01374169	Ensure localAddress / remoteAddress are still accessible in channelInactive / handlerRemoved (#346 ) Motivation: Often its useful to be still be able to access the local / remote address during channelInactive / handlerRemoved callbacks to for example log it. We should ensure its still accessible during it. Modifications: - Fallback to slow-path in ChannelHandlerContext.localAddress0 / remoteAddress0 if fast-path fails to try accessing the address via the cache. - Clear cached addresses after all callbacks are run. - Add unit test. Result: Be able to access addresses while handlers are notified.	2018-04-23 12:51:01 +02:00
Johannes Weiß	902b18df16	don't deliver events for unregistered fds (#341 ) Motivation: Since forever we had a major bug in the Selector: In this condition: - kqueue/epoll had many events - in one of the earlier events we unregister a Channel whose fd is on of the later events - we subsequently (still in the same event loop tick) register a new channel which gets the same fd as the previously closed one then we would deliver an event that was meant for a previous channel to a newly opened one. Thanks to @mcdappdev for hitting this bug, helping us debug it and also providing a repeatedly working repro. Modifications: if during event delivery any fd gets unregistered, we stop delivering the remaining events and rely on the selector to redeliver them again next time. Result: we don't deliver events for previously closed channels to new ones.	2018-04-20 16:19:10 +01:00
Norman Maurer	91ef938c60	Ensure correct ordering of promise notification when connect is still pending. (#330 ) Motivation: When a pending connect is still in process and the Channel is closed we need to ensure we use the correct ordering when notify the promises and ChannelHandlers in the pipeline. The ordering should be: - Notify pending connect promise - Notify close promise - Call channelInactive(...) Modifications: - Correct ordering of notification - Add unit test Result: Ensure correct ordering when connect is still pending	2018-04-19 14:06:21 +02:00
Lev Walkin	338cfb541b	support for initializing off the system file descriptors (#285 ) Motivation: In some contexts it is important to be able to pass file descriptors around instead of specifying the particular host and port. This allows removing certain windows for race conditions, and has positive security implications. Modifications: Added `withConnectedSocket(_:)` call to the `ClientBootstrap` that can be used to create a new `Channel` out of existing file descriptor representing the connected socket. Result: Allows initializing `Channel`s off the existing file descriptors.	2018-04-19 09:58:39 +02:00
Johannes Weiß	a1ab3f32b3	propagate socket errors on .reset (#325 ) Motivation: Since #286, we do eagerly detect connection resets which is good. To resolve the situation we first try to read everything off the socket. If something fails there, everything is good and the client gets the correct error. If that however doesn't lead to any errors, we would close the connection with `error: ChannelError.eof` which the client would then see for example on a connect that happened to get `ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are fixed here. Modifications: - on reset, try to get the socket error if any - if no socket error was detected, we send a `ECONNRESET` because we know we have been reset but not why and didn't encounter any errors - write tests for these situations Result: connection resets that don't hit any other errors will now see the correct error	2018-04-18 18:56:00 +01:00
Cory Benfield	707b413ec6	Synchronous connection failures must close channels. (#329 ) Motivation: When a connect() call returns an error other than EINPROGRESS, we currently leave the FD registered and don't close the channel. This is wrong, we should take this as a signal to close the socket immediately, rathern than letting the selector tell us this FD is dead: after all, we know it's dead. Modifications: Call `close0()` in synchronous connect failures. Result: Faster failures!	2018-04-18 16:07:05 +01:00
Johannes Weiß	b109389c54	eagerly process input EOF/connection resets (fixes #277 ) (#286 ) Motivation: We had a number of problems: 1. We wanted to lazily process input EOFs and connection resets only when the user actually calls `read()`. On Linux however you cannot unsubscribe from `EPOLLHUP` so that's not possible. 2. Lazily processing input EOFs/connection resets wastes kernel resources and that could potentially lead to a DOS 3. The very low-level `Selector` interpreted the eventing mechanism's events quite a lot so the `EventLoop`/`Channel` only ever saw `readable` or `writable` without further information what exactly happened. 4. We completely ignored `EPOLLHUP` until now which on Unix Domain Socket close leads to a 100% CPU spin (issue #277) Modifications: - made the `Selector` interface richer, it now sends the following events: `readable`, `writable`, `readEOF` (input EOF), `reset` (connection reset or some error) - process input EOFs and connection resets/errors eagerly - change all tests which relied on using unconnected and unbound sockets to user connected/bound ones as `epoll_wait` otherwise would keep sending us a stream of `EPOLLHUP`s which would now lead to an eager close Result: - most importantly: fix issue #277 - waste less kernel resources (by dealing with input EOFs/connection resets eagerly) - bring kqueue/epoll more in line	2018-04-16 19:36:47 +02:00
Cory Benfield	3275ff7c9c	Don't call to user code before reconciling Channel state. (#310 ) Motivation: Callouts to user code allows the user to make calls that re-enter channel code. In the case of channel closure, we could call out to the user before the channel knew it was completely closed, which would trigger a safety assertion (in debug mode) or hit a fatalError (in release mode). Modifications: Reconciled channel state before we call out to user code. Added a test for this case. Result: Fewer crashes, better channel state management.	2018-04-12 13:49:09 +01:00
Johannes Weiß	aee4aad3f9	fix a crash when async connect failed (#302 ) (#309 ) Motivation: When an asynchronous connect failed we would crash because we didn't correctly close the connection to start tearing everything down. That correctly made the `SocketChannelLifecycleManager` trip. This was filed as #302. Modifications: Correctly deregister and close the connection if an async connect fails. Result: Fewer crashes, fixes #302.	2018-04-12 14:21:12 +02:00
Johannes Weiß	748e08e62a	fix issues found by TSan (#294 ) Motivation: NIO 1.3.1 was TSan clean but `master` regressed. In total I could see two breakages both proudly presented by `SocketChannelLifecycleManager`. Sorry :\| : 1. accessing the `isActiveAtomic` from arbitrary threads: this was clearly a bug because `SocketChannelLifecycleManager` is usually held mutabe. So `isActiveAtomic` needs to be injected into `SocketChannelLifecycleManager` rather than contstructed there 2. accessing the `ChannelPipeline` through `SocketChannelLifecycleManager`: same problem as above really. Modifications: inject both the ChannelPipeline and the isActiveAtomic in the appropriate places. Result: NIO much more correct, TSan happy	2018-04-09 12:05:54 +01:00
Johannes Weiß	23744213e6	make Channel lifecycle statemachine explicit (#220 ) Motivation: We had a lot of problems with the Channel lifecycle statemachine as it wasn't explicit, this fixes this. Additionally, it asserts a lot more. Modifications: - made Channel lifecycle statemachine explicit - lots of asserts Result: - hopefully the state machine works better - the asserts should guide our way to work on in corner cases as well	2018-03-29 19:50:07 +02:00
Cory Benfield	0a23199d7d	Remove busywork left over from flush promises. (#234 ) Motivation: The Socket channels and PendingDatagramWritesManager had a bunch of overhead left over from when they were handling flush promises. This was only adding to code complexity and perf cost. Modifications: Removed the promise label in markPendingWritePoint. Reworked the code to stop expecting write promises in PendingDatagramWritesManager. Result: Smaller, faster code.	2018-03-26 17:07:56 +01:00
Johannes Weiß	f58e8318fb	move BaseSocketChannel to its own file (#212 ) Motivation: {,Server,Datagram,Base}SocketChannel hold quite a bit of complexity and also a many of lines of code. Yet they all shared the very same file which isn't great. Modifications: - moved BaseSocketChannel into its own file - made the methods that need overriding internal (from `fileprivate`) Result: SocketChannel.swift has a lot fewer lines of code.	2018-03-21 17:15:53 +00:00

23 Commits