Motivation:
I recently discovered that UnsafeRawBufferPointer.init(rebasing:) is
surprisingly expensive, with 7 traps and 11 branches. A simple
replacement can make it a lot cheaper, down to two traps and four
branches. This ends up having pretty drastic effects on
ByteBuffer-heavy NIO code, which often outlines the call to that
initializer and loses the ability to make a bunch of site-local
optimisations.
While this has been potentially fixed upstream with
https://github.com/apple/swift/pull/34879, there is no good reason to
wait until Swift 5.4 for this improvement.
Due to the niche use-case, I didn't bother doing this for _every_
rebasing in the program. In particular, there is at least one
UnsafeBufferPointer(rebasing:) that I didn't do this with, and there are
uses in both NIOTLS and NIOHTTP1 that I didn't change. While we can fix
those if we really need to, it would be nice to avoid this helper
proliferating too far through our codebase.
Modifications:
- Replaced the use of URBP.init(rebasing:) with a custom hand-rolled
version that avoids Slice.count.
Result:
Cheaper code. One NIOHTTP2 benchmark sees a 2.9% speedup from this
change alone.
Motivation:
The exisiting ByteBuffer`s `capacity` property exposes the number of
addressable bytes in the buffer (that is, the difference between its
upper and lower bound). This value can be significantly different than
the number of bytes allocated to the buffer's underlying storage.
For example, when constructing of COW slice from the original buffer the
underlying storage maybe tens of thousands of bytes while the slice's
capacity is 1.
To avoid any ambiguitity, a new property - `storageCapacity` - will be
added.
Modifications:
Added new `storageCapacity` property to expose the capacity of the buffer (or
derived COW slice) underlying storage (that is allocated to the buffer).
This property is part of public interface ans is typed as an Int.
Updated unit tests to verify `storedCapacity` remains constant when
slicing the ByteBuffer and also to account for the addition of the new
property in the CustomStringConvertible desciption.
Result:
The `storageCapacity` of a buffer and a COW slice taken from that buffer
with remain the same data is written to the slice.
Co-authored-by: Graeme Jenkinson <gcjenkinson@Graemes-MacBook-Pro.local>
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
There are multiple sub-optimal ByteBuffer creation patterns that occur
in the wild. Most often they happen when people don't actually have
access to a `Channel` just want to "convert" a `String` into a
`ByteBuffer`. To do this, they are forced to type
var buffer = ByteBufferAllocator().buffer(capacity: string.utf8.count)
buffer.writeString(string)
Sometimes, they don't get the capacity calculation right or just put a
`0`.
Similar problems happen if NIO users want to cache a ByteBuffer in their
`ChannelHandler`. You will then find this code:
```swift
if self.buffer == nil {
self.buffer = receivedBuffer
} else {
var receivedBuffer = receivedBuffer
self.buffer!.writeBuffer(&receivedBuffer)
}
```
And lastly, sometimes people want to append one `ByteBuffer` to another
without mutating the appendee. That's also cumbersome because we only
support a mutable version of `writeBuffer`.
Modifications:
- add `ByteBuffer` convenience initialisers
- add convenience `writeBuffer` methods to `Optional<ByteBuffer>`
- add `writeBufferImmutable` which doesn't mutate the appendee.
Result:
More convenience.
Co-authored-by: Cory Benfield <lukasa@apple.com>
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
Often we know ahead of time that we will need to write a certain number
of bytes into a buffer. It is useful to ensure the buffer has enough
capacity to store these bytes without causing unnecessary allocations.
However, `reserveCapacity()` is defined in terms of the capacity of the
buffer and it's can be easy to forget that this does not take into
account the writer index.
Modifications:
- Add a method which reserves enough capacity to write at least the
specified number of bytes.
- Update `writeWithUnsafeMutableBytes(minimumWritableBytes:body:)` to use the
new method.
Result:
It's easier for users to reserve enough writable capacity ahead of time.
Motivation:
Calling `function(otherFunction)` allocates, but
`function({ otherFunction($0) }` does not.
See: SR-12115 and SR-12116.
Modifications:
Avoid passing functions directly; pass them in new closures instead.
Result:
Fewer allocations.
Motivation:
`ByteBuffer` should implement the `Hashable` protocol as stated by: https://github.com/apple/swift-nio/issues/1320
Modifications:
Implemented the `Hashable` protocol for `ByteBuffer` and added a test `testHasherUsesReadBuffersOnly`.
Result:
`ByteBuffer` implements the `Hashable` protocol. The test `testHasherUsesReadBuffersOnly` is failing for the moment though.
Motivation:
In some cases it may be possible to write the (usually small) web socket
frame header to the buffer provided by the user. If we do this we can
avoid an extra pipeline traversal for the small write. This is only
going to be a performance win in cases where we can avoid a
copy-on-write operation on the buffer, but if we can then it's a useful
small win to achieve.
Modifications:
- Refactored the WebSocketFrameEncoder to potentially prepend the frame
header.
- Added a missing @inlinable attribute.
- Tests.
Result:
Potentially improved performance.
Motivation:
ByteBufferView isn't a mutable collection, but it probably should be.
Modifications:
- Add `copyBytes(at:to:length)` to `ByteBuffer` to copy bytes from a
readable region of a buffer to another part of the buffer
- Conform `ByteBufferView` to `MutableCollection`, `RangeReplaceableCollection`
and `MutableDataProtocol`
- Add an allocation counting test
Result:
`ByteBufferView` is now mutable.
Motivation:
The ExpressibleBy*Literal conformances require `@usableFromInline` inits
which make it harder to make NIO compatible with library evolution
because we can no longer just remove all `@inlinable`s and call it a
day.
Modifications:
Remove the ExpressibleBy*Literal conformances for internal types.
Result:
NIO easy to convert into a library evolution mode.
Motivation:
The Swift compiler actively tries to make our live hard and puts all
initialisations of variables that are initialised using the shorthand
syntax
@usableFromInline var myVar: MyVarType = .init()
into a _non-inlinable_ function
variable initialization expression of MyModule.MyType.myVar : MyVarType
This means that the following pattern
public struct MyType {
@usableFromInline var myVar: MyVarType = .init()
@inlinable public init() {
}
}
leads to code for the `init` that looks like
CALL variable_initialization_expression_...
instead of also inlining the the code for the variable initialiser.
Modifications:
Don't use `@usableFromInline var myVar: MyVarType = ...` anywhere,
instead put it explicitly into the @inlinable `init`s which makes
actually inlinable.
I even removed that syntax from places where it doesn't matter
(eg. non-@inlinable inits) for consistency and in case someone were to
make the init inlinable later.
Result:
Minor performance wins working around https://bugs.swift.org/browse/SR-11768
Motivation:
When calling ByteBuffer setBytes/writeBytes with CircularBuffer<Uint8> we take the slow path. Before we make any changes we want a baseline for current performance.
Modifications:
- This gives a very significant performance increase for CircularBuffer
and likely other types that use the slow path.
- Adds ByteBuferBenchmark.swift and tests circular_buffer_into_byte_buffer_1kb, circular_buffer_into_byte_buffer_1mb.
- Make _setSlowPath @inlinable
- Implement == for CircularBuffer.Index to make it @inlinable
Result:
- Extra performance tests
- Improve performance
Motivation:
Docs give unexpected results since behavior was changed in NIO 2
to only see readable bytes. Fixes#1232.
Modifications:
Documentation updated.
Result:
Documentation gives expected results with no errors.
Motivation:
In complex ByteBuffer parsing code a surprising amount of cost comes
from retain/release operations that are emitted by the compiler around
calls to readerIndex/writerIndex, readableBytes/writableBytes, and
ByteBufferSlice._lowerBound. These are all straightforward computed
properties based on usableFromInline operations, and so they can be made
inlinable.
Modifications:
Joannis' sample benchmark runtime is halved by eliminating
retains/releases.
Result:
Better, faster parsing code.
Motivation:
ByteBuffer.clear doesn't reset the ByteBuffer._slice. Imagine having a byte buffer slice (ByteBuffer.getSlice). Calling ByteBuffer.clear on the slice will reset the reader and writer, but the ByteBuffer.capacity will return the old slice size, not the full storage capacity. After this change ByteBuffer.clear will also reset the ByteBuffer._slice to regain the whole ByteBuffer._storage.
Modifications:
Just setting the full storage slice in the ByteBuffer.clear method.
Result:
After that change ByteBuffer.clear will reset byte buffer slice to full capacity of underlying storage.
Motivation:
writeWithUnsafeMutableBytes cannot tolerate arbitrarily large writes because it is not possible to resize the buffer by the time body receives the buffer pointer. This change provides an option for the user to reserve writable capacity before the pointer is passed to body.
Modifications:
Added a backward compatible optional parameter to writeWithUnsafeMutableBytes which can reserve writable capacity before the pointer is passed to body.
Result:
Additive change only. Users can optionally specify a minimum writable capacity when calling writeWithUnsafeMutableBytes.
Motivation:
From time to time a NIO application holds a ByteBuffer that it may want
to edit. Whether it wants to do that may potentially be contingent on
whether having to do so will cause an allocation, as it may be
preferable to prefer a different operation.
Modifications:
- Added API for modifying a ByteBuffer if it will not CoW.
Result:
People can write allocation-sensitive code on ByteBuffer.
Motivation:
ByteBuffer didn't document well that its capacity is really an initial
capacity only.
Modifications:
Clearly state that it's an initial capacity.
Result:
Better docs.
Motivation:
I noticed some closures were referred to by the wrong name in the
documentation.
Modifications:
Updated the documentation to use the correct names.
Result:
More correct documentation.
Motivation:
For historic reasons, the get* implementations checked the same indices
repeatedly and the checks were also repetitive.
Modifications:
unify and de-deplicate get* range checks
Result:
fixes#884
Motivation:
ByteBuffer's get* methods are now safe, docs claimed they were unsafe
though.
Modifications:
Document the gets as safe
Result:
More correct docs.
Motivation:
Previously, ByteBuffers get* methods would get bytes from anywhere, even
if the bytes were not readable. That can lead to security issues in code
that doesn't expect this. NIO shouldn't contain such unsafe methods
without the word 'unsafe'.
Modifications:
Made all `get*` methods respect reader/writerIndex.
Result:
safer user code
Motivation:
NIOAny is used extensively across SwiftNIO but can't be reasonably debugged. Users may want to simply call String(describing: CustomStringConvertible) to inspect what is the data they are working with at the moment instead of seeing an output that has no debug value.
Modifications:
Conform NIOAny to CustomStringConvertible, along with other types it currently uses that lacked string representation.
Result:
Users can perform simple debugging by printing their NIOAny object.
Motivation:
ByteBuffer methods like `set(string:)` never felt very Swift-like and
also didn't look the same as their counterparts like `getString(...)`.
Modifications:
- rename all `ByteBuffer.set/write(<type>:,...)` methods to
`ByteBuffer.set/write<Type>(...)`
- polyfill the old spellings in `_NIO1APIShims`
Result:
code more Swift-like
Motivation:
We now depend on Swift 5.0, so we can remove all the backporting of
functionality.
Modifications:
deleted all backported Swift functionality
Result:
less code
Motivation:
No code is the best code, let's have the compiler generate more
Equatable instances for us.
Modifications:
remove some hand-written Equatable conformances
Result:
less code, possibly fewer bugs
Motivation:
DispatchData's APIs are less than optimal and the current one we used
did allocate at least on Linux to copy bytes out.
Modifications:
introduce ByteBuffer.write/read/set/getDispatchData
Result:
when Dispatch gets fixed, lower allocations
Motivation:
The rest of the code in NIO uses multiline literals.
Modifications:
Converted string concatenations to multiline literals.
Result:
NFC, prettier code.
Motivation:
ContiguousCollection is no longer necessary because Sequence gained
withContiguousStorageIfAvailable.
Modifications:
Removed ContiguousCollection and switch to withContiguousStorageIfAvailable.
Result:
less code
Motivation:
The latest Swift 5.0s bring us new goodies (hence tools version) and
also more warnings, hence the fixes for them.
Modifications:
- set tools-version to 5.0
- fix warnings
- workaround https://bugs.swift.org/browse/SR-9514
Result:
happier developers
Motivation:
- `Int` is the currency type for integral things and we already sort of
changed the the type for ports but not everywhere.
- generic parameters that aren't just 'any type' shouldn't be just `T`
or `S` or something
Modifications:
- make more use of `Int`
- rename generic parameters to something more descriptive like `Bytes`
Result:
easier to read/write NIO code
Motivation:
NIO2 development starts now.
Modifications:
Made NIO Swift 5-only for everything else see docs/public-api-changes-NIO1-to-NIO2.md
Result:
NIO2 development can start.
Motivation:
urbp[start ..< end] is quite a bit faster than when done in two
operations urbp.dropFirst(start).prefix(length). This improves small (4
bytes) & medium sized (24 bytes) String copies by over 10%. Good for
HTTP headers etc.
Modifications:
replace urbp.dropFirst(start).prefix(length) by urbp[start ..< start +
length]
Result:
small & medium sized String copies faster
Motivation:
Quite embarrasingly, we previously would only store one `ChannelOption`
per `ChannelOption` type. Most channel option types are distinct and
that's probably why it took so long to find this issue. Thanks
@pushkarnk for reporting. Unfortunately though, the most important
`ChannelOption` is `.socket` which crucially also holds a level and a
name. That means if you set two `ChannelOptions.socket` options with
distinct name/level, one would still override the other.
Example:
.serverChannelOption(ChannelOptions.socket(SocketOptionLevel(SOL_SOCKET), SO_REUSEPORT), value: 1)
.serverChannelOption(ChannelOptions.socket(SocketOptionLevel(SOL_SOCKET), SO_REUSEADDR), value: 1)
would only actually set the latter.
Modifications:
- made all common `ChannelOption` types equatable (for 2.0 this will
be a protocol requirement)
- deprecated non-Equatable `ChannelOption` types
- zero out buffer before calling getsockopt as Linux doesn't do that
Result:
you can now set two distinct `ChannelOptions` for one type
Motivation:
All the mutable variants and also some slices of clearly contiguous
collections weren't marked as ContiguousCollections but they should be.
Modifications:
- made UnsafeMutable[Raw]BufferPointer a ContiguousCollection
- made Slice<UnsafeRawBufferPointer> a ContiguousCollection
Result:
faster code
Motivation:
We should have different constants for 32-bit systems where necessary
for example all the maximums should be Int32.max and not UInt32.max
because UInt32.max doesn't fit into an Int on 32-bit platforms. But
where unnecessary 32-bit and 64-bit should share maximum values.
Modifications:
- removed some constraints that should be unnecessary
- skip tests that can't work on 32-bit systems
Result:
better and more consistent 32-bit runs
Motivation:
ByteBuffer._Storage.fullSlice can always be trivially computed, so it
should be.
Modifications:
make ByteBuffer._Storage.fullSlice computed
Result:
saved almost 56 bits per ByteBuffer
Motivation:
Unsafe(Mutable)RawBufferPointers have a few APIs that make our code
easier to reason about. In this PR I'm trying to introduce some of it.
Modifications:
Improved UnsafeRawBufferPointer usage
Result:
code easier to reason about which hopefully leads to fewer bugs. Also:
In many cases we now get free bounds checking in debug mode, yay :).
Motivation:
I spotted this while flamegraphing: apparently we missed the
declaration for inlineability here.
Modifications:
Made these two functions inlineable.
Result:
Better optimizations.
Motivation:
It's poor style to assume a certain piece of memory is bound to a
certain type across a whole program. That's because any piece of code
run in between might have rebound the memory and also it's hard to prove
that the binding assumption is actually correct. Therefore we should
restrain from using them.
Modifications:
remove more assumptions that memory is bound to a certain type.
Result:
better code
Motivation:
As described in #537, ByteBuffer.changeCapacity is not a very good
function. We want to replace it with a better one, more similar to
reserveCapacity on the standard library data structures, and ideally
substantially faster.
Modifications:
- Deprecated ByteBuffer.changeCapacity.
- Added ByteBuffer.reserveCapacity.
- Changed the implementation of changeCapacity to call reserveCapacity
when we're increasing the capacity.
Result:
Better APIs, better performance in the legacy ones, everything is
better.
Motivation:
`ByteBuffer`'s `get*` methods can be used in an unsafe way and we didn't
warn the user enough.
Modifications:
Warn the user in the documentation of all the `get*` methods.
Result:
Less confusion.
Motivation:
ContiguousCollections shouldn't give access to more than `count` bytes
via `withUnsafeBytes` but if they do we should still handle this
correctly and only copy `count` bytes.
Modifications:
- copy an upper limit of `count` bytes
- fail hard if `withUnsafeBytes` offers fewer bytes as we can't do
anything sensible then.
Result:
even illegal `ContiguousCollection` implementations don't cause issues.
Motivation:
In a few situations we used `Int` in places where malloc/realloc
actually use `size_t`. On 64-bit platforms this doesn't actually matter
but on 32-bit it does!
Modifications:
- replace `Int` with `size_t` where appropriate
Result:
- should be able to use more than 2GB on 32-bit platforms
Motivation:
In Swift 4.2 you'll get a warning if you use an internal typealias from an
`@inlinable`/`@_inlineable` function. To fix it, you usually make it
`@usableFromInline`/`@_versioned`. Swift
4.0 however doesn't support those attributes for typealiases. The workaround is
to just make the typealias public (and underscored).
Modifications:
make `ByteBuffer.Capacity` and `ByteBuffer.Index` underscored and public
Result:
compiles warning-free on Swift 4.0, 4.1 and 4.2
Motivation:
ByteBuffer._ensureAvailableCapacity is quite complicated and the code
can be simplified by re-using existing code.
Modifications:
reuse more existing code
Result:
ByteBuffer code easier to understand