Fix possible header / URI corruption when discardReadBytes() is calle… #385

normanmaurer · 2018-05-02T18:17:09Z

…d in HTTPDecoder

Motivation:

When discardReadBytes() was called and we still did not pass the Head to the user it was quite possible that the headers / URI could be corrupted as the stored readerIndex did not matchup anymore.

Modifications:

Override most of the functionality of ByteToMessageDecoder in HTTPDecoder to better work with the provided state machine of http_parser
Remove allocations by removing the pendingInOut Array as its not needed anymore.
Add guards against re-entrance calls
Add unit test that shows that everything works as expected now.

Result:

Fixed bug and reduced allocations (8% - 10% perf win).

Lukasa

Cool, generally looks really good! Notes inline.

Lukasa · 2018-05-02T19:26:59Z

Sources/NIOHTTP1/HTTPDecoder.swift

@@ -82,6 +84,7 @@ private struct HTTPParserState {
            let (index, length) = consumeSlice()
            self.currentStatus = self.cumulationBuffer!.getString(at: index, length: length)!
        case .body:
+            self.currentNameIndex = nil


Can we ever hit this branch without having gone through .headerValue? It seems like should be an assertion instead.

yep sounds correct... same for the slice imho

Lukasa · 2018-05-02T19:29:18Z

Sources/NIOHTTP1/HTTPDecoder.swift

-    public func decoderRemoved(ctx: ChannelHandlerContext) {
-        // Remove the stored reference to ChannelHandlerContext
-        parser.data = UnsafeMutableRawPointer(bitPattern: 0x0000deadbeef0000)
+    private var decoding: Bool = false


Can we move this up to the other state declarations rather than hiding it just above channelRead?

Lukasa · 2018-05-02T19:29:31Z

Sources/NIOHTTP1/HTTPDecoder.swift

-        settings.on_message_begin = nil
-    }
+        // Guard against re-entrance calls of channelRead(...)
+        guard !decoding else {


Explicit self please. 😄

Lukasa · 2018-05-02T19:29:47Z

Sources/NIOHTTP1/HTTPDecoder.swift

-            // also need to update the reader index to whatever it is now.
-            state.slice = (buffer.readerIndex, slice.length)
-            buffer.moveReaderIndex(forwardBy: state.readerIndexAdjustment)
+        // Needed to guard again re-entrace calls.


s/entrace/entrant/

Lukasa · 2018-05-02T19:30:49Z

Sources/NIOHTTP1/HTTPDecoder.swift

        }
+    }


This method's 80 lines long, can we factor it into some smaller sub-functions?

Lukasa · 2018-05-02T19:31:32Z

Sources/NIOHTTP1/HTTPDecoder.swift

+                    // There was no change to the readable bytes of the cumulationBuffer by a re-entrant call of channelRead. Its safe to break the loop.
+                    break
+                }
+            } while true


repeat...while true is a strange idiom. I know you didn't add it, but can we swap this to just while true { }?

Actually, no, let's use while var bufferSlice = self.cumulationBuffer, bufferSlice.readableBytes > 0. That will cover your loop exit condition at the bottom and also create the buffer slice we need.

Lukasa · 2018-05-02T19:33:15Z

Sources/NIOHTTP1/HTTPDecoder.swift

+                    return result
+                }
+
+                guard self.cumulationBuffer != nil else {


Would it be better just to not nil the buffer out on channel closure? It'll simplify the code here, and the buffer will get dropped anyway, one way or another.

I can do this but this would also need changed in ByteToMessageDecoder and may increase memory usage if a user still hold a reference to the handler after the channel become inactive / the handler was removed.

WDYT ?

Lukasa · 2018-05-02T19:34:31Z

Sources/NIOHTTP1/HTTPDecoder.swift

+                // Update readerIndex of the cumulationBuffer itself as we will refetch it in the next loop run if needed.
+                self.cumulationBuffer!.moveReaderIndex(forwardBy: result)
+
+                if bufferSlice.readableBytes == self.cumulationBuffer!.readableBytes + result {


Given that we have an assertion above that bufferSlice.readableBytes == result, and then the next line is self.cumulationBuffer!.moveReaderIndex(forwardBy: result), I think this conditional is just a very complex way of spelling if self.cumulationBuffer!.readableBytes == 0.

yep... refactored to use your while loop :)

Lukasa · 2018-05-02T19:37:43Z

Sources/NIOHTTP1/HTTPDecoder.swift

-            return .continue
-        }
+    public func decode(ctx: ChannelHandlerContext, buffer: inout ByteBuffer) throws -> DecodingState {
+        return DecodingState.needMoreData


Can you add a code comment here indicating that we don't actually expect this function to be called?

Lukasa · 2018-05-02T19:39:15Z

Sources/NIOHTTP1/HTTPDecoder.swift

        self.state.seenEOF = true
+
+        guard !self.decoding else {
+            // We are currently decoding, return early as we will handle it in the decoding loop.


This comment is wrong: we don't handle the EOF in the decoding loop at all. We need to explicitly call c_nio_http_parser_execute with a length of 0, which the other code never does. You may want to add a test that triggers this case to validate that it doesn't work, and then we can fix it. 😄

Lukasa · 2018-05-03T08:40:20Z

Package.swift

@@ -51,6 +51,8 @@ var targets: [PackageDescription.Target] = [
            dependencies: ["NIO", "NIOHTTP1", "CNIOSHA1"]),
    .target(name: "NIOWebSocketServer",
            dependencies: ["NIO", "NIOHTTP1", "NIOWebSocket"]),
+    .target(name: "NIOHTTP1HelloWorldServer",
+            dependencies: ["NIO", "NIOHTTP1", "NIOConcurrencyHelpers"]),


Any reason this isn't just part of NIOHTTP1Server?

it should not be part of the pr at all... included by mistake.

Lukasa · 2018-05-03T08:42:52Z

Sources/NIOHTTP1/HTTPDecoder.swift

-        }
+                let result = state.baseAddress!.withMemoryRebound(to: Int8.self, capacity: pointer.count, { p in
+                    c_nio_http_parser_execute(&parser, &settings, p.advanced(by: bufferSlice.readerIndex), bufferSlice.readableBytes)
+                })


We can use the trailing closure syntax instead of including the closure in the parenthesis.

Lukasa · 2018-05-03T08:43:37Z

Sources/NIOHTTP1/HTTPDecoder.swift

+                    self.state.currentError = HTTPParserError.httpError(fromCHTTPParserErrno: http_errno(rawValue: httpError))!
+                    throw self.state.currentError!
+                }
+                return result
            }


This block seems wildly too long to me: do we need the pointer to be valid for this long?

nope let me move some stuff outside.

Lukasa · 2018-05-03T08:44:06Z

Sources/NIOHTTP1/HTTPDecoder.swift

+    private func decodeHTTP(ctx: ChannelHandlerContext) throws {
+        // We need to refetch the cumulationBuffer on each loop as it may has changed due re-entrance calls of channelRead(...)
+        while let bufferSlice = self.cumulationBuffer, bufferSlice.readableBytes > 0 {
+            let result = try bufferSlice.withVeryUnsafeBytes { (pointer) -> size_t in


Why does this use withVeryUnsafeBytes and then manipulate the pointer, instead of using withUnsafeReadableBytes?

because it makes things easier when we calculate the readerIndex based on the baseAddress.

Lukasa · 2018-05-03T08:45:17Z

Sources/NIOHTTP1/HTTPDecoder.swift

-            state.baseAddress = nil
+            guard !self.state.seenEOF else {
+                // We need to notify the parser about the EOF as we received it while in http_parser_excecute.
+                self.notifyParserEOF(ctx: ctx)


I think this notifies too early. While there are still bytes in the cumulation buffer, we need to parse them. This should be outside the while loop.

makes sense

Lukasa · 2018-05-03T08:46:33Z

Sources/NIOHTTP1/HTTPDecoder.swift

-            if httpError != 0 {
-                self.state.currentError = HTTPParserError.httpError(fromCHTTPParserErrno: http_errno(rawValue: httpError))!
-                throw self.state.currentError!
+            guard self.cumulationBuffer != nil else {


Rather than have this guard, just change the movement of the reader index on line 439 to self.cumulationBuffer?.moveReaderIndex(forwardBy: result). That will cover the buffer being nild out.

normanmaurer · 2018-05-03T12:59:34Z

@weissi @Lukasa PTAL again

weissi · 2018-05-03T14:34:21Z

Sources/NIOHTTP1/HTTPDecoder.swift

    fileprivate var state = HTTPParserState()

    fileprivate init(type: HTTPMessageT.Type) {
        /* this is a private init, the public versions only allow HTTPClientResponsePart and HTTPServerRequestPart */
        assert(HTTPMessageT.self == HTTPClientResponsePart.self || HTTPMessageT.self == HTTPServerRequestPart.self)
    }

+    deinit {
+        // Remove the stored reference to ChannelHandlerContext
+        self.parser.data = UnsafeMutableRawPointer(bitPattern: 0x0000deadbeef0000)


to make @helje5's life easier and not cause needless merge conflicts, could you make this value 0xdeadbeef? That way it'll work on 32-bit platforms too

weissi

generally looks really good but having a hard time to think all this though :)

Lukasa · 2018-05-03T14:45:20Z

Sources/NIOHTTP1/HTTPDecoder.swift

+            assert(result == bufferSlice.readableBytes)
+
+            // Update readerIndex of the cumulationBuffer itself as we will refetch it in the next loop run if needed.
+            self.cumulationBuffer?.moveReaderIndex(forwardBy: result)


It's marginally faster to use moveReaderIndex(to: writerIndex) I think.

This will mess up with the loop that handles re-entrance.

Ah shoot you're right. Nevermind then.

Lukasa · 2018-05-03T14:46:25Z

Sources/NIOHTTP1/HTTPDecoder.swift

-        let result = try buffer.withVeryUnsafeBytes { (pointer) -> size_t in
-            state.baseAddress = pointer.baseAddress!.assumingMemoryBound(to: UInt8.self)
+        if self.state.seenEOF {
+            // We need to notify the parser about the EOF as we received it while in http_parser_excecute.


Nit: http_parser_execute.

normanmaurer · 2018-05-03T17:30:00Z

@weissi @Lukasa PTAL again

normanmaurer · 2018-05-03T19:21:27Z

@weissi @Lukasa ok now it does handle discarding of decoded bytes a lot better then before (the code is more complex tho :( ). PTAL again.

Lukasa

A quick partial review, will re-review tomorrow.

Lukasa · 2018-05-03T20:01:18Z

Sources/NIOHTTP1/HTTPDecoder.swift

+            self.cumulationBuffer = nil
+
+        case .headerField, .headerValue:
+            if let headerStartIdx = self.state.headerStartIndex {


The whole body of this case is in the if block, so this may as well be guard.

Lukasa · 2018-05-04T08:53:58Z

Sources/NIOHTTP1/HTTPDecoder.swift

-            if let error = state.currentError {
-                throw error
+        case .headerField, .headerValue:
+            guard let headerStartIdx = self.state.headerStartIndex else {


Is it valid to be in this state without a headerStartIndex?

@Lukasa yes as we reset it to nil if we discarded the decoded bytes.

Is there any reason not to set it to 0 instead and avoid this branch?

Yes as otherwise we will try to increase the writerIndex etc later on which is even more expensive.

Lukasa · 2018-05-04T08:58:58Z

Sources/NIOHTTP1/HTTPDecoder.swift

+    }
+
+    /// Will discard bytes till readerIndex if its needed and then call `fn`.
+    private func mayDiscardDecodedBytes(readerIndex: Int, _ fn: () -> Void) {


I think that the "readerIndex" label could be improved: maybe "upTo" so that this now reads mayDiscardDecodedBytes(upTo:).

Lukasa · 2018-05-04T09:01:35Z

Sources/NIOHTTP1/HTTPDecoder.swift

+            fn()
+        }
+
+        self.cumulationBuffer!.moveReaderIndex(to: self.cumulationBuffer!.writerIndex)


Should we assert that at the start of this function the reader index was equal to the writer index?

This was done

Lukasa · 2018-05-04T13:44:52Z

Ok, this looks good to me.

…d in HTTPDecoder Motivation: When discardReadBytes() was called and we still did not pass the Head to the user it was quite possible that the headers / URI could be corrupted as the stored readerIndex did not matchup anymore. Modifications: - Override most of the functionality of ByteToMessageDecoder in HTTPDecoder to better work with the provided state machine of http_parser - Remove allocations by removing the pendingInOut Array as its not needed anymore. - Add guards against re-entrance calls - Add unit test that shows that everything works as expected now. Result: Fixed bug and reduced allocations (8% - 10% perf win).

normanmaurer · 2018-05-04T17:23:05Z

Thanks for the review.. merged

normanmaurer requested review from Lukasa and weissi May 2, 2018 18:17

normanmaurer force-pushed the decoder_fix_and_allocation_reduce branch from 3bc7012 to d6087cc Compare May 2, 2018 19:15

Lukasa requested changes May 2, 2018

View reviewed changes

Lukasa requested changes May 3, 2018

View reviewed changes

normanmaurer force-pushed the decoder_fix_and_allocation_reduce branch from e3d2f5e to 27f3508 Compare May 3, 2018 09:13

weissi reviewed May 3, 2018

View reviewed changes

Lukasa requested changes May 3, 2018

View reviewed changes

normanmaurer force-pushed the decoder_fix_and_allocation_reduce branch from 5c984d4 to 0189aa5 Compare May 3, 2018 19:21

Lukasa requested changes May 3, 2018

View reviewed changes

Lukasa requested changes May 4, 2018

View reviewed changes

Lukasa approved these changes May 4, 2018

View reviewed changes

Lukasa added the semver/patch No public API change. label May 4, 2018

Lukasa added this to the 1.7.0 milestone May 4, 2018

normanmaurer force-pushed the decoder_fix_and_allocation_reduce branch from 26a9275 to be9cd2f Compare May 4, 2018 17:11

normanmaurer merged commit 62a9ff1 into apple:master May 4, 2018

normanmaurer deleted the decoder_fix_and_allocation_reduce branch May 4, 2018 17:22

Fix possible header / URI corruption when discardReadBytes() is calle… #385

Fix possible header / URI corruption when discardReadBytes() is calle… #385

Conversation

normanmaurer commented May 2, 2018

Lukasa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

normanmaurer commented May 3, 2018

Choose a reason for hiding this comment

weissi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

normanmaurer commented May 3, 2018

normanmaurer commented May 3, 2018

Lukasa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lukasa commented May 4, 2018

normanmaurer commented May 4, 2018