Skip to content

Conversation

@buchgr
Copy link
Contributor

@buchgr buchgr commented Sep 6, 2016

The DefaultHttp2Headers class is a general-purpose Http2Headers implementation
and provides much more functionality than we need in gRPC. In gRPC, when reading
headers off the wire, we only inspect a handful of them, before converting to
Metadata.

This commit introduces a Http2Headers implementation that aims for insertion
efficiency, a low memory footprint and fast conversion to Metadata.

  • Header names and values are stored in plain byte[].
  • Insertion is O(1), while lookup is now O(n).
  • Binary header values are base64 decoded as they are inserted.
  • The byte[][] returned by namesAndValues() can directly be used to construct
    a new Metadata object.
  • For HTTP/2 request headers, the pseudo headers are no longer carried over to
    Metadata.

A microbenchmark aiming to replicate the usage of Http2Headers in NettyClientHandler
and NettyServerHandler shows decent throughput gains when compared to DefaultHttp2Headers.

Benchmark                                             Mode  Cnt     Score    Error  Units
InboundHeadersBenchmark.defaultHeaders_clientHandler  avgt   10   262.158 ±  3.624  ns/op
InboundHeadersBenchmark.defaultHeaders_serverHandler  avgt   10  1016.699 ± 14.174  ns/op
InboundHeadersBenchmark.grpcHeaders_clientHandler     avgt   10   203.998 ±  2.329  ns/op
InboundHeadersBenchmark.grpcHeaders_serverHandler     avgt   10   549.832 ±  5.796  ns/op

Additionally, the memory footprint is reduced by more than 50%!

gRPC Request Headers: 864 bytes
Netty Request Headers: 1728 bytes
gRPC Response Headers: 216 bytes
Netty Response Headers: 528 bytes

Furthermore, this change does most of the gRPC groundwork necessary to be able
to cache higher ordered objects in HPACK's dynamic table, as discussed in [1].

[1] #2217

iterations = 10
fork = 1
jvmArgs = "-server -Xms2g -Xmx2g"
include = "io.grpc.netty.InboundHeadersBenchmark"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert

@buchgr buchgr force-pushed the avoidheaders branch 2 times, most recently from 2197b4b to 64c674a Compare September 7, 2016 11:26
@buchgr
Copy link
Contributor Author

buchgr commented Sep 7, 2016

PTAL @carl-mastrangelo

bh.consume(Utils.convertHeaders(headers));
}

// /**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep it so in case of changes to the headers we can undocument it and re-evaluate the memory usage. I only commented it out, so that we don't have the JOL dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@buchgr
Copy link
Contributor Author

buchgr commented Sep 8, 2016

@carl-mastrangelo PTAL

}
byte[] nameBytes = bytes(name);
byte[] valueBytes = toRawSeralizedHeader(nameBytes, bytes(value));
values[namesAndValuesIdx >> 1] = value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: I personally find the * and / operator more readable when doing semantically arithmetic operations. I believe they are the same speed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh me too :-). Then I misunderstood. Cause I had it that way in the first place. They are the same speed, cause static powers of two get compiled to shift instructions. I ll change them back :)

@carl-mastrangelo
Copy link
Contributor

@buchgr I don't have any other comments, except for the outstanding ones.

@buchgr
Copy link
Contributor Author

buchgr commented Sep 8, 2016

@carl-mastrangelo I just noticed a bug. If we always add 1 to the header size estimate, this will grow forever 😅 . Worst case, By 1 for every header decoded 😅

@carl-mastrangelo
Copy link
Contributor

Oh, the plus one would work on the prev estimate I think:

new = val * .2
old = val * .8 + 1
next = new + old

@buchgr
Copy link
Contributor Author

buchgr commented Sep 8, 2016

@carl-mastrangelo I pushed a change. PTAL. Good to go?

@carl-mastrangelo
Copy link
Contributor

@buchgr LGTM

The DefaultHttp2Headers class is a general-purpose Http2Headers implementation
and provides much more functionality than we need in gRPC. In gRPC, when reading
headers off the wire, we only inspect a handful of them, before converting to
Metadata.

This commit introduces a Http2Headers implementation that aims for insertion
efficiency, a low memory footprint and fast conversion to Metadata.

  - Header names and values are stored in plain byte[].
  - Insertion is O(1), while lookup is now O(n).
  - Binary header values are base64 decoded as they are inserted.
  - The byte[][] returned by namesAndValues() can directly be used to construct
    a new Metadata object.
  - For HTTP/2 request headers, the pseudo headers are no longer carried over to
    Metadata.

A microbenchmark aiming to replicate the usage of Http2Headers in NettyClientHandler
and NettyServerHandler shows decent throughput gains when compared to DefaultHttp2Headers.

Benchmark                                             Mode  Cnt     Score    Error  Units
InboundHeadersBenchmark.defaultHeaders_clientHandler  avgt   10   283.830 ±  4.063  ns/op
InboundHeadersBenchmark.defaultHeaders_serverHandler  avgt   10  1179.975 ± 21.810  ns/op
InboundHeadersBenchmark.grpcHeaders_clientHandler     avgt   10   190.108 ±  3.510  ns/op
InboundHeadersBenchmark.grpcHeaders_serverHandler     avgt   10   561.426 ±  9.079  ns/op

Additionally, the memory footprint is reduced by more than 50%!

gRPC Request Headers: 864 bytes
Netty Request Headers: 1728 bytes
gRPC Response Headers: 216 bytes
Netty Response Headers: 528 bytes

Furthermore, this change does most of the gRPC groundwork necessary to be able
to cache higher ordered objects in HPACK's dynamic table, as discussed in [1].

[1] grpc#2217
@buchgr
Copy link
Contributor Author

buchgr commented Sep 9, 2016

@carl-mastrangelo I added some additional tests (in a separate commit) to increase the code coverage. PTAL.

@carl-mastrangelo
Copy link
Contributor

@buchgr Still LGTM

@buchgr buchgr merged commit 8c18a0d into grpc:master Sep 9, 2016
@buchgr
Copy link
Contributor Author

buchgr commented Sep 9, 2016

thanks for the review @carl-mastrangelo

@lock lock bot locked as resolved and limited conversation to collaborators Jan 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants