Paths

Table of Contentst

Introduce Brotli compression for keyserver -> client socket messages
ClosedPublic
Actions

Authored by ashoat on Sep 25 2023, 1:36 PM.

Details

Reviewers

atul
michal
kamil
bartek

Commits

rCOMM1a5f519898ce: Introduce Brotli compression for keyserver -> client socket messages

Summary

This addresses ENG-5018, which is one of the follow-ups to ENG-5007.

Scoping this just to staff members for now. If this goes well, we can try the same thing for client -> keyserver socket messages and for HTTP REST endpoints as well.

Depends on D9293

Test Plan

I tested sending a UTF-8 message (emoji with skin tone) from web to native and from native to web. I then forced a socket reconnection for the sender, and confirmed that the message didn't change once the socket was established.

I also tested adding somebody to the Comm org in my local environment. Note that I have only 275 chats inside the Comm org in my local environment, compared to 971 in production.

compressing message with size 7249341 to size 49747 took 1305ms
compressing message with size 7259629 to size 52163 took 1389ms
compressing message with size 6950442 to size 49431 took 1225ms
compressing message with size 7249345 to size 49683 took 1329ms

The compression time was approximately the same whether the sync or async Brotli compression function was used.

While this will increase CPU usage, the massive benefits of shrinking a 6.91 MiB payload to 48.58 KiB definitely outweigh the costs. I also made sure to only utilize the compression when the payload is at least 4 KiB.

Diff Detail

Repository

rCOMM Comm

Branch

ashoat/brotli

Lint

No Lint Coverage

Unit

No Test Coverage

Event Timeline

ashoat created this revision.Sep 25 2023, 1:36 PM

Herald added a subscriber: tomek. · View Herald TranscriptSep 25 2023, 1:36 PM

ashoat added inline comments.Sep 25 2023, 1:40 PM

native/utils/decompress.js
4 ↗	(On Diff #31417)	brotli.js relies on Node.js's `Buffer`. Recent versions of React Native include this spec-compliant implementation of it
web/.eslintrc.json
8 ↗	(On Diff #31417)	`Buffer` is from Node.js and does not appear on the web. However we polyfill it here
web/utils/decompress.js
1–16 ↗	(On Diff #31417)	Identical to `native`, except there's no need to import `Buffer`

Harbormaster completed remote builds in B22842: Diff 31417.Sep 25 2023, 1:54 PM

ashoat requested review of this revision.Sep 25 2023, 1:54 PM

ashoat added a reviewer: bartek.Sep 25 2023, 3:09 PM

I was testing compression while trying to reduce report sizes, but I used different packages ENG-3890 and failed to find something working well on reports, this one seems promising, especially on native we use only decompression which is less CPU consuming than compression, but you can you somehow test this with large payloads to make sure it will not kill the app performance for the time of decompressing?

Accepting to unblock, however, would be good to get thoughts on the following:

It looks like Brotli is included in Zlib in (at least) Node >= 18 LTS: https://nodejs.org/dist/latest-v18.x/docs/api/zlib.html#class-brotlioptions. Would it be possible to leverage this instead of brotli.js (which doesn't seem to have a "real" commit since ~2016)?
Further, instead of "rolling our own" compressMessage, could we use the WebSocket per message-deflate extension? (https://datatracker.ietf.org/doc/html/rfc7692). Seems like it can seamlessly be integrated w/ ws: https://github.com/websockets/ws#websocket-compression. This leverages the Node/zlib library mentioned above out of the box.
For non-WebSocket keyserver endpoints, we could use eg Express compression middleware (https://expressjs.com/en/resources/middleware/compression.html) instead of rolling something like compressMessage on our own and applying it on an endpoint by endpoint basis.
From the "Express compression middleware" docs:

For a high-traffic website in production, the best way to put compression in place is to implement it at a reverse proxy level (see Use a reverse proxy). In that case, you do not need to use compression middleware. For details on enabling gzip compression in Nginx, see Module ngx_http_gzip_module in the Nginx documentation.

We don't use Nginx, but Apache has a mod_brotli module that we could use to achieve the same thing: https://httpd.apache.org/docs/2.4/mod/mod_brotli.html?

At a high level, it seems like we might be able to leverage existing compression capablilties "below" the application layer "for free."

This revision is now accepted and ready to land.Sep 26 2023, 9:55 AM

atul added inline comments.Sep 26 2023, 10:06 AM

native/utils/decompress.js
11–14 ↗	(On Diff #31417)	It seems like we're making four copies of the data here, I wonder if there's a way to reduce that?

Example of using built-in Node functionality

In D9294#273227, @atul wrote:

Further, instead of "rolling our own" compressMessage, could we use the WebSocket per message-deflate extension? (https://datatracker.ietf.org/doc/html/rfc7692). Seems like it can seamlessly be integrated w/ ws: https://github.com/websockets/ws#websocket-compression. This leverages the Node/zlib library mentioned above out of the box.

Looks like this was addressed on Linear: https://linear.app/comm/issue/ENG-5018/compress-large-keyserver-client-payloads#comment-77a4df74

It looks like Brotli is included in Zlib in (at least) Node >= 18 LTS: https://nodejs.org/dist/latest-v18.x/docs/api/zlib.html#class-brotlioptions. Would it be possible to leverage this instead of brotli.js (which doesn't seem to have a "real" commit since ~2016)?

Good call. Made this update!

Further, instead of "rolling our own" compressMessage, could we use the WebSocket per message-deflate extension? (https://datatracker.ietf.org/doc/html/rfc7692). Seems like it can seamlessly be integrated w/ ws: https://github.com/websockets/ws#websocket-compression. This leverages the Node/zlib library mentioned above out of the box.

As you pointed out, this is addressed on Linear here.

For non-WebSocket keyserver endpoints, we could use eg Express compression middleware (https://expressjs.com/en/resources/middleware/compression.html) instead of rolling something like compressMessage on our own and applying it on an endpoint by endpoint basis.

I'll make sure to explore this when looking at compressing eg. get_initial_redux_state and log_in. I agree that applying it on an endpoint-by-endpoint basis doesn't seem ideal.

From the "Express compression middleware" docs:

For a high-traffic website in production, the best way to put compression in place is to implement it at a reverse proxy level (see Use a reverse proxy). In that case, you do not need to use compression middleware. For details on enabling gzip compression in Nginx, see Module ngx_http_gzip_module in the Nginx documentation.

We don't use Nginx, but Apache has a mod_brotli module that we could use to achieve the same thing: https://httpd.apache.org/docs/2.4/mod/mod_brotli.html?

I don't love this solution. I'd rather keep the reverse proxy layer minimally complex. Long-term the keyserver should not need a reverse proxy (should connect directly to Tunnelbroker).

In D9294#273177, @kamil wrote:

I was testing compression while trying to reduce report sizes, but I used different packages ENG-3890 and failed to find something working well on reports, this one seems promising, especially on native we use only decompression which is less CPU consuming than compression, but you can you somehow test this with large payloads to make sure it will not kill the app performance for the time of decompressing?

Good thinking... I'll do some additional testing and add it to the Test Plan before landing.

native/utils/decompress.js
11–14 ↗	(On Diff #31417)	I don't believe so. I experimented with this a bit and wasn't able to find a better solution. I spent some time trying to do it without base64 since that would significantly reduce the size of the result, but I couldn't get it to work. Feel free to take a stab at it yourself PS – it might be 3 and not 4 copies. `decompressed` is a `Uint8Array`, and passing that to `Buffer.from` might just create a "wrapper" around the original data rather than copying (but not 100% sure)

Remove brotli.js from keyserver in favor of built-in Node.js utility from zlib library

ashoat added inline comments.Sep 26 2023, 1:00 PM

keyserver/src/utils/compress.js
7–11	Docs here
19	Node.js's version of Brotli doesn't have a "minimum size" like Brotli.js, so I had to come up with a minimum. I got this from here
22	Docs here
native/utils/decompress.js
4	brotli.js relies on Node.js's `Buffer`. Recent versions of React Native include this spec-compliant implementation of it
web/.eslintrc.json
8	`Buffer` is from Node.js and does not appear on the web. However we polyfill it here
web/utils/decompress.js
1–16	Identical to `native`, except there's no need to import `Buffer`