Details
I tested sending a UTF-8 message (emoji with skin tone) from web to native and from native to web. I then forced a socket reconnection for the sender, and confirmed that the message didn't change once the socket was established.
I also tested adding somebody to the Comm org in my local environment. Note that I have only 275 chats inside the Comm org in my local environment, compared to 971 in production.
compressing message with size 7249341 to size 49747 took 1305ms compressing message with size 7259629 to size 52163 took 1389ms compressing message with size 6950442 to size 49431 took 1225ms compressing message with size 7249345 to size 49683 took 1329ms
The compression time was approximately the same whether the sync or async Brotli compression function was used.
While this will increase CPU usage, the massive benefits of shrinking a 6.91 MiB payload to 48.58 KiB definitely outweigh the costs. I also made sure to only utilize the compression when the payload is at least 4 KiB.
Diff Detail
- Repository
- rCOMM Comm
- Branch
- ashoat/brotli
- Lint
No Lint Coverage - Unit
No Test Coverage
Event Timeline
native/utils/decompress.js | ||
---|---|---|
4 | brotli.js relies on Node.js's Buffer. Recent versions of React Native include this spec-compliant implementation of it | |
web/.eslintrc.json | ||
8 | Buffer is from Node.js and does not appear on the web. However we polyfill it here | |
web/utils/decompress.js | ||
1–16 | Identical to native, except there's no need to import Buffer |
I was testing compression while trying to reduce report sizes, but I used different packages ENG-3890 and failed to find something working well on reports, this one seems promising, especially on native we use only decompression which is less CPU consuming than compression, but you can you somehow test this with large payloads to make sure it will not kill the app performance for the time of decompressing?
Accepting to unblock, however, would be good to get thoughts on the following:
- It looks like Brotli is included in Zlib in (at least) Node >= 18 LTS: https://nodejs.org/dist/latest-v18.x/docs/api/zlib.html#class-brotlioptions. Would it be possible to leverage this instead of brotli.js (which doesn't seem to have a "real" commit since ~2016)?
- Further, instead of "rolling our own" compressMessage, could we use the WebSocket per message-deflate extension? (https://datatracker.ietf.org/doc/html/rfc7692). Seems like it can seamlessly be integrated w/ ws: https://github.com/websockets/ws#websocket-compression. This leverages the Node/zlib library mentioned above out of the box.
- For non-WebSocket keyserver endpoints, we could use eg Express compression middleware (https://expressjs.com/en/resources/middleware/compression.html) instead of rolling something like compressMessage on our own and applying it on an endpoint by endpoint basis.
- From the "Express compression middleware" docs:
For a high-traffic website in production, the best way to put compression in place is to implement it at a reverse proxy level (see Use a reverse proxy). In that case, you do not need to use compression middleware. For details on enabling gzip compression in Nginx, see Module ngx_http_gzip_module in the Nginx documentation.
We don't use Nginx, but Apache has a mod_brotli module that we could use to achieve the same thing: https://httpd.apache.org/docs/2.4/mod/mod_brotli.html?
At a high level, it seems like we might be able to leverage existing compression capablilties "below" the application layer "for free."
native/utils/decompress.js | ||
---|---|---|
11–14 | It seems like we're making four copies of the data here, I wonder if there's a way to reduce that? |
Looks like this was addressed on Linear: https://linear.app/comm/issue/ENG-5018/compress-large-keyserver-client-payloads#comment-77a4df74
- It looks like Brotli is included in Zlib in (at least) Node >= 18 LTS: https://nodejs.org/dist/latest-v18.x/docs/api/zlib.html#class-brotlioptions. Would it be possible to leverage this instead of brotli.js (which doesn't seem to have a "real" commit since ~2016)?
Good call. Made this update!
- Further, instead of "rolling our own" compressMessage, could we use the WebSocket per message-deflate extension? (https://datatracker.ietf.org/doc/html/rfc7692). Seems like it can seamlessly be integrated w/ ws: https://github.com/websockets/ws#websocket-compression. This leverages the Node/zlib library mentioned above out of the box.
As you pointed out, this is addressed on Linear here.
- For non-WebSocket keyserver endpoints, we could use eg Express compression middleware (https://expressjs.com/en/resources/middleware/compression.html) instead of rolling something like compressMessage on our own and applying it on an endpoint by endpoint basis.
I'll make sure to explore this when looking at compressing eg. get_initial_redux_state and log_in. I agree that applying it on an endpoint-by-endpoint basis doesn't seem ideal.
- From the "Express compression middleware" docs:
For a high-traffic website in production, the best way to put compression in place is to implement it at a reverse proxy level (see Use a reverse proxy). In that case, you do not need to use compression middleware. For details on enabling gzip compression in Nginx, see Module ngx_http_gzip_module in the Nginx documentation.
We don't use Nginx, but Apache has a mod_brotli module that we could use to achieve the same thing: https://httpd.apache.org/docs/2.4/mod/mod_brotli.html?
I don't love this solution. I'd rather keep the reverse proxy layer minimally complex. Long-term the keyserver should not need a reverse proxy (should connect directly to Tunnelbroker).
Good thinking... I'll do some additional testing and add it to the Test Plan before landing.
native/utils/decompress.js | ||
---|---|---|
11–14 | I don't believe so. I experimented with this a bit and wasn't able to find a better solution. I spent some time trying to do it without base64 since that would significantly reduce the size of the result, but I couldn't get it to work. Feel free to take a stab at it yourself PS – it might be 3 and not 4 copies. decompressed is a Uint8Array, and passing that to Buffer.from might just create a "wrapper" around the original data rather than copying (but not 100% sure) |
Remove brotli.js from keyserver in favor of built-in Node.js utility from zlib library
keyserver/src/utils/compress.js | ||
---|---|---|
7–11 ↗ | (On Diff #31427) | Docs here |
19 ↗ | (On Diff #31427) | Node.js's version of Brotli doesn't have a "minimum size" like Brotli.js, so I had to come up with a minimum. I got this from here |
22 ↗ | (On Diff #31427) | Docs here |
native/utils/decompress.js | ||
4 ↗ | (On Diff #31427) | brotli.js relies on Node.js's Buffer. Recent versions of React Native include this spec-compliant implementation of it |
web/.eslintrc.json | ||
8 ↗ | (On Diff #31427) | Buffer is from Node.js and does not appear on the web. However we polyfill it here |
web/utils/decompress.js | ||
1–16 ↗ | (On Diff #31427) | Identical to native, except there's no need to import Buffer |