Details

Reviewers

marcin
bartek
kamil
ashoat

Commits

rCOMM2dce5a391b66: [backup-client] Compile to wasm

Summary

Compiles backup client to WASM for web. Uses a similar approach to SQliteQueryExecutor where we keep the files inside of the git repository.

tokio-tungstenite-wam allows us to have one API for native (tungstenite) and web websockets. It has very similar API to tungstenite but required a few small changes that I will annotate
web.rs contains a wasm-only file providing JS<->Rust conversions and helpers
A few structs got either serde implementations or conditional wasm_bindgen attributes so that they are available in JS or can be deserialized as arguments from JS

Adding @ashoat as reviewer because of dependency changes.

Depends on D11040

Test Plan

Check that native still compiles (both Android and iOS)
Check that compiling to WASM works (yarn build)
Testing that backup client in WASM works is tested in the future diffs

Diff Detail

Repository

rCOMM Comm

Lint

No Lint Coverage

Unit

No Test Coverage

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

ashoat added inline comments.Feb 14 2024, 10:53 AM

web/backup-client-wasm/package.json
1–14 ↗	(On Diff #37069)	You appear to be adding a new NPM project here, but excluding it from our workspace. Can you explain the motivations for this decision, and why it's being excluded from the `workspaces` list in our root `package.json`? Additionally – if this approach diverges from the approach we took with `comm_query_executor.wasm`, can you explain the motivations for that as well?
6–9 ↗	(On Diff #37069)	Not sure this does anything if we never publish
12 ↗	(On Diff #37069)	Should we have a clean step? Should the clean step be run when the user runs `yarn cleaninstall` at the root?
web/package.json
14 ↗	(On Diff #37069)	Why does this NPM script need to exist? It seems like it's just an alias for another NPM script. Can't we run that first NPM script directly? If we keep this, can we try `yarn workspace backup-client-wasm build`?

ashoat added inline comments.Feb 14 2024, 10:56 AM

web/backup-client-wasm/package.json
1–14 ↗	(On Diff #37069)	Looks like it's added in D11042. I'm still curious about differences with the approach we took with `comm_query_executor.wasm`, though

Updated the approach to be more similar to the comm_query_executor (so no additional yarn workspace, the rust crate is now part of web).

Confused why Phabricator isn't hiding the files in the _generated folder.

Me too, I'm not sure why. I wonder if it's broken? D10838 (a random diff which modified comm_query_executor) also doesn't have hidden files.

I modified the existing CI job to also check the backup client WASM. I'm not sure how to proceed with the filename though. Should I just rename it to something like "wasm-build.yml" (and change filename in buildkite pipeline definition) and devs would be expected to rebase onto these changes to fix the CI?
Task created here: ENG-6818
The code isn't tightly coupled and changes in one won't result in changes to the other. Also the build systems are completely different (emscripten vs cargo+wasm-pack) so I don't think the emscripten build script should build the backup client code. Additionally I don't think we need to have a separate script for the backup client as wasm-pack deals with the complexity for use. We don't need to specify all the different files we want to include in the resulting WASM, it's done automatically as a part of rust build. A package.json command that runs web-pack with required flags should be enough.

I've decided to go with the comm_query_executor.wasm approach for the same reasons we decided to with it for the database code: having the .wasm file in the repo greatly reduces the developer friction compared to having to upload a new package to NPM, especially for developers who haven't done it before.

The tradeoffs for the the different approaches for the comm_query_executor were discussed in this comment. In particular for the NPM package:

NPM package
Similar to olm, but I am afraid that it will be really painful, after looking at the codebase a think changes to the database are rather frequent, which means we will need to (each time there are some changes) build this package, publish, bump version in package.json and after make a web release. And also somehow setup dev environment to reflect changes in database files, which is basically similar to option 2 but without the docker problem.

On the other hand we have already added some rust build step during yarn cleaninstall -> for the keyserver we build the rust-node-addon. So there would already be some precedence for every dev building it. If we decided to go with this approach we would:

Add clean-backup-client-wasm to yarn clean
Add build-backup-client-wasm to web's postinstall script
.gitignore the generated files

On my machine (M2 Pro, 16GB), running build-backup-client-wasm after a yarn cleaninstall takes about ~22s with the global cargo cache having all dependencies (so everything is downloaded but needs to be compiled from scratch).

shared/backup_client/Cargo.toml
27–32 ↗	(On Diff #37069)	`wasm-bindgen` was already used by `comm-opaque2`. Licenses: `wasm-bindgen`, `wasm-bindgen-futures`, `web-sys` (they share the repo): double licensed: MIT, Apache `tokio-tungstenite-wasm`: MIT `serde-wasm-bindgen`: MIT
web/backup-client-wasm/package.json
1–14 ↗	(On Diff #37069)	The differences between this and comm_query_executor were mostly because one is C++ with emscripten and another is a Rust project so I was mixing approaches between `comm_query_executor` and `comm-opaque2-wasm`. I've simplified the code a bit now, so that it's now more similar to `comm_query_executor` and it's no longer a new yarn workspace/npm project. I've explained a bit more about why I chose this general approach in the top level comment.
12 ↗	(On Diff #37069)	Added a clean step in the `web/package.json`. Using the approach where we commit the files to the repo, we shouldn't add it to the `yarn cleaninstall` command (the files should always be current so there shouldn't be a need to re-build them).
web/package.json
14 ↗	(On Diff #37069)	I wanted to keep the `build-db-wasm` and `build-backup-client-wasm` commands together so they are more discoverable. I've now moved the whole logic into these top-level (from the `web` workspace perspective) commands.

Harbormaster failed remote builds in B26925: Diff 37262!Feb 15 2024, 7:43 AM

Seems like the emscripten CI uses emscripten/emsdk:3.1.17 docker image which probably doesn't have yarn. In that case it would probably be easier to introduce a new buildkite job for backup-client-wasm.

New approach makes sense to me!

In D11041#320318, @michal wrote:

Seems like the emscripten CI uses emscripten/emsdk:3.1.17 docker image which probably doesn't have yarn. In that case it would probably be easier to introduce a new buildkite job for backup-client-wasm.

That makes sense to me

Added a clean step in the web/package.json.

Using the approach where we commit the files to the repo, we shouldn't add it to the yarn cleaninstall command (the files should always be current so there shouldn't be a need to re-build them).

Hmm... how about the target folder?

.eslintignore
11 ↗	(On Diff #37262)	Any reason you gave a specific filename here instead of the whole `_generated`?

Reverted the changes to emscripten buildkite CI job for now, I will add a new buildkite job in a later revision. Add the target folder to the general yarn workspace web clean command.

Harbormaster completed remote builds in B26957: Diff 37300.Feb 16 2024, 3:58 AM

michal added inline comments.Feb 16 2024, 4:01 AM

.eslintignore
11 ↗	(On Diff #37262)	Sorry, I forgot to flag this in an inline comment because there were a lot of changes. I'm having a bit of a problem with specifying flow types for the WASM module. In this revision I've created a `web/backup-client-wasm/_generated/backup-client-wasm.js.flow` file with flow types according to docs. And it works great except that it's in a `_generated` folder... Do you maybe have any ideas for a better way to solve this? Docs for library definitions say For defining modules that are accessed via a relative require/import path, please see the docs on the .flow files So I don't think we can use them for this purpose. `comm-query-executor` defines a function `getDatabaseModule` that casts `any` to a type defined in `web/database/types/module.js`. This works well because it only has one export. I guess we could go with something similar for backup client but it default exports an init function, a `RequestedData` enum and a `BackupClient` class so it's not as clean.

In D11041#319521, @ashoat wrote:

Confused why Phabricator isn't hiding the files in the _generated folder.

I think we need to add a @generated phrase to the file if we want to hide it - like here.

web/backup-client-wasm/_generated/backup-client-wasm.js.flow
1 ↗	(On Diff #37300)	I think the fact that this file is in `/_generated` means that after cleaning a building `backup-client` I will end up with deleted types. Maybe there is some workaround to improve it?

kamil added inline comments.Feb 16 2024, 7:47 AM

web/backup-client-wasm/_generated/backup-client-wasm.js.flow
3 ↗	(On Diff #37300)	shouldn't we type it better like here?

You could potentially define the types in a different place entirely in web. For instance, consider how we type the ethers.js package – instead of a libdef, we just have a EthersProvider type that covers the four functions we use on it. We import directly from the package, and then we cast the import to our type.

Another option is to keep the .flow types in the same folder, but to name the folder something different than generated, and to be more specific (specific to files rather than general to the whole folder) in our .gitignore, .eslintignore, and rm-rf workflows. This is what we do with our grpc-web bindings, which are stored in web/protobufs. We could use the @generated pragma for the files that are actually generated, and skip it for the files that aren't.

Back to you for the Flow stuff, but this is looking almost ready

This revision now requires changes to proceed.Feb 16 2024, 9:51 AM

Renamed _generated folder to wasm because it will also include the flow types
Made the clean-backup-client-wasm skip the flow type file, and the eslint skip only the generated file
Added a run_wasmpack.sh script (inspired by run_emscripten.sh) as there are two additional actions we need to take and I think this is enough complexity to warrant a script file:
- remove the autogenerated .gitignore file
- add "@generated" at the top of the generated JS file
Update the flow types

Additionally I have created a task for the buildkite CI job: ENG-6873. I will add a diff later in the stack with this code, and land this only after the buildkite job is accepted. I did this because I wasn't able to get to implementing it today, but I wanted feedback on this (and later) diffs faster.

Harbormaster completed remote builds in B26973: Diff 37332.Feb 19 2024, 9:53 AM

ashoat accepted this revision.Feb 19 2024, 7:50 PM

This revision is now accepted and ready to land.Feb 19 2024, 7:50 PM

Would be good to have somebody else approve the Rust code, as I'm not capable of reviewing that. @bartek is out so maybe @kamil?

This revision now requires review to proceed.Feb 19 2024, 7:51 PM

michal added a child revision: D11115: [web] Check backup client wasm integrity buildkite.Feb 20 2024, 2:40 AM

kamil accepted this revision.Feb 21 2024, 4:40 AM

kamil added inline comments.

web/backup-client-wasm/wasm/backup-client-wasm.js.flow
3 ↗	(On Diff #37332)	shouldn't we type it like this?

This revision is now accepted and ready to land.Feb 21 2024, 4:40 AM

Update lock file

Harbormaster completed remote builds in B27107: Diff 37515.Feb 23 2024, 5:20 AM

Rebase

Harbormaster failed remote builds in B27126: Diff 37543!Feb 23 2024, 9:50 AM

Rebase

Harbormaster failed remote builds in B27175: Diff 37599!Feb 26 2024, 6:04 AM

Harbormaster completed remote builds in B27126: Diff 37543.Feb 26 2024, 2:14 PM

Harbormaster completed remote builds in B27175: Diff 37599.Feb 26 2024, 3:44 PM

Rebase

Harbormaster completed remote builds in B27203: Diff 37652.Feb 27 2024, 6:26 AM

As discussed in ENG-6873 : Add a buildkite job for checking integrity of backup client wasm file, we are changing thr approach and instead of keeping the wasm file in git repo we will be building it on cleaninstall as we do with the keyserver node addon.

Remove the generated files from repo and add them to the gitignore
Make cleaninstall also clear these files
Add a postinstall script to web which runs the backup client wasm build
No longer add the @generated header to the generated js file

Harbormaster failed remote builds in B27289: Diff 37766!Mar 4 2024, 4:13 AM

Fixes for keyserver docker build.

Try adding wasm pack as a yarn dependency for non-nix workflows.

Harbormaster completed remote builds in B27295: Diff 37772.Mar 4 2024, 6:56 AM

Requesting review again because there were some changes:

As mentioned in one of the previous comments, the generated files are no longer in repo and instead we have a postinstall script
wasm-pack (the CLI we are using for easy compiling to ready-to-use wasm) is provided on nix which is problematic for other platforms like CI. There are other download options and I decided to go with a pre-built npm dependency in web, as the easiest and the fastest (cargo install would compile from scratch)
There were some changes required in the keyserver dockerfile to bring in our rust shared/ deps

keyserver/Dockerfile
59	`wasm-pack` uses a `wasm-opt` tool which optimizes wasm files size. It's part of the binaryen suite of tools. For some reason when using building the keyserver image locally I'm getting: Error: no prebuilt wasm-opt binaries are available for this platform: Unrecognized target! To disable wasm-opt, add wasm-opt = false to your package metadata in your 'Cargo.toml'. This does not happen when `wasm-pack` is installed with nix and building the WASM outside of docker I didn't want to disable `wasm-opt` Installing binaryen makes the `wasm-pack` use the package on `$PATH` and the docker builds successfuly This PR added the option of using the version installed globally There are a few related github issues: https://github.com/rustwasm/wasm-pack/issues/913 https://github.com/WebAssembly/binaryen/issues/4334 Some of them are closed but we are using the versions with the fixes Some other suggestions from the issues: Disabling the wasm opt but I think we should avoid that Using `cargo install wasm-pack --git https://github.com/rustwasm/wasm-pack --rev ae10c23cc14b79ed37a7be222daf5fd851b9cd0d` but 1. this will compile it from scratch 2. uses an older version of the wasm-pack 3. Is more of a workaround than a fix

As mentioned in one of the previous comments, the generated files are no longer in repo and instead we have a postinstall script

I know this was discussed earlier, but can you remind me why we're taking the rust-node-addon approach here instead of the opaque-ke-wasm approach? Is it because of issues getting the CI pipeline working?

.eslintignore
9	Is there a similar `pkg` folder we should add to `.eslintignore` for `backup-client-wasm`?
keyserver/Dockerfile
54	Please add a comment here explaining why we need `binaryen`
59	Thanks for explaining all of the debugging you've done!
59	I wonder if this could be "solved" by installing `wasm-opt` via NPM?
117	Can we use a dash for the naming? I can see that `web/scripts/run_emscripten.sh` does not follow this convention, but in general I think we use dashes for filenames instead of underscores
web/package.json
9	It looks like when the input to the WASM changes, `yarn dev` won't know to recompile. In those cases, does the dev need to run `yarn build-backup-client-wasm` manually? And then after, run `yarn dev` again?
16	Should this new `postinstall.sh` file be in `web/scripts`?

Renamed the script to run-emscripten.sh, moved postinstall.sh to scripts. Added web/backup-client-wasm/target to .dockerignore.

I know this was discussed earlier, but can you remind me why we're taking the rust-node-addon approach here instead of the opaque-ke-wasm approach? Is it because of issues getting the CI pipeline working?

Yes, basically the it WASM compilation wasn't as deterministic as I had hoped and I wasn't able to make the CI work. Because of this we can't guarantee that the repo has the most current changes. With these changes we can be sure that after yarn cleaninstall we have the current WASM code.

.eslintignore
9	I rename the `pkg` folder to `wasm` with the `--out-dir "${OUT_DIR}"` flag in `run-wasmpack.sh` script. We don't `.eslintignore` the whole folder because in case of backup client it also contains the `.flow` file with types. We could stop passing the flag and instead just use the `pkg` folder but because we have some non-generated code inside I think the `wasm` folder name makes more sense.
keyserver/Dockerfile
59	I was able to build the docker image with the binaryen NPM package. So I removed the native package here and added it to `package.json`.
web/package.json
9	Yes, the dev will need to run `yarn build-backup-client-wasm` manually. But webpack (more specifically the `CopyPlugin`) should pick up the changes automatically after that.

Harbormaster failed remote builds in B27325: Diff 37818!Mar 5 2024, 3:35 AM

Fix script path in dockerfile

Harbormaster failed remote builds in B27326: Diff 37819!Mar 5 2024, 4:00 AM

michal planned changes to this revision.Mar 5 2024, 5:16 AM

Try reverting binaryen to the apt install because of CI failure.

Harbormaster completed remote builds in B27333: Diff 37826.Mar 5 2024, 6:34 AM

When using binaryen from npm I was getting this error in CI:

<...binary blob...>

RangeError: Invalid typed array length: 4289064482
    at new Int8Array (<anonymous>)
    at Object.write (/comm/node_modules/binaryen/bin/wasm-opt:2:9155272)
    at Object.write (/comm/node_modules/binaryen/bin/wasm-opt:2:9201027)
    at doWritev (/comm/node_modules/binaryen/bin/wasm-opt:2:9193410)
    at _fd_write (/comm/node_modules/binaryen/bin/wasm-opt:2:9193623)
    at wasm://wasm/01a172d6:wasm-function[2458]:0x2a68c7
    at wasm://wasm/01a172d6:wasm-function[1276]:0x1138a7
    at wasm://wasm/01a172d6:wasm-function[534]:0xa4e68
    at wasm://wasm/01a172d6:wasm-function[2403]:0x2a5068
    at wasm://wasm/01a172d6:wasm-function[2451]:0x2a6628
Error: failed to execute `wasm-opt`: exited with exit status: 7
  full command: "/comm/web/node_modules/.bin/wasm-opt" "/comm/web/scripts/../backup-client-wasm/wasm/backup-client-wasm_bg.wasm" "-o" "/comm/web/scripts/../backup-client-wasm/wasm/backup-client-wasm_bg.wasm-opt.wasm" "-O"
To disable `wasm-opt`, add `wasm-opt = false` to your package metadata in your `Cargo.toml`.

so I decided to revert back to the binaryen installed from apt. I don't think using either one over the other is much of a difference so I didn't do much investigation why the error happened (also the apt version might a bit faster but this is just a gut feeling, I didn't do any benchmarks). If the npm version is more preferable and I should spend some time trying to make it work, let me know.

Thanks for trying!

This revision is now accepted and ready to land.Mar 5 2024, 4:20 PM

Closed by commit rCOMM2dce5a391b66: [backup-client] Compile to wasm (authored by Michal Gniadek <michal.gniadek@swmansion.com>). · Explain WhyMar 6 2024, 6:09 AM

This revision was automatically updated to reflect the committed changes.

Michal Gniadek <michal.gniadek@swmansion.com> added a commit: rCOMM2dce5a391b66: [backup-client] Compile to wasm.

[backup-client] Compile to wasm
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 37772

.eslintignore

.gitignore

keyserver/Dockerfile

native/native_rust_library/Cargo.lock

services/commtest/Cargo.lock

shared/backup_client/Cargo.lock

shared/backup_client/Cargo.toml

shared/backup_client/src/lib.rs

shared/backup_client/src/web.rs

web/backup-client-wasm/Cargo.lock

web/backup-client-wasm/Cargo.toml

web/backup-client-wasm/src/lib.rs

web/backup-client-wasm/wasm/backup-client-wasm.js.flow

web/package.json

web/postinstall.sh

web/scripts/run_wasmpack.sh

yarn.lock

[backup-client] Compile to wasmClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 37772

.eslintignore

.gitignore

keyserver/Dockerfile

native/native_rust_library/Cargo.lock

services/commtest/Cargo.lock

shared/backup_client/Cargo.lock

shared/backup_client/Cargo.toml

shared/backup_client/src/lib.rs

shared/backup_client/src/web.rs

web/backup-client-wasm/Cargo.lock

web/backup-client-wasm/Cargo.toml

web/backup-client-wasm/src/lib.rs

web/backup-client-wasm/wasm/backup-client-wasm.js.flow

web/package.json

web/postinstall.sh

web/scripts/run_wasmpack.sh

yarn.lock

[backup-client] Compile to wasm
ClosedPublic
Actions

Revision Contents
Changeset List