[keyserver] mark prekeys and OTKs as published right before calling identity service
AbandonedPublic
Actions

Authored by varun on Apr 26 2024, 12:10 PM.

Details

Reviewers

marcin
kamil
ashoat

Summary

we want to mark prekeys and one-time keys as published right before calling the identity service to prevent a scenario where the identity service is vending keys that have not been published by the client (e.g. the client crashes right after uploading to identity)

Test Plan

https://gist.github.com/vdhanan/be35b6552248428fce2d20a55c6b65cb
https://gist.github.com/vdhanan/a256d9e9290f7aad3effb22260b27026

Diff Detail

Repository

rCOMM Comm

Lint

No Lint Coverage

Unit

No Test Coverage

Event Timeline

varun created this revision.Apr 26 2024, 12:10 PM

Herald added subscribers: tomek, ashoat. · View Herald TranscriptApr 26 2024, 12:10 PM

varun added a child revision: D11811: [web][native] mark keys as published before uploading to identity service.Apr 26 2024, 12:11 PM

varun added inline comments.Apr 26 2024, 12:12 PM

keyserver/src/responders/keys-responders.js
48	this didn't need to be async since `retrieveSessionInitializationKeysSet` isn't an async function
keyserver/src/user/login.js
31	we don't upload one-time keys on login

Harbormaster completed remote builds in B28533: Diff 39588.Apr 26 2024, 12:35 PM

varun requested review of this revision.Apr 26 2024, 12:35 PM

varun edited the test plan for this revision. (Show Details)Apr 26 2024, 12:52 PM

varun edited the test plan for this revision. (Show Details)Apr 26 2024, 12:59 PM

we want to mark prekeys and one-time keys as published right before calling the identity service to prevent a scenario where the identity service is vending keys that have not been published by the client (e.g. the client crashes right after uploading to identity)

I'm not sure I follow this. Can you more specifically describe the scenario you're concerned about?

It was my impression that we were concerned about OTKs being uploaded, dequeued, and then re-uploaded. But I'm not sure there are any risks around re-uploading the same prekey.

The scenario where the identity service is vending keys that have not been published by the client can definitely happen. But I don't think marking prekeys (or even OTKs) as published before calling the identity service mitigates or prevents this scenario.

keyserver/src/user/login.js
199–202	Marking the one-time keys as published has a slightly different effect than marking prekeys as published. Marking a prekey as published won't stop it being returned from `olmAccount.prekey()`, but marking the one-time keys as published WILL stop them from being returned from `olmAccount.one_time_keys()`. On the other hand, the significance of marking-as-published for the prekey is that it unblocks the next key rotation. We pause prekey rotation until the current prekey is marked as published.) Since we're only dealing with prekeys here, I think we should undo this change. With prekeys I think it's best to mark them as published after confirming identity upload. Prekey mark-as-published unblocks the new prekey rotation... failing to do it shouldn't have any effect, other than forcing it to be published again in the next cycle of `validateAndUploadAccountPrekeys`. Double-uploading a prekey doesn't have the same risks that double-uploading an OTK does. Curious for what other people think.
224–227	I considered a similar risk here as the one I described above for the prekeys. In this scenario, the prekey is marked as published but isn't really published, and then eventually the prekey that is on identity gets thrown away on the client during a `forget_old_prekey` call as part of prekey rotation (`validateAndUploadAccountPrekeys`) because the new one is marked as published. We should arguably separate out marking the prekeys as published (after identity upload) from marking the OTKs as published (before identity upload). That said, the concern about prekey rotation is hypothetical, and I think that prekey rotation can't happen without a CSAT anyways, which seems to imply that a successful prekey upload would have to occur first (since you need to call `registerOrLogIn` to get a CSAT).
keyserver/src/utils/olm-utils.js
164–165	`validateAndUploadAccountPrekeys` is called within `fetchCallUpdateOlmAccount`, which can fail to commit the updated Olm account to MariaDB if another `fetchCallUpdateOlmAccount` finishes before it does. When this happens, the client should retry and upload a new prekey, which then should get saved. `fetchCallUpdateOlmAccount` only gives up after 5 attempts, but if it gets there then we'll be in a bad state where the identity service is vending an invalid prekey that the keyserver never persisted to MariaDB. I think this would result in all attempts to auth with this keyserver failing. I feel like a more standard locking mechanism would have probably been wise for `fetchCallUpdateOlmAccount`. The custom retry logic presents nuanced risks. I've brought this up previously here, and in the comment referenced from there. @marcin, what do you think? Barring a rearchitecture of `fetchCallUpdateOlmAccount`, it might make sense to refactor this logic to look more like `registerOrLogIn`, with multiple discrete `fetchCallUpdateOlmAccount` steps: First, a `fetchCallUpdateOlmAccount` step where, if we need to, we generate a new prekey. Next, we upload the new prekey to the identity service. Finally, a `fetchCallUpdateOlmAccount` step where we mark the prekey as published.