Page MenuHomePhabricator

[keyserver] mark prekeys and OTKs as published right before calling identity service
Needs RevisionPublic

Authored by varun on Fri, Apr 26, 12:10 PM.
Tags
None
Referenced Files
F1712440: D11810.id.diff
Tue, May 7, 3:14 AM
Unknown Object (File)
Thu, May 2, 12:56 AM
Unknown Object (File)
Wed, May 1, 1:05 PM
Unknown Object (File)
Mon, Apr 29, 9:33 AM
Subscribers

Details

Summary

we want to mark prekeys and one-time keys as published right before calling the identity service to prevent a scenario where the identity service is vending keys that have not been published by the client (e.g. the client crashes right after uploading to identity)

Test Plan

Diff Detail

Repository
rCOMM Comm
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

keyserver/src/responders/keys-responders.js
48

this didn't need to be async since retrieveSessionInitializationKeysSet isn't an async function

keyserver/src/user/login.js
31

we don't upload one-time keys on login

ashoat requested changes to this revision.Sat, Apr 27, 6:04 PM

we want to mark prekeys and one-time keys as published right before calling the identity service to prevent a scenario where the identity service is vending keys that have not been published by the client (e.g. the client crashes right after uploading to identity)

I'm not sure I follow this. Can you more specifically describe the scenario you're concerned about?

It was my impression that we were concerned about OTKs being uploaded, dequeued, and then re-uploaded. But I'm not sure there are any risks around re-uploading the same prekey.

The scenario where the identity service is vending keys that have not been published by the client can definitely happen. But I don't think marking prekeys (or even OTKs) as published before calling the identity service mitigates or prevents this scenario.

keyserver/src/user/login.js
199–202

Marking the one-time keys as published has a slightly different effect than marking prekeys as published. Marking a prekey as published won't stop it being returned from olmAccount.prekey(), but marking the one-time keys as published WILL stop them from being returned from olmAccount.one_time_keys().

On the other hand, the significance of marking-as-published for the prekey is that it unblocks the next key rotation. We pause prekey rotation until the current prekey is marked as published.)

Since we're only dealing with prekeys here, I think we should undo this change. With prekeys I think it's best to mark them as published after confirming identity upload. Prekey mark-as-published unblocks the new prekey rotation... failing to do it shouldn't have any effect, other than forcing it to be published again in the next cycle of validateAndUploadAccountPrekeys. Double-uploading a prekey doesn't have the same risks that double-uploading an OTK does.

Curious for what other people think.

224–227

I considered a similar risk here as the one I described above for the prekeys. In this scenario, the prekey is marked as published but isn't really published, and then eventually the prekey that is on identity gets thrown away on the client during a forget_old_prekey call as part of prekey rotation (validateAndUploadAccountPrekeys) because the new one is marked as published.

We should arguably separate out marking the prekeys as published (after identity upload) from marking the OTKs as published (before identity upload). That said, the concern about prekey rotation is hypothetical, and I think that prekey rotation can't happen without a CSAT anyways, which seems to imply that a successful prekey upload would have to occur first (since you need to call registerOrLogIn to get a CSAT).

keyserver/src/utils/olm-utils.js
164–165

validateAndUploadAccountPrekeys is called within fetchCallUpdateOlmAccount, which can fail to commit the updated Olm account to MariaDB if another fetchCallUpdateOlmAccount finishes before it does.

When this happens, the client should retry and upload a new prekey, which then should get saved. fetchCallUpdateOlmAccount only gives up after 5 attempts, but if it gets there then we'll be in a bad state where the identity service is vending an invalid prekey that the keyserver never persisted to MariaDB. I think this would result in all attempts to auth with this keyserver failing.

I feel like a more standard locking mechanism would have probably been wise for fetchCallUpdateOlmAccount. The custom retry logic presents nuanced risks. I've brought this up previously here, and in the comment referenced from there. @marcin, what do you think?

Barring a rearchitecture of fetchCallUpdateOlmAccount, it might make sense to refactor this logic to look more like registerOrLogIn, with multiple discrete fetchCallUpdateOlmAccount steps:

  1. First, a fetchCallUpdateOlmAccount step where, if we need to, we generate a new prekey.
  2. Next, we upload the new prekey to the identity service.
  3. Finally, a fetchCallUpdateOlmAccount step where we mark the prekey as published.
This revision now requires changes to proceed.Sat, Apr 27, 6:04 PM