Page MenuHomePhabricator

[Identity] Implement GetKeyserverKeys
AbandonedPublic

Authored by jon on Aug 3 2023, 9:59 AM.
Tags
None
Referenced Files
F1666600: D8722.id29516.diff
Fri, Apr 26, 10:00 AM
Unknown Object (File)
Tue, Apr 16, 7:50 AM
Unknown Object (File)
Tue, Apr 16, 7:39 AM
Unknown Object (File)
Tue, Apr 16, 7:38 AM
Unknown Object (File)
Tue, Apr 16, 5:33 AM
Unknown Object (File)
Sun, Apr 14, 11:59 PM
Unknown Object (File)
Fri, Apr 12, 8:49 PM
Unknown Object (File)
Fri, Apr 12, 5:04 PM
Subscribers

Details

Reviewers
bartek
varun
tomek
Summary

Implement the GetKeyserverKeys endpoint. Used by clients to establish
X3DH connections with a keyserver.

https://linear.app/comm/issue/ENG-4445

Depends on D8721

Test Plan
  • Setup localstack and run terraform
# Run identity service
cd services/identity
RUST_LOG=debug cargo run -- server

# Separate terminal
cd services/commtest
cargo test --test identity_keyserver_tests

Diff Detail

Repository
rCOMM Comm
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

bartek requested changes to this revision.Aug 4 2023, 12:22 AM
bartek added inline comments.
services/identity/src/database.rs
292–296 ↗(On Diff #29516)
374 ↗(On Diff #29516)

This way you're doing unnecessary reads - you're reading all device's onetime keys just to try deleting one. That's inefficient.

My idea is to do the following (pseudocode written by hand)

loop {
  let otk = self.client.query()
    .key(partition_key_for_device_id_and_account_type)
    .limit(1) // try getting one key
    .send().await?
    .items[0]; // get first item, this is a pseudocode

  let Some(otk) = otk else { /* No one-time keys left */ };

  let result = self.client.delete_item()
    .key(otk_composite_primary_key)
    .condition_expression('attribute_exists(sort_key)') // check again for races
    .send().await;
  
  let Err(failure) = result else {
    return Ok(otk); // Yay! we popped a key
  };
  
  // here you can optionally match error type
  // e.g. DynamoDBError::ConditionalCheckFailedException means that race happened
}
// if we went here, either 
// - no keys left 
// - each time we had a failure, so might be good to distinguish and log error types ;)
return Ok(None);

This way, in optimistic case you do only 1 read and 1 delete instead of N reads + 1 delete

What do you think?

375–381 ↗(On Diff #29516)

This way you won't have to unwrap below

This revision now requires changes to proceed.Aug 4 2023, 12:22 AM
jon added inline comments.
services/identity/src/database.rs
265 ↗(On Diff #29516)
292–296 ↗(On Diff #29516)

TIL you could have an else block when doing assignment.

317–340 ↗(On Diff #29516)

Should be moved into a TryFrom implementation for a struct.

Can be done later though.

374 ↗(On Diff #29516)

Makes sense. Used to conventional RDMS like Prostgresql where searching by partition key is cheap, and returning 1-10 items isn't a significant difference.

375–381 ↗(On Diff #29516)

really liking the else clause... which i knew about it earlier.

Looks like it was added after I read the rust book: https://rust-lang.github.io/rfcs/3137-let-else.html

jon marked 3 inline comments as done.

Address feedback

services/identity/src/database.rs
377–388 ↗(On Diff #29797)

Up to you ;)

P.S. Not sure if the unwrap_or(None) is still needed in my suggestion

418 ↗(On Diff #29797)

Just curious - is anything after this place considered dead code by the compiler?

It's likely smarter than me but let's consider the following case:

  1. The self.get_onetime_key() always succeeds - there still are keys
  2. For some reason, the delete_item() always fails.

Isn't it an infinite loop?

It's a very unlikely scenario though. One possible solution is to introduce a counter variable for last fetched key. If the same key is fetched three or more times (two is still very okay due to data races, the condition expression saves us here) we can break the loop with failure.
But this is rather a follow-up

jon marked 2 inline comments as done.

Revert back to using initial "read all" logic

services/identity/src/database.rs
377–388 ↗(On Diff #29797)

going to avoid let-else until we update cargo in nix environment

418 ↗(On Diff #29797)

I'm going to remove this loop, and put back the "read all, then try to delete one key" logic I had before. In a later diff I need the item count to see if a device's keys are depleted, and I can only do that if I have the count.

Since doing select(COUNT) is usually as expensive, I think it makes sense to just return all items anyway.

Also avoid cases where we retrieve and attempt to delete the same key indefinitely.

292–296 ↗(On Diff #29516)

until our git commit lint stage allows for this to occur with out error, I'm going to use the match unfortunately

tomek requested changes to this revision.Aug 10 2023, 2:21 AM
tomek added inline comments.
services/identity/src/database.rs
418 ↗(On Diff #29797)

If the only reason behind reading all the items is to get their count, then why don't we maintain it as a separate value and update using transactions https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/transaction-apis.html?

This revision now requires changes to proceed.Aug 10 2023, 2:21 AM
jon marked an inline comment as done.
jon added inline comments.
services/identity/src/database.rs
418 ↗(On Diff #29797)

Do you mind if I create a follow up task for this? A naive but easy to reason about solution seems fine for now. We can re-address this in the future if it does become an issue.

Created https://linear.app/comm/issue/ENG-4650

services/identity/src/database.rs
435–443

If we decide to go this way and defer this, can we at least add these lines for easier monitoring?

tomek added inline comments.
services/identity/src/database.rs
418 ↗(On Diff #29797)

Ok, we can handle this as a follow-up, but I think this should be done before releasing the service to prod.

In D8722 @tomek wrote:

Ok, we can handle this as a follow-up, but I think this should be done before releasing the service to prod.

Agreed

This revision is now accepted and ready to land.Aug 16 2023, 5:05 AM
jon marked 3 inline comments as done.

When trying to patch this in, it got added to the previous diff. Will add @bartek suggestion to later diff

services/identity/src/database.rs
435–443

Seems reasonable to me