change how we store and retrieve OTKs from DynamoDB
Details
- Reviewers
bartek - Commits
- rCOMMcf700b99ee8d: [identity] otk db changes
commtest passes, manual testing
Diff Detail
- Repository
- rCOMM Comm
- Lint
No Lint Coverage - Unit
No Test Coverage
Event Timeline
services/identity/src/constants.rs | ||
---|---|---|
122–125 ↗ | (On Diff #38965) | To be fully correct, these should also be added to the DeviceRow struct in device_list.rs. |
services/identity/src/database.rs | ||
465 ↗ | (On Diff #38965) | Maybe we should add a retry counter to these loops? see mod batch_operations in comm_lib/database.rs Ideally, we could extract sth like transact_write_helper similar to batch write, but this is a follow-up, not sure if worth doing now |
507–511 ↗ | (On Diff #38965) | The AWS guide here https://aws.amazon.com/blogs/database/implement-resource-counters-with-amazon-dynamodb/ (Approach 1 - Atomic counters) |
614–616 ↗ | (On Diff #38965) | I guess we have two separate transactions because we want to track counts of content and notif keys separately. Alternative approach would be
Having single (and less transient-error-prone) DDB transaction at a cost of code complexity |
690–697 ↗ | (On Diff #38965) | Same as above - we have helpers for that |
711–719 ↗ | (On Diff #38965) | Builder pattern helps to avoid some memory operations and simplify code ;) |
726–733 ↗ | (On Diff #38965) |
fn is_transaction_canceled(err: &DynamoDBError, reason: &str) -> bool { match err { DynamoDBError::TransactionCanceledException( TransactionCanceledException { cancellation_reasons: Some(reasons), .. }, ) if reasons .iter() .any(|reason| reason.code() == Some(reason)) => true, _ => false } |
759–760 ↗ | (On Diff #38965) | It's good to add .projection_expression(devices_table::ATTR_CONTENT_OTK_COUNT) to avoid unnecessarily getting other fields. Otherwise, you can just use db_client.get_device_data() (it's in device_list.rs) |
765–772 ↗ | (On Diff #38965) | We have dedicated helper struct that does this (device_list.rs) |
780 ↗ | (On Diff #38965) | |
781–809 ↗ | (On Diff #38965) | Simplify |
783–786 ↗ | (On Diff #38965) | Unfortunately this is the correct way now, because there's a problem with TryFromAttribute for numbers - https://linear.app/comm/issue/ENG-4658/improve-database-type-conversions-for-numbers |
services/identity/src/ddb_utils.rs | ||
25–30 ↗ | (On Diff #38965) | nit |
services/identity/src/database.rs | ||
---|---|---|
588–594 ↗ | (On Diff #38965) | One thing that came to my mind - we should not allow uploading more than 98 keys in total. If ever uploading more at once, the UploadOneTimeKeys RPC should return the keys that were successfully uploaded (or inverse - failed). And this would be too much complexity. This way, the caller will know which keys were successfully uploaded. Also, this limit will let us avoid splitting into chunks server-side. |
services/identity/src/database.rs | ||
---|---|---|
588–594 ↗ | (On Diff #38965) | olm only remembers the last 100 one-time keys so clients will never upload more than 100 |
614–616 ↗ | (On Diff #38965) | From the aws docs:
Following this principle, I think we should avoid combining content and notif updates in a single transaction |
services/identity/src/constants.rs | ||
---|---|---|
122–125 ↗ | (On Diff #38965) | good point. going to just add a comment for now, since i can't think of a good reason to add these fields to the struct |
services/identity/src/database.rs | ||
465 ↗ | (On Diff #38965) | i'll add a basic retry counter that returns an error after the 8th try. i'll also leave a todo comment to introduce and use a transact write helper |
services/identity/src/database.rs | ||
---|---|---|
507–511 ↗ | (On Diff #38965) | i think it should fail the whole transaction |
588–594 ↗ | (On Diff #38965) | I think we can assume that the list of OTKs that the client provides is sorted (i think oldest to newest, but need to confirm this). so if the client provides 150 content otks, we can just throw away the first 50 in the list since the olm account will have thrown them away already |
588–594 ↗ | (On Diff #38965) |
Did you mean wouldn't be too much complexity? |
588–594 ↗ | (On Diff #38965) |
*should never, i suppose it's possible though |
588–594 ↗ | (On Diff #38965) | Ah I understand your concern now. Let's just immediately fail any request that comes in (register, upload_one_time_keys) that has more than 98 keys total so that we can do everything in a single transaction and avoid the scenario where, say, 99 content keys are uploaded but 0 notif keys are |
services/identity/src/database.rs | ||
---|---|---|
614–616 ↗ | (On Diff #38965) | After thinking about this a little more, the alternative you propose seems like the right way to go (except we don't need chunks since we'll just fail the request immediately if the caller provides more than 49 content or notif otks) |
services/identity/src/database.rs | ||
---|---|---|
711–719 ↗ | (On Diff #38965) | transact_items() expects a single TransactWriteItem object, so we have to use set_transact_item() since we have potentially 100 items here |
services/identity/src/database.rs | ||
---|---|---|
661–662 | Should work | |
588–594 ↗ | (On Diff #38965) |
Yeah this is what I meant |
614–616 ↗ | (On Diff #38965) |
Yes, if we don't need chunks, this gets much simpler |
711–719 ↗ | (On Diff #38965) |
Right, I missed it |
services/terraform/modules/shared/dynamodb.tf | ||
241–242 | Good idea with informative column names |