Details

Reviewers

varun
bartek
marcin

Commits

rCOMM6dc2a953186c: [Identity] Fix opaque login

Summary

The DDB query operation coupled with KeyConditionExpression
will project only the PartitionKey and the conditional attribute that
you filtered by. So the resulting query would only return username and
user_id for a username query.

Re-querying by partition_key allows for the entirity of the object to be
returned.

Depends on D8400

Test Plan

Diff Detail

Repository

rCOMM Comm

Lint

No Lint Coverage

Unit

No Test Coverage

Event Timeline

• jon created this revision.Jun 30 2023, 3:32 PM

Herald added subscribers: tomek, ashoat. · View Herald TranscriptJun 30 2023, 3:32 PM

• jon added a child revision: D8403: [Keyserver/rust] Enable logging of rust addon.Jun 30 2023, 3:37 PM

• jon added a child revision: D8404: [Keyserver/rust] Remove legacy auth_token configuration.

Harbormaster completed remote builds in B20662: Diff 28318.Jun 30 2023, 4:07 PM

• jon requested review of this revision.Jun 30 2023, 4:07 PM

The DDB query operation coupled with KeyConditionExpression
will project only the PartitionKey and the conditional attribute that
you filtered by.

I think it's more about the index projection than the KeyConditionExpression. It is now defined as projection_type = "KEYS_ONLY" in terraform.

services/identity/src/database.rs
541 ↗	(On Diff #28318)	IMO awaiting inside calls/parentheses looks a bit messy. We don't do it in JS either

This revision is now accepted and ready to land.Jun 30 2023, 11:08 PM

• jon added a parent revision: D8401: [Identity] Allow for setting logging levels through env.Jul 6 2023, 9:17 AM

• jon removed a parent revision: D8400: [Keyserver] Apply DB Migrations.

• jon removed a child revision: D8404: [Keyserver/rust] Remove legacy auth_token configuration.

Address feedback

• jon marked an inline comment as done.Jul 6 2023, 10:25 AM

Harbormaster completed remote builds in B20770: Diff 28449.Jul 6 2023, 10:43 AM

should we just change the index projection to INCLUDE or ALL?

should we just change the index projection to INCLUDE or ALL?

I'm not sure. DDB is outside my expertise. CC @bartek

• jon added 1 blocking reviewer(s): bartek.Jul 6 2023, 11:33 AM

This revision now requires review to proceed.Jul 6 2023, 11:33 AM

It depends on how often we query by index (UserInfo a.k.a username) vs normal table primary key (userID) vs how many columns (attributes) the column has in total.

Projecting ALL will technically duplicate the whole table in an index (2x storage).
Querying a keys-only index and then re-query the table (done in this diff) consumes another RCU (read capacity unit) because we do 2 read requests instead of one.

In our case:

If we query the users table often and/or it has only few columns, changing projection to ALL makes sense, because storage costs will be lower than query costs
If the query is comparatively rare (to querying by id), then we can leave it as is.
If query-by-username requires only some additional columns, not all, the INCLUDE is the way to go

If we added some monitoring (idk if aws has built-in tools precise enough), we can see in practice how it scales and change the approach later (note that it will need a migration: 1. create a new index, 2. switch code to use it, 3. delete the old one)

I suspect that query-by-username will be a common flow. Here are the situations in which it will happen:

When a user registers an account, we need to check if their selected username is already in use
When a user searches Comm, we need to search usernames
- Current interface: chat composer. This will stay
- Current interface: friend list and block list. This will stay
- Current interface: search in chat list. This will be replaced by the next one
- Future interface: global search

Additionally, it's important to call out that we will need "prefix search" for some of these use cases. Eg. we should be able to return "ashoat" if the user searches for "ash".

It may be the case that DDB is not a good option for solving this problem. I haven't thought about it deeply. The tasks for this are ENG-4131 and ENG-4132

In D8402#249035, @ashoat wrote:

I suspect that query-by-username will be a common flow. Here are the situations in which it will happen:

When a user registers an account, we need to check if their selected username is already in use

Okay, registration is common, but request rates still will be relatively low (we don't expect hundreds of registrations per minute)

When a user searches Comm, we need to search usernames

Current interface: chat composer. This will stay

Current interface: friend list and block list. This will stay

Current interface: search in chat list. This will be replaced by the next one

Future interface: global search

Additionally, it's important to call out that we will need "prefix search" for some of these use cases. Eg. we should be able to return "ashoat" if the user searches for "ash".

It may be the case that DDB is not a good option for solving this problem. I haven't thought about it deeply. The tasks for this are ENG-4131 and ENG-4132

Yeah, prefix search is DDB's weak part - we can do it only by "sort keys" which is not the case here. We would probably need a separate search engine for this purpose.

I think we can just stick with Jon's current approach here

For the searching usernames piece, I think we'll have to use AWS CloudSearch or Elasticsearch on top of DDB. https://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-dynamodb-data.html

In D8402#249107, @varun wrote:

I think we can just stick with Jon's current approach here

For the searching usernames piece, I think we'll have to use AWS CloudSearch or Elasticsearch on top of DDB. https://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-dynamodb-data.html

Yep, makes sense

This revision is now accepted and ready to land.Jul 7 2023, 11:42 AM

Continuing on :)

• jon removed a parent revision: D8400: [Keyserver] Apply DB Migrations.Jul 13 2023, 9:14 AM

varun added inline comments.Jul 13 2023, 11:16 AM