Page MenuHomePhabricator

[draft][services][identity] Initial .proto file for identity service
ClosedPublic

Authored by varun on Mar 30 2022, 12:50 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Dec 20, 10:54 PM
Unknown Object (File)
Fri, Dec 20, 8:05 PM
Unknown Object (File)
Sun, Dec 15, 8:49 AM
Unknown Object (File)
Sun, Dec 15, 8:49 AM
Unknown Object (File)
Sun, Dec 15, 8:49 AM
Unknown Object (File)
Sun, Dec 15, 8:49 AM
Unknown Object (File)
Sun, Dec 15, 8:48 AM
Unknown Object (File)
Sun, Dec 15, 8:48 AM

Details

Summary

Problem statement: We need a way to authenticate users and let services verify a user's identity. Rather than have each service manage identities internally, we decided to create an Identity Service that users and other services talk to.

D2950 introduced authentication directly in the Backup Service. We had this notion of full auth and simple auth in that diff, where full auth meant signing in with a password or with an eth wallet, and simple auth meant using the backup ID (secret) to prove a user’s identity to the Backup Service. In this new design, the Identity Service replaces full auth with three APIs to register, login, and look up a user. The login API returns a random, unique token to the user. After 10 minutes of user inactivity, this token will expire. The user then presents this token to other services, which can query the Identity Service to validate it. Once validated, other services can use a service-specific shared secret to perform a “simple auth” going forward.

Functional requirements: This service needs to be able to register, login, and look up users.

Non-functional requirements: The Identity Service needs to be highly available, accessible only by authorized devices and services, and highly secure. If the Identity Service is down, users will not be able to create backups or recover their backup key.

Here is a sequence diagram illustrating a basic use case: https://www.figma.com/file/GF3xZAMmjpBMXrOIERmzQO/Identity-Service?node-id=13%3A14

Failure cases:

  • User is not registered or not logged in, tries to create a new backup, Backup Service cannot find a valid token when it queries the Identity Service so it rejects the request

Corner cases:

Open questions:

  • How should we ensure high availability?
  • Which auth mechanism(s) should we use at the gRPC level?
  • Does this design let us introduce new APIs like "RevokeToken" in the future?
  • How should token be used? Should user just MAC a message with it when communicating with other services (e.g. backup)?

Depends on D3579

Test Plan

just a .proto file, so no testing required

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This revision now requires changes to proceed.Apr 18 2022, 9:47 PM
varun requested review of this revision.Apr 19 2022, 6:17 AM
varun added inline comments.
services/identity/proto/identity.proto
10–11 ↗(On Diff #11195)

My thinking was it's harder for a bad actor to intercept the access token if it's only shared in the VPC with other services. I realize now that it doesn't really make a difference since everything is TLS-encrypted anyway. The latest revision changes the design so that only a user can get their token from the Identity service.

ashoat requested changes to this revision.Apr 19 2022, 12:03 PM

Back to you for a couple questions

services/identity/proto/identity.proto
12 ↗(On Diff #11479)

It seems like registration should also return a token, no?

110–119 ↗(On Diff #11479)

Seems like log-in supports wallet auth but registration doesn't? Is this on purpose?

This revision now requires changes to proceed.Apr 19 2022, 12:03 PM
tomek requested changes to this revision.Apr 20 2022, 6:28 AM

I really like the idea of having questions and answers in the diff - I think it's really effective

services/identity/proto/identity.proto
14–15 ↗(On Diff #11479)

Does that mean, that if this token is stolen, someone can act as long as the token is valid? Should we have a way to invalidate the token? Are there any ways of reducing risk here?

40 ↗(On Diff #11479)

We need to think about having high availability during normal usage of the service and during updates. We can consider e.g. blue-green deployment, so that the downtime is as short as possible. Also, some kind of replication might be necessary so that when instance / db / etc is down, we can still serve the users.

varun added inline comments.
services/identity/proto/identity.proto
12 ↗(On Diff #11479)

No. The registration step is only for PAKE users. The login step (which is for both PAKE and wallet auth) will return a per-device token.

14–15 ↗(On Diff #11479)

Yes, if someone was able to steal a user's token and get their user ID, they could impersonate that user. We briefly considered adding a revoke API, but felt it could be added later... @ashoat what do you think?

40 ↗(On Diff #11479)

Container orchestrators like K8s handle blue-green deployment and replication very well. Have we considered using a CO for our services? CC: @ashoat @karol-bisztyga @jimpo @geekbrother

110–119 ↗(On Diff #11479)

Yeah, it's on purpose. Registration is an extra step that only PAKE users have to do. On any device, a user can register with a per-user password. Then, when they log in with that password, they'll receive a per-device token.

ashoat requested changes to this revision.Apr 20 2022, 8:08 PM

Requesting changes for one inline question

services/identity/proto/identity.proto
12 ↗(On Diff #11479)

Do we really need to structure it that way though? Why are we forcing PAKE users to first register and then log in?

14–15 ↗(On Diff #11479)

Yes, if somebody steals the token they have access. It's like a cookie. I think we can add a revoke API later

40 ↗(On Diff #11479)

This is probably out of scope for this architecture review. I understand K8s is pretty complicated and would love to understand more how it might fit into our infra before getting us involved. But open to it

110–119 ↗(On Diff #11479)

This seems reasonable, but to be honest I haven't thought enough about it to be confident in this design

This revision now requires changes to proceed.Apr 20 2022, 8:08 PM
varun requested review of this revision.Apr 21 2022, 7:50 AM
varun added inline comments.
services/identity/proto/identity.proto
12 ↗(On Diff #11479)

What's the alternative? If we wrap PAKE registration and login in a single API, then users will have to keep track of a different password for each device. That seems like a worse experience.

services/identity/proto/identity.proto
45–46 ↗(On Diff #11479)

I think the answer is yes

55 ↗(On Diff #11479)

Adding another question:
What if a user authenticates with PAKE and then later wants to authenticate with their wallet or vice versa? I think the Identity service should invalidate the existing token and create a new entry in the tokens table for the new token

149–158 ↗(On Diff #11479)

I think we actually need two tables, one for tokens and then one just for pake registration data. Also, in the tokens table I don't think we need to store users' wallet addresses... @ashoat please correct me if I'm missing something.

ashoat requested changes to this revision.Apr 22 2022, 8:21 AM

Continuing to churn here. @varun – if something is still confusing, it might be best to try and ask me about it directly rather than re-requesting review again.

services/identity/proto/identity.proto
12 ↗(On Diff #11479)

Why can't we just have them be logged in after registering? Why can't the registration return a valid token?

What's the alternative?

It should be clear from my line of questioning what I'm asking... see my very first question:

It seems like registration should also return a token, no?

It would be great if you could directly engage on this question and explain why registration should not return a token, rather than just telling me that is the case and asking me for alternatives. This is a design discussion – you need to give me a justification for your proposed design.

This revision now requires changes to proceed.Apr 22 2022, 8:21 AM
services/identity/proto/identity.proto
149–158 ↗(On Diff #11479)

Good call! For the record, pakePasswordCiphertext probably isn't all ciphertext (eg. I expect a plaintext nonce is in there), so maybe it should be named pakeData or pakeRegistrationData

varun added inline comments.
services/identity/proto/identity.proto
12 ↗(On Diff #11479)

I understand your original q now. No reason not to auto-login users after registration. Will update the RPC

varun marked 6 inline comments as done.

Address feedback

ashoat requested changes to this revision.Apr 24 2022, 4:59 PM

Another round, sorry!

services/identity/proto/identity.proto
78 ↗(On Diff #11809)

Could be helpful to clarify PAKE-related concept

83 ↗(On Diff #11809)

Can you remind me what the point of the deviceID is? I feel pretty confident that we discussed it, but in the discussion history on this diff we only talk about how each device should have its own token (which could be trivially achieved by just having each of them log in separately)

Tunnelbroker certainly needs to understand a concept like "device ID", so it might be good to have a way for Tunnelbroker to confirm a given token corresponds to a given device ID. But in that case VerifyUserTokenRequest should take deviceID

90 ↗(On Diff #11809)

Could be helpful to clarify PAKE-related concept. Is this the same as in RegistrationRequest?

102 ↗(On Diff #11809)

Could be helpful to clarify PAKE-related concept

103 ↗(On Diff #11809)

Could be helpful to clarify PAKE-related concept

123 ↗(On Diff #11809)

Could be helpful to clarify PAKE-related concept. Is this is the same as in PakeLoginRequest?

This revision now requires changes to proceed.Apr 24 2022, 4:59 PM
varun marked 5 inline comments as done.

Address latest round of feedback

I added some brief descriptions for the PAKE concepts but I think what's most helpful is the step numbers

services/identity/proto/identity.proto
83 ↗(On Diff #11809)

The deviceID helps us ensure that we only ever have one active token per device in our database. userID (the partition key) + deviceID (sort key) together form the table's primary key. It also provides some extra confirmation that a user is who they say they are.

Makes sense for VerifyUserTokenRequest to take deviceID if Tunnelbroker will use it for stronger verification. I'm thinking it should be optional, though (i.e., if a client provides it, we check it, but we'll still send back a tokenValid response even if it's omitted).

I just realized that RegisterUser will also need deviceID since we're auto-logging users in. Addressed in latest revision.

90 ↗(On Diff #11809)

It is the same as in RegistrationRequest. It's included in both places since we're going to automatically log users in.

123 ↗(On Diff #11809)

That's correct, I've added a comment inline

ashoat added inline comments.
services/identity/proto/identity.proto
134 ↗(On Diff #11902)

Nit: all lines in the codebase should be no longer than 80 chars. Can you check this whole file and make sure we maintain this rule?

169 ↗(On Diff #11902)

Are all fields optional, or is there some specific annotation we need to make to declare it optional?

services/identity/proto/identity.proto
134 ↗(On Diff #11902)

yeah i'll make sure this file maintains that rule before landing

169 ↗(On Diff #11902)

all fields are optional in proto3 syntax

tomek added inline comments.
services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

It might be a good idea for @karol-bisztyga to take a look. I remember that he mentioned that denormalizing the data in dynamo is a good practice, so we can consider merging these two tables

This revision is now accepted and ready to land.Apr 27 2022, 5:31 AM
karol added inline comments.
services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

I think these 2 tables should be definitely merged. In dynamo, a rule of thumb says that most apps need a single table and you have to have a really strong reason to make it to a multiple-tables scenario.

Not sure what is a relation between these tables (thinking in a SQL way), looks like 1:1. If this is 1:1, then we should for sure merge.
If the relation's 1:n and there would be really dozens of duplications, then we can consider splitting this into two tables.

This revision now requires changes to proceed.Apr 27 2022, 5:53 AM
varun requested review of this revision.Apr 28 2022, 6:02 AM
varun added inline comments.
services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

If we denormalized, the table would look something like this:

{
	"userID [PK]": "1",
	"pakeRegistrationData": "foo",
	"devices": [{
		"deviceID": "1",
		"token": "bar",
		"created": "2012 - 04 - 23 T18: 25: 43.511 Z",
		"authType": "password",
		"valid": true
	}]
}

My worry is that we'll exceed the 400kb size limit for items since the number of tokens per user is unbounded. That's why I opted for a normalized approach. Curious to hear your thoughts.

185 ↗(On Diff #11902)

This line should be removed

services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

@varun is this relationship 1:n? (I think it is, but would be good to clarify as that is something @karol-bisztyga asked about)

services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

It is 1:n, because a single user can have at most 1 PAKE registration but any number of tokens

services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

There would be a lot of duplication if we had the same pakeRegistrationData bytes in every item of the token table

Accepting again since it got put on my queue again because of the re-request review. Make sure you update line length before landing!

karol added inline comments.
services/identity/proto/identity.proto
178–192 ↗(On Diff #11902)

Got it, sounds like we should go with 2 tables then. Thx for explaining.

This revision is now accepted and ready to land.Apr 29 2022, 7:49 AM