Page MenuHomePhabricator

[web] Add function (and an 'enum') for creating unique device ID on web
ClosedPublic

Authored by inka on Jul 21 2022, 2:20 AM.
Tags
None
Referenced Files
F3391255: D4587.id14745.diff
Sat, Nov 30, 2:59 AM
F3391131: D4587.id15304.diff
Sat, Nov 30, 2:28 AM
F3391038: D4587.id15312.diff
Sat, Nov 30, 2:04 AM
Unknown Object (File)
Tue, Nov 26, 11:10 PM
Unknown Object (File)
Sat, Nov 23, 8:12 AM
Unknown Object (File)
Sat, Nov 23, 7:50 AM
Unknown Object (File)
Sat, Nov 23, 7:38 AM
Unknown Object (File)
Sat, Nov 23, 7:38 AM

Details

Summary

Add a function and an auxiliary 'enum' (created via Object.freeze) for creating unique
device ID in javascript/flow.
Linear issue: https://linear.app/comm/issue/ENG-1279/initialize-deviceid-on-clients

Test Plan

Run provided tests.

Diff Detail

Repository
rCOMM Comm
Branch
inka/userID
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change UUID to ID in all names for clarity

I shouldn't be a first-pass reviewer except in cases listed here. It previously made sense to include me since you were updating copy, but now that the work is not copy-related I should be left off the review

Right, I'll get more familiar with the Notion notes to get things like this right in the future.

max requested changes to this revision.Jul 25 2022, 4:35 AM

Thanks for changing the UUID to ID @inka !

In D4587#131634, @max wrote:

I think we should change the function from getDeviceUUID to generateDeviceID because we are generating it here not getting it.

Following my previous comment, I'm pretty sure that the best name for such function is generateDeviceID and not get... because we are not getting it from the storage or from somewhere, we are generating it.
For me, the current function name is pretty confusing.

This revision now requires changes to proceed.Jul 25 2022, 4:35 AM
web/utils/device-uuid.js
12 ↗(On Diff #14774)

Actually since I've been using a hex encoding for now, I believe this was correct. 32 bytes = 8*32 bits = 4*64 bits = 64 hex digits. (I've also checked manually that it does produce 64 hex digits)
But again - I'm not sure about the encoding. If hex is ok, then I think we should change the regex in tunnelbroker to accept only hex format.

I've re-checked it, it returns a 64 characters string. Sorry about this, the hex is ok.

15 ↗(On Diff #14774)

The last thing is that we have a format like prefix:suffix, please check the comment in ENG-1279. And you need to check if the value from your function is valid in our regex which I've mentioned in the task comment.
It's actually not, because you have - delimiter instead of : so the regex always fails.

web/utils/device-uuid.js
15 ↗(On Diff #14774)

Right, I'll fix that. Thank you!

Fix the return value to match the specification

web/utils/device-id.js
12 ↗(On Diff #14832)

Thanks for the changes! Why we've updated the string to the upper case?

tomek requested changes to this revision.Jul 25 2022, 9:11 AM
tomek added inline comments.
web/utils/device-id.js
12 ↗(On Diff #14832)

The regex mentioned by @max https://github.com/CommE2E/comm/blob/f074a9c13b3ba5c33cbdb9a4c1bde54f0d7330ce/services/tunnelbroker/src/Constants.h#L48 contains 64 chars [a-zA-Z0-9] (so about 62 chars to choose from). It gives 62^64 (about 2^384) possible ids. The solution with 32 random bytes gives only 8^32 = 2^96 possible ids. It is still plenty, but we decrease the number significantly. Could you find a solution where we choose from the whole available space of valid ids?

This revision now requires changes to proceed.Jul 25 2022, 9:11 AM
web/utils/device-id.js
12 ↗(On Diff #14832)

I've changed to uppercase because on mobile we already had a function creating random hex strings but it produces uppercase hex. Regardless, I'll change that according to Tomek's comment, so it's no longer relevant.

Change how the ID is generated

tomek requested changes to this revision.Jul 26 2022, 2:53 AM
tomek added inline comments.
web/utils/device-id.js
15–21 ↗(On Diff #14914)

Splitting this string isn't necessary - you can use charAt function of string

16–19 ↗(On Diff #14914)

It is a good idea to give every constant a meaningful name - create a const. It is especially important when the same value is used in multiple places.

20 ↗(On Diff #14914)

The generated chars aren't uniformly distributed: the number of available symbols doesn't divide number of possible values of byte (62 doesn't divide 256). This causes an issue when using this logic, because some chars would appear more often than others (0 will be used when byte value is 0, 62, 124, 186, and 248, at the same time Z will be used when the value is 61, 123, 185, and 247, so we can expect 0 to be 25% more frequent than Z).

Generating secure, uniform string is complicated and it might be a good idea to search for a library that does that. But in our case the uniformity isn't required - the lack of it slightly changes the probability of some ids, but it shouldn't be a threat to our security. I'd probably keep the current implementation, but I'm also curious what @ashoat thinks about it.

This revision now requires changes to proceed.Jul 26 2022, 2:53 AM

Cleanup code - add a constant and remove redundant code

web/utils/device-id.js
20 ↗(On Diff #14914)

We have code that does it in the same way in native/cpp/CommonCpp/CryptoTools/Tools.cpp (https://github.com/CommE2E/comm/blob/master/native/cpp/CommonCpp/CryptoTools/Tools.cpp#:~:text=Tools%3A%3A-,generateRandomString,-(size_t%20size%2C)
So if we change it I suppose we should change it in this file as well

inka retitled this revision from Add function (and an 'enum') for creating unique device ID in js to [web] Add function (and an 'enum') for creating unique device ID in js.Jul 26 2022, 5:10 AM
tomek added inline comments.
web/utils/device-id.js
31 ↗(On Diff #14959)

It might be a good idea to return null here, or even throw - we should never expect this code to be executed. Probably returning null is better, because the code that calls this function can then decide what to do with the result - if throwing is a good solution.

Returning an empty string isn't ideal solution because the client should be warned that this function might fail. If we simply return empty string, it can get accepted and used without checking if it's empty - for optional string an explicit action needs to be performed.

Adding @ashoat because we need to decide if having non-uniform distribution of ids is good enough for us.

ashoat requested changes to this revision.Jul 26 2022, 11:18 AM

I think we should find a way to generate a uniform distribution of IDs here... hopefully it isn't too hard. I think this is a good idea:

Generating secure, uniform string is complicated and it might be a good idea to search for a library that does that.

If this ends up being too difficult, I'm open to landing this. But would be great to do a little bit of research first to see if we can find something.

This revision now requires changes to proceed.Jul 26 2022, 11:18 AM

Rewrite the function to make the random part of the ID
uniformly distributed

inka retitled this revision from [web] Add function (and an 'enum') for creating unique device ID in js to [web] Add function (and an 'enum') for creating unique device ID on web.Aug 3 2022, 7:20 AM
inka edited the summary of this revision. (Show Details)
inka edited the test plan for this revision. (Show Details)
inka removed a reviewer: ashoat. inka added 1 blocking reviewer(s): tomek.
tomek requested changes to this revision.Aug 3 2022, 8:59 AM
tomek added inline comments.
web/utils/device-id.js
27 ↗(On Diff #15264)

What do you think about making it consistent with generateRandomString by adding an invariant at the beginning of the function?

web/utils/device-id.test.js
10 ↗(On Diff #15264)

Maybe use deviceIDTypes.KEYSERVER instead?
Also, this line exceeds our limit (80 chars). It could be the case that prettier is ok with that because this is a test description, but it might be a good idea to check if we can reduce this length (by e.g. splitting this string into multiple lines).

13 ↗(On Diff #15264)
This revision now requires changes to proceed.Aug 3 2022, 8:59 AM
web/utils/device-id.js
27 ↗(On Diff #15264)

We should avoid throwing raw strings, please throw Errors

web/utils/device-id.js
27 ↗(On Diff #15264)

Since it is not possible anyway for type to not be one of deviceIDTypes.KEYSERVER, deviceIDTypes.WEB, deviceIDTypes.MOBILE (as it is of type DeviceIDType), and the throw was only there so that flow wouldn't show "Cannot expect string as the return type of function because string [1] is incompatible with implicitly-returned undefined." error, I'll just change the last else if to an else.

Make the code prettier and more readable

tomek requested changes to this revision.Aug 4 2022, 2:04 AM

This looks great and we're really close to finishing it!

web/utils/device-id.js
20–26 ↗(On Diff #15301)

We're using false so that an invariant would always be violated. We want that because, in every valid scenario, we should return before it.

27 ↗(On Diff #15264)

The issue with this approach is that introducing new DeviceIDType becomes a lot harder. This function won't warn a programmer that a new value should be handled in a different way and instead would happily return mobile... for it.
I think we should keep else if and add an invariant. We can put it at the bottom to avoid the duplication in logic.

This revision now requires changes to proceed.Aug 4 2022, 2:04 AM

Fix the casing logic in generateDeviceID

Add an empty line after @flow

Looks ok to me with a few minor comments.

web/utils/device-id.js
17 ↗(On Diff #15308)

Not sure we need to repeat DEVICEID_CHAR_LENGTH twice.

30 ↗(On Diff #15308)

I think the comment should look like this.

This revision is now accepted and ready to land.Aug 4 2022, 7:06 AM
web/utils/device-id.test.js
28 ↗(On Diff #15308)

It seems like we're repeating the same regex 3 times. It's a good point to create one and reuse it.

web/utils/device-id.js
30 ↗(On Diff #15308)

I think since the function accepts DeviceIDType, and as such accepts only values that actually are present in the deviceIDTypes, reaching this point means that the function doesn't handle all types declared in deviceIDTypes. So it actually makes more sense to say "Unhandled" - it's an information for the developer that the function has to be corrected, to handle all declared types.

web/utils/device-id.js
30 ↗(On Diff #15308)

I think since the function accepts DeviceIDType, and as such accepts only values that actually are present in the deviceIDTypes, reaching this point means that the function doesn't handle all types declared in deviceIDTypes. So it actually makes more sense to say "Unhandled" - it's an information for the developer that the function has to be corrected, to handle all declared types.

Ok, lets use the original one.