Paths

Table of Contentst

-
services/backup/src/service/
-
backup/
-
src/
-
service/
-
handlers/
-
create_backup.rs
-
mod.rs

[services][backup] CreateBackup 1/3 - create handler module
ClosedPublic
Actions

Authored by bartek on Jan 9 2023, 3:54 AM.

Details

Reviewers

varun
tomek
• jon
• max

Commits

rCOMM74955fedf8ff: [services][backup] CreateBackup 1/3 - create handler module

Summary

Created module structure for gRPC service handlers to keep it organized:

- mod service (existing grpc service mod)
  - mod handlers (groups all handler submodules)
    - create_backup
    - add_attachments
    - ...

Scaffolded the CreateBackupHandler structure, containing the whole endpoint logic. This is 1:1 analogy with Blob service PutHandler
Implemented usage of this structure in the endpoint handler function.

The CreateBackupHandler consists of two phases

Non-data mode - processing non-chunk inputs like device_id, user_id etc.
Data mode - processing backup data chunks
Finish - postprocessing, saving to db etc.

Depends on D6181

Test Plan

This does nothing yet, subsequent diffs will add logic to this code.

Diff Detail

Repository

rCOMM Comm

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

bartek created this revision.Jan 9 2023, 3:54 AM

bartek held this revision as a draft.

Herald added subscribers: atul, ashoat. · View Herald TranscriptJan 9 2023, 3:54 AM

bartek added a child revision: D6197: [services][backup] Add helper utilities.Jan 9 2023, 4:00 AM

Harbormaster completed remote builds in B15157: Diff 20668.Jan 9 2023, 4:08 AM

bartek edited the summary of this revision. (Show Details)Jan 9 2023, 4:39 AM

bartek published this revision for review.Jan 9 2023, 5:35 AM

bartek added inline comments.

services/backup/src/service/mod.rs
53 ↗	(On Diff #20668)	This imports `UserId`, `DeviceId`, `KeyEntropy` etc. I use the local import as this `Data` has the same name, but is different for each endpoint.

Rebase

Harbormaster completed remote builds in B15192: Diff 20704.Jan 9 2023, 7:36 AM

• jon added inline comments.Jan 10 2023, 2:45 PM

services/backup/src/service/mod.rs
61–82 ↗	(On Diff #20704)	I think we should revisit data modeling for backup https://linear.app/comm/issue/ENG-1052. Having to piece together each individual field is likely a smell; a lot of these fields are related, and I don't think it makes sense to treat them as individual messages. For example, `NewCompactionHash` and `NewCompactionChunk` both describe the same bit of information. Not to mentions there's a non-zero amount of overhead for serialization and deserialization of each message. Handling large files seems to be a non-goal of gRPC. Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy. We could look into using presigned urls and having the client upload the object directly.

100% agree we need to rethink those things, but also want to make sure we don't block the Rust refactor on it

The original design here is pretty questionable

As for transferring large files via gRPC, I would probably treat that as a separate task than the initial data model rethinking (which I think would be easier to address)

Probably a good argument here for doing this before we start actually using the service

In D6196#inline-41692, @jon wrote:

Having to piece together each individual field is likely a smell [...]

Agree, this was already discussed during the Blob service refactor.
A solution, that would significantly reduce the complexity of this code and make the API simpler is to at least group non-data inputs together, as I proposed here for Blob service: https://linear.app/comm/issue/ENG-937#comment-1a039b4a

In D6196#inline-41692, @jon wrote:

Handling large files seems to be a non-goal of gRPC.

This should be raised long ago during Blob service design. Dealing with large data through gRPC is possible, but cumbersome, because e.g. in the PullBackup endpoint we need to mix business and transport logic by counting individual message field bytes, then buffering and shrinking data bytes length accordingly and I hate it (example).

In D6196#inline-41692, @jon wrote:

We could look into using presigned urls and having the client upload the object directly.

I'm open and enthusiastic to this solution, but let's create a Linear task to discuss this further. I've already worked with presigned URLs - this is how Expo's EAS builds and submissions are stored.

In D6196#186463, @ashoat wrote:

100% agree we need to rethink those things, but also want to make sure we don't block the Rust refactor on it

Right, changing the API design isn't a part of this task. One advantage is that the Rust service should be much easier to adapt to API changes than the old C++ one.

tomek accepted this revision.Jan 11 2023, 4:27 AM

tomek added inline comments.

services/backup/src/service/handlers/create_backup.rs
22 ↗	(On Diff #20704)	Usually it is more maintainable to use enum state instead of a boolean. It makes a difference when we decide to add a second flag.
services/backup/src/service/mod.rs
21–22 ↗	(On Diff #20704)	Maybe I'm missing something but does `pub(self)` make a difference? https://github.com/rust-lang/rfcs/blob/master/text/1422-pub-restricted.md#semantics As noted above, the definition means that pub(self) item is the same as if one had written just item.
55 ↗	(On Diff #20704)	Is it a good idea to log the whole request? Can that result in logging binary data?

bartek added inline comments.Jan 11 2023, 4:35 AM

services/backup/src/service/handlers/create_backup.rs
22 ↗	(On Diff #20704)	I have no strong opinion on this, I can refactor to enum
services/backup/src/service/mod.rs
21–22 ↗	(On Diff #20704)	No difference, I just wanted to be explicit

bartek added inline comments.Jan 11 2023, 4:46 AM

services/backup/src/service/mod.rs
55 ↗	(On Diff #20704)	In case of stream requests, it won't print any actual inputs, but `message: Streaming`: CreateNewBackup request: Request { metadata: MetadataMap { headers: {"te": "trailers", "content-type": "application/grpc", "user-agent": "tonic/0.8.3"} }, message: Streaming, extensions: Extensions } In fact, I'm wondering if it's worth printing this object - there isn't much useful data here. The main reason I put this log is to inform that a new request has just started processing.

varun accepted this revision.Jan 11 2023, 9:20 PM

This revision is now accepted and ready to land.Jan 11 2023, 9:20 PM

Refactored is_data_mode boolean to HandlerState enum
Removed user_id field from logs

Harbormaster completed remote builds in B15303: Diff 20878.Jan 12 2023, 9:19 AM

tomek accepted this revision.Jan 16 2023, 4:48 AM

Closed by commit rCOMM74955fedf8ff: [services][backup] CreateBackup 1/3 - create handler module (authored by bartek). · Explain WhyJan 16 2023, 8:17 AM

This revision was automatically updated to reflect the committed changes.

bartek added a commit: rCOMM74955fedf8ff: [services][backup] CreateBackup 1/3 - create handler module.

Revision Contents
Changeset List

Path

Size

services/

backup/

src/

service/

handlers/

create_backup.rs

87 lines

mod.rs

76 lines

Diff 20982

View Options

services/backup/src/service/handlers/create_backup.rs

Show First 20 Lines • Show All 638 Lines • ▼ Show 20 Lines	return {
cookieInsertedThisRequest: true,		cookieInsertedThisRequest: true,
isScriptViewer: false,		isScriptViewer: false,
};		};
}		}

type UserCookieCreationParams = {		type UserCookieCreationParams = {
platformDetails: PlatformDetails,		platformDetails: PlatformDetails,
deviceToken?: ?string,		deviceToken?: ?string,
		primaryIdentityPublicKey?: ?string,
};		};

// The result of this function should never be passed directly to the Viewer		// The result of this function should never be passed directly to the Viewer
// constructor. Instead, it should be passed to viewer.setNewCookie. There are		// constructor. Instead, it should be passed to viewer.setNewCookie. There are
// several fields on UserViewerData that are not set by this function:		// several fields on UserViewerData that are not set by this function:
// sessionID, sessionIdentifierType, cookieSource, and ipAddress. These		// sessionID, sessionIdentifierType, cookieSource, and ipAddress. These
// parameters all depend on the initial request. If the result of this function		// parameters all depend on the initial request. If the result of this function
// is passed to the Viewer constructor directly, the resultant Viewer object		// is passed to the Viewer constructor directly, the resultant Viewer object
// will throw whenever anybody attempts to access the relevant properties.		// will throw whenever anybody attempts to access the relevant properties.
async function createNewUserCookie(		async function createNewUserCookie(
userID: string,		userID: string,
params: UserCookieCreationParams,		params: UserCookieCreationParams,
): Promise<UserViewerData> {		): Promise<UserViewerData> {
const { platformDetails, deviceToken } = params;		const { platformDetails, deviceToken, primaryIdentityPublicKey } = params;
const { platform, ...versions } = platformDetails \|\| defaultPlatformDetails;		const { platform, ...versions } = platformDetails \|\| defaultPlatformDetails;
const versionsString =		const versionsString =
Object.keys(versions).length > 0 ? JSON.stringify(versions) : null;		Object.keys(versions).length > 0 ? JSON.stringify(versions) : null;

const time = Date.now();		const time = Date.now();
const cookiePassword = crypto.randomBytes(32).toString('hex');		const cookiePassword = crypto.randomBytes(32).toString('hex');
const cookieHash = bcrypt.hashSync(cookiePassword);		const cookieHash = bcrypt.hashSync(cookiePassword);
const [[cookieID]] = await Promise.all([		const [[cookieID]] = await Promise.all([
createIDs('cookies', 1),		createIDs('cookies', 1),
deviceToken ? clearDeviceToken(deviceToken) : undefined,		deviceToken ? clearDeviceToken(deviceToken) : undefined,
]);		]);

const cookieRow = [		const cookieRow = [
cookieID,		cookieID,
cookieHash,		cookieHash,
userID,		userID,
platform,		platform,
time,		time,
time,		time,
deviceToken,		deviceToken,
versionsString,		versionsString,
		primaryIdentityPublicKey,
];		];
const query = SQL`		const query = SQL`
INSERT INTO cookies(id, hash, user, platform, creation_time, last_used,		INSERT INTO cookies(id, hash, user, platform, creation_time, last_used,
device_token, versions)		device_token, versions, public_key)
VALUES ${[cookieRow]}		VALUES ${[cookieRow]}
`;		`;
await dbQuery(query);		await dbQuery(query);
return {		return {
loggedIn: true,		loggedIn: true,
id: userID,		id: userID,
platformDetails,		platformDetails,
deviceToken,		deviceToken,
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines