Details

Reviewers

varun
• jon
tomek
marcin

Commits

rCOMMca771d73de9a: [services][blob] Upload 2/2 - Handle data chunks

Summary

Support for uploading blobs with Put RPC call. This is the most complicated part of the service as it is stateful.

Depends on D5703. This is part 2 of 2.

I implemented it slightly differently from the C++ counterpart. I created a separate PutHandler struct to organize the code. A few notes on how it works:

(parent diff)In each input stream message, client can send either holder, blob hash or data chunk.
(parent diff)Put handler starts with a PutAction::None state and waits for both holder and hash to be provided (in any order, but each only once).
(parent diff)When both of them exist, next state is determined:
- If blob hash already exists, the AssignHolder action is scheduled and the input stream is closed by the server.
- Otherwise, the UploadNewBlob action is initialized and handler is waiting for data chunks

Here is where this diff starts.

Messages containng data chunks can only be sent when handler is in the UploadNewBlob state. Chunks are uploaded to S3 if big enough. The Multipart upload session is initialized when first chunk is sent.
After input stream is closed, the finish() method is invoked, which consumes the handler and performs action depending on the enum value:
- If no action, just return
- For AssignHolder - just add a new row to the reverse_index table
- In case of UploadNewBlob - finishes the S3 upload and adds rows to both DB tables

Test Plan

Now all gRPC handlers are implemented so running integration tests against this service is possible.
In constants set port to 50053 and add the -test suffix to both table names, run the service, then run yarn run-integration-tests blob.

Diff Detail

Repository

rCOMM Comm

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

bartek created this revision.Nov 28 2022, 1:43 AM

Herald added subscribers: atul, ashoat. · View Herald TranscriptNov 28 2022, 1:43 AM

Harbormaster returned this revision to the author for changes because remote builds failed.Nov 28 2022, 1:52 AM

Harbormaster failed remote builds in B13766: Diff 18851!

Unrelated CI failure

services/blob/src/service.rs
355–359 ↗	(On Diff #18851)	I'm wondering if instead of doing `self.current_chunk.clone()` wouldn't be better to do something like this, to avoid copying: https://gist.github.com/barthap/1b6e717f8c659d269530415e7771f556

bartek added a child revision: D5729: [services][blob] Use tracing for logs.Nov 28 2022, 2:14 AM

bartek added a child revision: D5730: [services][blob] Add meaningful logs in gRPC handler.Nov 28 2022, 2:23 AM

bartek removed a child revision: D5730: [services][blob] Add meaningful logs in gRPC handler.Nov 28 2022, 2:24 AM

bartek added a child revision: D5725: [services][blob] Use Rust implementation in Dockerfile.

• jon requested changes to this revision.Nov 28 2022, 9:19 AM

• jon added inline comments.

services/blob/src/service.rs
373–374 ↗	(On Diff #18851)	really not a fan of explicit `unwrap` usage. Please use `expect()` at least so we can grep for failures
391 ↗	(On Diff #18851)	If the error message gets logged, we should have more meaningful messages
394–396 ↗	(On Diff #18851)	does this not work?

This revision now requires changes to proceed.Nov 28 2022, 9:19 AM

bartek added inline comments.Nov 28 2022, 9:27 AM

services/blob/src/service.rs
373–374 ↗	(On Diff #18851)	If the action is set, they must be existing at this point, but good catch, this should throw `anyhow` in this case
391 ↗	(On Diff #18851)	Intentional, See https://phab.comm.dev/D5700#169941 Logging is added in a child revision: https://phab.comm.dev/D5730
394–396 ↗	(On Diff #18851)	`db.put_blob_item` fails with different type of error (`anyhow`, but subject to change by ENG-2307) than this function (`Status`). Also, logging is added here in a child revision

tomek accepted this revision.Nov 28 2022, 9:54 AM

tomek added inline comments.

services/blob/src/service.rs
355–359 ↗	(On Diff #18851)	Yeah, avoiding copying would be really beneficial. I think we can handle it as a separate task. Regarding your solution: have you checked if there are some built-in solutions that achieve the same goal?

bartek added inline comments.Nov 29 2022, 12:54 AM

services/blob/src/service.rs
355–359 ↗	(On Diff #18851)	I dug through std lib code (e.g. Vec::append) and AWS buffer/stream creation too and I haven't found anything. Folks on StackOverflow say that `std::mem::swap` is the way.

tomek added inline comments.Nov 29 2022, 2:50 AM

services/blob/src/service.rs
355–359 ↗	(On Diff #18851)	Ok, thanks!

bartek added inline comments.Nov 29 2022, 3:25 AM

services/blob/src/service.rs
355–359 ↗	(On Diff #18851)	Created a task for that: https://linear.app/comm/issue/ENG-2372/blob-service-avoid-copying-data-chunk

Rebase
Refactored direct unwrapping holder and blob hash

Harbormaster completed remote builds in B13827: Diff 18922.Nov 29 2022, 3:39 AM

Rebase

Harbormaster completed remote builds in B13839: Diff 18957.Nov 29 2022, 7:26 AM

I don't see any other smells, but going to defer to @varun for final say so

services/blob/src/service.rs
394–396 ↗	(On Diff #18851)	Fair enough, we can revisit this if necessary after anyhow adoption.

varun requested changes to this revision.Nov 29 2022, 8:09 PM

varun added inline comments.

services/blob/src/service.rs
374 ↗	(On Diff #18957)	we should use `ok_or_else` instead of `ok_or`, here and everywhere else
387–391 ↗	(On Diff #18957)

This revision now requires changes to proceed.Nov 29 2022, 8:09 PM

bartek added inline comments.Nov 29 2022, 10:23 PM

services/blob/src/service.rs
374 ↗	(On Diff #18957)	No problem! This change will be made anyway in a child diff when logging is added. But can you explain what's the advantage of using `ok_or_else` over `ok_or` when we're just returning an error object? Is this because the lambda is lazily evaluated, so the error is not created unnecessarily?

Used ok_or_else
Merged nested if statements

varun added inline comments.Nov 29 2022, 10:32 PM

services/blob/src/service.rs
374 ↗	(On Diff #18957)	sorry, i'm confused how adding logging will change the behavior here and yeah, you got it, it's because the closures are lazily evaluated. not a huge deal, but it's a good practice when you are passing the result of a function call. we should also opt for `unwrap_or_else` over `unwrap_or` for the same reason.