services/backup/blob_client/src/put_client.rs
28 ↗	(On Diff #16165)	Still not convinced that Arc needs to be used with lazy_static https://linear.app/comm/issue/ENG-1716#comment-5a5f8aef
31 ↗	(On Diff #16165)	What are error messages used for? and do we need to really manage them? Should they just be emitted and moved on?

• karol added inline comments.Sep 2 2022, 2:04 PM

services/backup/blob_client/src/put_client.rs
28 ↗	(On Diff #16165)	https://linear.app/comm/issue/ENG-1732/remove-arc-from-lazy-static
31 ↗	(On Diff #16165)	I don't think so. The purpose is to detect any errors in the async functions and break the flow yielding an error to the c++ if they occur. This was the simplest/fastest way of doing this but I'm open to suggestions. We can change it but I don't think we should just print them and move on. `panic` is not a way to go either.

• karol mentioned this in D5006: [services] Backup - Blob Put Client - Add helper functions.Sep 2 2022, 2:05 PM

• karol mentioned this in D5011: [services] Backup - Blob Put Client - Add write.

• karol mentioned this in D5012: [services] Backup - Blob Put Client - Add terminate.Sep 2 2022, 2:14 PM

• jon requested changes to this revision.Sep 2 2022, 5:40 PM

• jon added inline comments.

services/backup/blob_client/src/put_client.rs
31 ↗	(On Diff #16165)	The purpose is to detect any errors in the async functions and break the flow yielding an error to the c++ if they occur. If we enter an invalid state, then we should add some logic to get back to a valid state if possible. The logic right now in the other diffs is that some errors will be emitted, but the functions just starts returning `Ok(())`'s which means everything is silently failing from the perspective of C++. We can change it but I don't think we should just print them and move on If all we can do is just report the status, then we should print it and return that it failed. I don't see the value in indefinitely accumulating error messages. panic is not a way to go either. Agreed, which is also why I don't like `unwrap()`

This revision now requires changes to proceed.Sep 2 2022, 5:40 PM

• karol added inline comments.Sep 5 2022, 1:24 PM

services/backup/blob_client/src/put_client.rs
31 ↗	(On Diff #16165)	If we enter an invalid state, then we should add some logic to get back to a valid state if possible. What do you mean exactly? I don't think this is a good idea. I think, if we enter an invalid state, we should break the connection and return a grpc status with an error. The logic right now in the other diffs is that some errors will be emitted, but the functions just starts returning `Ok(())`'s which means everything is silently failing from the perspective of C++. If that's what's happening, then it is a mistake, I'm going to look at this code once again, if there's ever an error, we should exit with a proper `Result`(`Err`). I generally agree that the error handling here needs to be repaired big time. I just did it quickly and dirty. I created a task for this https://linear.app/comm/issue/ENG-1734/improve-error-handling-system-in-the-blob-client

• karol requested review of this revision.Sep 6 2022, 7:43 AM

What do you mean exactly?

If the client gets to be in an unhealthy state, then there should be some logic to get try and get the client back into a healthy state and retry the get/put. Currently, an unhealthy client will both report an error and drop the message.

I don't think this is a good idea. I think, if we enter an invalid state, we should break the connection and return a grpc status with an error.

Yes, but the message also gets dropped. There should be some re-try logic (which can be done in another ticket).

• jon requested changes to this revision.Sep 6 2022, 5:13 PM

This revision now requires changes to proceed.Sep 6 2022, 5:13 PM

If the client gets to be in an unhealthy state, then there should be some logic to get try and get the client back into a healthy state and retry the get/put. Currently, an unhealthy client will both report an error and drop the message.

Sorry, but I disagree. I think when the client gets into an unhealthy state, it should drop the connection and return an error.

If we wanted to really attempt to retry, we should at least add some logic of when to do this and when not. If we get an error like provided holder already exists in the database, then what is the point of retrying? If we try again, it won't disappear magically I guess, will it? And adding logic for this will probably result in adding an additional layer of abstraction or something - we could have a type of error that has a flag shouldRetry, something like this. All in all, don't think we should do it here and I don't think this is necessary and important either.

The way I see it now is that we could have a button on the client side in the UI saying retry when there's an error returned.

If we get an error like provided holder already exists in the database, then what is the point of retrying? If we try again, it won't disappear magically I guess, will it?

Wouldn't that mean that a previous put was already successful? Seems like a state that shouldn't ever occur, but I agree it should be handled by dropping.

I think the concept of the client and message are being fused here. There's a difference between a message being handled, and the client being healthy.

I'm refering to the lifecycle of a BidiClient, not the lifecycle of a message passed through a BidiClient.

Either way, seems like we are just pushing this through, and creating tasks for others to address concerns.

This revision is now accepted and ready to land.Sep 7 2022, 5:29 PM

rebase

Harbormaster completed remote builds in B11982: Diff 16437.Sep 8 2022, 11:07 AM

Closed by commit rCOMMfe23b0156554: [services] Backup - Blob Put Client - Add essential objects. · Explain WhySep 8 2022, 11:10 AM

This revision was automatically updated to reflect the committed changes.

• karol added a commit: rCOMMfe23b0156554: [services] Backup - Blob Put Client - Add essential objects.