Page MenuHomePhabricator

[services] Backup/Blob - fix handling reading done in server reactors
ClosedPublic

Authored by karol on Mar 31 2022, 6:30 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Oct 16, 8:36 AM
Unknown Object (File)
Sun, Oct 13, 4:17 PM
Unknown Object (File)
Sun, Oct 13, 4:17 PM
Unknown Object (File)
Sun, Oct 13, 4:17 PM
Unknown Object (File)
Sun, Oct 13, 4:17 PM
Unknown Object (File)
Sun, Oct 13, 4:17 PM
Unknown Object (File)
Sun, Oct 13, 4:17 PM
Unknown Object (File)
Sun, Oct 13, 4:14 PM

Details

Summary

Depends on D3562

we should suppress this as we want to have the ability to gracefully end a connection on the other side. This will result in !ok here. I think it is somehow broken and the simple bool flag doesn't give us enough information on what happened.

We should manually check if the data we received is valid

Test Plan

The same as in for instance D3535, we have to see that the connection end on the other side(so if we're looking at the server, the connection should end gracefully at the client and vice versa).

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

karol edited the test plan for this revision. (Show Details)
karol added reviewers: max, varun, tomek, jim.
tomek requested changes to this revision.Apr 6 2022, 11:41 PM

How do we plan to check if the data is valid?

This revision now requires changes to proceed.Apr 6 2022, 11:41 PM

I was planning to check if the hash is valid, we can also send a bool flag manually at the end as suggested in D3562. Anyway, relying on !ok here is a no-go.

tomek requested changes to this revision.Apr 8 2022, 12:50 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
81–85 ↗(On Diff #10878)

I think we can improve this comment by focusing on facts and changing its structure a bit, e.g. "Ending a connection on the other side results in ok flag being set to false. It makes it impossible to detect a failure based just on the flag. We plan to introduce additional mechanism which will allow distinguishing the outcomes in a task ENG-xxx." When that task is done we can replace the last sentence with the description of the solution (if necessary).

86 ↗(On Diff #10878)

This status is used to determine if we want to write a message to stdout and is used as a parameter of Finish. What actually happens when a different status is passed to Finish?

This revision now requires changes to proceed.Apr 8 2022, 12:50 AM
karol added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
81–85 ↗(On Diff #10878)

Ok, I can change the comment, but I'm not sure about the last sentence:

We plan to introduce additional mechanism which will allow distinguishing the outcomes in a task ENG-xxx.

We do? Here it's just information for devs what they should be aware of - they have to manually detect the failure. I don't see a way we could do this in a generic way.

How about this:

Ending a connection on the other side results in the `ok` flag being set to false. It makes it impossible to detect a failure based just on the flag. We should manually check if the data we received is valid
86 ↗(On Diff #10878)

Then this status is sent to the other side.

This status is used to determine if we want to write a message to stdout

This was not the intention but it indeed works like this. I just decided to print errors.

tomek requested changes to this revision.Apr 8 2022, 2:26 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
81–85 ↗(On Diff #10878)

Ok, that makes sense, but see my other comment with an idea of a generic solution

86 ↗(On Diff #10878)

So in theory, this approach could result in client receiving OK when a failure occurred, right? What do you think about adding a virtual method that validates the data, is used here and determines whether we want to return OK or something else?

This revision now requires changes to proceed.Apr 8 2022, 2:26 AM
karol added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
86 ↗(On Diff #10878)

What data do you want to validate? The data that's specific to the derived type of the reactor? If so, I honestly find it easier to handle everything in one method(here it's handleRequest) and in a case when something's wrong, just throw an error. What you're proposing is splitting the reactor-specific logic into two methods which would make it harder to implement for us later.

As for the problem itself, what might happen is that somehow the terminate method is called twice, maybe with two different states. But as I think of this now, this should never ever happen and would indicate that there's an error in the logic. So how about we could throw an error when terminate is being called more than once in a reactor's lifetime?

tomek requested changes to this revision.Apr 11 2022, 6:11 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
86 ↗(On Diff #10878)

What data do you want to validate?

The same data you mentioned in the summary

We should manually check if the data we received is valid

So the plan is to always handle validation in handleRequest, right? It has some advantages, but in defense of introducing validate method, it would be more readable, as the responsibilities of these would be clearer. Also, it would be more intuitive to implement a subclass - the function name would dictate what's need to be done.

I'm not convinced that what you described makes it possible to handle correctly the validation. We should always call terminate at most once, so if we call it with OK in OnReadDone, there would be no way to tell the client that something actually went wrong. The solution where the validation happens in handleRequest would force us to call that method even when !ok which I find a bad practice.

So generally, the flow of OnReadDone would be to always call validate and when it reports some issues we would call terminate with proper error code.

This revision now requires changes to proceed.Apr 11 2022, 6:11 AM
karol added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
86 ↗(On Diff #10878)

The same data you mentioned in the summary

Just to clarify - by this data I meant the data that we will have at the end when terminate is called.

So generally, the flow of OnReadDone would be to always call validate and when it reports some issues we would call terminate with proper error code.

This feels odd. I'd rather call validate in the terminate method. How about this? The validate method could then optionally throw an error and in such a case, there would be a proper status set in the terminate method. SAo something like this:

template <class Request, class Response>
void ServerWriteReactorBase<Request, Response>::terminate(grpc::Status status) {
  this->status = status;
  try {
    this->validate();
  } catch(std::runtime_error &e) {
    this->status = grpc::Status(grpc::StatusCode::INTERNAL, e.what());
  }
  this->terminateCallback();
  if (!this->status.ok()) {
    std::cout << "error: " << this->status.error_message() << std::endl;
  }
  if (finished) {
    return;
  }
  this->Finish(status);
  finished = true;
}
tomek requested changes to this revision.Apr 12 2022, 1:57 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
86 ↗(On Diff #10878)

This feels odd. I'd rather call validate in the terminate method

Could you explain what is odd in this approach? I think it is intuitive we're validating every chunk of data we receive, but maybe I'm missing something.

Having a validate method that should be called only when we finish doesn't sound too intuitive at first, but it makes some sense: we want to validate if the whole communication was successful. But at the same time, it would require storing all the data in memory which might be an issue - we should try to avoid being constrained by memory.

Regarding your proposed method, it has some issues:

  1. If the status is not ok, do we really need to validate first or can we just finish with that status?
  2. When do we want to call terminateCallback? We're calling it even when status is not ok, but we're not doing that when validation fails

We need to think through it and find some consistent solution.

This revision now requires changes to proceed.Apr 12 2022, 1:57 AM
karol added inline comments.
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
86 ↗(On Diff #10878)

Regarding your proposed method, it has some issues:

  1. We can finish with that status I think.
  2. Ah yes, so there's an issue in the snippet I created. terminateCallback should always be called no matter what the status is.

Could you explain what is odd in this approach? I think it is intuitive we're validating every chunk of data we receive, but maybe I'm missing something.

We can throw in handleRequest. Adding an additional virtual method that has to be implemented doesn't seem handy to me at all, sorry. There's going to be just more boilerplate and I don't see an advantage of decomposing this.

Some pseudocode :

Having this:

enum STATE {
  READ_A,
  READ_B,
  READ_C,
};

We now have this:

handleRequest() {
  switch(this->state) {
    case READ_A: {
      // read
      // validate - throw if err
    }
    case READ_B: {...}
    case READ_C: {...}
  }
}

And you want to convert it to this:

handleRequest() {
  switch(this->state) {
    case READ_A: {
      // read
    }
    case READ_B: {...}
    case READ_C: {...}
  }
}

validate() {
  switch(this->state) {
    case READ_A: {
      // validate - throw if err
    }
    case READ_B: {...}
    case READ_C: {...}
  }
}

Do I understand this correctly?

tomek requested changes to this revision.Apr 12 2022, 8:59 AM

Requesting changes one last time, but I agree with your proposed solution. So please update the comments (and terminate method) appropriately.

services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
86 ↗(On Diff #10878)

Thanks for explaining this! Although I think that there's a lot of value in splitting the responsibilities here, it's also the case that with the current approach with a switch it's harder to implement. So I agree to avoid having to call validate after each message.

This revision now requires changes to proceed.Apr 12 2022, 8:59 AM

A bit of logic is duplicated in terminate methods, but I don't think we can improve it significantly

services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
88 ↗(On Diff #11380)

terminate method has ServerBidiReactorStatus as the first argument, but here we're passing grpc::Status::OK. Is it ok?

This revision is now accepted and ready to land.Apr 13 2022, 2:37 AM
services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
77 ↗(On Diff #11380)

Is it possible for terminate to be called twice in this class? What happens if we call Finish twice? Should we avoid that?

services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
77 ↗(On Diff #11380)

terminate may be called multiple times, Finish not - for Finish I fix it in https://phabricator.ashoat.com/D3664

88 ↗(On Diff #11380)

Weird that it compiles, maybe it implicitly calls the ServerBidiReactorStatus ctor. I'm going to change this to an explicit call.

services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
77 ↗(On Diff #11380)

Ok

88 ↗(On Diff #11380)

Yeah, that makes sense. Do you think we should make our constructor explicit?

services/backup/docker-server/contents/server/src/Reactors/server/base-reactors/ServerBidiReactorBase.h
88 ↗(On Diff #11380)

Honestly I don't mind