Page MenuHomePhabricator

[services] Backup - fix handling reading done in client reactors
ClosedPublic

Authored by karol on Mar 31 2022, 6:31 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 16, 1:51 AM
Unknown Object (File)
Fri, Nov 15, 11:25 PM
Unknown Object (File)
Fri, Nov 15, 11:25 PM
Unknown Object (File)
Fri, Nov 15, 11:25 PM
Unknown Object (File)
Fri, Nov 15, 11:25 PM
Unknown Object (File)
Fri, Nov 15, 11:25 PM
Unknown Object (File)
Fri, Nov 15, 9:03 AM
Unknown Object (File)
Fri, Nov 15, 9:03 AM

Details

Summary

Depends on D3582

We should see a proper status in the terminateCallback and the error log(if an error occurs).

Test Plan
cd services
yarn run-backup-service

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

tomek requested changes to this revision.Apr 6 2022, 11:44 PM

How do we plan to verify the data?

services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

Why do we need to change the order here? It looks like the only difference is that we will write the error every time this method is called instead of once

This revision now requires changes to proceed.Apr 6 2022, 11:44 PM

I was planning to check if the hash is valid, we can also send a bool flag manually at the end as suggested in D3562. Anyway, relying on !ok here is a no-go.

services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

We need that because terminate is called from several places, so we should assume there may be a situation when it's called twice and an error occurs between these two calls. So we have something like this:

terminate(OK);
...
terminate(Err);

Now, we want to have that error printed out, right? Of course, there may be a situation when it gets called twice and there's an error for both calls:

terminate(Err);
...
terminate(Err);

In such a case, we will have this error printed twice. If that bothers you, I can add a bool flag preventing that.

tomek requested changes to this revision.Apr 8 2022, 12:55 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

This is confusing. What is the scenario in which terminate is called twice? In this function, we set done to true, so after it is set, there shouldn't be any operation that can fail. Or maybe, the meaning of done is different and we should find a better name.

This revision now requires changes to proceed.Apr 8 2022, 12:55 AM
karol added inline comments.
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

This is confusing. What is the scenario in which terminate is called twice?

It should never happen, but we can assume it will somehow. There's no hard check for this, so maybe there's some edge case for this.

In this function, we set done to true, so after it is set, there shouldn't be any operation that can fail. Or maybe, the meaning of done is different and we should find a better name.

The point is to not invoke StartWritesDone multiple times.

tomek requested changes to this revision.Apr 8 2022, 3:19 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

So when this->done we should add a warning Terminate function was called multiple times, which should never happen. Please investigate!

This revision now requires changes to proceed.Apr 8 2022, 3:19 AM
karol added inline comments.
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

Yes, as I suggested in D3582, we should probably throw in such a case. That would terminate the connection with an error and we could just set it to an internal server error.
What do you think?

tomek requested changes to this revision.Apr 11 2022, 6:20 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

What would be the consequence of calling terminate twice? If we can continue execution I'd prefer to just log an error. But if it is not possible or it is likely to cause some serious issues, it's safer to throw. Overall, it is better to recover from such issues instead of always throwing.

This revision now requires changes to proceed.Apr 11 2022, 6:20 AM
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
56–61 ↗(On Diff #10879)

Ok so let's just log.

karol edited the summary of this revision. (Show Details)

add log that terminate was called multiple times

tomek requested changes to this revision.Apr 12 2022, 9:07 AM
tomek added inline comments.
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
55 ↗(On Diff #11331)

Shouldn't we call this after this->done check?

85–90 ↗(On Diff #11331)

We also need to update comments - just like in D3582

This revision now requires changes to proceed.Apr 12 2022, 9:07 AM
services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
55 ↗(On Diff #11331)

yes

56–61 ↗(On Diff #10879)

I spent some time on this and I think that terminate will be called twice when for example there are no more reads:

  • from OnReadDone
  • from OnDone

And this is just fine. I think we should allow it. What we should prevent is overwriting the error status. So if there's an error and then we somehow call terminate with the status OK, we should not suppress the error.

I'm going to push changes for this to reduce cycles, I think this is a valid change. If you feel otherwise, I can always revert.

tomek requested changes to this revision.Apr 13 2022, 2:46 AM

Do we need to introduce validate just like in D3582?

services/backup/docker-server/contents/server/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
57–59 ↗(On Diff #11383)

Shouldn't we validate before telling that there was an error? Or is this case different than D3582?

56–61 ↗(On Diff #10879)

Agree, it's a good practice to make such changes to reduce cycles.

Regarding terminate - sounds reasonable.

This revision now requires changes to proceed.Apr 13 2022, 2:46 AM
This revision is now accepted and ready to land.Apr 13 2022, 3:56 AM