⚙ D3467 [services] Backup - Add server reactor implementations

Herald added subscribers: • benschac, atul, • adrian and 2 others. · View Herald TranscriptMar 21 2022, 9:39 AM

Harbormaster returned this revision to the author for changes because remote builds failed.Mar 21 2022, 9:56 AM

Harbormaster failed remote builds in B7488: Diff 10543!

• karol edited the summary of this revision. (Show Details)Mar 21 2022, 10:11 AM

• karol edited the test plan for this revision. (Show Details)

• karol added reviewers: • max, tomek, varun, • jim.

• karol added a parent revision: D3466: [services] Backup - Add blob client.

• karol added a child revision: D3468: [services] Backup - Update service implementation.

empty - retrigger CI

• karol retitled this revision from [services] Backup - Add server reactor implementations to [draft] [services] Backup - Add server reactor implementations.Mar 21 2022, 10:18 AM

Harbormaster completed remote builds in B7497: Diff 10553.Mar 21 2022, 10:36 AM

• karol requested review of this revision.Mar 21 2022, 10:36 AM

• karol added inline comments.Mar 22 2022, 10:20 AM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
71 ↗	(On Diff #10553)	Here we have to decide how exactly the backup id will be generated. One thing is certain - the holder has to be unique across the system. So if the backup id is just random bytes, we can add a prefix - user id for example.

• karol retitled this revision from [draft] [services] Backup - Add server reactor implementations to [services] Backup - Add server reactor implementations - create new backup reactor.Mar 24 2022, 9:16 AM

• karol edited the summary of this revision. (Show Details)

• karol planned changes to this revision.Mar 24 2022, 9:19 AM

• karol mentioned this in D3468: [services] Backup - Update service implementation.

• karol edited the summary of this revision. (Show Details)

• karol added a child revision: D3516: [services] Backup - Add server reactor implementations - send log reactor.

update

• karol edited the summary of this revision. (Show Details)Mar 24 2022, 9:32 AM

• karol added a parent revision: D3517: [services] Backup - Add server reactor implementations - pull backup reactor.

• karol retitled this revision from [services] Backup - Add server reactor implementations - create new backup reactor to [draft] [services] Backup - Add server reactor implementations - create new backup reactor.Mar 24 2022, 9:47 AM

Harbormaster completed remote builds in B7583: Diff 10654.Mar 24 2022, 9:48 AM

Please update the logs

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
45–47 ↗	(On Diff #10654)	Why do we need to mock this function? Usually it's better to first introduce a function and then use it - by taking that approach you make sure that your code is always correct and it's a lot harder to forget about fixing the logic in the future
81–83 ↗	(On Diff #10654)	Can we somehow recover from this? Can we initialize it if it's not initialized?

This revision now requires changes to proceed.Mar 24 2022, 3:12 PM

I have very little context on the reactor stuff, and haven't reviewed the other diffs in this stack... so I don't think I can do a very thorough review here. @palys-swm, can you clarify what you wanted me to look at?

Removing from my queue until then, happy to review once it's clarified!

The logs are going to be removed once this will no longer be a draft.
Once this is accepted as a draft, I'm going to remove the draft label and request a review as for a normal diff.
Back to your queue.

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
45–47 ↗	(On Diff #10654)	I created a task for this https://linear.app/comm/issue/ENG-918/implement-generating-backup-id. I needed to have some kind of mock here as I needed to see that the flow works properly for multiple backups added sequentially and the backup id had to be unique.
81–83 ↗	(On Diff #10654)	It is removed in https://phabricator.ashoat.com/D3488. However, at the time of this diff, we could not. This was a critical error that was supposed to break the connection - it meant that a situation that should never occur happened.

In D3467#95719, @ashoat wrote:

I have very little context on the reactor stuff, and haven't reviewed the other diffs in this stack... so I don't think I can do a very thorough review here. @palys-swm, can you clarify what you wanted me to look at?

Removing from my queue until then, happy to review once it's clarified!

Sorry for the confusion - just wanted for you to take a look on Karol's inline comment

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
81–83 ↗	(On Diff #10654)	It's usually a lot better to not introduce an error at all than to introduce and fix it later. Is there a reason why we should break this rule here? I guess it might be challenging to inverse the order here...
71 ↗	(On Diff #10553)	We have a couple of options: we can try using some UUID, or use a prefix as you suggested. With userID as a prefix we might have some privacy issues - not sure if they are real, but they need to be considered. This is a comment I wanted @ashoat opinion on

This revision now requires changes to proceed.Mar 25 2022, 3:35 PM

ashoat removed a reviewer: ashoat.Mar 28 2022, 1:22 AM

ashoat added inline comments.

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
71 ↗	(On Diff #10553)	Here are some properties I think we should look for in a holder ID scheme: Should be globally unique Should be possible to "reverse" to find the forward index The "forward" index goes from a unique row in a particular service's database to a unique holder row in the blob service's database If the holder ID is constructed in such a way that we can find the corresponding unique row in the corresponding service's database, that makes it easier to recover from database corruption As an example, a holder ID scheme that might work for the backup service might be `backup:<backup_id>:<compaction\|log_id>:<blob_id>:<UUID>`. That way, we can figure out which row in the backup service's database is "holding" onto this blob I don't think we can address privacy very well here Even if we didn't achieve property 2 above, it would still be possible to regenerate the forward index by scanning through the backup service's database Since the backup service needs to store a reference into the blob service, there's no way to prevent this relationship from being visible to us It will be possible for us to determine which users and holding which blobs, but we won't have a way to discern the actual contents of the blobs (as they are encrypted)

back to you

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
81–83 ↗	(On Diff #10654)	Honestly, I'm not sure where did you come up with this rule from... For me it's pretty natural to throw and then fix it in an upcoming diff, a trivial example would be: void foo() { throw("unimplemented"); } I'm still a bit confused with feedback like this which seems to be standing against the philosophy of diffs ("put up small diffs even when you're doing a big feature and if something turns out to need a different solution, just overwrite the code from the prior diffs").
71 ↗	(On Diff #10553)	I think we can follow up in https://linear.app/comm/issue/ENG-918/implement-generating-backup-id I added there what @ashoat said here.

• karol planned changes to this revision.Mar 28 2022, 10:43 AM

update

Harbormaster failed remote builds in B7631: Diff 10719!Mar 28 2022, 11:03 AM

• karol planned changes to this revision.Mar 28 2022, 11:32 AM

rebuild please

Harbormaster completed remote builds in B7639: Diff 10727.Mar 28 2022, 11:46 AM

tomek accepted this revision.Mar 28 2022, 12:35 PM

tomek added inline comments.

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
81–83 ↗	(On Diff #10654)	Actually, the philosophy is totally different. In diff philosophy, you start with things you're sure that are needed and know how to do them. Later you use the code you've introduced. So in the case you mentioned it would be e.g.: 1: void aFunction() { // implementation } 2: void function2() { // implementation } 3: void foo() { aFunction(); function2(); } So again, you start with introducing things that are not used, but at the same time after each diff all the codebase is correct. There are no regressions and exceptions. In real life it is sometimes useful to have a shortcut and start with usage and then add the implementation. The rule you mentioned was a solution to even worse pattern: where we had a stack of diffs opened for weeks and were constantly updating diffs from the begging. But if we started with things the we were sure about then this improvement wouldn't be needed. So to summarize, instead of put up small diffs even when you're doing a big feature and if something turns out to need a different solution, just overwrite the code from the prior diffs I would say: put up small diffs even when you're doing a big feature and when you discover that something needs a different solution, start with introducing that solution

This revision is now accepted and ready to land.Mar 28 2022, 12:35 PM

• karol added inline comments.Mar 28 2022, 1:08 PM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
81–83 ↗	(On Diff #10654)	The changes have been added in D3488 which is a follow-up, the original stack ended at D3471. Anyway, can we proceed here?

tomek added inline comments.Mar 28 2022, 1:17 PM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
81–83 ↗	(On Diff #10654)	I think so - it doesn't matter too much in this case

make non-draft

removed draft label

Harbormaster failed remote builds in B7656: Diff 10751!Mar 29 2022, 8:15 AM

tomek requested changes to this revision.Mar 30 2022, 11:04 AM

tomek added inline comments.

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
67 ↗	(On Diff #10751)	In D3466 in `ServiceBlobClient.put` we have if (this->putReactor != nullptr && !this->putReactor->isDone()) { throw std::runtime_error( "trying to run reactor while the previous one is not finished yet"); } And here we're using `ServiceBlobClient::getInstance()`, which means that we're going to throw if we tried to call `put` before other `put` has finished. Does that mean that we can handle only a single request at a time? We should be able to have as many concurrent connections as we want, and not limit ourselves to just a single connection per service instance. (but I might be wrong)

This revision now requires changes to proceed.Mar 30 2022, 11:04 AM

• karol added inline comments.Mar 30 2022, 1:04 PM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
67 ↗	(On Diff #10751)	Sorry, looks like D3466 is outdated... Managing a big stack like this can be tricky sometimes

• karol added inline comments.Mar 30 2022, 1:06 PM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
67 ↗	(On Diff #10751)	It's all good, sorry. It is just fixed in D3488 as a follow-up.

• karol requested review of this revision.Mar 30 2022, 1:06 PM

tomek accepted this revision.Mar 31 2022, 3:58 PM

tomek added inline comments.

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
67 ↗	(On Diff #10751)	Issues like this highlight that we should improve diffs / reviews process. It doesn't sound like an approach where we first introduce code with bugs and then fix the bugs later is maintainable long-term. In this case a better ordering would be to have D3488 before this diff. (we shouldn't change it now, but in the future we need to be more mindful about this to avoid forgetting about issues and to reduce the possible confusion when reviewing). Managing a big stack like this can be tricky sometimes It's even harder to review them. We need to find ways of making the stacks more manageable. One easy solution might be to reduce their size by e.g. splitting by feature. In this case, this huge stack could be a number of stacks each introducing only one (or two: client and server) reactors. Do you think it would be more manageable?

This revision is now accepted and ready to land.Mar 31 2022, 3:58 PM

• karol added inline comments.Apr 1 2022, 6:23 AM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
67 ↗	(On Diff #10751)	I do agree there's room for improvement in terms of the process. I think we can discuss better approaches outside of this diff. For now, I'm going to land this. It's even harder to review them. We need to find ways of making the stacks more manageable. One easy solution might be to reduce their size by e.g. splitting by feature. In this case, this huge stack could be a number of stacks each introducing only one (or two: client and server) reactors. Do you think it would be more manageable? I'm not sure about this. In such a case, we'll end up in a situation when the diffs will be decomposed into smaller stacks but effectively the stacks will have to be considered in certain order as well.

This revision was landed with ongoing or failed builds.Apr 1 2022, 6:35 AM

Closed by commit rCOMMd7d2a98056ef: [services] Backup - Add server reactor implementations - create new backup…. · Explain Why

This revision was automatically updated to reflect the committed changes.

• karol added a commit: rCOMMd7d2a98056ef: [services] Backup - Add server reactor implementations - create new backup….

• karol mentioned this in rCOMM6cede023718f: [services] Backup - Update service implementation.Apr 1 2022, 6:38 AM

tomek added inline comments.Apr 1 2022, 9:49 AM

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h
67 ↗	(On Diff #10751)	I think we can discuss better approaches outside of this diff. Yeah, that definitely makes sense. we'll end up in a situation when the diffs will be decomposed into smaller stacks but effectively the stacks will have to be considered in certain order as well. That's true, but there's no issue with the first diff of a stack to depend on one of the diffs from the other. Also, a diff can depend on multiple diffs, if that's helpful. This decomposition will make it easier to focus on one feature at a time.

[services] Backup - Add server reactor implementations - create new backup reactor
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 10543

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h

services/backup/docker-server/contents/server/src/Reactors/server/PullBackupReactor.h

services/backup/docker-server/contents/server/src/Reactors/server/RecoverBackupKeyReactor.h

services/backup/docker-server/contents/server/src/Reactors/server/SendLogReactor.h

[services] Backup - Add server reactor implementations - create new backup reactorClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 10543

services/backup/docker-server/contents/server/src/Reactors/server/CreateNewBackupReactor.h

services/backup/docker-server/contents/server/src/Reactors/server/PullBackupReactor.h

services/backup/docker-server/contents/server/src/Reactors/server/RecoverBackupKeyReactor.h

services/backup/docker-server/contents/server/src/Reactors/server/SendLogReactor.h

[services] Backup - Add server reactor implementations - create new backup reactor
ClosedPublic
Actions

Revision Contents
Changeset List