Page MenuHomePhabricator

[services] Backup - Pull more mutual code from base reactors - Apply in client reactors
ClosedPublic

Authored by karol on Apr 20 2022, 3:48 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Oct 23, 4:49 AM
Unknown Object (File)
Sun, Oct 6, 10:08 AM
Unknown Object (File)
Sun, Oct 6, 10:01 AM
Unknown Object (File)
Sun, Oct 6, 9:28 AM
Unknown Object (File)
Sep 27 2024, 4:31 AM
Unknown Object (File)
Sep 15 2024, 1:51 PM
Unknown Object (File)
Sep 15 2024, 5:53 AM
Unknown Object (File)
Sep 15 2024, 5:53 AM

Details

Summary

Depends on D3785

https://linear.app/comm/issue/ENG-916/consider-pulling-more-mutual-code-from-the-async-reactors

Applying inheriting the BaseRactor in the client reactors

Test Plan
cd services
yarn run-backup-service

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

ashoat requested changes to this revision.Apr 20 2022, 10:59 AM
ashoat added inline comments.
services/backup/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
13 ↗(On Diff #11660)

I have a very strong opinion that we should avoid multiple inheritance in C++. Can we make this work by having BaseReactor extend grpc::ClientBidiReactor? We'll probably need to parameterize it on Request, Response as well

This revision now requires changes to proceed.Apr 20 2022, 10:59 AM
karol added inline comments.
services/backup/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
13 ↗(On Diff #11660)

I don't think this is possible. I realize that multiple inheritance (MI) is tricky/dangerous, but I couldn't see a better way of doing this.

If you look at the related diffs in this stack, all of the base reactor classes inherit from the BaseReactor and from a class from gRPC which represents either a client or a server reactor class.

Now, Every base class inherits from a different gRPC class so there is no way that the BaseReactor class could inherit from one of them because it wouldn't fit with other base reactor classes.

We'd end up with different versions of the BaseReactor class for every base class which kills the whole purpose of the BaseReactor.

The good news about MI here is that our coding style is different than gRPC's. In gRPC, they start method names with capital letters and we start the names with lowercase letters. That helps avoid collisions/the diamond problem, etc.

If you (or anybody) have a different idea of how we could do this, then I'm open.

ashoat requested changes to this revision.Apr 21 2022, 10:19 AM

We could have a utility class with the functionality, and then have two superclasses (one for client, one for server) that call into the utility class?

Most languages don't have a multiple inheritance feature. What would you do if C++ did not support multiple inheritance?

Curious for @palys-swm's perspective as well.

This revision now requires changes to proceed.Apr 21 2022, 10:19 AM

Every language is different. It depends on what technology I could use. For example with Java I'd use interfaces - you can implement more than one. Python allows multiple inheritance as well.

If I used yet another language without multiple inheritance, I'd solve this problem differently. Depends on the language.

ashoat requested changes to this revision.Apr 22 2022, 11:17 AM

Can we use one of those solutions here?

I think multiple inheritance in C++ is like var declarations in JavaScript... technically possible, but widely considered an anti-pattern with a lot of unexpected / inconsistent behavior, and best to avoid.

Curious for @palys-swm's perspective here as well.

This revision now requires changes to proceed.Apr 22 2022, 11:17 AM

Can we use one of those solutions here?

What solutions? Did you refer to this:

If I used yet another language without multiple inheritance, I'd solve this problem differently. Depends on the language.

? If so, then please note that I mean that I'd use another language's features to achieve the goal differently. C++ may not have these features (like it doesn't have interfaces per se).

I think multiple inheritance in C++ is like var declarations in JavaScript... technically possible, but widely considered an anti-pattern with a lot of unexpected / inconsistent behavior, and best to avoid.

Kind of. I think these anti-patterns come from the fact that they make it easy for people to shoot themselves in the feet. If you are careful and know what you're doing, there shouldn't be a problem.

What we need is a parent class for the reactors, but also every reactor has to extend its equivalent from the gRPC. MI fits here I think. But ok, what can go wrong? We may have a diamond problem or a problem with function overriding in general, you can google the diamond problem, it's well described on the Internet, here is a very simple example of a confused compiler when using MI in C++:

class A { void f(){...} }
class B { void f(){...} }
class C: A, B {}
main() {
  C c;
  c.f(); // ???
}

But, as I mentioned before:

The good news about MI here is that our coding style is different than gRPC's. In gRPC, they start method names with capital letters and we start the names with lowercase letters. That helps avoid collisions/the diamond problem, etc.

I'm going to state what I said before:

I don't think this is possible. I realize that multiple inheritance (MI) is tricky/dangerous, but I couldn't see a better way of doing this.

My point is that we use c++. We don't have JS, python, java, or c#. We have to solve this particular problem somehow. This code extraction is very handy I think (also with the status access synchronization).

As a side note, @ashoat why exactly do you think that MI is an anti-pattern? I read some opinions on the Internet and most people say that we should avoid this if possible but it's indeed sometimes needed. I think in this case we do need this.
Also, is there anything else that could go wrong except methods' names collisions?

I think this became a low-level discussion. We have to solve this in c++, I don't think that saying: "it doesn't look good, let's do it differently" will bring anything new here at this point.

I know that you may not know all the c++ details but you have an intuition waving a red flag in your head that this is dangerous. But since this is a very specific problem to solve, I think we should talk about language details.

As an alternative, we can always give up on extracting this code to the base reactor class. In this case, we won't have a problem with MI but we're going to end up with a lot of boilerplate code.

Curious for @palys-swm's perspective here as well.

Me too.

tomek requested changes to this revision.Apr 25 2022, 3:36 AM

Ah, this is a really interesting problem!

First of all, we need to distinguish two types of inheritance. Unfortunately, in C++ the distinction isn't that clear like in Java where we have classes and interfaces. Interfaces specify which functions an object has and classes provide more - the actual implementation (may not be always true, but that doesn't matter). There's nothing wrong in using multiple interfaces, as they only define the functions that need to be implemented, but it still might cause issues when two interfaces declare the same function.

In C++, this distinction isn't that obvious, as we use the same syntax when inheriting from abstract and concrete classes, which might be a little confusing. The point is, extending a couple of abstract classes is something which is widely used in other languages, e.g. traits, interfaces, protocols, etc. and this technique is rather safe and really useful. On the other hand, extending a couple of concrete classes has a lot of issues and should be avoided - there are better ways of doing that.

So one possible solution here is to create pure abstract classes and have some implementation for them. But how to do that?

OOP in general has a lot of traps and using inheritance extensively is one of them. Even if we can say that some class is another class, it doesn't mean that we always should inherit from it. It's usually a lot more maintainable to use delegation instead.

When using delegation, we don't inherit from a class and instead we include an object of the class as a member. Then we delegate all the necessary calls to that class.

It seems like the solution here might be to use a combination of multiple inheritance of pure virtual methods and a delegation. What are your thoughts?

Makes sense to me. So for the methods that aren't implemented in the base class (terminate, validate, doneCallback, terminateCallback), we'd use a pure abstract class and for the rest (getStatus, setStatus, getState) delegation. @ashoat ?

requesting review so Ashoat can see it (he probably can't right now).

ashoat requested changes to this revision.EditedApr 25 2022, 9:55 AM

I'd like to see some alternatives considered that are not multiple inheritance. There is a lot of discussion here but no alternatives have been seriously considered. I'm not going to be able to give you detailed alternatives and code snippets. I did suggest an approach above that was not considered / explored, but ultimately it's up to @karol-bisztyga whether he wants to consider my suggested approach or something else.

(For context, here's my suggestion above. @karol-bisztyga, have you spent any time considering that?)

We could have a utility class with the functionality, and then have two superclasses (one for client, one for server) that call into the utility class?

The critical thing to understand here: I'm explicitly asking @karol-bisztyga to consider alternatives and see what some of the options are. I am not taking responsibility for finding the alternatives or proposing them (although I am happy to help however I can briefly) – instead, I'm asking @karol-bisztyga
to take that responsibility.

The reason I brought up other languages was that @karol-bisztyga seemed to initially have trouble coming up with alternatives. I know other languages don't have multiple inheritance, and have had to solve this exact issue using "old school" OOP. The idea that in all other languages this is solved with some other magic, I think is misinformed. I'm pretty sure that in languages like classic Java, people address this in a way that could be ported to C++ easily.

If I can be frank, most of @karol-bisztyga's responses to my questions have come across as either "I don't know any alternatives" or "why don't you propose some alternatives?" Neither of those is an appropriate response here – I'm asking you to try and come up with some alternatives so we can discuss them. I am not in a position to spend time researching the alternatives and proposing them here, unfortunately.

Last note – if you're inclined to hit "re-request review" here without proposing any alternatives that don't use multiple inheritance, can you instead schedule a chat with me and @palys-swm to go through this feedback together?

This revision now requires changes to proceed.Apr 25 2022, 9:55 AM
This comment was removed by karol.

First of all, all I said was that we should start talking language-specific concrete stuff instead of general ideas because this discussion just got low-level.

I'd like to see some alternatives considered that are not multiple inheritance. There is a lot of discussion here but no alternatives have been seriously considered.

Sorry, but this is incorrect. First, @palys-swm said

It seems like the solution here might be to use a combination of multiple inheritance of pure virtual methods and a delegation. What are your thoughts?

Then, I responded:

Makes sense to me. So for the methods that aren't implemented in the base class (terminate, validate, doneCallback, terminateCallback), we'd use a pure abstract class and for the rest (getStatus, setStatus, getState) delegation. @ashoat ?

So I do think this very alternative is seriously considered as both me and Tomek agreed this can be a good solution here.

Another thing: I never got a response for this:

As a side note, @ashoat why exactly do you think that MI is an anti-pattern? I read some opinions on the Internet and most people say that we should avoid this if possible but it's indeed sometimes needed. I think in this case we do need this.
Also, is there anything else that could go wrong except methods' names collisions?

I understand you may not have time for this, etc. But, to solve this issue, we have to dive into details. This is a pure conflict - I mean, we need detailed discussion and you refuse to participate due to a lack of time, at the same time having a very strong opinion that's not backed up. With that said, we can of course just treat "no multiple inheritance" as a strict, golden rule, not question it and just obey. I'm ok.

I'm pretty sure that in languages like classic Java, people address this in a way that could be ported to C++ easily.

Is there a point in continuing these discussions around the topic? Are we going to end up posting Java code here, after another couple of cycles...? How about getting to the point of the problem?

Finally, the thing you proposed, sorry I missed this:

We could have a utility class with the functionality, and then have two superclasses (one for client, one for server) that call into the utility class?

This is confusing and I do not understand the idea. Not sure if you understand the problem, maybe I did miss something. Anyway, it's certainly confusing.

Look, we have something like this:

// code derived by grpc
class gprcBidiReactor {}
class gprcReadReactor {}
class gprcWriteReactor {}
// our code
class ServerBidiBase extends gprcBidiReactor {}
class ServerReadBase extends gprcReadReactor {}
class ServerWriteBase extends gprcWriteReactor {}

class ClientBidiBase extends gprcBidiReactor {}
class ClientReadBase extends gprcReadReactor {}
class ClientWriteBase extends gprcWriteReactor {}

then, we extracted the base reactor:

// code derived by grpc
class gprcBidiReactor {}
class gprcReadReactor {}
class gprcWriteReactor {}
// our code
class BaseReactor {}

class ServerBidiBase extends gprcBidiReactor, BaseReactor {}
class ServerReadBase extends gprcReadReactor, BaseReactor {}
class ServerWriteBase extends gprcWriteReactor, BaseReactor {}

class ClientBidiBase extends gprcBidiReactor, BaseReactor {}
class ClientReadBase extends gprcReadReactor, BaseReactor {}
class ClientWriteBase extends gprcWriteReactor, BaseReactor {}

I don't understand the separation you're talking about:

have two superclasses (one for client, one for server)

So this would look like this?

// code derived by grpc
class gprcBidiReactor {}
class gprcReadReactor {}
class gprcWriteReactor {}
// our code
class Util {}
class ServerBaseReactor {}
class ClientBaseReactor {}

class ServerBidiBase extends gprcBidiReactor, ServerBaseReactor {}
class ServerReadBase extends gprcReadReactor, ServerBaseReactor {}
class ServerWriteBase extends gprcWriteReactor, ServerBaseReactor {}

class ClientBidiBase extends gprcBidiReactor, ClientBaseReactor {}
class ClientReadBase extends gprcReadReactor, ClientBaseReactor {}
class ClientWriteBase extends gprcWriteReactor, ClientBaseReactor {}

There's still MI here. I don't get the idea, sorry.

Last note – if you're inclined to hit "re-request review" here without proposing any alternatives that don't use multiple inheritance, can you instead schedule a chat with me and @palys-swm to go through this feedback together?

Yes, we can set up a call, but could you please first address Tomek's idea that I agreed on? You never responded to that. Also, maybe clarifying your idea would help. I'm going to request changes as it seems you missed the former thing.

Thanks!

Tomek's idea technically uses the multiple inheritance but in fact, it does not as one of the parent classes is going to be pure abstract, therefore we can treat this as an interface (the functions aren't going to contain implementations).

If you want, we can also go with full delegation:

// code derived by grpc
class gprcBidiReactor {}
class gprcReadReactor {}
class gprcWriteReactor {}
// our code
class ReactorDelegate { methods (with implementations) mutual for all services }

class ServerBidiBase extends gprcBidiReactor { ReactorDelegate rd; }
class ServerReadBase extends gprcReadReactor { ReactorDelegate rd; }
class ServerWriteBase extends gprcWriteReactor { ReactorDelegate rd; }

class ClientBidiBase extends gprcBidiReactor { ReactorDelegate rd; }
class ClientReadBase extends gprcReadReactor { ReactorDelegate rd; }
class ClientWriteBase extends gprcWriteReactor { ReactorDelegate rd; }

// then in the code somewhere:
ServerBidiBase s;
s.rd.getStatus();

But even in this case we'd probably have to force somehow that in every reactor base class there would have to be getRd() method...

ashoat requested changes to this revision.Apr 27 2022, 10:02 AM

As a side note, @ashoat why exactly do you think that MI is an anti-pattern? I read some opinions on the Internet and most people say that we should avoid this if possible but it's indeed sometimes needed. I think in this case we do need this.
Also, is there anything else that could go wrong except methods' names collisions?

I understand you may not have time for this, etc. But, to solve this issue, we have to dive into details. This is a pure conflict - I mean, we need detailed discussion and you refuse to participate due to a lack of time, at the same time having a very strong opinion that's not backed up. With that said, we can of course just treat "no multiple inheritance" as a strict, golden rule, not question it and just obey. I'm ok.

As clarified above, I have a strong feeling that multiple inheritance is an anti-pattern, but I am not in a position here to argue for it. I am sure you could Google this and find several articles that make this point better than I could.

We absolutely should NOT treat this as a strict, golden rule. Instead, you should take responsibility for understanding why multiple inheritance might be an anti-pattern, and to propose some alternatives that do not have the same issues.

First of all, all I said was that we should start talking language-specific concrete stuff instead of general ideas because this discussion just got low-level.

I'd like to see some alternatives considered that are not multiple inheritance. There is a lot of discussion here but no alternatives have been seriously considered.

Sorry, but this is incorrect. First, @palys-swm said

It seems like the solution here might be to use a combination of multiple inheritance of pure virtual methods and a delegation. What are your thoughts?

My initial perception was that @palys-swm suggested an alternative that still uses multiple inheritance. On second read, I understand his argument that multiple inheritance is primarily an issue when neither superclass is abstract – for instance, Java does allow multiple interfaces to be implemented. (In C++, there is no concept of an interface – only a "pure" abstract class.)

However, I did some research and it's clear that gRPC's ClientBidiReactor is not a "pure abstract class" – see here.

As a result, in order to follow the principle @palys-swm suggested, we need to make BaseReactor a "pure abstract class" (aka interface), which means that @karol-bisztyga's initial intention of code sharing (correct me if I'm wrong about that intention) will not be achieved.

Instead, to achieve that we would need something like what I suggested above:

We could have a utility class with the functionality, and then have two superclasses (one for client, one for server) that call into the utility class?

I think this is basically what @palys-swm was getting at with "delegation". We know we can't implement any shared code in BaseReactor: BaseReactor must be "pure abstract" since we already have ClientBidiReactor which is not "pure abstract". So the shared code must be implemented somewhere else, which is what I meant by a "utility class". We can then make sure we call into the shared code from the relevant places.

The reason I suggested two superclasses was to reduce the amount of boilerplate. I think I was wrong that we would need two – it seems like we would actually need six, as you detailed above:

class ServerBidiBase extends gprcBidiReactor, BaseReactor {}
class ServerReadBase extends gprcReadReactor, BaseReactor {}
class ServerWriteBase extends gprcWriteReactor, BaseReactor {}

class ClientBidiBase extends gprcBidiReactor, BaseReactor {}
class ClientReadBase extends gprcReadReactor, BaseReactor {}
class ClientWriteBase extends gprcWriteReactor, BaseReactor {}

Yes, we can set up a call, but could you please first address Tomek's idea that I agreed on? You never responded to that. Also, maybe clarifying your idea would help. I'm going to request changes as it seems you missed the former thing.

I really wish you had scheduled the call right away as I requested. It would have been much faster for us to come to a consensus over a call. It seems like my suggestion and @palys-swm's suggestion are very similar (and perhaps the same).

This revision now requires changes to proceed.Apr 27 2022, 10:02 AM

Ok, I'm going to implement Tomek's/yours idea then - as they are more or less the same thing.

As for the process:

We absolutely should NOT treat this as a strict, golden rule. Instead, you should take responsibility for understanding why multiple inheritance might be an anti-pattern, and to propose some alternatives that do not have the same issues.

That doesn't make sense to me, sorry. I'd rather do research to see if MI is an anti-pattern, not why it is. What is the point of doing research if we strictly assume the result right away? I did the research and asked clearly:

As a side note, @ashoat why exactly do you think that MI is an anti-pattern? I read some opinions on the Internet and most people say that we should avoid this if possible but it's indeed sometimes needed. I think in this case we do need this.
Also, is there anything else that could go wrong except methods' names collisions?

What I don't understand is, that we either take that MI is an anti-pattern as a golden rule and don't question it, look for alternatives, etc. OR we want to check IF it is an anti-pattern to see if that statement is true.
What you said was basically this: "we don't take this as a golden rule but you should've done the research assuming it's true and just looked for the sentences confirming that statement and for code alternatives".

Either I missed something or the logic here limps a lil bit.

services/backup/src/Reactors/client/base-reactors/ClientBidiReactorBase.h
89–110 ↗(On Diff #12039)

In https://phabricator.ashoat.com/D3800#107105 Tomek said:

It seems like we should hold a mutex for the whole duration of this function. I think it will be hard to prove that nothing could go wrong when multiple threads would concurrently set and get the status in this function. For example, one thread can enter the first if, then another thread would enter the same if, then the first thread would set a status and then the second thread would set it to a different value.

Looks good!

What you said was basically this: "we don't take this as a golden rule but you should've done the research assuming it's true and just looked for the sentences confirming that statement and for code alternatives".

Yeah, that's basically right. Basically, try to come up with a charitable version of the "MI is an anti-pattern" argument. To do that, you start with a faith that it is true, and then come up with an argument for it. That allows you to consider alternatives.

Once the alternatives are in front of you, we can have a discussion on whether "MI is an anti-pattern" is a valid argument. But to understand that argument, it's good to search for the best articulation for it first.

For the record, I'm open to arguments against "MI is an anti-pattern". Normally I would be arguing for it, and you would be arguing against it. But in this case I don't recall enough context to argue for it, so basically I'm asking for you to argue for it. You're already able to argue against it, but arguing for it is an important part of the process of determining the way forward.

This revision is now accepted and ready to land.Apr 29 2022, 5:39 AM