Page MenuHomePhabricator

[services] Tunnelbroker - Adding the pinging loop and online status of the client check
ClosedPublic

Authored by max on Nov 2 2022, 2:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 11, 11:23 PM
Unknown Object (File)
Mon, Dec 30, 12:59 PM
Unknown Object (File)
Mon, Dec 30, 12:22 PM
Unknown Object (File)
Fri, Dec 27, 12:53 PM
Unknown Object (File)
Fri, Dec 27, 12:53 PM
Unknown Object (File)
Fri, Dec 27, 12:53 PM
Unknown Object (File)
Fri, Dec 27, 12:53 PM
Unknown Object (File)
Fri, Dec 27, 12:53 PM

Details

Summary

This diff introduces a pinging messages loop implementation to make sure that the stream is alive and the client is online because gRPC doesn't provide the keep-alive checks out of the box. The full context is available in the ENG-1766 task.
We are also changing the client's (session) online status in the database when the client became online or the stream is dropped or closed.

Linear task: ENG-2060

Test Plan

Testing pinging of the device and its online status:
Monitoring the value of the isOnline field in the tunnelbroker-device-sessions database table.

  • Connecting to the bidirectional stream with the valid sessionID in metadata results in the value changing to true right after connection.
  • Disconnecting from the stream or closing the client, resulting in the value becoming false in the `tunnelbroker-device-sessions database table in no more than 3 seconds (pinging interval).

Diff Detail

Repository
rCOMM Comm
Branch
add-grpc-messages-pinging
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

max held this revision as a draft.
max published this revision for review.Nov 2 2022, 4:49 PM
max edited the summary of this revision. (Show Details)
max edited the test plan for this revision. (Show Details)
max added a reviewer: marcin. max added 1 blocking reviewer(s): jon.
max edited the test plan for this revision. (Show Details)
services/tunnelbroker/src/server/mod.rs
140–142 ↗(On Diff #18043)

might be nice to log this

max edited the test plan for this revision. (Show Details)

Rebasing on parent changes.
Adding of debug message.
Adding variables cloning before the move into the spawned task.

max added inline comments.
services/tunnelbroker/src/server/mod.rs
140–142 ↗(On Diff #18043)

might be nice to log this

Makes sense to use tracing debug here. Added. Thanks, @jon!

This revision is now accepted and ready to land.Nov 4 2022, 11:00 AM
max marked an inline comment as done.

Rebase on master changes.

This revision now requires review to proceed.Nov 8 2022, 7:43 AM

Rebasing on parent changes.

Rebasing on parent changes.

tomek requested changes to this revision.Nov 9 2022, 9:06 AM
tomek added inline comments.
services/tunnelbroker/src/server/mod.rs
114 ↗(On Diff #18231)

Why we're doing this?

123–124 ↗(On Diff #18231)

We're setting this to true only here. Does that mean that after any failed ping we would need a new call to messages_stream? Is that ok?

This revision now requires changes to proceed.Nov 9 2022, 9:06 AM
max marked 2 inline comments as done.
max added inline comments.
services/tunnelbroker/src/server/mod.rs
114 ↗(On Diff #18231)

Why we're doing this?

This is not a necessary step because the channel still be destructed at the end of the async task, but I decided to drop it earlier as we can destruct the channel using drop because anyway we can't write to it anymore if there is an error in it.

123–124 ↗(On Diff #18231)

We're setting this to true only here. Does that mean that after any failed ping we would need a new call to messages_stream? Is that ok?

Yes, we are setting the online to true only on the successful connection. If the channel is dropped in a ping loop or another write we are marking it as false. If the client reconnects the true status will be updated by this line. There are no other reasons to mark the client online except for the successful connection to the channel. I've tested this a lot by connecting to the channel, dropping the network, dropping the connection, reconnecting, etc. it works as expected for any reason.

The same approach was in the C++ handler and we had a long discussion before about that )

tomek requested changes to this revision.Nov 17 2022, 2:08 AM
tomek added inline comments.
services/tunnelbroker/src/server/mod.rs
114 ↗(On Diff #18231)

What is the benefit of using explicit drop? In almost all the cases we should let the object go out of scope - is this case somehow different and dropping is beneficial? For me it seems like a premature optimization, but maybe I'm missing something.

As a side note, I'm not sure if this code does what you think it does. Drop just takes an ownership of an object, and if that results in object going out of scope, then the destructor is called. tx_writer takes channel by reference so this call is probably useless. Could you check that?

This revision now requires changes to proceed.Nov 17 2022, 2:08 AM
max marked 2 inline comments as done.

Removing droping of the channel.

This revision is now accepted and ready to land.Nov 17 2022, 2:56 AM
max marked an inline comment as done.
max added inline comments.
services/tunnelbroker/src/server/mod.rs
114 ↗(On Diff #18231)

What is the benefit of using explicit drop? In almost all the cases we should let the object go out of scope - is this case somehow different and dropping is beneficial? For me it seems like a premature optimization, but maybe I'm missing something.

Yes, it will be dropped at the end if it goes out of scope. The only benefit here is that we are not waiting for that and dropping it earlier.

For me it seems like a premature optimization

Maybe it is. I do not insist on this, so I just removed the drop )

As a side note, I'm not sure if this code does what you think it does. Drop just takes an ownership of an object, and if that results in object going out of scope, then the destructor is called. tx_writer takes channel by reference so this call is probably useless. Could you check that?

Drop is just a destructor of the channel copy in our case.

max removed a reviewer: tomek.

Landing this diff because Jon and Tomek accepted it, and the blocking reviewer (Tomek) was removed accidentally.

Rebase and resolve merge conflict.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 21 2022, 5:07 AM
This revision was automatically updated to reflect the committed changes.