Details

Reviewers

• karol
tomek

Commits

rCOMMceb853937917: [services] Tunnelbroker - Amqp Manager in parallel threads messages throughput…

Summary

This diff introduces AMQP Manager and AMQP-C++ client implementation multithreaded access test. In this test, we are spawning 100 threads that send 10 messages each in parallel. In this case, we are checking for possible deadlocks and unsynchronized access to the shared resources inside the AMQPManager implementation.

Also, this test can show the throughput performance of the single shared AMQP channel implementation to send/receive messages.

Related Linear task: ENG-1495

Test Plan

Run yarn run-unit-tests tunnelbroker command and test is successfully passed.

Diff Detail

Repository

rCOMM Comm

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

• max created this revision.Aug 6 2022, 1:05 PM

• max held this revision as a draft.

Herald added subscribers: • abosh, • karol, atul and 3 others. · View Herald TranscriptAug 6 2022, 1:05 PM

• max published this revision for review.Aug 6 2022, 1:06 PM

• max retitled this revision from [services] Tunnelbroker - Amqp Manager messages throughput test to [services] Tunnelbroker - Amqp Manager in parallel threads messages throughput test.

• max edited the summary of this revision. (Show Details)

• max added reviewers: • karol, tomek.

• max mentioned this in D4767: [services] Tunnelbroker - Add AMQP shared channel locking.

• max mentioned this in D4740: [services] Tunnelbroker - Wrap `connect()` into `init()` in AmqpManager.Aug 6 2022, 1:13 PM

Harbormaster completed remote builds in B11202: Diff 15411.Aug 6 2022, 1:14 PM

• karol requested changes to this revision.Aug 8 2022, 4:57 AM

• karol added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	Why detach? I think we shouldn't do that.

This revision now requires changes to proceed.Aug 8 2022, 4:57 AM

Switch to use join() instead of detach().

• max marked an inline comment as done.Aug 8 2022, 6:36 AM

• max added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	Why detach? I think we shouldn't do that. There is no difference here between `join` and `detach` because detached threads will be exited too at the end of the test application. It makes sense to distinguish in a long-running app-like servers and where we should control the lifetime of the threads. Anyway, I don't mind changing it to the `join()`.

Harbormaster completed remote builds in B11209: Diff 15420.Aug 8 2022, 6:45 AM

• karol requested changes to this revision.Aug 8 2022, 7:21 AM

• karol added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	There is no difference here between join and detach They're quite different, I'd say. detached threads will be exited too at the end of the test application Will they? Do we have control over how far with the execution they'd get? A quick test: #include <iostream> #include <thread> #include <chrono> using namespace std; int main() { cout << "MAIN BEGIN" << endl; thread th([](){ cout << "TH BEGIN" << endl; std::this_thread::sleep_for(std::chrono::milliseconds(500)); cout << "TH END" << endl; }); th.detach(); std::this_thread::sleep_for(std::chrono::milliseconds(100)); cout << "MAIN END" << endl; } The output is: MAIN BEGIN TH BEGIN MAIN END Does the thread end somewhere in the background? Maybe. Do we have control over it? I don't think so, we just let it fly, right? Now, what you did with the latest update isn't a good solution either I think. Joining every thread right after spawning them kills the whole purpose of using threads. We could just execute the code sequentially. I think the best practice is to let them run in the background and join in the end so we're sure they're finished. Something like this: #include <iostream> #include <thread> #include <chrono> using namespace std; int main() { cout << "MAIN BEGIN" << endl; thread th([](){ cout << "TH BEGIN" << endl; std::this_thread::sleep_for(std::chrono::milliseconds(500)); cout << "TH END" << endl; }); std::this_thread::sleep_for(std::chrono::milliseconds(100)); cout << "MAIN END" << endl; th.join(); }

This revision now requires changes to proceed.Aug 8 2022, 7:21 AM

Remove threads joining.

• max added inline comments.Aug 16 2022, 5:50 AM

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	Now, what you did with the latest update isn't a good solution either I think. Joining every thread right after spawning them kills the whole purpose of using threads. We could just execute the code sequentially. That makes sense. I think the best practice is to let them run in the background and join in the end so we're sure they're finished. Something like this: The best and the easiest way here according to the std::thread documentation is just don't use `detach` or `join` and just create them by the constructor. We can join them at the end but looking that this is a short live running app (just test) we can omit this instead of adding an additional code to the simple test. Thanks @kaIor I've removed the `join`ing.

Harbormaster completed remote builds in B11379: Diff 15650.Aug 16 2022, 5:55 AM

• karol requested changes to this revision.Aug 16 2022, 6:52 AM

• karol added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	The best and the easiest way here according to the std::thread documentation is just don't use detach or join and just create them by the constructor. Sorry, but I don't understand what using `join`/`detach` has to do with how we spawn threads. We can join them at the end but looking that this is a short live running app (just test) we can omit this instead of adding an additional code to the simple test. I still don't understand why we cannot simply `join` here in the end. but looking that this is a short live running app (just test) we can omit this How is complexity related to correctness? Even the simples program should be as safe as possible and correctly written. I still think we should join in the end. Unjoined threads often cause crashes (not sure if always). Please, read this SO answer (which BTW includes yet one more argument against using `detach`).

This revision now requires changes to proceed.Aug 16 2022, 6:52 AM

Joining message sending threads at the end of the test execution.

• max marked an inline comment as done.Aug 16 2022, 11:18 AM

• max added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	I still think we should join in the end. I'm sure it's 100% overengineering here, but it's easier to add it neither we'll spend a lot of additional time pushing each other, especially on a test... I'm always good on good ) I've updated the diff with the joining of the threads at the end of the test execution. Looks good now ;)

Harbormaster completed remote builds in B11391: Diff 15670.Aug 16 2022, 11:25 AM

• karol accepted this revision.Aug 17 2022, 2:39 AM

• karol added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	Ok, cool. As a side note, it would be cool if you responded to my questions/doubts directly. Actually, none of them were addressed.

• max marked an inline comment as done.Aug 17 2022, 3:45 AM

• max added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
138 ↗	(On Diff #15411)	The best and the easiest way here according to the std::thread documentation is just don't use detach or join and just create them by the constructor. Sorry, but I don't understand what using `join`/`detach` has to do with how we spawn threads. The std::thread spawns on a constructor, but we can also detach or join it. When we are joining the thread, the parent thread should wait until the tread completes while detaching completely detaching the thread and making it independent of the parent. The detached thread will end when the thread ends itself or when the main exits (including fatal errors). We can join them at the end but looking that this is a short live running app (just test) we can omit this instead of adding an additional code to the simple test. I still don't understand why we cannot simply `join` here in the end. We can, but does it makes sense in this certain scenario? Ask yourself: how long the detached thread can live in a running test? Not long, because the test is not a long-running app like server and etc. The detached thread will die after all the tests are done or fail. The thread will be killed automatically without the need to join it. but looking that this is a short live running app (just test) we can omit this How is complexity related to correctness? Even the simples program should be as safe as possible and correctly written. Following this scenario, we can implement additional safety checks that will never occur in a certain scenario/program and cover this to make a program as safe as possible. But ask yourself: what the problem detached thread in a running short-lived test can cause? Memory overflow: not in this case, segfault: not in this case, "ghosts" threads: not in this case.... so why does the developer needs to spend time to protect from the scenario which will never happen in a certain condition? There is a tradeoff between additional complexity and necessity. We should always ask ourselves if this part of the code is really necessary? Because when we are working on a team there is a big possibility that someone else will be figuring out the code and every line of the code will take the time to figure out why it is here. And the next developer will ask: why you should join the threads if they die soon at the end of the tests? It took me time to figure out which is expensive. I respect your thoughts it is super correct! But we should ask ourselves is it really necessary to add additional complexity here?... That's my thoughts. Unjoined threads often cause crashes (not sure if always). That sounds like: mutexes always cause deadlocks, and shared data access causes race conditions... No, it doesn't. We should not throw anything because it needs to be taken with additional care, we just need to know that we should avoid it if it's not really necessary and take it with additional care. I'm not against @karol your comments, they were very useful! I'm against adding complexity to simple things when it's really don't need to be. Thanks, @karol ;) passing to @tomek to review.

In this test, we are spawning 100 threads that send 10 messages each in parallel.

This isn't exactly correct, even if it seems like that. The problem is that we create threads and start sending messages immediately, so it could happen that all the messages from the first thread will be sent before the last thread even started. The solution is to use some synchronization, so we start sending messages only when all the threads are created. This might test the service better and will give us more accurate performance measurements.

services/tunnelbroker/test/AmqpManagerTest.cpp
144–148 ↗	(On Diff #15670)	This assertion isn't really useful. If less that `MESSAGES_NUMBER * THREADS_NUMBER` messages are delivered, the loop will hang. If more that `MESSAGES_NUMBER * THREADS_NUMBER` then we will ignore them because the loop will be finished. So we need a way to verify that after `MESSAGES_NUMBER * THREADS_NUMBER` iterations there are no more messages.

This revision now requires changes to proceed.Aug 17 2022, 7:19 AM

• max added inline comments.Aug 17 2022, 7:25 AM

services/tunnelbroker/test/AmqpManagerTest.cpp
144–148 ↗	(On Diff #15670)	This isn't exactly correct, even if it seems like that. The problem is that we create threads and start sending messages immediately, so it could happen that all the messages from the first thread will be sent before the last thread even started. The solution is to use some synchronization, so we start sending messages only when all the threads are created. This might test the service better and will give us more accurate performance measurements. I agree with you here @tomek ! But is this not too complex for a test? In the past, we have an agreement that tests must be not complex and not so time-consuming... so the main question: is this (additional synchronization etc...) not too much for a test?

• max requested review of this revision.Aug 19 2022, 6:59 AM

• max marked an inline comment as done.

tomek requested changes to this revision.Aug 19 2022, 7:21 AM

tomek added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
144–148 ↗	(On Diff #15670)	If the additional synchronization is time-consuming, we can skip it (I guess it should be rather easy, though). But checking if there are no more messages after the for loop is a lot more important, so we should have it (unless it's really complicated).

This revision now requires changes to proceed.Aug 19 2022, 7:21 AM

• max requested review of this revision.Aug 19 2022, 9:37 AM

• max marked an inline comment as done.

• max added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
144–148 ↗	(On Diff #15670)	If the additional synchronization is time-consuming, we can skip it (I guess it should be rather easy, though). To inform threads to start together we can use a conditional variable. The test now is complicated itself and using mutex and CV will add an additional complexity here with a little improvement. But checking if there are no more messages after the for loop is a lot more important, so we should have it (unless it's really complicated). This is a good idea! But the problem here is that `DeliveryBroker::getInstance().pop(toDeviceID)` is a blocking read and it will wait indefinitely if no more new messages.

tomek accepted this revision.Aug 19 2022, 10:58 AM

tomek added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
144–148 ↗	(On Diff #15670)	This is a good idea! But the problem here is that DeliveryBroker::getInstance().pop(toDeviceID) is a blocking read and it will wait indefinitely if no more new messages. That's correct, but `MPMCQueue` is used underneath and it should be rather easy to use `isEmpty` method that it has. Nevertheless, if that's too complicated, we can leave without it, but not checking this reduces usefulness of this test significantly.

This revision is now accepted and ready to land.Aug 19 2022, 10:58 AM

Check for an isEmpty was added.

• max marked an inline comment as done.Aug 19 2022, 11:18 AM

• max added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
144–148 ↗	(On Diff #15670)	This is a good idea! But the problem here is that DeliveryBroker::getInstance().pop(toDeviceID) is a blocking read and it will wait indefinitely if no more new messages. That's correct, but `MPMCQueue` is used underneath and it should be rather easy to use `isEmpty` method that it has. Nevertheless, if that's too complicated, we can leave without it, but not checking this reduces usefulness of this test significantly. Omg, sorry I've missed that we have exposed the `isEmpty` method in a DeliverBroker. So that's easy, I've added this check. Thanks @tomek !

Harbormaster completed remote builds in B11481: Diff 15793.Aug 19 2022, 11:25 AM

Fixing space.

Harbormaster completed remote builds in B11482: Diff 15794.Aug 19 2022, 11:35 AM

tomek added inline comments.Aug 22 2022, 2:50 AM

services/tunnelbroker/test/AmqpManagerTest.cpp
149–151 ↗	(On Diff #15794)	Just one thought: what do you think about joining the threads before we check if all the messages are delivered?
144–148 ↗	(On Diff #15670)	Great, thanks!

• max mentioned this in rCOMM3f673de2b9e4: [services] Tunnelbroker - Wrap `connect()` into `init()` in AmqpManager.Aug 22 2022, 6:25 AM

• max mentioned this in rCOMMc3089d0e93e1: [services] Tunnelbroker - Add AMQP shared channel locking.Aug 22 2022, 7:03 AM

Rebase on master changes, move threads joining before checks.

• max marked an inline comment as done.Aug 24 2022, 5:34 AM

• max added inline comments.

services/tunnelbroker/test/AmqpManagerTest.cpp
149–151 ↗	(On Diff #15794)	Just one thought: what do you think about joining the threads before we check if all the messages are delivered? Yes, we can do this. I've moved threads joining before checks. Thanks, @tomek.

Harbormaster completed remote builds in B11577: Diff 15918.Aug 24 2022, 5:44 AM

Closed by commit rCOMMceb853937917: [services] Tunnelbroker - Amqp Manager in parallel threads messages throughput… (authored by • max). · Explain WhyAug 24 2022, 6:09 AM

This revision was automatically updated to reflect the committed changes.

• max marked an inline comment as done.

• max added a commit: rCOMMceb853937917: [services] Tunnelbroker - Amqp Manager in parallel threads messages throughput….

[services] Tunnelbroker - Amqp Manager in parallel threads messages throughput test
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 15924

services/tunnelbroker/test/AmqpManagerTest.cpp

[services] Tunnelbroker - Amqp Manager in parallel threads messages throughput testClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 15924

services/tunnelbroker/test/AmqpManagerTest.cpp

[services] Tunnelbroker - Amqp Manager in parallel threads messages throughput test
ClosedPublic
Actions

Revision Contents
Changeset List