[services] Tunnelbroker - Fix AMQP client reconnection algorithm
Summary:
This diff is a part of the stack.
This diff introduces a fix to the AMQP-Client reconnection algorithm.
- In the current implementation uses AMQP_SHORTEST_RECONNECTION_ATTEMPT_INTERVAL constant and the current timestamp to make a delay.
That was changed to use AMQP_RECONNECT_MAX_ATTEMPTS and AMQP_RECONNECT_ATTEMPT_INTERVAL instead.
AMQP_RECONNECT_MAX_ATTEMPTS reflects the maximum attempts before we exit with the error. The value is 10 attempts.
AMQP_RECONNECT_ATTEMPT_INTERVAL reflects the interval between reconnect attempts. The value is 3 seconds.
The maximum waiting time is 30 seconds with 3 seconds intervals and 10 attempts maximum. Which looks enough to me to reconnect in case of network issues.
- The current implementation in AmqpManager::connect() -> while(true) loop doesn't work.
In case of the channel/connection is closed it just throws an error or segmentation fault due to the access to the this->amqpChannel which is null in that case.
The loop was changed to use a local atomic reconnectAttempt counter and AMQP_RECONNECT_MAX_ATTEMPTS maximum attempts constant. As long as throw a fatal error only once outside of the loop when the maximum reconnect attempts were reached.
- On the successful reconnect clearing the reconnectAttempt counter was added.
- A waiter method introduced in D4741 updated to use AMQP_RECONNECT_ATTEMPT_INTERVAL to wait for another attempt to check if the connection/channel is ready.
Related linear task: ENG-1495
Test Plan:
Successfully built using yarn run-tunnelbroker-service-in-sandbox command.
Passing all AMQP unit tests in the last diff D4749 in a stack.
Reviewers: karol, tomek
Reviewed By: karol, tomek
Subscribers: ashoat, tomek, adrian, atul, karol, abosh
Differential Revision: https://phab.comm.dev/D4744