This diff is a part of the stack.
This diff introduces a fix to the AMQP-Client reconnection algorithm.
- In the current implementation uses AMQP_SHORTEST_RECONNECTION_ATTEMPT_INTERVAL constant and the current timestamp to make a delay.
That was changed to use AMQP_RECONNECT_MAX_ATTEMPTS and AMQP_RECONNECT_ATTEMPT_INTERVAL instead.
AMQP_RECONNECT_MAX_ATTEMPTS reflects the maximum attempts before we exit with the error. The value is 10 attempts.
AMQP_RECONNECT_ATTEMPT_INTERVAL reflects the interval between reconnect attempts. The value is 3 seconds.
The maximum waiting time is 30 seconds with 3 seconds intervals and 10 attempts maximum. Which looks enough to me to reconnect in case of network issues.
- The current implementation in AmqpManager::connect() -> while(true) loop doesn't work.
In case of the channel/connection is closed it just throws an error or segmentation fault due to the access to the this->amqpChannel which is null in that case.
The loop was changed to use a local atomic reconnectAttempt counter and AMQP_RECONNECT_MAX_ATTEMPTS maximum attempts constant. As long as throw a fatal error only once outside of the loop when the maximum reconnect attempts were reached.
- On the successful reconnect clearing the reconnectAttempt counter was added.
- A waiter method introduced in D4741 updated to use AMQP_RECONNECT_ATTEMPT_INTERVAL to wait for another attempt to check if the connection/channel is ready.
Related linear task: ENG-1495