New Sauce Connect tunnels would not start in our European datacenter.
Why Did It Happen?
We inadvertently made a network level change to our Sauce Connect Tunnel Cloud that made it unable to receive requests to start new tunnels Apr 9 1:58 CEST.
How Was It Resolved?
The change was reverted at 3:10 CEST and network connectivity was restored. New tunnels could be started again, but there was a large backlog of tunnels requests to process at the same time. This artificial "spike" in demand for new tunnels overwhelmed the Tunnel Cloud and, in effect, new tunnels still could not be started. At 3:58 CEST additional changes were made to throttle the incoming requests and allow our Tunnel Cloud to stabilize. By 4:30 CEST we had fully recovered and established all requested Sauce Connect Tunnels.
Why Are We Doing To Ensure It Never Happens Again?
*Resolving the bug found in our configuration that caused a loss of network connectivity.
*Improving our response process to allow our Sauce Connect Tunnel cloud to recover more rapidly.
*Adding additional capacity to our Sauce Connect Tunnel Cloud in Europe to improve resiliency and better meet spikes in demand.
*Making additional software and code updates to improve the responsiveness of our Sauce Connect Cloud under load.