Outage

Incident Report for Bitly

Postmortem

Summary:

A data center maintenance on a core router affected all inter-datacenter connectivity which caused a complete failure due to an incorrectly configured redundant link. Because Bitly was mid datacenter move, no part of the application was able to tolerate the loss of inter-datacenter connectivity. Some Bitlink redirect functionality which was recovered by engaging a Disaster Recovery plan. Service was restored when the maintenance completed. We will be reconfiguring the inter-datacenter backup connection to connect to a router in a different region as a result of this outage.

Timeline:

Feb 18 2019 0905 UTC: Outage detected

Feb 18 2019 0913 UTC: Execution of disaster recovery

Feb 18 2019 0920 UTC: Update status page

Feb 18 2019 1035 UTC: Services restored

Posted Feb 20, 2019 - 15:21 EST

Resolved

This incident has been resolved.
Posted Feb 18, 2019 - 06:07 EST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Feb 18, 2019 - 05:38 EST

Identified

We have identified the issue and are working with datacenter support teams to resolve.
Posted Feb 18, 2019 - 05:15 EST

Investigating

We are currently investigating reports of major failures
Posted Feb 18, 2019 - 04:20 EST
This incident affected: Custom Domain Redirects, Website (Bitly.com), API, and Metrics.