packet.com Site Unavailable
Incident Report for Equinix Metal
Postmortem

Reason for Outage

Start Date (1): June 18th 2019 at 06:27 UTC

End Date (1): June 18th 2019 at 21:25 UTC

Start Date (2): June 18th 2019 at 12:56 UTC

Start Date (2): June 19th 2019 at 13:24 UTC

Problem Location: Packet Website

Problem Description: (1) The Packet website was unavailable, (2) Google Mail responds NXDOMAIN for email sent to Packet.com

Outage Details

Description of Events & Contributing Factors

At 06:27 UTC a DDoS by an unidentified botnet was directed at the Packet WWW Origin site. When it was clear the attack would not abate, we black holed the site address and changed the IP of our WWW origin to restore service.

The sites origin IP was exposed because it’s IP was the packet.com and packet.net apex (A record at the domain root) address. An attempt by a Packet Engineer to replace the packet.com A record with a CNAME resulted in a violation of RFC1034 (section 3.6.2) that states, if a CNAME record is present at a node, no other data should be present. Google’s internal DNS caches ignored the MX records for packet.com causing all Google GMail and GSuite customers to get an NXDOMAIN error when sending email. Although we quickly rectified the issue and made numerous attempts to escalate a cache flush to Google, the problem only resolved 24 hours when presumably the Google caches flushed.

Packet has replaced its A record with a Fastly supplied Anycast IP and additional ACL’s have been put in place to protect our origin server from future attacks.

Customer Impact

The initial DDoS only impacted the public website, the subsequent Google caching problem impacted all email to packet.com for a period of 24 hours.

Posted Jun 28, 2019 - 01:26 UTC

Resolved
This incident has been resolved.
Posted Jun 18, 2019 - 21:25 UTC
Update
We are continuing to monitor for any further issues.
Posted Jun 18, 2019 - 12:02 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jun 18, 2019 - 12:02 UTC
Investigating
We are also investigating reports of public internet reachability on EWR1 that is related to this issue. Please reach out to support@packet.com with any questions.
Posted Jun 18, 2019 - 11:25 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jun 18, 2019 - 09:55 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Jun 18, 2019 - 08:57 UTC
Investigating
We are currently investigating this issue.
Posted Jun 18, 2019 - 07:35 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jun 18, 2019 - 06:38 UTC
Investigating
We are currently investigating an issue with www.packet.com (site) backend. This should not affect any customer instances or Packet services
Posted Jun 18, 2019 - 06:27 UTC
This incident affected: Equinix Metal Website.