SJC1 Network Issues
Incident Report for Equinix Metal
Postmortem

Reason for Outage

Start Date: 11/19/2018 at 5:50 PM EST

End Date: 11/19/2018 at 6:43 PM EST

Internal Ticket: #1306

Location: SJC1

Description: SCJ1 Network Issue

Outage Details

On Monday, November 19th 2018, Packet performed an emergency modification to our master firewall in our SJC1 facility, which started at 5:50 PM EST and was resolved at 6:43 PM EST.  

The incident was limited to a single rack, which included a small amount of our c1.small config type.

During the incident, our networking team discovered a rule that was discarding legitimate traffic. The discarded traffic was all IPv4. Any IPv6 communication was not affected.

After further investigation, the affecting rule was modified to restore normal traffic patterns. Due to the nature of the incident, our internal network monitoring was not able to alert us in time.

Timeline

All times are in EDT.

  • Monday, November 19, 2018

    • 5:56 PM - Customer Experience was notified of a potential network issue in our SJC1 facility.
    • 6:21  PM - Issue was identified
    • 6:43 PM - Fix was implemented and monitoring began.
    • 8:31 PM - Monitoring was completed and issue was resolved.

Impact Notes

  1. This incident was isolated to a specific rack.
Posted Nov 21, 2018 - 21:24 UTC

Resolved
This incident has been resolved.
Posted Nov 20, 2018 - 01:31 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 19, 2018 - 23:43 UTC
Identified
The affected servers are limited to single rack. Our network engineers are working on a fix.
Posted Nov 19, 2018 - 23:21 UTC
Investigating
We are currently investigating reports of network issues in our SJC1 facility.
Posted Nov 19, 2018 - 23:00 UTC
This incident affected: Equinix Metal Network.