Azure - East US 2 (Virginia): INC0123685
Incident Report for Snowflake
Postmortem

Snowflake Engineering has completed the postmortem of this service incident. A detailed Root Cause Analysis (RCA) is available on the Snowflake Community site:
https://community.snowflake.com/s/article/INC0123659INC0123685

Posted Jan 23, 2025 - 22:51 PST

Resolved
Current status: We've coordinated with our third-party service provider to implement the fix for this issue, and we've monitored the environment to confirm that service was restored. If you experience additional issues or have questions, please open a support case via Snowflake Community.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Incident end time: 15:10 UTC January 10, 2025

Preliminary root cause: A third-party service provider experienced a network configuration issue, which caused a subset of storage partitions to become unhealthy. This resulted in intermittent virtual machine connectivity issues and failures in allocating or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing the affected availability zones, which prolonged recovery time.

A root cause analysis (RCA) document will be published within ten business days.
Posted Jan 10, 2025 - 22:59 PST
Monitoring
Current status: We continue to work with our third-party service provider as they address the underlying cause of this incident, previously identified as INC0123659, which resulted in prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

The third-party provider's recovery plan has been completed and they continue to see signs of recovery. They are monitoring telemetry to confirm the issue is fully resolved.

Our internal telemetry indicates the majority of the impact to Snowflake services and features recovered at 21:20 UTC on January 9, 2025; the remaining impact recovered at 15:10 UTC on January 10, 2025. Due to the nature of the issue, we're conducting an extended period of monitoring to ensure the impact doesn't return. We'll continue monitoring over the weekend and through peak traffic hours, and we'll provide another update within 72 hours.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Incident end time: 15:10 UTC January 10, 2025

Preliminary root cause: A third-party service provider experienced a network configuration issue, which caused a subset of storage partitions to become unhealthy. This resulted in intermittent virtual machine connectivity issues and failures in allocating or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing affected availability zones, which prolonged recovery time.
Posted Jan 10, 2025 - 14:50 PST
Update
Current status: We continue to work with our third-party service provider as they address the underlying cause of this incident, previously identified as INC0123659, which resulted in prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

The third-party provider's recovery plan has been completed and they are seeing signs of recovery; however, they continue to validate that the issue has been fully resolved.

We remain committed to monitoring the situation closely and implementing measures wherever possible to maintain service continuity.

Our next update will be shared within three hours or sooner if we receive significant developments from the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 10, 2025 - 10:41 PST
Update
Current status: We continue to work with our third-party service provider as they address the underlying cause of this incident, previously identified as INC0123659, which resulted in prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

The third-party provider's recovery plan remains underway, however, full resolution continues to require an extended timeline. Snowflake's previously implemented mitigation strategies - such as reallocating resources and increasing capacity in high-demand areas - are unchanged, minimizing the impact for many of our customers. We remain committed to monitoring the situation closely and implementing measures wherever possible to maintain service continuity.

The third-party provider has confirmed the issue stemmed from a configuration change within their regional networking service, which resulted in an inconsistent service state across a subset of storage partitions. This caused intermittent virtual machine connectivity issues, failures in allocating or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing the impacted availability zones, which prolonged recovery time.

Our next update will be shared within three hours or sooner if we receive significant developments from the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 10, 2025 - 07:23 PST
Update
Current status: We continue to collaborate with our third-party service provider to address this incident, previously identified as INC0123659, which resulted in prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

The third-party provider's recovery plan is ongoing, however, they continue to indicate full resolution will require an extended timeline. Snowflake's previously implemented mitigation strategies - such as reallocating resources and increasing capacity in high-demand areas - are holding in place, reducing the impact for many of our customers. We remain committed to monitoring the situation closely and implementing measures wherever possible to maintain service continuity.

The third-party provider has confirmed the issue stemmed from a configuration change within their regional networking service, which resulted in an inconsistent service state across a subset of storage partitions. This caused intermittent virtual machine connectivity issues, failures in allocating or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing the impacted availability zones, which prolonged recovery time.

Our next update will be shared within three hours or sooner if we receive significant developments from the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 10, 2025 - 04:25 PST
Update
Current status: We remain in close collaboration with our third-party service provider to address the ongoing incident, previously identified as INC0123659, which contributed to prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

The third-party provider is making progress on their recovery plan but continue to indicate an extended timeline for full resolution. Snowflake's previously implemented mitigation strategies - such as reallocating resources and increasing capacity in high-demand areas - remain effective in reducing the impact for many of our customers. We remain committed to monitoring the situation closely and implementing measures wherever possible to maintain service continuity.

The third-party provider shares that the issue stemmed from a configuration change within their regional networking service, which resulted in an inconsistent service state. This caused intermittent virtual machine connectivity issues, failures in allocating or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing the impacted availability zones, which prolonged recovery time.

Our next update will be shared within three hours or sooner if we receive significant developments from the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 10, 2025 - 01:27 PST
Update
Current status: We remain in close collaboration with our third-party service provider to address the ongoing incident, previously identified as INC0123659, which has been contributing to prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

The third-party provider continues to meticulously advance toward recovery but consistently indicates that resolution requires extended recovery time. Snowflake remains committed to monitoring the situation closely and is ready to apply the necessary mitigation strategies, such as reallocating resources and increasing capacity in high-demand areas, if required.

The third-party provider shares that the issue stemmed from a configuration change within their regional networking service, which resulted in an inconsistent service state. This caused intermittent virtual machine connectivity issues or failures in allocating resources or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing the impacted availability zones, which prolongs recovery time.

Our next update will be shared within three hours or sooner if we receive significant developments from the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 22:29 PST
Update
Current status: We remain in close collaboration with our third-party service provider to address the ongoing incident, previously identified as INC0123659, which has been contributing to prolonged warehouse resume times and delays when accessing or utilizing Snowflake services and features.

As per the previous update from the third-party provider, a configuration change in a regional networking service resulted in an inconsistent service state, resulting in intermittent virtual machine connectivity issues or failures in allocating resources or communicating with resources in the region. The nature of this incident prevented standard workarounds, such as bypassing the impacted availability zones.

The provider is actively advancing its recovery process and continues to indicate that resolution requires extended recovery time. Snowflake's previously implemented mitigation strategies - such as reallocating resources and increasing capacity in high-demand areas - continue to reduce the impact for many of our customers. We remain committed to monitoring the situation closely and optimizing our measures wherever possible to further improve service continuity.

Our next update will be shared within three hours or sooner if we receive significant developments from the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 19:23 PST
Update
Current status: We are continuing to coordinate with our third-party service provider to fully recover from an incident recently posted to our status page (INC0123659).

As per the previous update from the third-party provider, a configuration change in a regional networking service resulted in an inconsistent service state, resulting in intermittent virtual machine connectivity issues or failures in allocating resources or communicating with resources in the region. Snowflake is designed to support multiple availability zones within the third-party provider's infrastructure. However, the nature of the impact prevents potential workarounds such as bypassing the impacted availability zones within the provider.

The service provider continues to work towards service restoration and advises an extended recovery time. In parallel, we have taken measures to alleviate impacts by reallocating available resources and increasing the number of virtual warehouses where demand is high. Some customers have experienced improved performance as a result of these actions.

We will provide another update within three hours or as updates are shared by the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 16:17 PST
Update
Current status: We are continuing to coordinate with our third-party service provider to fully recover from an incident posted to our status page earlier today (INC0123659).

Per the third-party provider, a configuration change in a regional networking service resulted in an inconsistent service state, causing intermittent virtual machine connectivity issues or failures in allocating resources or communicating with resources in the region. While Snowflake is designed to support multiple availability zones within the third-party provider's infrastructure, the nature of the impact prevents potential workarounds such as bypassing the impacted availability zones within the provider.

While the service provider continues to work towards service restoration, they report that extended recovery times are expected. In parallel, we are continuing to investigate any opportunities to assist with mitigation or providing impact relief, such as reallocating available resources and increasing the number of available virtual warehouses whenever possible. We will provide another update within three hours or as updates are shared by the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 13:07 PST
Update
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are continuing to liase with this service provider, who have confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

We are still coordinating with the provider as they work towards service restoration. A fix continues to be gradually rolled out, but extended recovery times are expected. We will provide another update within two hours or as updates are shared by the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 11:00 PST
Update
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are continuing to liase with this service provider, who have confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

We are still coordinating with the provider as they work towards service restoration, and a fix continues to be gradually rolled out. We will provide another update within two hours or as updates are shared by the third-party provider.

Customer experience: Customers hosted in the specified region may experience delays while attempting to access or use Snowflake services and features. Virtual warehouses may be delayed in a resuming state.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 08:50 PST
Update
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are continuing to liase with this service provider, who have confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

We are still coordinating with the provider as they work towards service restoration, and a fix continues to be gradually rolled out. A potential workaround was investigated by Snowflake engineers, but determined to not be feasible for this situation. We will provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may experience delays while attempting to access or use Snowflake services and features.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 07:41 PST
Update
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are continuing to liase with this service provider, who have confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

This notification has been posted to advise customers on the affected region of this residual impact as the service provider continues their work to fully restore service.

We are still coordinating with the provider as they work towards service restoration. The fix continues to be gradually rolled out. Additionally, Snowflake engineers are investigating available actions to move impacted workloads to a stable service zone in the meantime. We will provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may experience delays while attempting to access or use Snowflake services and features.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 06:37 PST
Update
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are continuing to liase with this service provider, who have confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

This notification has been posted to advise customers on the affected region of this residual impact as the service provider continues their work to fully restore service.

We are still coordinating with the provider as they work towards service restoral. The fix continues to be gradually rolled out, and customers should start to see signs of recovery. We will provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may experience delays while attempting to access or use Snowflake services and features.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 05:37 PST
Update
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are continuing to liase with this service provider, who have confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

This notification has been posted to advise customers on the affected region of this residual impact as the service provider continues their work to fully restore service.

We are coordinating with them as they work towards service restoral. A fix is being applied by the service provider and is in the process of being gradually rolled out. We will provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may experience delays while attempting to access or use Snowflake services and features.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 04:38 PST
Identified
Current status: Snowflake have identified residual impact that may be occurring as a result of ongoing efforts by a third-party service provider to fully recover from an incident posted to our Status Page earlier today (INC0123659).

We are actively liaising with this service provider, who confirmed a networking issue in the impacted region that remains the focus of their ongoing recovery actions.

This notification has been posted to advise customers on the affected region of this residual impact as the service provider continues their work to fully restore service.

We're coordinating with them as they work towards service restoral and we'll provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may experience delays while attempting to access or use Snowflake services and features.

Incident start time: 00:01 UTC January 09, 2025
Posted Jan 09, 2025 - 03:42 PST
This incident affected: Azure - East US 2 (Virginia) (Snowflake Data Warehouse (Database)).