AWS - US East (N. Virginia): SI-20201009
Incident Report for Snowflake
Postmortem

Dear Customer,

Snowflake Engineering has completed the postmortem of this service incident. A detailed Root Cause Analysis is made available on the Snowflake Community site.

https://community.snowflake.com/s/article/SI-20201008

We apologize for the service incident and the impact on your applications.

If you have any questions or difficulty accessing the Snowflake community site, please send feedback by submitting a support case via the Snowflake Community Site.

Thank you,

Snowflake Support

Posted Oct 20, 2020 - 18:19 PDT

Resolved
AWS fixed the issue with the communication channel between S3 and the EC2 servers.

A Root Cause Analysis (RCA) will be posted within the next ten (10) business days.

We apologize for the inconvenience caused by this Incident.

If you have any questions or see any related issues, please open a support request ticket via the Snowflake Community Site.
Posted Oct 09, 2020 - 14:11 PDT
Monitoring
AWS has narrowed down the problem area and continuing to work with Snowflake to resolve the issue.

Snowflake has increased the server count in the reserve pool to meet customer demand for warehouses. This step ensures that current work on resolution does not impact customer workloads demand on warehouse resources.

Symptom: Query Failures

Incident Start Time:06:14 AM PT Oct 09, 2020
Incident End Time: 10:45 AM PT Oct 09, 2020

We will remain in this monitoring state until the AWS incident is resolved or will provide an update in the next 4 hours.

If you have any questions or see any related issues, please open a support request ticket via the Snowflake Community Site.
Posted Oct 09, 2020 - 10:53 PDT
Update
AWS Platform is still continuing to mitigate the communication channel issue between S3 and the EC2 servers.

Snowflake has increased the server count in the reserve pool to meet customer demand for a request for warehouses. This step minimizes the impact of the ongoing AWS communication channel issue.

For additional details, please visit: https://status.aws.amazon.com/

Symptoms: Query Failures
Incident Start Time: 06:14 AM PT Oct 09, 2020

We will provide another update within 60 minutes.
Posted Oct 09, 2020 - 09:22 PDT
Update
AWS Platform is continuing to mitigate the communication channel issue between S3 and the EC2 servers. Snowflake is monitoring charts for communication rates.

For additional details, please visit: https://status.aws.amazon.com/

Symptoms: Query Failures
Incident Start Time: 06:14 AM PT Oct 09, 2020

We will provide another update within 60 minutes.
Posted Oct 09, 2020 - 08:08 PDT
Update
AWS Platform is rolling out a fix to mitigate the communication issue between S3 and the EC2 servers. Snowflake is monitoring charts for improved communication rates.

Symptoms: Query Failures
Incident Start Time: 06:14 AM PT Oct 09, 2020

We will provide another update within 30 minutes.
Posted Oct 09, 2020 - 07:25 PDT
Identified
We have identified an issue with the AWS platform:
- S3 communication channel with severs

The issue with the S3 communication channel has regressed after AWS has applied a fix to resolve the Service Incident from 11:25 PM PT Oct 08, 2020. Snowflake engineers are working with AWS to resolve the issue.

Symptoms: Query Failures
Incident Start Time: 06:14 AM PT Oct 09, 2020

We will provide another update within 30 minutes.
Posted Oct 09, 2020 - 06:44 PDT
This incident affected: AWS - US East (N. Virginia) (Snowflake Data Warehouse (Database)).