On Mar 7, 2019, between the times 9:44 PST to 10:20 PST, some of the Snowflake customers with deployments on AWS EU - Frankfurt were unable to use Snowflake services exhibiting one of the following symptoms:
1. Cannot retrieve or save worksheets from Snowflake UI
2. Failed Queries
Snowflake Engineering resolved the issue by taking the following steps:
First, identified the clusters that were impacted and isolated them from production immediately
Second, the Engineering team added new clusters to production after successfully verifying the startup sequence and validating the HSM calls
Third, Monitored the affected deployment for errors after HSM v2 is active.
The disruption in service was caused due to an issue with a wrong configuration file that prevented a service module (HSM) from starting up correctly. The incorrect configuration file was as a result of a bug in the incorrect security permissions for the HSM module.
Our investigation of the problem and research shows two areas of opportunity for improvement:
1. Update the JAVA security file with the appropriate permissions for the HSM module
2. Upgrade of HSM module to v2 is a significant upgrade that is being applied sequentially to each of our deployments. So, for the pending updates, we have now added a new step (i.e., in addition to code change identified in step 1 above) of verifying the HSM v2 upgrade by checking the permissions and invoking the call to retrieve a test key.
We apologize for the inconvenience. If you have any questions or issues, please send feedback to email@example.com or submit a support request ticket via the Snowflake Lodge Portal.