{"id":33568,"date":"2021-12-03T08:10:31","date_gmt":"2021-12-03T13:10:31","guid":{"rendered":"https:\/\/centricconsulting.com\/?p=33568"},"modified":"2022-08-05T12:55:55","modified_gmt":"2022-08-05T16:55:55","slug":"how-snowflake-architecture-delivers-a-modern-data-storage-solution-part-2","status":"publish","type":"post","link":"https:\/\/centricconsulting.com\/blog\/how-snowflake-architecture-delivers-a-modern-data-storage-solution-part-2\/","title":{"rendered":"How Snowflake Architecture Delivers a Modern Data Storage Solution, Part 2"},"content":{"rendered":"
As I explained in my previous blog<\/a>, Snowflake works on a Storage and Compute separation model, which keeps the storage of data apart from its manipulation.<\/p>\n Now I will explore the question: How does the storage mechanism interact with the compute engine?\u00a0To answer this question, let\u2019s start by exploring three ways Snowflake manages data.<\/p>\n The persistent storage layer resides in a scalable cloud storage service, such as Amazon S3. This ensures data replication, scaling and availability without any management by customers. Snowflake optimizes and stores data in a columnar format within the storage layer, organized into databases as specified by the user.<\/strong><\/p>\n Snowflake processes queries using Virtual Warehouses (VWs). VWs can access any of the databases in the storage layer to which they have access, where they can perform operations such as SELECT, DELETE, INSERT, UPDATE and COPY INTO. Snowflake configures VWs only to \u201crun\u201d when in use. When not in use, VWs will shut down automatically after some time, so you are not charged for queries when not actively running them. Snowflake\u2019s caching and cloud services layers further reduce the need to pay for compute time.<\/strong><\/p>\n In my view, Snowflake\u2019s VW architecture is one of its benefits and key differentiators because it enables elasticity, optimal execution engine storage and self-tuning and self-healing:<\/p>\n VWs make Snowflake very elastic, allowing it to manage costs while improving the user experience. For example, a data load might take 20 hours on a system with four nodes but only four hours with 32 nodes. Since the user pays for compute-hours, the overall cost is very similar, yet the user experience is dramatically different.<\/p>\n Each VW\u2019s execution engine is columnar, vectorized and push-based, giving it a number of advantages.<\/p>\n <\/a><\/span><\/p>\n The services layer is Snowflake\u2019s \u201cbrain\u201d and manages the complete Snowflake system \u2014 metadata, security, access control and infrastructure. This layer seamlessly communicates with client applications (including the Snowflake web user interface, JDBC and ODBC clients) to coordinate query processing and return results.<\/strong><\/p>\n Services managed in this layer include:<\/p>\n Snowflake provides various unique features compared to other traditional databases:<\/strong><\/p>\n Let\u2019s take a deeper look at two of the most significant features:<\/p>\n In traditional systems, developers had to wait hours or days to spin up a copy of a production data warehouse<\/a> in a lower environment like test or development. In addition, the organization had to pay extra for disks to store all the replicated data.<\/strong><\/p>\n <\/a><\/p>\n With the Snowflake CLONE command, customers can create their own tables, schemas or databases without copying the actual data. Unlike copying, with cloning the data exists in only one place and at virtually no additional cost, saving time and money.<\/p>\n <\/a><\/p>\n Snowflake can easily handle disaster management and business continuity with its Database Replication approach.<\/p>\n <\/a><\/p>\n Most traditional databases require additional hardware, high cost and more time than usual to replicate data across different regions for disaster recovery and availability.<\/p>\nManaging Data With Snowflake<\/h2>\n
Storage<\/h3>\n
PAX Architecture<\/h4>\n
\n
S3 Usage<\/h4>\n
\n
Virtual Warehouses<\/h3>\n
Elasticity<\/h4>\n
Execution Engine<\/h4>\n
\n
\n
Services Layer<\/h3>\n
Infrastructure management<\/h4>\n
\n
Metadata management<\/h4>\n
\n
Query parsing and optimization<\/h4>\n
\n
Security<\/h4>\n
\n
Key Features of Snowflake Architecture<\/h2>\n
\n
Zero Copy Cloning<\/h3>\n
\n
Database Replication<\/h3>\n