{"id":39020,"date":"2022-10-20T07:16:20","date_gmt":"2022-10-20T11:16:20","guid":{"rendered":"https:\/\/centricconsulting.com\/?p=39020"},"modified":"2022-11-28T11:07:46","modified_gmt":"2022-11-28T16:07:46","slug":"snowflake-security-and-data-privacy-using-snowflake-tags-to-boost-privacy-and-security","status":"publish","type":"post","link":"https:\/\/centricconsulting.com\/blog\/snowflake-security-and-data-privacy-using-snowflake-tags-to-boost-privacy-and-security\/","title":{"rendered":"Snowflake Security and Data Privacy: Using Snowflake Tags to Boost Privacy and Security"},"content":{"rendered":"

Snowflake tags are another tool you can use to boost your data\u2019s security. In this blog, we explain what object tagging is and how you can use it for data governance.<\/h2>\n
\n

In part one of our Snowflake security blog series<\/a>, we discussed how to think about storing and organizing your data, from the organization level all the way down to individual tables.<\/p>\n

In part two<\/a>, we discussed ways to think about data access and control at a granular level. In this blog, we\u2019ll look at a newer Snowflake feature, object tagging, which gives us another tool in our toolbox to identify which data needs granular security.<\/p>\n

First, let\u2019s review a key concept relevant to Snowflake tags: data vs. metadata.<\/p>\n

Data is the information itself \u2013 date, amount, product, customer, and so on. Metadata is information about the data. For example, we know that this nine-digit number (our data) is a social security number (our metadata).<\/strong> Historically, we had to rely on a combination of naming conventions and human judgment to glean the metadata about our data, hoping the table is called something like \u201cemployee\u201d and the relevant column has a name like \u201csocial_security_number,\u201d or we look for patterns like \u201c123-45-6789.\u201d Now, we can tag our tables and columns directly, making it easy to keep track of important fields and control access to them.<\/p>\n

What Are Tags in Snowflake?<\/h2>\n

Typical databases, feeds and files hold only data with minimal metadata (e.g., column names). However, Snowflake<\/a> provides the opportunity to store much more metadata in the form of tags. Snowflake offers the ability to create and apply tags to databases, schemas, tables, views and columns (and also to users and roles, which we will discuss in a future post)<\/strong>.<\/p>\n

Tags are made of key-value pairs. For example, \u201ccost_center = finance\u201d or \u201cprotection_level = PII\u201d or \u201cPII_type = email.\u201d Equally as important, you can set up multiple tags and apply them to the same piece of data: \u201cpersonally-identifiable: true; sensitive: true; type: SSN; category: employee, owner: HR\u201d and so on. These tags can be anything we like and therefore require careful management to be useful.<\/p>\n

Further, tags are inherited based on where you apply them \u2013 so if you tag a table \u201cprotection_level = private,\u201d every column in that table will also be tagged as \u201cprotection_level = private.\u201d This applies the same way at higher levels: if you tag an entire database as \u201cprotection_level = private,\u201d then every schema, every table, every view and every column in that database will be tagged as \u201cprotection_level = private.\u201d This again argues for careful management but is ultimately very useful. If you write a query that brings back columns from multiple places, each column pulled from anywhere in the private database will carry the \u201cprotection_level = private\u201d tag in Snowflake\u2019s tagging repository.<\/p>\n

You can also override or add to tags. A specific table in the \u201ccost_center = finance\u201d schema may have its tag overridden to \u201ccost_center = finance_north_america.\u201d All the columns in a table with \u201cprotection_level = PII\u201d will have the same \u201cprotection_level = PII\u201d tag but can also have a specific tag such as \u201cPII_type = email\u201d appended, so both pieces of information are returned when you query information about that column.<\/strong><\/p>\n

Important note: Many different tags can be applied to the same table or column, but you cannot set multiple values for the same tag on the same column. We need to plan and organize our tags carefully in situations where multiple pieces of similar information may apply. For example, you cannot combine \u201cprotection_level=PII\u201d and \u201cprotection_level=GDPR\u201d, but you can combine \u201cPII=true\u201d and \u201cGDPR=true\u201d.<\/p>\n

Combining Tagging and Masking<\/h2>\n

In the previous entry in this series, we discussed the Dynamic Data Masking feature that lets us hide the contents of sensitive fields from people without access. Now, we can use tagging to support and improve our masking efforts. Snowflake has not yet implemented a fully-dynamic combination of tagging and masking: you must set Dynamic Data Masking explicitly on each column, and the masking rule cannot simply look up and use that column\u2019s tags. However, we can still use this information to drive our masking implementation and audit for completeness. We can query Snowflake to find out all tagged columns that should be masked per our rules, query to find out which ones are or are not masked, and apply masking where it\u2019s missing.<\/strong> We can also query Snowflake\u2019s history tables to see who accessed tagged tables or columns and when.<\/p>\n

Combining tagging capability with Dynamic Data Masking, we can create and enforce a hierarchy of permissions:<\/p>\n

Database Tag: \u201cowner = HR\u201d\r\nSchema Tag: \u201ccategory = employee_data\u201d\r\nTable Tag: \u201cprotection_level = PII\u201d\r\nColumn Tags: \u201cPII_type = firstname,\u201d \u201cPII_type = lastname,\u201d \u201cPII_type = work_email,\u201d \r\n\u201cPII_type = work_phone,\u201d \u201cPII_type = personal_phone,\u201d \u201cPII_type = dob,\u201d and \u201cPII_type = ssn\u201d<\/pre>\n

You can keep this simple by using tags to inform your masking requirements, or you can write a slightly more complicated masking policy that reads from the SYSTEM$GET_TAG function to enforce masking based directly on the tags in place. This isn\u2019t completely dynamic \u2013 you need to code for the specific tag and column combinations you want to check \u2013 but it does make your code more self-documenting and secure.<\/strong> (Since processing this lookup logic will take some computing resources, you\u2019ll want to do a proof-of-concept to make sure your specific implementation still performs well if you go this route).<\/p>\n

Given a masking implementation using the example tags above, if any user in your company who has not been granted access to the \u201cemployee_data\u201d tag happens to find the Employee table and tries to query it, they\u2019ll get:<\/p>\n

\"Snowflake<\/a><\/p>\n

An HR user within your company who has been granted access to the \u201cprotection_level = PII\u201d tag and to some specific PII_type tags will get:<\/p>\n

\"Snowflake<\/a><\/p>\n

Tag Management<\/h2>\n

Because tags are so flexible, we must guard against proliferation and inconsistency. The best practice is to create a separate security database that holds all security-related information, and within that, a \u201cTag_Library\u201d schema where you can define and manage all tags in a central location. Specific roles are recommended for:<\/p>\n

    \n
  • Tag_Administrator<\/strong> \u2013 A person who is allowed to create brand-new tags (such as \u201cowner\u201d or \u201cPII_type\u201d)<\/li>\n
  • Tag_Steward<\/strong> \u2013 A person who is allowed to add new values for an existing tag (such as \u201cwork_email\u201d or \u201cpersonal_phone\u201d as new PII_types)<\/li>\n
  • Tag_Manager<\/strong> \u2013 A person (or program) who can apply tags to databases, schemas, and other objects.<\/li>\n<\/ul>\n

    Snowflake offers some automatic tagging (currently in early preview). During the process of loading data into a new Snowflake table, Snowflake can look for patterns like \u201c(###) ###-####\u201d and apply best-guess tags to columns. This is a nascent capability, however, so we need to have our own approach to review and augment any automatic tags.<\/strong><\/p>\n

    Snowflake also offers the ability to monitor and audit tag usage:<\/p>\n

      \n
    • The snowflake.account_usage.tags view shows all tags that have been created.<\/li>\n
    • The snowflake.account_usage.tag_references function shows all the places each tag has been applied (each database, schema, view, table, column). You can call this with a filter to zero in on a specific object as desired.<\/li>\n
    • The snowflake.account_usage.tag_references_with_lineage function includes not only what tags exist on an object but how they got there (e.g., column-level tags inherited from table, which are inherited from database).<\/li>\n<\/ul>\n

      Some data cataloging and lineage tools can make use of these tags. The Alation and OneTrust data catalogs are the first tools explicitly supporting Snowflake tags, but many others are expected. Using a tool like this, you can pull descriptions and locations of all your tagged data into a data catalog, making it easy to see where all instances of PII (for example) are stored and how people are using them.<\/p>\n

      Tag Limitations<\/h2>\n

      Tags are meant for big-picture information about an entire table or column \u2013 metadata that describes the contents and can be used for limited enforcement.<\/strong> However, you cannot apply tags to specific data inside a table itself (e.g., we cannot tag certain rows in a table as belonging to Customer 1 or Customer 2). Not to worry \u2013 we can handle this level of filtering using row access control, discussed in the next entry of our series.<\/p>\n

      Why Do Snowflake Tags Matter?<\/h2>\n

      Remember, the first step in keeping our data private and secure is identifying which data needs protection. Traditional data catalogs and governance processes have to make do with educated guesses about your data and are hard to keep in sync with reality.<\/strong> With built-in data tags, we can keep track of important information right at the source, provide it to all our data consumers, and use it to ensure we\u2019re properly protecting our most sensitive assets.<\/p>\n\n

      \n
      \n How can you develop a data strategy that takes your business maturity model to the next level? We explain in our white paper.\n <\/div>\n
      \n \n\n Download Here\n <\/a>\n <\/div>\n <\/div>\n","protected":false},"excerpt":{"rendered":"

      You can use Snowflake tags to boost your data\u2019s security. We explain what object tagging is and how you can use it for data governance.<\/p>\n","protected":false},"author":178,"featured_media":39027,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_oasis_is_in_workflow":0,"_oasis_original":0,"_oasis_task_priority":"","_relevanssi_hide_post":"","_relevanssi_hide_content":"","_relevanssi_pin_for_all":"","_relevanssi_pin_keywords":"","_relevanssi_unpin_keywords":"","_relevanssi_related_keywords":"","_relevanssi_related_include_ids":"","_relevanssi_related_exclude_ids":"","_relevanssi_related_no_append":"","_relevanssi_related_not_related":"","_relevanssi_related_posts":"","_relevanssi_noindex_reason":"","footnotes":""},"categories":[1],"tags":[18616],"coauthors":[15561],"acf":[],"publishpress_future_action":{"enabled":false,"date":"2024-07-21 21:48:14","action":"change-status","newStatus":"draft","terms":[],"taxonomy":"category"},"_links":{"self":[{"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/posts\/39020"}],"collection":[{"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/users\/178"}],"replies":[{"embeddable":true,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/comments?post=39020"}],"version-history":[{"count":0,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/posts\/39020\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/media\/39027"}],"wp:attachment":[{"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/media?parent=39020"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/categories?post=39020"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/tags?post=39020"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/centricconsulting.com\/wp-json\/wp\/v2\/coauthors?post=39020"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}