About Zach Showalter
Zach Showalter is a Consultant on the Data team.
This site uses cookies to enhance your browsing experience and deliver personalized content. By continuing to use this site, you consent to our use of cookies.
COOKIE POLICY
In today’s digital age, data security and privacy are paramount. As organizations increasingly collect, store, and process personal data, protecting Personally Identifiable Information (PII) has never been more critical. One essential practice that organizations can implement at the database level to secure this sensitive information is to obfuscate it through the usage of data masking rules.
PII encompasses any data that can be used to identify an individual, such as names, addresses, email addresses, and Social Security numbers (SSN). If this information falls into the wrong hands, it can lead to identity theft, financial fraud, and other severe consequences. Masking PII fields ensures that sensitive data is obscured or anonymized, reducing the risk of exposure and misuse.
Even within organizations, not everyone needs access to raw PII in a database (DB). While this can certainly be mitigated through role-based access to limit who can access the DB, many organizations have employees who require access to the data for reporting and analytics purposes. For instance, data analysts at a health insurance firm might need access to a Policy Holder table for building reports, but they should not be able to see the policy holders’ SSN. Data masking rules can address this by applying them on individual columns of PII data, allowing employees and systems to interact with the data they need from a table without exposing sensitive information.
Many industries are subject to security regulations regarding the security of PII, and certain countries and states also have laws or regulations in place regarding PII of their citizens (such as the CCPA in California, and the GDPR in the EU). Failure to comply with such regulations could lead to fines or legal consequences for an organization.
Beyond legal obligations, an organization serves as a steward for the personal data it collects from its customers and employees, and as such it has an ethical responsibility to protect that data from falling into the wrong hands. Ensuring that PII is handled with care demonstrates respect for individuals’ privacy and fosters positive relationships with stakeholders.
The actual implementation of data masking rules on a database is a relatively straightforward task. However, before any implementation can be done, a standard must be established for defining what constitutes PII within your organization. Your team should collaborate with your business owners and database administrators to align on a PII definition which fits the types and use cases of your organization’s data. As an example, during a recent engagement with a client our UDig team recommended establishing a PII definition based on that of the GDPR, which emphasizes both directly linked data (full names, addresses, personal email, etc.) and linkable data (first/last name fields, business addresses, business phone numbers, personal characteristics, etc.) as constituting PII. This was a good fit for this particular client as their data consisted of a wide range of subjects ranging from individual customer data to data on businesses and other organizations.
For reference, common PII fields include:
Once a definition of PII has been agreed upon, your team should evaluate your database(s) to determine which fields in which tables constitute PII and require masking rules. Depending on the database platform your organization uses, you may have tools available to assist in this process. Azure SQL Server, for instance, has a Dynamic Data Masking feature which provides suggestions of columns within the database that could be PII or secure fields. Features such as this are useful because they can provide a baseline of columns and tables to investigate within the DB, however, they should not be treated as a comprehensive or correct assessment of which fields contain sensitive data.
Ultimately, the best way to ensure a complete evaluation of all columns in the database is to perform a manual audit of all the tables within. The team performing this audit should document all columns to be flagged as PII, noting the database, schema, column name, and data type. This list of fields should then be reviewed by the Database Administrator as well as relevant business stakeholders for approval before moving on to implementation.
After identifying the PII fields within your data environment you can define data masking rules for them and implement those rules in your database. Once again, depending on the database platform used by your organization there may be several tools available to your team. Many modern database platforms have native support for data masking. Some examples include:
Using features such as these you can set masking rules on only the PII-flagged columns to obscure the data from unauthorized accounts. There are several types of masking function that can be used:
Select the technique or combination of techniques that best fits your needs and ensures that data remains protected while still usable for its intended purposes.
Once masking rules are in place, test your implementation to ensure that the rules effectively obfuscate data when an account that does not have unmask privileges attempts to query data with masking rules applied.
After the initial implementation of data masking rules, it is important to document your data masking procedures and maintain records of your compliance efforts so that it is easy to reference them in the future. Data environments are constantly evolving and changing, and it is important to maintain regular processes surrounding data masking to account for updates and new data added to the database. Consider implementing the following processes:
As the laws and regulations surrounding data protection continue to evolve, data security measures such as data masking become an increasingly integral part of data security strategy. By following the steps outlined above you will be positioned to successfully integrate data masking into your data environment, ensure an extra layer of security around your organization’s PII, and reduce the risk of sensitive data being compromised. Doing this will not only strengthen your organization’s security as a whole, but also demonstrate your commitment to safeguarding the privacy of your customers.
Zach Showalter is a Consultant on the Data team.