Masking Data 101: Safeguarding PII in Your Organization

By Zach Showalter

In today’s digital age, data security and privacy are paramount. As organizations increasingly collect, store, and process personal data, protecting Personally Identifiable Information (PII) has never been more critical. One essential practice that organizations can implement at the database level to secure this sensitive information is to obfuscate it through the usage of data masking rules.

Why Masking PII Fields is Crucial

Protecting Sensitive Information

PII encompasses any data that can be used to identify an individual, such as names, addresses, email addresses, and Social Security numbers (SSN). If this information falls into the wrong hands, it can lead to identity theft, financial fraud, and other severe consequences. Masking PII fields ensures that sensitive data is obscured or anonymized, reducing the risk of exposure and misuse.

Enhancing Data Security

enhanced data security Even within organizations, not everyone needs access to raw PII in a database (DB). While this can certainly be mitigated through role-based access to limit who can access the DB, many organizations have employees who require access to the data for reporting and analytics purposes. For instance, data analysts at a health insurance firm might need access to a Policy Holder table for building reports, but they should not be able to see the policy holders’ SSN. Data masking rules can address this by applying them on individual columns of PII data, allowing employees and systems to interact with the data they need from a table without exposing sensitive information.

Ensuring Legal and Regulatory Compliance

Many industries are subject to security regulations regarding the security of PII, and certain countries and states also have laws or regulations in place regarding PII of their citizens (such as the CCPA in California, and the GDPR in the EU). Failure to comply with such regulations could lead to fines or legal consequences for an organization.

Beyond legal obligations, an organization serves as a steward for the personal data it collects from its customers and employees, and as such it has an ethical responsibility to protect that data from falling into the wrong hands. Ensuring that PII is handled with care demonstrates respect for individuals’ privacy and fosters positive relationships with stakeholders.

Implementing Data Masking Rules

Defining PII in your Organization

The actual implementation of data masking rules on a database is a relatively straightforward task. However, before any implementation can be done, a standard must be established for defining what constitutes PII within your organization. Your team should collaborate with your business owners and database administrators to align on a PII definition which fits the types and use cases of your organization’s data. As an example, during a recent engagement with a client our UDig team recommended establishing a PII definition based on that of the GDPR, which emphasizes both directly linked data (full names, addresses, personal email, etc.) and linkable data (first/last name fields, business addresses, business phone numbers, personal characteristics, etc.) as constituting PII. This was a good fit for this particular client as their data consisted of a wide range of subjects ranging from individual customer data to data on businesses and other organizations.

For reference, common PII fields include:

Names
Date of Birth
Email addresses
Phone numbers
Addresses
Social Security numbers
PIN numbers

Identifying PII

identifying PII Once a definition of PII has been agreed upon, your team should evaluate your database(s) to determine which fields in which tables constitute PII and require masking rules. Depending on the database platform your organization uses, you may have tools available to assist in this process. Azure SQL Server, for instance, has a Dynamic Data Masking feature which provides suggestions of columns within the database that could be PII or secure fields. Features such as this are useful because they can provide a baseline of columns and tables to investigate within the DB, however, they should not be treated as a comprehensive or correct assessment of which fields contain sensitive data.

Ultimately, the best way to ensure a complete evaluation of all columns in the database is to perform a manual audit of all the tables within. The team performing this audit should document all columns to be flagged as PII, noting the database, schema, column name, and data type. This list of fields should then be reviewed by the Database Administrator as well as relevant business stakeholders for approval before moving on to implementation.

Implement Masking Rules

After identifying the PII fields within your data environment you can define data masking rules for them and implement those rules in your database. Once again, depending on the database platform used by your organization there may be several tools available to your team. Many modern database platforms have native support for data masking. Some examples include:

Using features such as these you can set masking rules on only the PII-flagged columns to obscure the data from unauthorized accounts. There are several types of masking function that can be used:

Redaction: Hiding or removing sensitive parts of data. For instance, displaying only the last four digits of a Social Security number. This is often the default masking function.
Substitution: Replacing sensitive data with realistic but fictitious values. For example, replacing real names with placeholder names.
Encryption: Converting data into a format that can only be read with a decryption key. Although encryption isn’t always considered masking, it adds an extra layer of security.
Tokenization: Replacing sensitive data with unique identifiers (tokens) that have no meaningful value outside the database.

Select the technique or combination of techniques that best fits your needs and ensures that data remains protected while still usable for its intended purposes.

Once masking rules are in place, test your implementation to ensure that the rules effectively obfuscate data when an account that does not have unmask privileges attempts to query data with masking rules applied.

Document Processes & Monitor New Data

After the initial implementation of data masking rules, it is important to document your data masking procedures and maintain records of your compliance efforts so that it is easy to reference them in the future. Data environments are constantly evolving and changing, and it is important to maintain regular processes surrounding data masking to account for updates and new data added to the database. Consider implementing the following processes:

Establish a Process for New Table Ingestion: Whenever a new table is added to the database as part of a new feature development, that developer implementing the new feature should additionally vet any new tables associated with it for PII. Any identified columns should be documented, and the defined rules should be implemented for that table by the Database Administrator.

Data Masking Rules Maintenance & Support: Define a regular interval (once per month, every other month, etc.), to review your masking rules to adapt to any changes in business needs, regulatory compliance, or the database tables themselves. This process should be performed by the Database Administrator.

As the laws and regulations surrounding data protection continue to evolve, data security measures such as data masking become an increasingly integral part of data security strategy. By following the steps outlined above you will be positioned to successfully integrate data masking into your data environment, ensure an extra layer of security around your organization’s PII, and reduce the risk of sensitive data being compromised. Doing this will not only strengthen your organization’s security as a whole, but also demonstrate your commitment to safeguarding the privacy of your customers.

Ready to start masking your data? Let’s talk.

About Zach Showalter

Zach Showalter is a Senior Consultant on the Data team.

Digging In

Data & Analytics
Ensuring Data Strategy Adoption: The Power of a Test Drive with Blueprinting and Mock Outputs
Despite years of investment in data platforms and analytics tools, many organizations still face a familiar challenge: their data strategy looks great on paper, but never delivers the value that was expected. Dashboards sit untouched, and self-service portals fail to gain traction. The data team checked every technical box, yet business users continue defaulting to […]
Read More
Data & Analytics
Piloting Data Discovery and Governance: The Open-Source Data Catalog
As organizations grow increasingly data-driven, the ability to quickly discover, understand, and trust internal data becomes more than a convenience—it’s a necessity. Over the past year, I’ve spent more time exploring data catalog solutions and the pivotal role they play in solving a challenge I frequently hear from clients: “We know we have the data, […]
Read More
Data & Analytics
2025 Data Trends
Read More
Data & Analytics
Legacy Data Modernization: A Comprehensive Guide to Upgrading Your Data Platform
Though they may have been more than functional in the past, legacy data platforms can become a burden to your organization and prevent it from realizing its full potential. That’s why legacy data modernization can effectively transform your organization’s obsolete data systems into modern platforms that are scalable, efficient, and better equipped to handle today’s […]
Read More
Data & Analytics
Unlocking the Full Potential of a Customer 360: A Comprehensive Guide
In today’s fast-paced digital economy, understanding your customer has never been more critical. The concept of a customer 360 view has emerged as a revolutionary approach to gaining a comprehensive understanding of consumers by integrating data from different touchpoints to offer a holistic view. A customer 360 view is about taking an overarching approach to […]
Read More
Data & Analytics
Microsoft Fabric: A New Unified Data Platform
MicroPopular data services and tools often specialize in specific aspects of the data analytics pipeline, serving teams in the data lifecycle. For instance, Snowflake addresses large-scale data warehousing challenges, while Databricks focuses on data engineering and science. Power BI and Tableau have become standard tools for business intelligence tasks. So, where does Microsoft Fabric create […]
Read More

Your Privacy